OM: Validate utf8 in labelvalues #312

leecalcote · 2018-09-25T23:51:35Z

Working with @girishranganathan to knock out a TODO for OpenMetrics compliant client.

Allow gsum, gcount, and created to be sanely returned in Prometheus format. Extend openmetrics parser unittests to cover Info and StateSet. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil · 2018-09-26T11:29:11Z

prometheus_client/openmetrics/parser.py

    return labels

+def _validate_utf8(char):
+    return ord(char) < 65536


Why is this a test for a utf-8 character?

I think you were going the right direction with the encode.

@brian-brazil ok. Starting on line 109, how's this instead?

utf8_str = ''.join(labelvalue) try: utf8_str.encode('utf') except: raise ValueError("Invalid line: " + text

That sounds about right.

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil · 2018-09-29T10:05:32Z

prometheus_client/openmetrics/parser.py

-                labels[''.join(labelname)] = ''.join(labelvalue)
+                utf8_str = ''.join(labelvalue)
+                try:
+                    utf8_str.encode('utf')


I believe it's utf-8

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil · 2018-10-01T09:30:22Z

That looks right, the test you added is failing though.

brian-brazil · 2018-10-01T09:30:59Z

prometheus_client/openmetrics/parser.py

+                utf8_str = ''.join(labelvalue)
+                try:
+                    utf8_str.encode('utf-8')
+                except:


Catch the specific exception type that you're looking for

@brian-brazil we're scratching our heads on finding a set of characters as a test case...

Added UnicodeEncodeError exception type.

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil · 2018-10-25T08:18:42Z

prometheus_client/openmetrics/parser.py

+                    if sys.version_info >= (3,):
+                        utf8_str.encode('utf-8')
+                    else:
+                        utf8_str.decode('utf-8')


This doesn't make sense to me, it should be an encode in both cases.

Yes the issue is that when we created the invalid test, python2 just continued to process the data while the test yielded expected results in python3. This is a workaround we came up for that.
Please let us know if the test we have in place is indeed a good one to cover the negative case.

Maybe you need a different test for Python 2

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

…thon into openmetrics

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com> # Conflicts: # .gitignore

brian-brazil · 2018-11-08T10:28:41Z

.gitignore

 .*cache
-htmlcov
+htmlcov
+.vscode


This should go in your global gitignore

brian-brazil · 2018-11-08T10:30:51Z

tests/openmetrics/test_parser.py


+    def test_labels_with_invalid_utf8_values(self):
+        try:
+            if sys.version_info < (2, 7):


I think you want to go back to >= 3

Not sure why this is failing on 2.6

…thon into openmetrics Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com> # Conflicts: # prometheus_client/openmetrics/parser.py # tests/openmetrics/test_parser.py

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

brian-brazil · 2018-11-20T18:02:00Z

tests/openmetrics/test_parser.py

                inj = u'\uD802'
            else:
-                inj = '\xf8\xa1\xa1\xa1\xa1'
+                inj = '\xfc'


\xff should do it

so using this or any in this way dint help but using them as a binary string seemed to have done the trick bcoz of the way python2 deals with it I guess.

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

brian-brazil · 2018-11-21T10:06:19Z

Thanks!

leecalcote · 2018-11-21T10:41:17Z

Excellent. :)

brian-brazil and others added 2 commits September 20, 2018 14:28

Add gsum/gcount to GaugeHistogram.

313f12f

Allow gsum, gcount, and created to be sanely returned in Prometheus format. Extend openmetrics parser unittests to cover Info and StateSet. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

checks to validate labelvalues as utf-8

7577d64

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil reviewed Sep 26, 2018

View reviewed changes

switching to encode('utf-8') test

2b485d3

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil reviewed Sep 29, 2018

View reviewed changes

update from encode('utf') to 'utf-8'

53bcb89

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil reviewed Oct 1, 2018

View reviewed changes

brian-brazil force-pushed the openmetrics branch 2 times, most recently from 9de686e to 5c5c3e2 Compare October 3, 2018 13:14

Identified a potential invalid utf-8 character - u'\uD802

bf70f58

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

leecalcote force-pushed the valid-utf8 branch from fbd2ce6 to bf70f58 Compare October 24, 2018 16:20

leecalcote added 4 commits October 24, 2018 11:39

rebase

c988520

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

Merge branch 'openmetrics' into valid-utf8

d44eb30

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

catching specific error type

1012e6f

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

added python2.6 compatibility

3b02812

Signed-off-by: Lee Calcote <leecalcote@gmail.com>

brian-brazil reviewed Oct 25, 2018

View reviewed changes

brian-brazil and others added 4 commits November 5, 2018 13:37

Check for negative counter-like and guage histogram values.

244ba2e

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>

trying to only use encode

3bdeed2

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

Merge branch 'openmetrics' of https://github.com/prometheus/client_py…

c4db135

…thon into openmetrics

Merge branch 'openmetrics' into valid-utf8

e1ca18a

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com> # Conflicts: # .gitignore

brian-brazil reviewed Nov 8, 2018

View reviewed changes

brian-brazil force-pushed the openmetrics branch from 244ba2e to c593db8 Compare November 9, 2018 11:34

girishranganathan added 5 commits November 20, 2018 11:24

Merge branch 'openmetrics' of https://github.com/prometheus/client_py…

2af9025

…thon into openmetrics Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com> # Conflicts: # prometheus_client/openmetrics/parser.py # tests/openmetrics/test_parser.py

Merge branch 'openmetrics' into valid-utf8

3238046

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

Merge branch 'openmetrics' into valid-utf8

6708b0c

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

trying another sequence for negative test with python2

1f5bee3

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

trying another sequence for negative test with python2

8c38c1d

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

brian-brazil reviewed Nov 20, 2018

View reviewed changes

trying a binary string for python2

e26b1ef

Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>

brian-brazil merged this pull request into prometheus:openmetrics Nov 21, 2018

OM: Validate utf8 in labelvalues #312

OM: Validate utf8 in labelvalues #312

Uh oh!

Conversation

leecalcote commented Sep 25, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-brazil commented Oct 1, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

girishranganathan Nov 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-brazil commented Nov 21, 2018

Uh oh!

leecalcote commented Nov 21, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

girishranganathan Nov 20, 2018 •

edited

Loading