-
Notifications
You must be signed in to change notification settings - Fork 830
OM: Validate utf8 in labelvalues #312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allow gsum, gcount, and created to be sanely returned in Prometheus format. Extend openmetrics parser unittests to cover Info and StateSet. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Signed-off-by: Lee Calcote <leecalcote@gmail.com>
return labels | ||
|
||
def _validate_utf8(char): | ||
return ord(char) < 65536 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a test for a utf-8 character?
I think you were going the right direction with the encode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brian-brazil ok. Starting on line 109, how's this instead?
utf8_str = ''.join(labelvalue) try: utf8_str.encode('utf') except: raise ValueError("Invalid line: " + text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds about right.
Signed-off-by: Lee Calcote <leecalcote@gmail.com>
labels[''.join(labelname)] = ''.join(labelvalue) | ||
utf8_str = ''.join(labelvalue) | ||
try: | ||
utf8_str.encode('utf') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's utf-8
Signed-off-by: Lee Calcote <leecalcote@gmail.com>
That looks right, the test you added is failing though. |
utf8_str = ''.join(labelvalue) | ||
try: | ||
utf8_str.encode('utf-8') | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catch the specific exception type that you're looking for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brian-brazil we're scratching our heads on finding a set of characters as a test case...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added UnicodeEncodeError exception type.
9de686e
to
5c5c3e2
Compare
Signed-off-by: Lee Calcote <leecalcote@gmail.com>
fbd2ce6
to
bf70f58
Compare
Signed-off-by: Lee Calcote <leecalcote@gmail.com>
Signed-off-by: Lee Calcote <leecalcote@gmail.com>
Signed-off-by: Lee Calcote <leecalcote@gmail.com>
if sys.version_info >= (3,): | ||
utf8_str.encode('utf-8') | ||
else: | ||
utf8_str.decode('utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make sense to me, it should be an encode in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the issue is that when we created the invalid test, python2 just continued to process the data while the test yielded expected results in python3. This is a workaround we came up for that.
Please let us know if the test we have in place is indeed a good one to cover the negative case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you need a different test for Python 2
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>
Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com> # Conflicts: # .gitignore
.gitignore
Outdated
.*cache | ||
htmlcov | ||
htmlcov | ||
.vscode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go in your global gitignore
tests/openmetrics/test_parser.py
Outdated
|
||
def test_labels_with_invalid_utf8_values(self): | ||
try: | ||
if sys.version_info < (2, 7): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to go back to >= 3
Not sure why this is failing on 2.6
244ba2e
to
c593db8
Compare
…thon into openmetrics Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com> # Conflicts: # prometheus_client/openmetrics/parser.py # tests/openmetrics/test_parser.py
Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>
Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>
Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>
Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>
tests/openmetrics/test_parser.py
Outdated
inj = u'\uD802' | ||
else: | ||
inj = '\xf8\xa1\xa1\xa1\xa1' | ||
inj = '\xfc' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\xff should do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so using this or any in this way dint help but using them as a binary string seemed to have done the trick bcoz of the way python2 deals with it I guess.
Signed-off-by: Girish Ranganathan <girish.rranganathan@gmail.com>
Thanks! |
Excellent. :) |
Working with @girishranganathan to knock out a TODO for OpenMetrics compliant client.