Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx #199

Merged
merged 4 commits into from Jun 19, 2017

Conversation

Projects
None yet
2 participants
@paramsingh
Copy link
Member

paramsingh commented Jun 17, 2017

So I spent the day looking through the influxdb-python source code. And the reason this was so hard to debug is a combination of random weird behaviour in influx with backslashes and some problems in our code.

Influxdb-python automatically escapes measurement names when we use write_points which can be seen here, so when we send a measurement with name "para\m" to write_points, it shows up in influx as "para\\m". As a result, the line protocol only requires escaping of \n characters. On the other hand, the query language requires escaping of everything and since influx had replaced the \ in the measurement name with \\, we need FOUR backslashes in the query to get the data.

So, I have changed the code so that it does this and added a bunch of comments explaining stuff as best I could. I expect more changes will be requested to make it clearer. Also, I have upgraded to influx 1.2.4 and influxdb-python 4.1.0 .

Also, there is some weird behaviour when we backslash quotes in influx. Influx SOMETIMES totally ignores backslashes present in measurement names, if they come before quotes. This behaviour is present in influx itself (not influxdb-python) and I am not sure if this is a bug or if I am misunderstanding something. For example:

Backslash away from quotes:

> insert "random\user"",val=1 hello=1
> show measurements
name: measurements
name
----
"random\user""

> select * from "\"random\\user\"\""
name: "random\user""
time                hello val
----                ----- ---
1497719311488498867 1     1

Backslash in front of a quote gets ignored

> insert "randomuser\"",val=1 hello=1
> show measurements
name: measurements
name
----
"random\user""
"randomuser""

> select * from "\"randomuser\"\""
name: "randomuser""
time                hello val
----                ----- ---
1497719359874380564 1     1

Stuff like this is hard to write code for, unless there is something that I am missing completely...

@paramsingh paramsingh closed this Jun 17, 2017

@paramsingh paramsingh reopened this Jun 17, 2017

@paramsingh paramsingh changed the title Escape users LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx Jun 17, 2017

@paramsingh

This comment has been minimized.

Copy link
Member Author

paramsingh commented Jun 17, 2017

Clicked Create Pull Request before it was ready. 😅

@@ -196,7 +197,7 @@ def to_influx(self, measurement):
# add the user generated keys present in additional info to fields
for key, value in self.data['additional_info'].items():
if key not in Listen.SUPPORTED_KEYS:
data['fields'][key] = value
data['fields'][key] = escape(value)

This comment has been minimized.

Copy link
@paramsingh

paramsingh Jun 17, 2017

Author Member

Did this because additional_info variables may have random newlines in them, which would lead to parse errors.

paramsingh added some commits Jun 10, 2017

Fix handling of usernames with special characters in InfluxListenStore
Influx has different behaviour for the line protocol and for
the query language, so we have to use 2 backslashes for each
backslash in the line protocol and 4 backslashes for each
backslash in the query language. Change the code so that it
does that everywhere.

Also, because newlines cause parse errors in the line protocol
replace newlines in the measurement names and the values of
extraneous variables in additional_info with `\n`.

I've also added a unit test and an integration test with a user
with a weird user name.
Fix build errors due to bug in setuptools by upgrading
Setuptools might have an error that is causing build errors
See here: https://travis-ci.org/paramsingh/listenbrainz-server/builds/244051410

The bug seems to be known, see this:
pypa/setuptools#951

@paramsingh paramsingh force-pushed the paramsingh:escape-users branch from 43c1984 to 3b691c2 Jun 19, 2017

@mayhem

mayhem approved these changes Jun 19, 2017

Copy link
Member

mayhem left a comment

Thank you for doing all of this painful work! 🛡 💯

@mayhem mayhem merged commit 81c826c into metabrainz:master Jun 19, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@paramsingh paramsingh deleted the paramsingh:escape-users branch Jun 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.