Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing command line arguments throws JSON Serialization Exception #205

Closed
madisonb opened this issue Nov 23, 2015 · 13 comments
Closed

Passing command line arguments throws JSON Serialization Exception #205

madisonb opened this issue Nov 23, 2015 · 13 comments
Assignees
Labels

Comments

@madisonb
Copy link

It appears that when passing extra arguments to sparse run via -o the topology configuration generated is not JSON Serializable and throws an exception.

Steps to reproduce:

  1. Run the example wordcount topology via quickstart
  2. Add the command line argument -o 'my.conf="TEST"' like so:
    sparse run -o 'my.conf="TEST"'
  3. Generates exception: Caught exception: Topology conf is not json-serializable

Upon further inspection, it looks like the lein command generated is missing the single quotes with the new argument, the generated command is below:

lein run -m streamparse.commands.run/-main topologies/wordcount.clj -t 0 --option 'topology.workers=2' --option 'topology.acker.executors=2' --option 'streamparse.log.path="/my/user/path/wordcount/logs"' --option 'streamparse.log.level="debug"' --option my.conf="TEST"
@amontalenti
Copy link
Contributor

@madisonb Interesting bug. Does the same thing happen to you if you do -o my.conf="TEST"? That is, don't have the single quotes in your invocation?

@madisonb
Copy link
Author

@amontalenti Same thing.
sparse run -n wordcount -o my.conf="TEST"
java.lang.IllegalArgumentException: Topology conf is not json-serializable

I just did a new wordcount quickstart via the following commands:

$ sparse quickstart wordcount
$ cd wordcount/
$ sparse run -n wordcount -o my.conf="TEST" # with or without -n flag
# error thrown

I am running OS X 10.10.5 in a pew virtualenv if that makes a difference, Python 2.7.10 and Java 1.8.0_20.

@j-bennet
Copy link
Contributor

I have the same error. I'm wondering if this line is problematic:

https://github.com/Parsely/streamparse/blob/master/streamparse/cli/run.py#L47

While all the other options are added with single quotes around them, like this:

https://github.com/Parsely/streamparse/blob/master/streamparse/cli/run.py#L40

user-provided options are not quoted.

Is this a bug?

@amontalenti
Copy link
Contributor

@madisonb @j-bennet ah hah, I see what's going on now. It's all about shell quote escaping rules, of course! Here's the proof:

$ echo sparse -o my.conf="TEST"
sparse -o my.conf=TEST

Believe it or not, if you escape this in your shell, it'll do the right thing:

$ echo sparse -o my.conf=\"TEST\"
sparse -o my.conf="TEST"

Which means, if you escape the quotes in your command-line invocation of sparse, this will also work (as a workaround). Clearly we need to put some more thought into this.

@j-bennet, @madisonb, please try e.g. sparse run -n wordcount -o my.conf=\"TEST\" (notice the escaped \" chars at the shell), and let me know if that works.

Ironically, this is only true if you actually want a string literal via configuration. That's because, for example, storm conf options can be specified with integer literals (like -o topology.max.spout.pending=1000, which shouldn't use escaped quotes or any quotes at all). This will actually get interpreted as integers when parsed by the Clojure/JVM interop layer with Storm. Very confusing, I admit.

@amontalenti
Copy link
Contributor

@j-bennet Oh, I was also looking more carefully at the code. You are right that --option {} should be surrounded by single quotes. This is because, when the command-line is taken from the Python layer (sparse) and sent to the Clojure layer via a lein invocation, I think it's possible the string escaping rules are in effect as well. I think TEST is being interpreted as an actual Clojure symbol or something. I have an idea about adding a small validator in the Clojure layer that would make finding this bug easier. Going to try to implement it now.

@j-bennet
Copy link
Contributor

Some more findings with option quoting.

This command fails:

lein run -m streamparse.commands.run/-main topologies/wordcount.clj -t 0 --option 'topology.workers=2' --option 'topology.acker.executors=2' --option 'streamparse.log.path="/Users/irina/repos/wordcount/logs"' --option 'streamparse.log.level="debug"' --option boo="foo"

This command fails:

lein run -m streamparse.commands.run/-main topologies/wordcount.clj -t 0 --option 'topology.workers=2' --option 'topology.acker.executors=2' --option 'streamparse.log.path="/Users/irina/repos/wordcount/logs"' --option 'streamparse.log.level="debug"' --option 'boo=foo'

This command works:

lein run -m streamparse.commands.run/-main topologies/wordcount.clj -t 0 --option 'topology.workers=2' --option 'topology.acker.executors=2' --option 'streamparse.log.path="/Users/irina/repos/wordcount/logs"' --option 'streamparse.log.level="debug"' --option 'boo="foo"'

@amontalenti
Copy link
Contributor

More sleuth work revealed that PR #189, contributed by @eric7j and merged by @dan-blanchard, actually fixed this issue (I think). I think the reason no one has actually got this fix in a released version is because when @dan-blanchard made 2.1.x releases of our Python code, he didn't also make a corresponding release for our Clojure code via lein/maven, thus this change never made it into the released JAR. Indeed, the fix was made in Sept 2015 but the last released JAR on clojars is July 7, 2015.

Easy fix -- want to take care of this, @dan-blanchard? I think you just need to do a lein deploy clojars using the v2.1.3 tag. Also, a reminder that because we use SNAPSHOT releases, there is no need to update the project.clj Jinja2 template or anything.

@amontalenti
Copy link
Contributor

@j-bennet yes, that makes perfect sense. And, to explain why:

  1. Your first command-line fails because --option foo="boo" gets translated into --option foo=boo by your shell. Then boo is treated like a Clojure symbol, not a string, and thus doesn't serialize.
  2. Your second command-line fails because of much the same reason as number 1. Setting --option 'foo=boo' means forcing a shell evaluation equivalent to --option foo=boo.
  3. Your third command-line succeeds because of the double-quotes inside the single-quotes. The shell expands this to --option boo="foo" when sending it lein. Thus, "foo" finally looks like a string, and serializes like one in JSON. I think --option boo=\"foo\" would also be a winning option from your shell.

@dan-blanchard
Copy link
Member

Just published a new version of our Clojure code to clojars.

@amontalenti
Copy link
Contributor

@madisonb @j-bennet Does the new streamparse JAR fix the issue? Your streamparse project should pick it up automatically.

To confirm it got picked up, run this command out of your project directory:

$ lein deps :tree 2>/dev/null | grep 'streamparse'
 [com.parsely/streamparse "0.0.4-20151215.145617-6"]

The output should include a date string of 20151215... if you are picking up the version @dan-blanchard published on Dec 15. Then check if it fixes your issues with sparse -o quoting.

@j-bennet
Copy link
Contributor

@amontalenti Yes, this issue is now fixed. No need to backslash the quotes.

@slava92
Copy link

slava92 commented Jan 4, 2016

Streamparse has been published on Clojars as [com.parsely/streamparse "0.0.4-SNAPSHOT"].
When we try to create a release, lein refuses to do it due to -SNAPSHOT suffix of Streamparse artifact. Can you publish it using just version number instead? Thank you.

@dan-blanchard
Copy link
Member

@slava92 The next release will do away with the need for the Clojure library entirely, so that won't be a problem much longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants