Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jq-1.6 **extremely** slow compared to 1.5 #1826

Closed
waldner opened this issue Feb 17, 2019 · 16 comments
Closed

jq-1.6 **extremely** slow compared to 1.5 #1826

waldner opened this issue Feb 17, 2019 · 16 comments

Comments

@waldner
Copy link

waldner commented Feb 17, 2019

Sample timing test:

# time for i in {1..1000}; do echo '{"foo":"bar"}' | ./jq-1.5-linux64 '.foo' > /dev/null; done

real	0m4.852s
user	0m1.352s
sys	0m0.564s
# time for i in {1..1000}; do echo '{"foo":"bar"}' | ./jq-1.6-linux64 '.foo' > /dev/null; done

real	0m46.192s
user	0m42.448s
sys	0m0.996s

Scripts that used to finish very fast with jq 1.5 are now timing out with 1.6. This is under Debian stretch 9.6, with the official binaries downloaded from jq site.

@waldner waldner changed the title jq-1.6 *extremely* slow compared to 1.5 jq-1.6 **extremely** slow compared to 1.5 Feb 17, 2019
@nicowilliams
Copy link
Contributor

This is quadratic linking of built-ins, which is now fixed in master thanks to a kind contribution by @muhmuhten.

@waldner
Copy link
Author

waldner commented Feb 17, 2019

Thanks. Is a stable release with the fix in sight anytime soon?

@nicowilliams
Copy link
Contributor

I think this warrants a new release soon, but we'll probably wait for a few more things..net

@nicowilliams
Copy link
Contributor

Ay, a few more things yet.

@pkoppstein
Copy link
Contributor

pkoppstein commented Feb 18, 2019

The good news is that the "master" version is significantly faster than 1.6 as shown below.
The not-so-good news is that it's still slower than 1.5 and significantly slower than 1.4.
On a Mac:

u+s # version

 3.81 jq-1.4  
 5.85 jq-1.5  
30.6  jq-1.6 
10.7  jq-1.6-35-g80b064f 

@pkoppstein
Copy link
Contributor

pkoppstein commented Feb 18, 2019

@waldner wrote:

Scripts that used to finish very fast with jq 1.5 are now timing out with 1.6.

I would venture to guess that jq is not being used optimally :-)

You might consider asking a usage question at stackoverflow.com with the jq tag.

@muhmuhten
Copy link
Contributor

AFAICT the difference between jq 1.5 and jq 1.6 startup time comes down to a change in the binding strategy: 1.5 binds each builtin individually (in reverse order) only if used, while 1.6 parses a big block of builtins as a library, resolving all the symbols internally (the quadratic part).

The change seems to been made to facilitate make the builtins easy to generate, but in the process forces jq to link all the builtins every time.

Meanwhile, linking was always (and is still) quadratic.

@nicowilliams
Copy link
Contributor

@muhmuhten I finally got a patch working that removes unbound instructions from a sub-tree of unbound instructions, but... it didn't help much, and cannot help much for reasons noted in the PR I opened for it. A bit disappointing...

@muhmuhten
Copy link
Contributor

muhmuhten commented Feb 18, 2019

Which opens up a new avenue for hacking around the slow startup time: if we can separate symbol resolution from parsing (just block_join in the parser, only do the equivalent of equivalent of bind_block_referenced), then we might be able to parse builtin.jq, build the array for builtins/0, and then do the binding, throwing out unreferenced blocks so we don't go through them repeatedly...

I'm not sure there are any easy ways to avoid quadratic linking within the current design, but that could go a long way toward bringing M back down to 1.5 levels.

@muhmuhten
Copy link
Contributor

None of this is likely to affect any particular jq script that takes a long time, though; the impact is on shell scripts that call jq many (at least hundreds of) times.

If a single run of a particular jq script used to run "very fast" but is orders-of-magnitude slower now I'd look for 9a4576e-like accidental complexity changes to builtins or primitives.

(I was totally going to suspect that particular issue looking at dates before I noticed the year changed between.)

@waldner
Copy link
Author

waldner commented Feb 18, 2019

None of this is likely to affect any particular jq script that takes a long time, though; the impact is on shell scripts that call jq many (at least hundreds of) times.

Yes, that is indeed the case. They can probably be optimized (will look at that when I have time, I didn't write them), but the increased slowness is a thing nonetheless. I was just citing the timeouts because that's how I became aware of the change.

@nicowilliams
Copy link
Contributor

@muhmuhten i am not at a computer right now, but i was thinking of having a compiler flag to delay all bindings of top-level defs when the module is a library... Then we could bind all bulletins backwards, one at a time...

@nicowilliams
Copy link
Contributor

Another idea is to keep src/builtin.jq but use two newlines to separate defs, then compile each separately as we used to and bind them separately too.

@fadado
Copy link

fadado commented Feb 19, 2019

And what about SPLIT builtin.jq into one one core module, loaded always, and extra modules not loaded without an explicit import? IMHO the first candidates to be removed from builtin.jq are the regexp related functions and the relational uppercased functions. Perhaps this not compatible enhanced modularity can be reached in the 2.0 jq...

@nicowilliams
Copy link
Contributor

Closed by #1834. Thanks @muhmuhten!

buildroot-auto-update pushed a commit to buildroot/buildroot that referenced this issue Dec 21, 2019
jq 1.6 has a severe performance regression compared to 1.5. The problem is
reported [1] and fixed [2] upstream, but there are different commits and
later subsequent fixes on top of them that make it cumbersome to patch
specifically.

Instead, bump to a recent git version.

[1] jqlang/jq#1826
[2] jqlang/jq#1834

Signed-off-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
@jpmckinney
Copy link

In case anyone else comes across this issue, brew install --HEAD jq restores good performance.

FWIW, my query was against a JSON Lines file with jq 'reduce (inputs | .awards[]?.items[]?.classification.id) as $id ({}; .[$id|tostring] += 1)' myfile.jsonl

(I had written some Rust code to do a query, then realized I could do the same with jq, and was surprised that it was taking more than 10x longer. With the HEAD version it's only about 2x longer.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants