Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When writting and input file is the same as piped output file, jq writes nothing #2152

Closed
LoganBarnett opened this issue Jul 3, 2020 · 4 comments

Comments

@LoganBarnett
Copy link

LoganBarnett commented Jul 3, 2020

Describe the bug
This is possibly related to #1433 and #1110, and peripherally to #105. If jq's input file is the same as its output file (via a pipe), jq will emit nothing, or perhaps even make a partial write (more below). I have only noticed this on write operations though it is possible a filter could produce this as well.

To Reproduce

This is a expected scenario, where the input and output are separate:

echo "with separate out file"
echo '{"foo": 1, "bar": ["baz", "qux"]}' > in.json
jq '.foo=2' in.json > out.json
cat out.json

The output of the expected scenario will be the entire document with foo set to 2.

Here is the output of breaking scenario, where jq reads in in.json and also pipes its output there:

echo "jq pipe to same file with set from file"
echo '{"foo": 1, "bar": ["baz", "qux"]}' > in.json
jq '.foo=2' in.json > in.json
cat in.json

The output of this breaking scenario is that in.json is simply an empty file. No stderr is seen either.

Environment (please complete the following information):

  • macOS (from sysctl kern.version: kern.version: Darwin Kernel Version 18.7.0: Mon Feb 10 21:08:45 PST 2020; root:xnu-4903.278.28~1/RELEASE_X86_64)
  • jq v1.6

Additional context

This was a head scratcher for me. In hindsight, I'm not even sure this should be a bug. If jq streams (I didn't think you could get away with that in a JSON document), then I could easily understand how this breaks because the output is being written as it is being read - what should the outcome be in that case? Undefined, I imagine.

For #1433 and #1110, this initially appeared in my tests as an issue with isatty but that was a poor rabbit hole. I was only flipping between watching stdout and writing to my file (which was also my input).

I did mention this is peripherally related to the dead horse that is #106. I am not in a position to put together such a pull request, but I can record my findings and hopefully this report will help someone else who tries to get around it as I did. The workarounds listed in #106 are numerous - I went with writing to a temporary file and then moving it. No biggie.

If there were one ask from this ticket (I hesitate to call it a bug), it might be to either allow jq to buffer everything with some kind of flag(-s doesn't seem to do that, or I misunderstand it), or produce an error when the input gets funky. Though such a case seems very niche. I would not press the maintainers to pursue a fix. I will allow the maintainers to do as they wish with though, and close it. It'll be indexed by web crawlers and that's my primary goal here :)

Thanks for maintaining such a delightful, self-standing tool! I've enjoyed jq for many years, and I've recently come across oq which adds YAML and XML to the mix.

@wtlangford
Copy link
Contributor

So this isn't actually a jq bug, it's a function of how shells work. The stream redirection you did for outputting (> in.json) causes your shell to open the output file and truncate it before running the command. This means that by the time jq gets to look at whatever its input is, the file is already empty.

In general, we recommend using a tool like sponge (https://linux.die.net/man/1/sponge) to handle this- it consumes the entire input stream before writing to the output file, which avoids the truncation issue above. In your case, the command would become jq '.foo=2' in.json | sponge in.json. On macOS, you can find sponge in the moreutils brew package.

@wtlangford
Copy link
Contributor

That said, re: your comments about streaming-

jq "streams" by default in that input files need not be a single JSON value, but could be multiple JSON values (imagine a file with multiple JSON objects in it- structured logging is a common example). So it passes each of those JSON values one-at-a-time to your jq program.

The -s flag modifies this default behavior where it slurps up all JSON values in the input into a single array and passes that whole array to your jq program.

It also has a streaming mode (--stream), which is mainly meant for when you have very large objects that won't fit well in memory. In that case, the parser emits values to your jq program as it parses through the JSON values. This mode is admittedly a bit complicated and tricky to use, however.

@LoganBarnett
Copy link
Author

So this isn't actually a jq bug, it's a function of how shells work. The stream redirection you did for outputting (> in.json) causes your shell to open the output file and truncate it before running the command. This means that by the time jq gets to look at whatever its input is, the file is already empty.

TIL. Thanks for the clarification!

he -s flag modifies this default behavior where it slurps up all JSON values in the input into a single array and passes that whole array to your jq program.

It also has a streaming mode (--stream), which is mainly meant for when you have very large objects that won't fit well in memory. In that case, the parser emits values to your jq program as it parses through the JSON values. This mode is admittedly a bit complicated and tricky to use, however.

Thankfully the documentation was pretty good here on explaining its nuances from what I was thinking jq was doing.

Thanks again! I'll close this out now.

@rainabba
Copy link

rainabba commented Nov 24, 2021

I think I'm not understanding -s and my output file isn't a "appended version of the original" as intended, but ends up totally blank. without the file redirect, I'm getting the expected output and I thought -s would address the "input file is the same as piped output file" issue. What have I missed? Thanks in advance!

(cat package.json; cat eslintrc.json) | jq -s '.[0] * .[1]' > package.json

Surely this would work, but instead I end up with a file containing '[]\n'. Again, I only have to drop the " > package.json" and I get the expected and desired output. Must I use an env-var to do this?

(cat package.json; cat eslintrc.json) | jq -s '.[0] * .[1]' | jq -s '.[]' > package.json

Here's the best I could come up with:

echo "`(cat package.json; cat eslintrc.json) | jq -s '.[0] * .[1]'`" > package.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants