-
-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incapable of working with large (very large?) yaml files #1215
Comments
That's a lot of memory :( You're right in that it reads the entire document into memory before doing anything - that's how the underlying yaml parsers work. The only think I can think of that would help immediately is turning off colors with the |
On that machine it seems to be about the same with -M, on my [Windows] desktop which has substantially more RAM with the latest version, it appears to get to 16-17 GB and then I get a golang panic:
|
support of streaming processing would be great - something like that: #!/usr/bin/jq -Rn -f
def objectify(headers):
def tonumberq: tonumber? // .;
. as $in
| reduce range(0; headers|length) as $i ({}; .[headers[$i]] = ($in[$i] | tonumberq) );
def trim:
sub("\n";"") | sub("\r";"") | sub("^ +";"") | sub(" +$";"") | sub("\"";"") | sub("\"$";"");
def csv2table:
split(",") | map(trim);
def csv2json:
first(inputs) | csv2table as $headers |
inputs | select(length > 0) | csv2table | objectify($headers);
csv2json |
Describe the bug
I have a 202MB Yaml file generated by llvm-pdbutil's pdb2yaml functionality. It is a yaml export of an executable debug database. I'm trying to do simple queries against this file to try to find/understand the contents and it seems that yq is not capable of doing so.
This command did not complete, it used about 10 GB of memory before I control-c'd it before I ran out of memory.
Note that any how to questions should be posted in the discussion board and not raised as an issue.
Version of yq: 4.16.2 from ubuntu PPA
Operating system: Ubuntu 20.04.4 LTS
Installed via: ppa
Input Yaml
Concise yaml document(s) (as simple as possible to show the bug, please keep it to 10 lines or less)
Err... I think it might be more than 10 lines.
Command
The command you ran:
Actual behavior
Lots of memory and CPU use, no output (even without /dev/null). I'm guessing it loads the entire file and all structures into a parsed memory structure before doing anything.
Expected behavior
Not sure if large files is just considered an unsupported feature of yq, but ideally, not use 10 GB of ram to parse a 200 MB file.
Additional context
The text was updated successfully, but these errors were encountered: