Skip to content

Incapable of working with large (very large?) yaml files #1215

Open
@MattMills

Description

@MattMills

Describe the bug
I have a 202MB Yaml file generated by llvm-pdbutil's pdb2yaml functionality. It is a yaml export of an executable debug database. I'm trying to do simple queries against this file to try to find/understand the contents and it seems that yq is not capable of doing so.

time yq eval '.DbiStream.Modules.Module' [snip].yaml > /dev/null
^C
real    3m49.209s
user    4m9.729s
sys     0m21.666s

This command did not complete, it used about 10 GB of memory before I control-c'd it before I ran out of memory.

Note that any how to questions should be posted in the discussion board and not raised as an issue.

Version of yq: 4.16.2 from ubuntu PPA
Operating system: Ubuntu 20.04.4 LTS
Installed via: ppa

Input Yaml
Concise yaml document(s) (as simple as possible to show the bug, please keep it to 10 lines or less)
Err... I think it might be more than 10 lines.

Command
The command you ran:

yq eval '.DbiStream.Modules.Module' [snip].yaml > /dev/null

Actual behavior

Lots of memory and CPU use, no output (even without /dev/null). I'm guessing it loads the entire file and all structures into a parsed memory structure before doing anything.

Expected behavior

Not sure if large files is just considered an unsupported feature of yq, but ideally, not use 10 GB of ram to parse a 200 MB file.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions