-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JQ dependency is too heavy for some setups #162
Comments
Scratch the last option. We won't be able to interleave object types this way, unfortunately. Unless we change the schema, embedding type data into objects themselves, that is. |
Also, check if the stock pypi package is easy to install in AWS lambda. If it is, try to get stream parsing support merged into the upstream again. Consider implementing parsing file objects instead of iterators/generators, as the maintainer requested. |
@spbnick Is there extra information I need to know? |
@mrbazzan, that would be fun to do indeed! If you want to do that, here are some of the requirements:
You can work on that in your own repo, making your own package, etc., but I would need to review the solution before using it and accepting the dependency. The likelihood of success would be increased with early feedback, though. |
@spbnick Okay. A pure python implementation for binding Also, please kindly provide sample data to run test on. |
Also, I'm still pretty confused. I went through the I'm really interested in this project, and I would appreciate further guidance |
We can't really use a pure-Python implementation for binding jq, since it's written in C. So we might need to write a pure-Python (using only standard library) parser for JSON object streams. That's all we need from JQ - the ability to parse a sequence of JSON objects without loading the whole file into memory. Another option maybe is to work with upstream further for incorporating our changes (I got to the point of the author ignoring me 😬), or maybe making our own binding for jq, just for stream parsing. In either case we would need a compiled package on PyPi, and a verification that e.g. Amazon Lambda can handle it. Here's a release tag of our fork of jq.py, if you're interested in that: https://github.com/kernelci/jq.py/tree/1.2.1.post1
You can start with the sample I already provided, just |
Oh... I think I have a better understanding of the problem now. We want a package that offers the parsing ability of JQ(without loading the whole file in memory) but with standard Python packages, so as to make it easy to install kcidb in environments like AWS lambda, right? |
Yep. Or, if compiled pypi packages work in AWS after all, either work with upstream to integrate our changes, or make our own pypi package binding jq just for parsing. |
Depending on the jq library makes it difficult to install kcidb in restricted environments, particularly in AWS Lambda, which e.g. Tuxsuite uses.
Consider other options, e.g.:
kcidb-submit-stream
,kcidb-db-dump-stream
,kcidb-db-load-stream
, and so on), and move them to a separate package, along with jq dependency.The text was updated successfully, but these errors were encountered: