-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
work with generator source #44
Comments
@Bubu thanks for the very good question. I have pondered about this myself in the past, I have also thought it would be nice to have -- so having someone else express interest in the idea is definitely good. I think this should be possible, but needs some care to great care. The ijson functions actually already support generators as inputs, but those are assumed to be the lower-level generator functions of ijson itself (e.g., you can use ijson.parse() as the input to ijson.items, see the "Intercepting events" section of the README). It should still be possible to detect those separately from any other arbitrary generators and act differently though. After that it should all work, because funnily enough we internally turn file objects into generators! Line 215 in ed2182e
The C backend might need some more extra care as well. I can't promise anything in terms of deadlines. But like I said, I'm onboard with the idea, and if someone decides to step in and give it a crack in the meanwhile I'll be happy to review code and PRs. |
For those coming in the future: see #58 (comment) for an (untested, personally) example of a simple file-like wrapper around a generator as a workaround. |
Based on the above, here is what I'm using with httpx: import httpx
import ijson
from contextlib import asynccontextmanager
from typing import AsyncIterator
class HttpxStreamAsFile:
def __init__(self, url: str):
self.url = url
self.data = None
self.client = httpx.AsyncClient()
@asynccontextmanager
async def create_stream(self) -> AsyncIterator:
try:
await self._create_stream()
yield
finally:
await self.client.aclose()
async def _create_stream(self) -> None:
req = self.client.build_request("GET", self.url)
res = await self.client.send(req, stream=True)
self.data = res.aiter_bytes()
async def read(self, n: int) -> None:
if self.data is None or n == 0:
return b""
return await anext(self.data, b"")
async def main():
url = "your-url"
httpx_as_file = HttpxStreamAsFile(
url
)
async with httpx_as_file.create_stream():
async for prefix, event, name in ijson.items(httpx_as_file):
print(prefix, event, name) |
Hi,
I was trying to use ijson with a json stream coming from a zip archive through through a libarchive binding. Unfortunately the package I tried first exposed only a generator for getting the file bytes out of the zip:
https://github.com/Changaco/python-libarchive-c/blob/master/libarchive/entry.py#L48-L56
This is apparently not currently supported by ijson? At least I was getting very strange errors (internal C errors with the default C backend, "too many values to unpack" with the python backend using
.items()
) which I eventually could narrow down to the generator when using.basic_parse()
. Would it make sense to support generators as a source as well or is that somehow fundamentally incompatible?(Meanwhile I've switched to using the other python libarchive binding which does offer a file-like interface for reading from the archive.)
The text was updated successfully, but these errors were encountered: