Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It is not possible to parse a file with newline (\n etc.) #38

Closed
k-zaytsev opened this issue Nov 9, 2020 · 3 comments
Closed

It is not possible to parse a file with newline (\n etc.) #38

k-zaytsev opened this issue Nov 9, 2020 · 3 comments
Labels
invalid This doesn't seem right

Comments

@k-zaytsev
Copy link

Please help. What am I doing wrong?

import asyncio

import ijson
# ijson==3.1.2.post0

from aiofile import AIOFile

data = """
[
 "a"
]
"""


async def main():
    with open("test.json", "w") as f1:
        f1.write(data)

    with open("test.json", "r") as f2:
        for obj in ijson.items(f2, prefix="item", use_float=True):
            print(obj)

    # ijson.common.IncompleteJSONError: parse error: trailing garbage
    #                                        [  "a" ]
    #                      (right here) ------^
    async with AIOFile("test.json",  "r") as fp:
        async for obj in ijson.items_async(fp, prefix="item", use_float=True):
            print(obj)

asyncio.run(main())
@rtobar
Copy link

rtobar commented Nov 9, 2020

@k-zaytsev this was a puzzling one, but I think the culprit is aiofile (version 3.1.1 locally), and this had nothing to do with newlines:

import asyncio

from aiofile import AIOFile

data = b'hi'

async def main():
    with open("test.data", "wb") as fp:
        fp.write(data)
    async with AIOFile("test.data", "rb") as fp:
        while True:
            buf = await fp.read(2)
            if not buf:
                break
            print(buf)

asyncio.run(main())

When running locally, that script never finishes. It doesn't matter how many bytes you try to read from fp (I'm reading 2), it will always read them from the start of the file without updating its current position. This is certainly not what one usually expects from a file-like object (and certainly against ijson's expectations, which stops parsing when no more data is read). In fact in the original code you could see two as being printed before the failure: one from the synchronous case, and another for the parsing of the first read from AIOFile; it's the second read from AIOFile (that should return an empty string) that generates the error.

Please confirm that you can reproduce what I see with that sample script. If you agree with the diagnose, please close this issue; I'd also suggest you either create an issue against aiofile, or switch to using aiofiles, which seems to be the more popular choice.

@k-zaytsev
Copy link
Author

Thanks! This helped me a lot to understand the problem!

@rtobar rtobar added the invalid This doesn't seem right label Nov 11, 2020
@mosquito
Copy link

It's missuse of aiofile cause the AIOFile object expects the offset argument (so it's zero by default 😀). For linear-like file reading you have to use Reader helper.

The reason is the aiofile is a library to made real asyncronous file operations. Asyncronous API with builtin file position pointer doesn't known to me. So for user this abstracitons will be provided as addidional utilities.

Please double check the aiofile README, it contains a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants