New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading with bz2.BZ2File() returns one garbage character #44233
Comments
When comparing two files which should be equal the last line is The first file is a bzip2 compressed file and is read with The first file named file.txt.bz2 is uncompressed with: $ bunzip2 -k file.txt.bz2 To compare I use this script: f1 = bz2.BZ2File(r'file.txt.bz2', 'r')
f2 = open(r'file.txt', 'r')
lines = 0
while True:
line1 = f1.readline()
line2 = f2.readline()
if line1 == '':
break
lines += 1
if line1 != line2:
print 'line number:', lines
print repr(line1)
print repr(line2)
f1.close()
f2.close()
############################## Output: $ python bzp.py
line number: 588317
'\x07'
'' The offending attached file is 5.5 MB. Sorry, i could not reproduce this problem Tested in Fedora Core 5 and Python 2.4.3 |
I can't upload the bz2 sample file. So it is here: |
With your file, I can reproduce that on Linux, Python 2.5. Which compressor did you compress your file with? |
I received this file already compressed. I don't know what was the used compressor. $ bzip2 -t file.txt.bz2 |
There are some bugs in the bz2 module. The problem boils down to the do { This could be fixed by putting a "if (bzerror == BZ_OK) break;" after However, I also noticed that in the universal newline section of the I changed the code around so that the read loop is unified between Please let me know if this looks good to commit. |
Found some problems in the previous version, this one passes the tests This code passes the test and also correctly handles the bz2 file that |
I have committed this into trunk and the 2.5 maintenance branch. It |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: