S3 FileField returns zero bytes when read #4098

flooie · 2024-06-05T21:37:52Z

Summary

When attempting to read a file stored on AWS S3 using Django’s FieldFile object, the file content is returned as an empty binary string (b''), despite the file existing and having a non-zero size.

Steps to Reproduce

Example Recap Document id: 102021417

In [26]: r.filepath_local
Out[26]: <FieldFile: recap/gov.uscourts.cacd.683100/gov.uscourts.cacd.683100.178.0.pdf>

In [27]: r.filepath_local.read()
Out[27]: b''

In [28]: r.filepath_local.size
Out[28]: 100549

Expected Results

The content of the file should be read and returned as a non-empty binary string.

Actual Results

The FieldFile.read() method returns an empty binary string (b'').

This is causing the extract recap into op crash - somewhere in the micro service because obviously it's sending zero bytes and its crashing it.

On the good news front the retry works as expected for timeouts of legitimate failures of DOCTOR although I want to bump it to 5 retries at least from 3.

@mlissner if you have any insights into how this could be occurring I would appreciate it.

The text was updated successfully, but these errors were encountered:

mlissner · 2024-06-05T21:59:47Z

Is this some of the time or all of the time?

quevon24 · 2024-06-06T00:34:10Z

I think it is a different document, using that id gives me another document: https://www.courtlistener.com/api/rest/v3/recap-documents/102021417/ (gov.uscourts.cacd.572997.50.0.pdf)

I cloned everything from that recap document id(docket entry, docket and recap document) and uploaded the file to dev bucket using the admin: https://dev-com-courtlistener-storage.s3.amazonaws.com/recap/dev.gov.uscourts.cacd.572997/gov.uscourts.cacd.572997.50.0.pdf

I was able to read the file correctly

filepath_local.read()
b'%PDF-1.6\r%\xe2\xe3\xcf\xd3\r\n83 0 obj\r<</Filter/FlateDecode/First 5/Length 177/N 1/Type/ObjStm>>stream\r\nh\xded\xcc\xb1\n\xc20\x14@\xd1_y\x9b\xc9\xd0\xf65!m*\xa5P\x0cn\x82\x88\xd8E\x90\xdaD\x0c\x94>H"\xfa\xf9:\x88\x8b\xfb\xb9Wj@h\xdb\xa2\x7f\xa4;\x05\x16\xa3\xd4\xb5\x92\xbc\xd8\x047&O\x8b\x19\x93cf-\xb0lPc\x8d\x8d.................

filepath_local.read() 207714

Maybe it is some permissions problem that prevents the file from being read? does it read other files correctly?

mlissner changed the title ~~Extract Recap to Opinions Bug~~ S3 FileField returns zero bytes when read Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 FileField returns zero bytes when read #4098

S3 FileField returns zero bytes when read #4098

flooie commented Jun 5, 2024

mlissner commented Jun 5, 2024

quevon24 commented Jun 6, 2024

S3 FileField returns zero bytes when read #4098

S3 FileField returns zero bytes when read #4098

Comments

flooie commented Jun 5, 2024

Summary

Steps to Reproduce

Expected Results

Actual Results

mlissner commented Jun 5, 2024

quevon24 commented Jun 6, 2024