Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

[jpegio] Use chunked reading if CFile::Open doesn't set CFile::GetFileSize #1885

Merged
merged 1 commit into from Jan 8, 2013

Conversation

Projects
None yet
2 participants
Member

jmarshallnz commented Dec 4, 2012

This solves the problem mentioned here:

http://forum.xbmc.org/showthread.php?tid=145656

in that some webservers don't set the filesize, or the file isn't seekable (with unknown filesize).

This fixes it by reading chunked.

Alternatives might be to have a general routine available to do this stuff such as CFile::ReadFile(std::vector<uint8_t> &foo) or some such.

Could also be done a little clearer perhaps by separating the chunked and non-chunked reads (the non-chunked read is a single Read() call like it was before).

@elupus, @theuni to review.

Member

elupus commented Dec 4, 2012

Your resize code is rather slow, you should resize to something larger than
needed. Ie in multiples of previous size or similar. Now you keep
reallocating each loop.

Also READ_TRUNCATED should be set to avoid pointless internal loops in
cfile.

Also, I would very very much like to see us checking for jpeg header when
we have enough and aborting early then to avoid downloading whole files
that are not jpeg.

Member

jmarshallnz commented Dec 4, 2012

Considering we start with either the whole file (most cases) or at least 65k, it's not too bad really considering the size of most jpegs we load (around 200k or so). I don't mind having a doubling type routine but really the time spent in the code here isn't constrained by the slowness of resizing of the buffer ;)

Agreed regarding READ_TRUNCATED and checking the jpeg header, but the latter is a separate issue (also, this code really only gets run on images named "foo.jpg", "foo.tbn" or with mimetype set to "image/jpeg" which somewhat reduces the scope for non-jpeg files being parsed).

@jmarshallnz jmarshallnz added a commit that referenced this pull request Jan 8, 2013

@jmarshallnz jmarshallnz Merge pull request #1885 from jmarshallnz/chunky_jpegs
[jpegio] Use chunked reading if CFile::Open doesn't set CFile::GetFileSize
c04aa30

@jmarshallnz jmarshallnz merged commit c04aa30 into xbmc:master Jan 8, 2013

Member

jmarshallnz commented Jan 8, 2013

@elupus: pulled into master with changes as suggested by you. Please sign-off for Frodo inclusion.

Member

elupus commented Jan 8, 2013

The stop on read == total is pointless and possibly harmfull. Also it will
now allocate a huge buffer (2meg) if source file system return 1 byte at a
time, which http could very much do, since you grow on each read instead of
full buffer.

Member

jmarshallnz commented Jan 8, 2013

The stop on read == total is required to ensure the buffer doesn't grow unnecessarily in the case where filesize is known (as amount == 0 at that point).

We also only increase the buffer size if amount == 0, which happens only once we've read in enough to get to the end of the current buffer.

@jmarshallnz jmarshallnz deleted the jmarshallnz:chunky_jpegs branch Jan 8, 2013

Member

elupus commented Jan 9, 2013

So if file has grown since we started, we truncate. GetLength should only
be seen as a hint.

And if file size is unknown but actually about 100bytes, then read in steps
of 1 byte at the time (truncate allow that) we will double the allocated
buffer 100 times.

Member

jmarshallnz commented Jan 9, 2013

  1. Yes. I dunno if this is actually a real issue or not. If it is, it would probably be best to initially make chunksize slightly larger (one byte is enough) than the actual filesize, so that the second Read() operation can be done in the case where GetLength() is accurate (the second Read() returning 0 and breaking the loop). If chunksize == file length, then there's always a realloc() otherwise.
  2. Nope. We resize only when we've completely filled the existing buffer (amount == 0), which happens only once we've read chunksize bytes. We'd allocate 65k in that case.
Member

jmarshallnz commented Jan 9, 2013

The code is be a bit clearer after 9e7d52f and also takes care of the expanding file case (0 < GetLength() < real file size)

Member

elupus commented Jan 9, 2013

Looks fine. Seems, funny that it needed that big comment. Sorry about that
:-)

Member

jmarshallnz commented Jan 9, 2013

Heh - it is kinda unobvious exactly how all the cases drop through, and is nicer without that check for total == filesize. I figured this code will end up as some CFileUtils::ReadFile() or some such (or even CFile I guess?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment