Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DiskBasedCache fix with unit tests #52

Merged
merged 14 commits into from
Jul 28, 2017
Merged

DiskBasedCache fix with unit tests #52

merged 14 commits into from
Jul 28, 2017

Conversation

joebowbeer
Copy link
Contributor

Fixes Issue #12 without changing cache format

Changes:

  1. DiskBasedCache fixed to avoid NegativeArraySizeException and OutOfMemoryError thrown from streamToBytes. Enhanced CountingInputStream nested class to maintain bytesRemaining (the file length initially), so that streamToBytes can validate its length parameter before attempting to allocate a new byte array.

  2. DiskBasedCacheTest unit tests extended to 100% method coverage and 95% code coverage.
    All tests passing: https://circleci.com/gh/joebowbeer/volley/24

@joebowbeer
Copy link
Contributor Author

See #13 for prior discussion.

@jpd236
Copy link
Collaborator

jpd236 commented Jul 11, 2017

Thanks for the rebase :)

I'll get to this review soon - I want to give it a very careful look. At a high level, though, I feel much better about this approach than what we were looking at before.

Copy link
Collaborator

@jpd236 jpd236 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall - I feel pretty good about this approach, and really appreciate your thorough unit tests to ensure that nothing regresses. My comments are mostly nits, and I think there's definitely a path forward to getting this merged. Nice work!

byte[] data = streamToBytes(cis, cis.bytesRemaining());
return entry.toCacheEntry(data);
} finally {
//noinspection ThrowFromFinallyBlock
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confused me at first, so it would be nice to add a clarifying comment explaining why this is a safe warning to suppress, e,g.:

// Any IOException thrown here is handled by the below catch block by design.

Copy link
Contributor Author

@joebowbeer joebowbeer Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

BTW, this pattern is Style 2 at javapractices.com - Finally and catch

CountingInputStream cis = new CountingInputStream(
new BufferedInputStream(createInputStream(file)), (int) file.length());
try {
CacheHeader found = CacheHeader.readHeader(cis);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you name the variable "entryHeader" or something more clearly indicative of what this variable represents?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

new BufferedInputStream(createInputStream(file)), (int) file.length());
try {
CacheHeader found = CacheHeader.readHeader(cis);
if (!key.equals(found.key)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: while this is safe, I prefer "!TextUtils.equals(key, found.key)" as a best practice in general because it offers 100% guarantee that the code won't crash even if key is null, whereas otherwise we have to look more closely at the lines above to make sure that this is a guaranteed impossibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (!key.equals(found.key)) {
// File was shared by two keys and now holds data for a different entry!
VolleyLog.d("%s: key=%s, found=%s", file.getAbsolutePath(), key, found.key);
removeEntry(key);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be remove() or removeEntry()? If the former, maybe it would be cleaner to just do:

throw new IOException(String.format("%s: key=%s but file key = %s", file.getAbsolutePath(), key, found.key));

and let the catch block take care of the cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to clarify.

entry.size = cis.bytesRemaining();
putEntry(entry.key, entry);
} finally {
//noinspection ThrowFromFinallyBlock
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here re: comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
} catch (IOException ignored) { }
CacheHeader entry = CacheHeader.readHeader(cis);
entry.size = cis.bytesRemaining();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the previous code considered the entry size equal to the file length (i.e. including the header), but the new code would only count the number of bytes remaining after reading the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous code was inconsistent. The entry.size was set to file.length() in initialize(), but was set to the data.length in put(), via the two-arg CacheHeader constructor. The new code consistently uses the data length.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - good catch re: the inconsistency.

However, since the purpose of tracking the size is to limit the total size of the cache on disk, and since the cache header itself can contain a decent chunk of data (in the response headers), would it be feasible/advisable to consistently use the full size of the file instead?


@RunWith(RobolectricTestRunner.class)
@Config(manifest="src/main/AndroidManifest.xml", sdk=21)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Volley does (currently) need to be compatible with SDK 8 and earlier. I'm open to relaxing that to something like API 14, but it'd be a topic for a separate discussion, and I'm worried that if we test against SDK 21 here, we may mask potential issues like calls to APIs that didn't exist back in SDK 8.

Is this necessary, and if so, why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to sdk=16, which is the minimum supported by Robolectric.

Note that Robolectric runs sdk=16 by default.

I added the sdk attribute here because I wanted to eliminate a Robolectric warning about not being able to find the manifest, and adding a manifest attribute seems to require an sdk attribute. (The other Robolectric tests still generate this warning.)

AFAIK none of the unit tests under the test path are testing sdk=8 compliance. For that, I recommend adding some tests under the androidTest path.

cache.put("kilobyte3", randomData(1024));

// Create DataInputStream that throws IOException
InputStream mockedInputStream = spy(InputStream.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you actually need a spy here, or does a regular mock work?

Copy link
Contributor Author

@joebowbeer joebowbeer Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think spy is needed, or at the least it is an expedient way to inject only the failure that I want. I need the methods other than read() to function normally.

private Cache.Entry randomData(int length) {
Cache.Entry entry = new Cache.Entry();
byte[] data = new byte[length];
new Random().nextBytes(data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random data in unit tests is typically an antipattern - it can lead to flakiness if one set of data causes the tests to fail but others work fine. And flakiness can lead to a lot of pain for things like continuous integration.

Assuming we don't expect the contents of the data to matter much, I'd just write fixed data here of some sort. If we did think the data contents were actually likely to impact code flows rather than just be read agnostically as bytes, it would potentially be reasonable to try a bunch of different combinations of data, but still from a fixed set. But since AFAIK we just read the raw bytes directly and all the interesting logic happens surrounding that, that feels like overkill here.

Copy link
Contributor Author

@joebowbeer joebowbeer Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the random seed explicit and added a comment.

rules.gradle Outdated
testCompile "org.mockito:mockito-core:1.9.5"
testCompile "junit:junit:4.12"
testCompile "org.hamcrest:hamcrest-library:1.3"
testCompile "org.mockito:mockito-core:2.2.16"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any objections to using 2.2.29 instead? It's the latest from the 2.2 branch (and is available in our internal prebuilt repositories, whereas 2.2.16 isn't, so it'll make our lives a bit easier when we pull in this patch).

Alternatively, I'd certainly be fine with leaving this at 1.9.5 and dealing with upgrading Mockito separately. Do you need specific new functionality from the upgraded deps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jpd236 jpd236 requested a review from yilanl July 14, 2017 00:26
@jpd236
Copy link
Collaborator

jpd236 commented Jul 14, 2017

@yilanl - feel free to take a look as well to see if you have any further concerns.

@joebowbeer
Copy link
Contributor Author

Thanks for review! PTAL

@joebowbeer
Copy link
Contributor Author

joebowbeer commented Jul 15, 2017 via email

@joebowbeer
Copy link
Contributor Author

joebowbeer commented Jul 15, 2017 via email

@jpd236
Copy link
Collaborator

jpd236 commented Jul 17, 2017

I'd counter that the actual disk usage seems like what is more important to developers - it's what shows up in the platform settings and is attributed as usage to the app. It seems harder for a developer to understand/control what's going on if we just say "max usage not counting for some overhead that changes depending on the number of entries you have in the cache + how many headers the HTTP server happens to return in the response" when all they're really concerned about is saying "don't use more than this much space on disk".

Could you elaborate on what makes that approach difficult?

I'm also okay deferring handling of this issue to its own separate issue, as it's certainly unrelated to the overall goal of fixing the crashes here and adding initial tests. But I would only feel comfortable doing so if this wouldn't introduce a regression where the cache might suddenly grow larger on disk than it would have with older versions of Volley due to changes in accounting.

@joebowbeer
Copy link
Contributor Author

The put() implementation works exactly as it did before: pruneIfNeeded is called first in order to release enough space for the new entry to be written (data.length), and the new entry size is recorded as its data.length.

There is no file written before prune is called, so there is no easy way to prune based on the file length of the new entry.

Previously, initialize() worked differently from put(). During initialize() the file length was used to record the entry size. So the recorded size of an entry would change based on whether the entry was initialized from a previously written file, or put() in the current session. But initialize does not prune, so the total size of the cache on disk would not have changed: it would have been pruned according to the data size not the file size. After initialize(), however, a subsequent put() might cause some previously cached entries to be removed, because the size of the cache after initialize() will appear larger than it did when those entries were put().

If this is really the behavior you want to replicate, then it is easy enough. One liner.

However, trying to put() and prune() based on the size of an entry that has not yet been written would be more difficult to achieve, and I would view this as an incompatible change.

@joebowbeer
Copy link
Contributor Author

joebowbeer commented Jul 17, 2017

BTW, Travis build failure is not related to my change and appears to be due to a flaky test in RequestQueueIntegrationTest: add_requestProcessedInCorrectOrder

All tests pass here: https://travis-ci.org/joebowbeer/volley/builds/254582764

@jpd236
Copy link
Collaborator

jpd236 commented Jul 18, 2017

Forked #57 to resolve the inconsistency in how we track sizes. In general, I prefer to keep changes as small and scoped to the bug they're trying to fix as possible - it makes it easier to understand where behavior changed / revert broken changes as needed. This way we won't have to revert this CL or dig into it more deeply if someone reports a cache size growing unexpected large, for instance :)

Forked #56 for the flaky test. I retried the CI run until it passed (took two tries).

This change looks good to me. Since it's non-trivial, let me loop in the other Volley code reviewer and give them a chance to voice any concerns before I merge.

@jpd236 jpd236 merged commit aafd2e5 into google:master Jul 28, 2017
@jpd236
Copy link
Collaborator

jpd236 commented Jul 28, 2017

Thanks for your work and patience :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants