Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(#431) FileCacheV2 + Concurrent chunked file downloading #449

Merged
merged 90 commits into from
Jan 27, 2020

Conversation

K1rakishou
Copy link

@K1rakishou K1rakishou commented Dec 8, 2019

Closes #431, #150, #466, #420, #56 and maybe some other issues I couldn't find. Also reverts back webm streaming.
Ready to be merged.

FileCache was completely rewritten (as well as CacheHandler), FileCacheDownloader was merged into FileCache. FileCache now uses RxJava for concurrency and in the future it will be completely reactive (when MultiImageView is rewritten as well).
Some additional things that were also changed in this PR:

  • FileCache now executes asynchronous requests (instead of synchronous) which can be canceled while they are in-flight (which was impossible in the previous implementation because they were blocking, the only way to do it was via Thread.currentThread().interrupt() but it was still pretty tricky to do).
  • FileCache now uses two queues with their own threadpools: one to download images when swiping in the gallery and the other one when downloading whole albums. So now you can download albums AND swipe images in the album at the same time. If a file is already being downloaded via gallery downloading and you swipe to it in the gallery, this file WON'T be downloaded twice. Instead you will subscribe to an already active request and once it completes that image image will be returned.
  • Cache size was increased to 512MB. Prefetch cache remains the same but probably it may be increased as well. The reasoning to increase the cache size is that 100MB is not enough for some boards (the cache trimming is called pretty often (and it is now slower than it was before), you just need to swipe through 20 5MB webms on /wsg/ to fill up the cache) and there are other chans where the file limit is up to 25MB (!!!). But maybe 512MB is way too high. Maybe it should be like 350MB or something.
  • CacheHandler now DOES NOT use LastModified flag to tell apart fresh cache files from the old because LastModified DOES NOT WORK on some android versions as well as some phones (it doesn't even work on the emulator for me). Instead it uses a second file (meta file) for every cache file where two parameters are stored: creation time and a "isDownloaded" flag.
  • Album image preloading now works when swiping in both directions (in the previous implementation it was only preloading when swiping "forward", now it also preloads "next" file when swiping "backwards").
  • We don't cancel previous download ("current - 1"/"current + 1") when swiping forward/backward, instead we cancel a file at "current position - 2" (or + 2 in case of backward swipes) because it would sometimes cancel an image right when the user swipes back right after swiping forward. Should be better now.
  • There are now three different OkHttpClient instances: One for posting (the one which supports proxies), one for image downloading and one for local threads downloading. All of them use their own thread pools. Also, we now don't use proxied OkHttpClient for image downloading, only for posting. Two of them are injected via "Named" annotation the third one (proxied) is not for now bu should be in the future. By doing so we will be able to remove the ProxiedHttpClient wrapper.

UPDATE:
This PR will also have a new experimental method for files/images downloading. Instead of downloading a file sequentially, there will be a new setting that will allow you to download a file in separate chunks concurrently. In theory this should drastically improve image/file download speed. Basically it will work like the old boomer software called Download Master. The setting will allow you to change the amount of chunks every file will be split into. Let's say you change it to 4. That means, that if you try to download a file that has a size of 4MB it will be split into 4 chunks 1MB each and they will be downloaded concurrently which in theory should boost the download time 4x times. So if it usually downloads in 4 seconds it should be downloaded in 1 second.

I haven't tested it yet so it's only theoretical because 4chan may limit concurrent connections to the same resource and if that's true then there is nothing I can do. But I will still leave this so that people can use it with other chans.

This PR also fixes a bug when downloading sometimes would get completely stuck and OkHttp won't accept any new requests immediatelly throwing SocketTimeoutException. This is a OkHttp bug - square/okhttp#3146 and to fix it I had to disable ConnectionPooling which is pretty bad since it will create a new thread for every new request. We should revert it back once it is fixed in OkHttp 4.3 which should be released soon!

UPDATE2:
Also, webm streaming is also reverted back in this PR. But it does not work good with 4chan because 4chan does not allow concurrent access to the same file/image. But it works when a file is cached by cloudflare and it works on other chans with good servers so I'll leave it.

UPDATE3:
Pretty much done. So this thing works really bad with 4chan images/webms/gifs that ARE NOT CACHED by the cloudflare. 4chan blocks any concurrent access to the same resource and limits download speed by A LOT (like it may take up to 15 fucking seconds for a 3MB picture while it takes like 50ms for the same picture once it gets cached by the cloudflare). The same goes to webm streaming. But it works pretty good with images that are cached by the cloudflare and on some other chans with good servers (like 2ch.hk). So I will leave both of these settings available.
The good news are for those people who live closely to the place where 4chan servers are located: they should have a really good download speed with this thing (well it should have already been great), as well as people living close to the cloudflare CDN servers (EU/US). Once anybody loads an image, and it gets cached by the cloudflare, it will be then served from the cache for anyone who lives nearby.

Now what's left are unit tests. I'm planning to write tons of unit tests to cover all this shit so we don't break it accidentally. Gonna take a little bit more of time.

UPDATE4:
Wew this thing got out of hands and now it's 6.6K LOCs.
Ok, it's done. I did a little bit of testing and here are the things I've checked manually:

- Gallery image viewing with regular and chunked downloads.....OK
- Test media prefetching.......................................OK
- Test that media prefetching canceling works..................OK
- Test gallery batch downloading...............................OK
- Test local threads downloading...............................OK
- Test and make sure that the OkHttp bug is gone...............OK
- Test and make sure that clipboard image link downloading.....OK
- Test apk update download.....................................OK

I might forgot something.
Debug apk: Kuroba-3c18460.zip

UPDATE5:
Now for vichan and futaba we also parse and use the md5 parameter for every file. I use it to check, after a file has been downloaded, that the file is not corrupted. We can also use it now to implement post filtering based on file hash.

New Debug apk: Kuroba-c3e0956.zip

UPDATE6:
New Debug apk: Kuroba-5e3cc95.zip

UPDATE7:
Now the app shouldn't hang, when downloading batches of images, until all images are downloaded.

New Debug apk:
Kuroba-c7dd250.zip

UPDATE8:
Alright, after a couple of weeks of using it, I haven't had any problems and I think it's good to go. One thing I should change though. It's the setting where you select the amount of chunks that will be downloaded concurrently. I think 6 and 8 chunks is overkill. It should be 1, 2 and 4. 6 and 8 do not really speed up the download speed and they may even lag the device. So yeah, 1, 2 and 4 should be enough.

…when swiping forward and when swiping back.

Previous downloading image canceling now happens not for a previous or next image but "previous - 1" or "next + 1".
When couldn't load an image the reason of that happening will always be shown.
Increase timeouts for the "downloader OkHttpClient" so that people with bad connections still had a chance to load an image.
Use a hack to avoid OkHttpClient randomly getting blocked for no apparent reason and not accepting any new connections (apparently this somehow tied to the ConnectionPool and HTTP/2).
Show in the logs how much time image downloading took.
Close network requests on a background thread.
… flag which does not work on some Android versions or on some phones.

Now every file in the cache directory has a meta file with two values in it: creation time and whether it's downloaded or not.
We use those two parameters to clean up the cache and to figure out whether a file should be downloaded or return from the disk right away.
…really rare cases when an attempt to load an image throws SocketException.

Load images from disk for downloading threads.
Recalculate total cache size when deleting a cache file from the disk.
Some fixes in comments.
# Conflicts:
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/cache/FileCache.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/cache/FileCacheDownloader.kt
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/di/ManagerModule.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/di/NetModule.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/manager/ThreadSaveManager.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/presenter/ThreadPresenter.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/site/common/vichan/VichanAntispam.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/site/http/HttpCallManager.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/ui/captcha/v2/CaptchaNoJsHtmlParser.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/ui/captcha/v2/CaptchaNoJsPresenterV2.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/ui/controller/settings/DeveloperSettingsController.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/ui/helper/ImagePickDelegate.java
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/ui/view/MultiImageView.java
@K1rakishou
Copy link
Author

K1rakishou commented Dec 8, 2019

Oh, I forgot to change the prefetcher to also use the batched image downloading queue, so you can view images while prefetcher downloads everything in a thread.

@K1rakishou
Copy link
Author

K1rakishou commented Dec 8, 2019

Should also add a setting to force the downloader OkHttpClient to only use HTTP 1.1 protocol. Right now it's the default so it uses HTTP 1.1 and 2.0 and the latter may be the cause of all the problems.
No, HTTP 2.0 has nothing to do with that problem.

@K1rakishou
Copy link
Author

K1rakishou commented Dec 9, 2019

We also should somehow avoid cancelation of images at positions -1, 0 and +1 (where 0 is the current image the user is viewing in the gallery) to avoid the opening canceled images.

…the current viewable windows (-1, 0, +1, i.e. if it is previous image, current or the next image in the gallery relative to the one user currently has opened)
…f copying them or returning the original list via the "Unsafe" method thus making it 100% thread-unsafe
Cancel all prefetch downloads when switching from one thread to another one.
Do not cancel prefetch downloads when swiping away from an image in the album viewer (to avoid unnecessary exceptions).
Fix a bug where an already downloading image would start downloading second time instead of just resubscribing to it.
@K1rakishou K1rakishou changed the title (#431) FileCacheV2 (#431) FileCacheV2 + Concurrent chunked file downloading Dec 20, 2019
- Fix a bug with 2ch.hk file downloading where a file size we get in the thread json is in KB which, after converting back to bytes, would often result in incorrect file size thus breaking the chunked downloading. Now we will always send the HEAD request to 2ch.hk
- Handle the "else" branch in the supportsPartialContent condition.
- Comments fixes.
- Now we do not enqueue all images of a download batch at once, we do it in small chunks (4 images per chunk). Otherwise the phone may get ANRed if there a lot of images.
- Show LoadingViewController when downloading batches of images.
…nt chunk downloader settings and make 2 chunks the default setting
@Adamantcheese
Copy link
Owner

I'll write some review notes, but as more of a general thing, do you think it would be a good idea to break out the concurrent chunked downloading to its own repository like you did for SAF?

Copy link
Owner

@Adamantcheese Adamantcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's probably more that I missed, but I guess it's a start?

@K1rakishou
Copy link
Author

K1rakishou commented Jan 21, 2020

@Adamantcheese

do you think it would be a good idea to break out the concurrent chunked downloading to its own repository like you did for SAF?

Nah, I only did it with FSAF because I thought it would be a good idea to make it into a separate library because SAF is a pain in the ass and some other people might use it. This one, on the other hand, is very specific for our app. You won't see stuff like this in any other regular app.

But, I could extract it into a separate module since everything is already internal to it's package.

…out swiping it (for swipes nothing changes)

Remove synchronization for nonCancelableImages since it's not needed.
…ames won't blow up everything. Periods in dir names are still not allowed!
…le hash

- 2ch.hk sends file size in KB so we can't use it to avoid sending HEAD requests, so we have to always send them.
- Wired-7.org sometimes sends incorrect md5 hash so we have to skip the hash check.
# Conflicts:
#	Kuroba/app/build.gradle
#	Kuroba/app/src/main/java/com/github/adamantcheese/chan/core/di/AppModule.java
@Adamantcheese Adamantcheese merged commit e7efcf8 into multi-feature Jan 27, 2020
@K1rakishou K1rakishou deleted the (#431)-file-cache-fixes branch January 27, 2020 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] When quickly swiping through images in the gallery previous downloading files do not get canceled
3 participants