Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in HlsDownloader leading to ConcurrentModificationException & NPE #101

Open
Ilya-Gh opened this issue May 24, 2021 · 6 comments
Assignees

Comments

@Ilya-Gh
Copy link

Ilya-Gh commented May 24, 2021

The Race condition in HlsDownloader leading to ConcurrentModificationException & NPE ocurred after update from v2.6.4 to v2.6.9.

Currently, the HlsDownloader#createDownloadTasks function can be called from 2 places which are run in 2 separate Threads.

  1. On track selection from TrackSelectorImp.apply, which uses AsyncTasks sDefaultExecutor.
  2. Download services after metadata completes, which uses a ThreadPoolExecutor to invoke HlsDownloader#createDownloadTasks further down the stack via AbrDownloader#apply call.

The race condition appeared after metadata fetching was moved to a dedicated ThreadPoolExecutor from AsyncTasks sDefaultExecutor which was reused reusing the same Thread for both calls.

Steps to reproduce

  1. Start download.
  2. Change track selection at the time when metadata loading finishes.

Expected Results

The download should start.

Actual Results

The following exceptions are thrown from dtg:

Fatal Exception: java.util.ConcurrentModificationException
       at java.util.ArrayList$Itr.next(ArrayList.java:860)
       at com.kaltura.dtg.hls.HlsDownloader.createDownloadTasks(HlsDownloader.java:88)
       at com.kaltura.dtg.AbrDownloader.applyInitialTrackSelection(AbrDownloader.java:122)
       at com.kaltura.dtg.hls.HlsDownloader.applyInitialTrackSelection(HlsDownloader.java:297)
       at com.kaltura.dtg.AbrDownloader.apply(AbrDownloader.java:235)
       at com.kaltura.dtg.TrackSelectorImp.lambda$apply$0(TrackSelectorImp.java:39)
       at com.kaltura.dtg.TrackSelectorImp.lambda$apply$0$TrackSelectorImp(TrackSelectorImp.java)
       at com.kaltura.dtg.-$$Lambda$TrackSelectorImp$nlSaArBap0OhKxA2tNBRIKRHnbY.run(-.java:4)
       at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:289)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
       at java.lang.Thread.run(Thread.java:919)
Fatal Exception: java.lang.NullPointerException: Attempt to read from field 'int i.n.b.b1.a$b.a' on a null object reference
       at com.kaltura.dtg.hls.HlsDownloader.createDownloadTasks(HlsDownloader.java:90)
       at com.kaltura.dtg.AbrDownloader.applyInitialTrackSelection(AbrDownloader.java:122)
       at com.kaltura.dtg.hls.HlsDownloader.applyInitialTrackSelection(HlsDownloader.java:297)
       at com.kaltura.dtg.AbrDownloader.apply(AbrDownloader.java:235)
       at com.kaltura.dtg.TrackSelectorImp.lambda$apply$0(TrackSelectorImp.java:39)
       at com.kaltura.dtg.TrackSelectorImp.lambda$apply$0$TrackSelectorImp(TrackSelectorImp.java)
       at com.kaltura.dtg.-$$Lambda$TrackSelectorImp$nlSaArBap0OhKxA2tNBRIKRHnbY.run(-.java:4)
       at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:289)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
       at java.lang.Thread.run(Thread.java:919)
@kaltura-hooks
Copy link

Hi @Ilya-Gh,

Thank for you reporting an issue and helping improve Kaltura!

To get the fastest response time, and help the maintainers review and test your reported issues or suggestions, please ensure that your issue includes the following (please comment with more info if you have not included all this info in your original issue):

  • Is the issue you're experiencing consistent and across platforms? or does it only happens on certain conditions?
    please provide as much details as possible.
  • Which Kaltura deployment you're using: Kaltura SaaS, or self-hosted?
    If self hosted, are you using the RPM or deb install?
  • Packages installed.
    When using RPM, paste the output for:
	# rpm -qa \"kaltura*\"
For deb based systems:
	# dpkg -l \"kaltura-*\"
  • If running a self hosted ENV - provide the MySQL server version used
  • If running a self hosted ENV - is this a single all in 1 server or a cluster?
  • If running a self hosted ENV, while making the problematic request, run:
	# tail -f /opt/kaltura/log/*.log /opt/kaltura/log/batch/*.log | grep -A 1 -B 1 --color \"ERR:\|PHP\|trace\|CRIT\|\[error\]\"

and paste the output.

  • When relevant, provide any screenshots or screen recordings showing the issue you're experiencing.

For general troubleshooting see:
https://github.com/kaltura/platform-install-packages/blob/Jupiter-10.13.0/doc/kaltura-packages-faq.md#troubleshooting-help

If you only have a general question rather than a bug report, please close this issue and post at:
http://forum.kaltura.org

Thank you in advance,

@GouravSna GouravSna self-assigned this May 24, 2021
@GouravSna
Copy link
Contributor

GouravSna commented May 25, 2021

Can you please share a sample of your scenario ? Better if you share a fork branch.

@Ilya-Gh
Copy link
Author

Ilya-Gh commented May 25, 2021

@GouravSna The closer I could get. Ilya-Gh@0756be0

Steps to reproduce:

  • Start multitrack HLS download
  • If it does not crash then remove the download and repeat the cycle.

It crashes 20-30% of times for me with the following exception. I guess it can be ConcurrentModificationException sometimes as well as we see in our app (track selection might needed to be adjusted for that)

    Process: com.kaltura.dtg.demo, PID: 28097
    java.lang.NullPointerException: Attempt to read from field 'int com.kaltura.dtg.hls.HlsAsset$Chunk.lineNum' on a null object reference
        at com.kaltura.dtg.hls.HlsDownloader.createDownloadTasks(HlsDownloader.java:90)
        at com.kaltura.dtg.AbrDownloader.applyInitialTrackSelection(AbrDownloader.java:122)
        at com.kaltura.dtg.hls.HlsDownloader.applyInitialTrackSelection(HlsDownloader.java:297)
        at com.kaltura.dtg.AbrDownloader.apply(AbrDownloader.java:235)
        at com.kaltura.dtg.TrackSelectorImp.lambda$apply$0$TrackSelectorImp(TrackSelectorImp.java:39)
        at com.kaltura.dtg.-$$Lambda$TrackSelectorImp$nc8jV5_c5sx4Pk5MMu1__H32Fp8.run(Unknown Source:4)
        at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:289)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
        at java.lang.Thread.run(Thread.java:919)

@GouravSna
Copy link
Contributor

GouravSna commented Jun 29, 2021

Hi @Ilya-Gh ,

As per our architecture , once you call createItem on ContentManager then item state is NEW. Then you call loadMetaData and during this we do fire onTracksAvailable() which helps to do setSelectedTracks (manually or dynamically).

onTracksAvailable() is not meant to be called startDownload because metadata is not ready yet. We do fire onDownloadMetadata which tells the app that now they can call start the download.

loadMetaData was moved to another thread to give the ability to cancel it and it handles db operations etc. Please check our new download sequence for the reference.

Please move your startDownload code to onDownloadMetadata.

@Ilya-Gh
Copy link
Author

Ilya-Gh commented Sep 21, 2022

Hey @GouravSna. Thanks for taking a look at the issue, but it's reproducible on our end.

I have double-checked our project. We already call startDownload from onDownloadMetadata callback, as you suggested.

I have remade the sample project by removing startDownload call as it's redundant for this scenario.
Ilya-Gh@e21576d

I have added the code that restarts the download when it finish so it will be easier for you to notice the crash (the reproducibility rate is about 5% for me).

The line that makes the app crash is Ilya-Gh@e21576d#diff-d59630304bf15fb3aa1e3709f4ca60a982485601c69b7eff2d94c94e37cad140R501

The stack trace with the crash:

FATAL EXCEPTION: AsyncTask #2
Process: com.kaltura.dtg.demo, PID: 6983
java.util.ConcurrentModificationException
	at java.util.ArrayList$Itr.next(ArrayList.java:860)
	at com.kaltura.dtg.hls.HlsDownloader.createDownloadTasks(HlsDownloader.java:98)
	at com.kaltura.dtg.AbrDownloader.applyInitialTrackSelection(AbrDownloader.java:122)
	at com.kaltura.dtg.hls.HlsDownloader.applyInitialTrackSelection(HlsDownloader.java:314)
	at com.kaltura.dtg.AbrDownloader.apply(AbrDownloader.java:235)
	at com.kaltura.dtg.TrackSelectorImp.lambda$apply$0$com-kaltura-dtg-TrackSelectorImp(TrackSelectorImp.java:39)
	at com.kaltura.dtg.TrackSelectorImp$$ExternalSyntheticLambda0.run(Unknown Source:4)
	at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:305)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
    at java.lang.Thread.run(Thread.java:920)

Could you please check the first part of the issue, where I described the potential concurrency issue in the code?

Currently, the HlsDownloader#createDownloadTasks function can be called from 2 places which are run in 2 separate Threads.

  1. On track selection from TrackSelectorImp.apply, which uses AsyncTasks sDefaultExecutor.
  2. Download services after metadata completes, which uses a ThreadPoolExecutor to invoke HlsDownloader#createDownloadTasks further down the stack via AbrDownloader#apply call.

The race condition appeared after metadata fetching was moved to a dedicated ThreadPoolExecutor from AsyncTasks sDefaultExecutor which was reused reusing the same Thread for both calls.

@Ilya-Gh
Copy link
Author

Ilya-Gh commented Sep 21, 2022

@GouravSna As this issue is closed. Lets continue the discussion in #115

@GouravSna GouravSna reopened this Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants