Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Race in Time Remap around mlt_frame_get_image #729

Closed
fabianschuiki opened this issue Aug 22, 2021 · 1 comment · Fixed by #730
Closed

Data Race in Time Remap around mlt_frame_get_image #729

fabianschuiki opened this issue Aug 22, 2021 · 1 comment · Fixed by #730
Labels
Milestone

Comments

@fabianschuiki
Copy link
Contributor

fabianschuiki commented Aug 22, 2021

I ran into a problem where output frames would be corrupted when using time remapping on some 4K footage. Specifically, the source was a MP4 file, 3840x2160 at 29.8471 Hz, and the rendered output being 30 Hz. The *.mlt I was running came out of a kdenlive project. What I observed was the following:

  • Nearest mode: Single frames would be corrupted (mostly greenish, arbitrary patterns, looked like something partially decoded)
  • Blend mode: Looked like corrupted frames would be interpolated/blended with non-corrupted frames

This made me think that it's probably a problem of the data flowing into the time remapping getting corrupted, rather than anything after the remapping. Also, the problem with nearest seemed to mainly appear at the boundary from a sped-up section to a 100% section, and never within the sped-up section. With blend mode the corruption would appear in many more places.

Digging through the code and running a debug build of melt and specially libmltcore with my own tweaks/hacks, I realized the following: Locking and unlocking a pthread_mutex in link_timeremap.c before link_get_image_blend/link_get_image_nearest would call mlt_frame_get_image until after the image data has been copied/processed would completely get rid of the problem. This suggests that there might be a data race on the calls to mlt_frame_get_image, where the returned image buffer might be corrupted by a second call to the function in a concurrent thread.

I'm not familiar enough with the code base and how it uses threading to know where to look to fix this more properly than my ugly static mutex in the critical parts of the code. Are there any considerations related to this in the design of MLT, or does the time remapping hit a new corner case because it requires more random access to input frames? Would this require adding the ability to lock/unlock the returned image data somehow?

@fabianschuiki
Copy link
Contributor Author

Pushed my hack to draft PR #730 for reference.

fabianschuiki added a commit to fabianschuiki/mlt that referenced this issue Aug 22, 2021
Fix an issue in `link_timeremap.c`, where multiple concurrent threads
calling `mlt_frame_get_image` would cause the returned image data to be
corrupted, or to be consumed incorrectly. As a result, the produced
frames would be seemingly arbitrarily corrupted, more so in "blend" mode
than in "nearest" mode, due to the increased number of frames fetched.

Add a `mlt_service_lock` and `mlt_service_unlock` call around the
critical function calls to ensure data returned is intact.

Fixes mltframework#729.
bmatherly pushed a commit that referenced this issue Aug 22, 2021
* Fix data race in time remapping

Fix an issue in `link_timeremap.c`, where multiple concurrent threads
calling `mlt_frame_get_image` would cause the returned image data to be
corrupted, or to be consumed incorrectly. As a result, the produced
frames would be seemingly arbitrarily corrupted, more so in "blend" mode
than in "nearest" mode, due to the increased number of frames fetched.

Add a `mlt_service_lock` and `mlt_service_unlock` call around the
critical function calls to ensure data returned is intact.

Fixes #729.
@bmatherly bmatherly added this to the v7.2.0 milestone Aug 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants