Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/support-processing-m3u8-video-playlist-uris #188

Merged
merged 8 commits into from
May 22, 2022

Conversation

evamaxfield
Copy link
Member

Link to Relevant Issue

CouncilDataProject/cdp-scrapers#96 (comment)

Description of Changes

Include a description of the proposed changes.

Allows processing of m3u8 URIs.

What will happen is during resource_copy we will download and convert the m3u8 playlist to an mp4. Then during determining the video host, we check the original session video uri and see if it ends with ".m3u8" and if so, we host the already converted and saved locally m3u8 file.

@evamaxfield evamaxfield added enhancement New feature or request event gather pipeline A feature or bugfix relating to event processing labels May 22, 2022
@evamaxfield evamaxfield self-assigned this May 22, 2022
@codecov
Copy link

codecov bot commented May 22, 2022

Codecov Report

Merging #188 (c85dd3f) into main (5888bca) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #188      +/-   ##
==========================================
+ Coverage   94.55%   94.60%   +0.04%     
==========================================
  Files          50       50              
  Lines        2607     2630      +23     
==========================================
+ Hits         2465     2488      +23     
  Misses        142      142              
Impacted Files Coverage Δ
cdp_backend/pipeline/event_gather_pipeline.py 86.00% <100.00%> (+0.28%) ⬆️
cdp_backend/tests/conftest.py 100.00% <100.00%> (ø)
...ckend/tests/pipeline/test_event_gather_pipeline.py 100.00% <100.00%> (ø)
cdp_backend/tests/utils/test_file_utils.py 100.00% <100.00%> (ø)
cdp_backend/utils/file_utils.py 92.99% <100.00%> (+0.37%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5888bca...c85dd3f. Read the comment docs.

Copy link
Contributor

@dphoria dphoria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate!

@evamaxfield
Copy link
Member Author

I have two actions running on my dev infrastructure to test:

  1. testing the manual / custom process event: https://github.com/JacksonMaxfield/cdp-dev/runs/6545676305?check_suite_focus=true
  2. testing the normal event gather: https://github.com/JacksonMaxfield/cdp-dev/runs/6545697019?check_suite_focus=true

if both succeed I will link to their page on my dev infra website and then likely merge after approvals! 🎉

Comment on lines +130 to +134
m3u8_To_MP4.download(
uri,
mp4_file_dir=dst.parent,
mp4_file_name=save_name,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This downloads to an mp4 file in local storage - could that be a problem on GitHub actions with limited storage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We store the video on the disk regardless of m3u8 or native mp4. This slightly duplicates storage because it stores chunks and the converted but 🤷 we have never run into problems. Even when processing 8 events at a time

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the limit is 500mb. Are the temp files deleted after they have been hosted, that would help?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 500MB must be for logs and artifact files then 😕

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that is likely

@evamaxfield
Copy link
Member Author

Both the normal event gather and the manual process special event succeeded

https://jacksonmaxfield.github.io/cdp-dev/#/events/0913ecd21d43
https://jacksonmaxfield.github.io/cdp-dev/#/events/99f4c2a99447

@tohuynh you happy with this PR if so I think I can merge. 🎉

@tohuynh
Copy link
Collaborator

tohuynh commented May 22, 2022

Both the normal event gather and the manual process special event succeeded

https://jacksonmaxfield.github.io/cdp-dev/#/events/0913ecd21d43 https://jacksonmaxfield.github.io/cdp-dev/#/events/99f4c2a99447

@tohuynh you happy with this PR if so I think I can merge. 🎉

yes

@evamaxfield evamaxfield merged commit acd91e0 into main May 22, 2022
@evamaxfield evamaxfield deleted the feature/support-processing-m3u8-video-playlist-uris branch May 22, 2022 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request event gather pipeline A feature or bugfix relating to event processing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants