Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Near-constant download of uploaded stories content apparent in S3/AWS logs #30

Closed
mradamcox opened this issue May 30, 2023 · 3 comments
Closed

Comments

@mradamcox
Copy link
Collaborator

We have had a jump in our AWS bill over the last week, due to an unanticipated uptick in transfer out expenses, well beyond the free tier and well beyond normal usage of the site. I've put logging in AWS where I can, and, I believe, tracked this activity to GitHub node fetches. The timing of this seems to match up with when we uploaded a large number of files through the admin-upload process after a tabling event.

@mradamcox
Copy link
Collaborator Author

mradamcox commented Jun 2, 2023

As far as I can tell, the Repair AV workflow was causing this data transfer. My understanding of this problem is as follows:

  1. We had a lot of videos (24) in the queue from the admin-upload process, and the workflow doesn't commit the ids of completed videos if it is unexpectedly killed mid-way through.
  2. We had some videos over 1gb in size, and it was while transcoding one of these that the workflow process was killed every time, presumably due to the file size.
  3. Because the workflow was scheduled hourly, a new run would start (often even before the first run had failed), and begin working on the exact same videos until it failed by hitting the very large files.

For now, I have gone through the 24 un-transcoded videos in s3 and put the ids of those that are >= 1gb in size into a new txt file that the repair av script will reference and use to skip processing those videos. Ultimately, we'll need to compress and re-upload those particular videos.

@mradamcox
Copy link
Collaborator Author

@mukeshchugani10 One change to the code that would help address issues like this in the future would be a modifications of the workflow script such that each time a video is processed successfully, the updated list of video ids is committed to the repo. As far as I can tell, this commit only happens after all of the videos have been processed, which ultimately was what caused this particular issue.

We can set up a different ticket/workflow for handling very large (>1gb) video files, though, arguably, it would be better this size is never uploaded to s3 and we have some preprocessing, etc. Will figure that out with the next batch of uploads.

@mradamcox
Copy link
Collaborator Author

Ultimately, I've addressed this by disabling the A/V repair workflow and handling transcoding locally instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant