Skip to content

Refactor our resource generation script #820

@scotts

Description

@scotts

We have a script, generate_reference_resources.sh, that we use to generate all of our reference frames for testing. However, we also have command in there that generates an actual audio mp3 from one of our reference videos: https://github.com/pytorch/torchcodec/blob/4af0bfe5f294415e14754f3ef9c5c3fc22b0d858/test/generate_reference_resources.sh#L45

On a clean repo, running this script should be an expensive no-op. That is, it should do a lot of work generating all of the test references, but what it generates should match exactly, bit-for-bit, what's already in the repo. Lately, however, when I run this script, the generated mp3 file from this line, test/resources/nasa_13013.mp4.audio.mp3, is different from what we have checked in.

Questions we should resolve:

  1. Why is the newly generated mp3 different? This may be due to me having a different FFmpeg installed; I believe I used FFmpeg 4 back when first generating that mp3 and now I use FFmpeg 6 in my development environment.
  2. Should we just side-step the above question entirely and say that this script should exclusively be about generating references frames from checked-in media files? The solution here would be to just remove the mp3 generation from the script.
  3. Since we're here, should we refactor the script into Python? My personal rule I like to follow is that once a bash script has a single branch, I would rather do it in Python. We crossed that threshold a while ago and now have nested for-loops. We also call a Python script, convert_image_to_tensor.py from within this bash script, so we consolidate it all in one.
  4. Even if we stick with bash, we should remove the redundant bmp generation. (Trying to do this bit is why I discovered this problem.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    refactorImproves code itself, but does not fix a bug or add new functionality.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions