Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MLH Fellow] [Video] Add augment_audio augmentation #130

Closed
zpapakipos opened this issue Sep 27, 2021 · 1 comment
Closed

[MLH Fellow] [Video] Add augment_audio augmentation #130

zpapakipos opened this issue Sep 27, 2021 · 1 comment

Comments

@zpapakipos
Copy link
Contributor

馃殌 Feature

Add an augmentation called augment_audio which takes in a video and an audio augmentation (a Callable), applies the given audio augmentation to the video's audio track, recombines the video with the augmented audio, and writes out the video to the given output path. The metadata should include a field audio_metadata which contains the metadata resulting from applying the audio augmentation on the video's audio track, which means you should call get_metadata() twice -- once for the video, and once for the audio augmentation.

Motivation

One of the cool things about AugLy is that it's multimodal, so users can transform data of different modalities under one unified API, and even apply augmentations to multimodal data (e.g. augment text which is then overlayed over an image which is also augmented, or augment the audio track of a video which is also augmented). In order to make it easier to combine audio & video augmentations, we would like to define a video augmentation which augments the audio with a given AugLy augmentation.

To do

  • Read our contributing guidelines for all the steps to add a new augmentation
@Adib234
Copy link
Contributor

Adib234 commented Sep 29, 2021

Hi, so I'm not sure where to ask you about questions about my task, so for now I'll ask over here. An assumption I'm making is that the Callable is not an optional parameter, what would the default value be in this case? Here's what I have so far

audio_aug_function: Callable[
        ..., Tuple[np.ndarray, int]
    ] (would there be a default value over here?)

I don't think it's possible to know the type of the argument in Callable but from seeing audio/functional.py it seems that the return type is always Tuple[np.ndarray,int] and I think it's what we want

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants