ammml_mv

Context

This project is part of 11-877 Advanced Topics in Multimodal Machine Learning at Carnegie Mellon University by Santiago Benoit and Mehul Agarwal under the guidance of Professor Louis-Philippe Morency.

Description

We use deep multimodal learning techniques to create music videos that are perfectly synced with a given piece of music and its lyrics. This project is designed to take advantage of the power of natural language processing (NLP) and computer vision to generate a context-specific visual experience that complements the audio.

Additional Required Repos

Place these in this directory.

giffusion: https://github.com/DN6/giffusion.git MultimodalMusicEmotion: https://github.com/santient/MultimodalMusicEmotion.git

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
MultimodalMusicEmotion		MultimodalMusicEmotion
__pycache__		__pycache__
ffmpeg-git-20240301-i686-static		ffmpeg-git-20240301-i686-static
generated		generated
giffusion		giffusion
initial_results		initial_results
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
ablations.py		ablations.py
baseline.py		baseline.py
compile.sh		compile.sh
diffusion.py		diffusion.py
ffmpeg.tar.xz		ffmpeg.tar.xz
final.py		final.py
giffusion_video.py		giffusion_video.py
llm.py		llm.py
llm.pyc		llm.pyc
music.py		music.py
run_ablations.sh		run_ablations.sh
transcriber.py		transcriber.py
urls.txt		urls.txt
video.py		video.py

License

agarwalml/ammml_mv

Folders and files

Latest commit

History

Repository files navigation

ammml_mv

Context

Description

Additional Required Repos

About

Resources

License

Stars

Watchers

Forks

Languages