This is the repository for the FYP of 2024-25 Cohort, supervised by Prof. Andrew Horner. Group code is HO3.
The chronology of branch development, in parallel of the main branch, is summarised as follows:
front-end-dev → back-end-dev & detached / experimental_multi_GPU (model training) → JS_frontend → front-end (where the core components of the system reside, including the backend)
[Main contributor: Tomy Kwong]
The dataset used is FMA (Defferrard, Benzi, Vandergheynst, and Bresson, 2017), which, in full, features 106,574 soundtracks (of full length) spanning across 161 genres. Downloading the dataset using the link to the left allows access to all metadata files and soundtracks (specifically, 17 out of 156 folders of soundtracks - randomly sampled - are used to optimise storage).
Data cleaning procedure:
- Identify useful information from metadata (tracks.csv, found in the FMA zip file) (See comments in
txtGen.pyfor description of useful fields) - Generate (trackID).txt by running
txtGen.py - Aggregate all .txt files (generated in 1.) into NewTracks.csv (or AggTracks.csv) by running
csvAgg.py - Generate tracks.json from NewTracks.csv (or AggTracks.csv) by running
jsonify.py
[Update: Contents in this branch have been merged with the frontend]
[Main contributor: Eric Kwok]
This branch houses the backend scripts that host the music generator model inference, as well as the SLM module. Highlights:
api.py: Scripts that serve the required backend modulesinference_class.py: Inference class definition for the music generator model
Before proceeding to the backend, please make sure all Python library dependencies are collected by running (python -m) pip install -r (dependencies.txt or requirements.txt). This .txt file is located in the (MuXiT\)backend directory.
[Main contributors: Crystal Chan, Tomy Kwong]
This branch houses the frontend scripts that host the Next.js site on which the user interface of the system runs. Highlights:
Gradio.py: For early prototyping purposes.- Hosts:
- Frontend hosted at localhost:3000 (127.0.0.1:3000)
- Backend hosted at localhost:8000 (127.0.0.1:8000)
- Running the system:
npm run startto start the whole frontend and backend system (For best compatibility, execute this command in the(MuXiT\)jsfrontenddirectory)npm run start-backendto awake the backend (Alternatively, runbackend\api.pyin another terminal window on the same machine)npm run start-frontendto awake the frontendnpm run buildto refresh and build frontend when initialising on a new env, error invoked, or updates
- System features spotlight:
- Local chat history: Keep your past chats (all text and audio files), even after you have closed the server!
- Customising music generation: On top of text prompts, feel free to upload audio clips to generate more creative stuff!
- SLM integration: Get friendly responses with every message sent in the system! Powered by Google Gemma 3
(Note: To use this model, please make sure you have downloaded the model weights locally, and change the model path to the local path in
api.py. Alternatively, please make sure you have logged in with a Hugging Face token with gated access permission by runninghuggingface-cli login- follow the on-screen instructions after executing the command)
[Main contributor: Melvin Tong]
We performed LoRA (Low-Rank Adaptation) training on the CSE server. Please download the LoRA weights here
Training code can be found in the musicgen_trainer folder (courtesy of @chavinlo). Other files on these branches are mostly log files produced in the output.
During the training process, the pre-trained model was loaded and all components were explicitly converted to float32 precision to ensure numerical stability.
The transformer layers were evenly partitioned across four GPUs, with each device responsible for twelve out of forty-eight layers.
LoRA adapters were selectively injected into key linear submodules (linear1, linear2, and out_proj), resulting in approximately 28M trainable parameters —representing only 2.8% of the total model parameters.
- 20250316: Implemented
api.py - 20250401:
- Updated base model loading
- Included more output methods
- 20250413: Implemented
inference_class.py - Changes made outside of this branch
- [Contributor: Tomy Kwong | Branch: front-end] Implemented SLM integration in
api.py
- [Contributor: Tomy Kwong | Branch: front-end] Implemented SLM integration in
- Debugging
- Fixed 422 error for the FASTAPI presentation problem and getting to test for the connection with model
- Fixed CORS error using middleware
- Fixed dark mode text
- 20250422:
- Added duration parameter
- Added local storage system, users can keep the history even after closing the website