# Inteligent Music Generation

## Project Overview


**1. Project Goals**

My project is aimed at **AI-based music fusion** and generation using two main generative approaches:

1. **Variational Autoencoder (VAE)**

   * Used for **interpolation between existing MIDI tracks**.
   * Produces smooth transitions between melodies, blending two pieces into a hybrid track.
2. **Generative Adversarial Network (GAN)**

   * Trained on a large MIDI dataset.
   * Generates **completely new tracks** in the style of the training set.
   * Can produce novel compositions not directly derived from any input.

Additional components:

* **Website**: UI for users to explore examples and use the system.
* **Web Scraper**: Efficiently downloads MIDI samples from sites.
* **Transposer**: Standardizes all MIDI files to the same key, ensuring compatibility when fusing or interpolating.

---

**2. Three Main Approaches**

**A. Interpolation on Existing Music (VAE)**

1. Two MIDI tracks are input to a VAE.
2. VAE encodes each track into **latent vectors**.
3. Latent vectors are **interpolated** (weighted average or smooth blending).
4. Decoder outputs a new MIDI track that **blends the musical content** of both tracks.

**Use case:**

* Fusion of two known songs, genres, or melodies.
* Preserves structure and style of original tracks.



**B. GAN Generation**

1. GAN is trained on a **large MIDI dataset**.
2. Once trained, the generator can produce **entirely new MIDI tracks** in the learned style.
3. Output tracks can be used as **fresh material** for further processing.

**Use case:**

* Generating novel music in a specific genre.
* Expands creative possibilities beyond existing tracks.


**C. GAN + VAE (Hybrid Fusion)**

1. GAN generates two new MIDI tracks.
2. These tracks are passed through the **VAE interpolation pipeline**.
3. Output is a **blended track**, combining both the generative creativity of GAN and the smooth interpolation of VAE.

**Use case:**

* Creating unique compositions by fusing two GAN-generated tracks.
* Produces music that is both novel and coherent.

---

**3. Supporting Tools**

**A. Transposer**

* Ensures all tracks are in the **same musical key**.
* Important because interpolating tracks in different keys can sound dissonant.

**B. Web Scraper**

* Collects MIDI samples efficiently from websites.
* Enables training GANs or VAE on larger datasets.

**C. Website Interface**

* Demonstrates examples and allows users to **experiment with music fusion**.
* Likely includes:

  * Upload interface for MIDI tracks
  * Selection of fusion method (VAE interpolation, GAN generation, hybrid)
  * Playback of generated music

---

**4. How the Methods Complement Each Other**

| Method            | Strength                                                        | Weakness                                         |
| ----------------- | --------------------------------------------------------------- | ------------------------------------------------ |
| VAE interpolation | Smooth blending of existing tracks, preserves musical structure | Cannot create completely new content             |
| GAN generation    | Generates novel tracks in learned style                         | May produce incoherent or unpolished outputs     |
| GAN + VAE hybrid  | Combines novelty with coherence                                 | Computationally heavier, may need careful tuning |

* Essentially, **VAE handles smoothness and interpolation**, while **GAN handles creativity and new content generation**.

---

**5. Summary Workflow Diagram**

```
Existing MIDI tracks → Transposer → VAE → Interpolated MIDI
          │
          └─> GAN (trained on dataset) → New MIDI tracks
                       │
                       └─> VAE → Interpolated GAN MIDI
```

* This captures all three approaches:

  1. Direct VAE interpolation
  2. GAN generation
  3. Hybrid GAN + VAE interpolation

**1. Website Goals**

The website should allow users to:

1. Upload **MIDI or audio files**.
2. Generate new music using:

   * **VAE interpolation** (blend two tracks)
   * **GAN generation** (create new tracks)
   * **Hybrid GAN + VAE** (blend two GAN-generated tracks)
3. Listen to and download generated music.
4. Explore example tracks and demos.

---

**2. Core Services / Features**

Here’s a breakdown of **services** the website should offer:

| Service                       | Description                                                             | Implementation Notes                                                                      |
| ----------------------------- | ----------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| **File Upload**               | Users upload MIDI files (and optionally audio if you add transcription) | Use an HTML file input or drag-and-drop. Backend stores files temporarily for processing. |
| **Track Fusion (VAE)**        | Blend two uploaded MIDI tracks                                          | Backend calls MusicVAE to encode, interpolate, decode, and return MIDI.                   |
| **Track Generation (GAN)**    | Generate new track in a specific genre                                  | Backend runs GAN inference. Provide parameters like style or seed.                        |
| **Hybrid Fusion (GAN + VAE)** | Generate two GAN tracks, then interpolate                               | Combine GAN generation and VAE interpolation pipelines.                                   |
| **Playback**                  | Listen to results in browser                                            | Convert MIDI → WAV → stream with `<audio>` HTML element. Use FluidSynth or WebAudio API.  |
| **Download**                  | Download generated tracks                                               | Serve as `.mid` or `.wav` files.                                                          |
| **Examples / Gallery**        | Showcase pre-generated tracks                                           | Static page or dynamic gallery showing genre fusion examples.                             |
| **Transposition (Optional)**  | Normalize key of uploaded tracks                                        | Can be done server-side using MIDI libraries.                                             |

---

**3. Backend Architecture**

The backend handles **all AI processing**, file conversion, and storage. Here’s a suggested setup:

**A. Stack**

* **Language:** Python (best for Magenta / GAN / VAE)
* **Web framework:** Flask, FastAPI, or Django
* **AI libraries:** Magenta (MusicVAE), TensorFlow, PyTorch (GAN)
* **MIDI/audio tools:** pretty\_midi, fluidsynth, soundfile

**B. API Endpoints**

| Endpoint              | Method | Description                                 |
| --------------------- | ------ | ------------------------------------------- |
| `/upload`             | POST   | Upload MIDI files, returns file IDs         |
| `/generate/vae`       | POST   | Interpolate two tracks via VAE              |
| `/generate/gan`       | POST   | Generate a new track via GAN                |
| `/generate/hybrid`    | POST   | Generate GAN tracks and interpolate via VAE |
| `/download/<file_id>` | GET    | Download generated track                    |
| `/play/<file_id>`     | GET    | Stream generated audio                      |


**C. Processing Flow**

1. User uploads files → backend stores them.
2. For VAE:

   * Encode files → interpolate → decode → MIDI.
3. For GAN:

   * Generate new MIDI → optionally interpolate via VAE.
4. Convert MIDI → WAV for playback.
5. Return downloadable file URLs and streaming links.

---

**4. Frontend Layout**

**A. Pages / Components**

1. **Home / Landing Page**

   * Introduction and examples of AI-generated music.
2. **Upload / Fusion Page**

   * File upload inputs (2 tracks for VAE)
   * Buttons: VAE, GAN, Hybrid
   * Genre/style selection for GAN
   * Audio player to preview generated track
   * Download button
3. **Gallery / Examples**

   * Pre-generated tracks with descriptions
4. **About / Documentation**

   * How the system works (VAE, GAN, hybrid)

**B. Interactivity**

* Use AJAX / fetch API to call backend without reloading page.
* Display **progress/loading bar** for generation (AI models can take time).
* Show **waveform or MIDI piano roll preview** (optional, for visual appeal).

---

**5. Optional Advanced Features**

1. **User Accounts**

   * Save favorite generated tracks.
   * Track history of generated music.

2. **Genre Customization**

   * Allow user to pick genres, BPM, key, or mood.

3. **Audio Upload**

   * If you implement audio → MIDI transcription, allow vocal or instrument tracks.

4. **Batch Processing**

   * Let users generate multiple fused tracks at once.

---

**6. Tech Stack Summary**

| Layer      | Technology                                      |
| ---------- | ----------------------------------------------- |
| Frontend   | HTML, CSS, JS, React or Vue (optional)          |
| Backend    | Python + Flask/FastAPI                          |
| AI Models  | MusicVAE (Magenta), GAN (PyTorch or TensorFlow) |
| MIDI/Audio | pretty\_midi, fluidsynth, soundfile             |
| Storage    | Local filesystem or cloud (S3) for MIDI/WAV     |

---



## Import Libraries

In [1]:
import numpy as np
import pandas as pd