<center><img src="https://github.com/insaid2018/Term-1/blob/master/Images/INSAID_Full%20Logo.png?raw=true" width="360" height="160" /></center>

**<center><h1>Automatic music genaration using RNN</center>**

----
# **Table of Contents**
----
**1.** [**Problem Statement**](#Section1)<br>
**2.** [**Importing Libraries**](#Section2)<br>
**3.** [**Loading the data**](#Section3)<br>
**4.** [**Data-Preprocessing**](#Section4)<br>
**5.** [**Batch Genaration**](#Section5)<br>
**6.** [**Modelling**](#Section6)<br>
**7.** [**Music Genaration**](#Section7)<br>
**8.** [**Applications**](#Section8)<br>
**9.** [**Limitations**](#Section9)<br>
**10.** [**Conclusion**](#Section10)<br>

----
<a id=Section1></a>
# **1. Problem Statement**
----
- **ABC music prod. pvt.ltd** is a reknowned **audio-video production house** based out of **Mumbai, India**

- As **COVID-19** cases are increasing day by day it is almost impossible for the musicians to cope up with real time studio work.

- Hence, the company wants you to make an **AI based music genaration system.**

- The goal of this project is to make an **AI based music genaration system.** 

- The key contraint to the problem is **accuracy.**


<center><img src = "https://cdn.dribbble.com/users/316072/screenshots/10724786/laptop_music_animation_01_1600x1200.gif"></center>

### **Scenario**

- You have been hired as a **freelance data scientist** for **ABC music prod. pvt.ltd**

- The model should read a text file in **abc format**.
 
- The model should genarate the **corresponding music** framed out of that note sequence.


----
<a id=Section2></a>
# **2. Installing libraries**
----

In [None]:
import numpy as np                                     # Importing numpy   
from keras.models import Sequential, load_model        # Importing Sequential layers and load_model 
from keras.layers import LSTM, Dropout                 # Importing LSTM and Dropout layers  
from keras.layers import TimeDistributed               # Importing TimeDistributed layers
from keras.layers import Dense, Activation, Embedding  # Importing Dense, Activation and Embedding layers  
import os                                              # Importing OS 
import json                                            # Importing JSON  
import argparse                                        # Importing Argsparse 

----
<a id=section3></a>
# **3. Data Reading**
----
- In this step we will **read the data** from the **input text corpus**.
- We will use **UTF-8** encoding for making the data

In [None]:
with open('/content/input.txt','rb') as f:
      input_text = f.read()
input_text=str(input_text,'utf-8')

In [None]:
input_text [:100]

'X: 1\nT:A and D\n% Nottingham Music Database\nS:EF\nY:AB\nM:4/4\nK:A\nM:6/8\nP:A\nf|"A"ecc c2f|"A"ecc c2f|"A"'

----
<a id=section4></a>
# **4. Data Preprocessing**
----

- In this step we will make the data into the desired format.
- We will be converting each and every character into a integer.
- Then we will create a dictionary of it.

In [None]:
def generate_keys(text):
    #Charecter to index dictionary  
    char_to_idx = {ch:idx for idx,ch in enumerate(sorted(list(set(text))))}

    #Index to character dicrionary
    idx_to_char = {idx:ch for ch,idx in char_to_idx.items()}

    #Printing the charecter lengths
    print("length of the  charecter to index ",len(char_to_idx))
    print("length of the  index to charecter ",len(idx_to_char))
    return char_to_idx,idx_to_char

In [None]:
# Genarating the charecter to index and index to charecter values
char_to_idx, idx_to_char = generate_keys(input_text)

len of the  char_to_idx  86
len of the  idx_to_char  86


In [None]:
# Reading the input text corpus
with open('/content/input.txt','w') as f:
      json.dump(char_to_idx,f)

In [None]:
len(input_text)

129665

----
<a id=section5></a>
# **5. Batch genaration**
----

- In this section we will be genarating batches in order to train our model.
- We can see that the **length is 129,665** 
- We will be using a **batch size of 16** and a **sequence length of 64**
- Hence, the **number of batches** are going to be **((129,665/16)/ 64)** = 126
- Now since it is a sequence data we will divide the 1st **8104 char in 126 batches** , each batches will have these char in the 1st row.
- Similarly from **8105 - 16209th** chars will be divided into 126 batches (each batch will have 64 sequence) and will be added at the 2nd row of each batches
- This is how at **8104th interval** we will take char and divide them into batches and put them in respective rows of batches
- So we will have the **row wise continuation of the sequence** for different batches.  
- In this way we can keep the sequence information in the text data.

In [None]:
# Here we are genarating respective batches to fit into the model
def generate_batchs(T, vocab_size):  
    length = T.shape[0] # length = 129665
    batch_char = int(length / batch_size); 
    for start in range(0, 126*64,64):
        X = np.zeros((batch_size, batch_sequence)) 
        Y = np.zeros((batch_size, batch_sequence, vocab_size)) 

        # Getting each batch index and column index
        for batch_index in range(0,batch_size):
            for col_index in range(0,batch_sequence):
              X[batch_index, col_index] = T[batch_char * batch_index + start + col_index]
              Y[batch_index, col_index, T[batch_char * batch_index + start + col_index+1]] = 1
        yield X,Y

In [None]:
# Checking the size of the vocabulary
vocab_size = len(char_to_idx)
vocab_size

86

----
<a id=section6></a>
# **6. Modelling**
----
- In this step we will be making the **deep learning model**.
- We will train the model for a **100 epochs.**
- After every **10** epochs we will save the respective weights into the **main directory**.

In [None]:
# Defining a batch size of 16 and sequence length of 64
batch_size = 16
seq_len = 64
batch_sequence = 64

# Getting the model directory
MODEL_DIR = '/content/'

# Defining a function for saving weights
def save_weights(epoch, model):
    if not os.path.exists(MODEL_DIR):
        os.makedirs(MODEL_DIR)
    model.save_weights(os.path.join(MODEL_DIR, 'weights.{}.h5'.format(epoch)))

# Defining a function for loading weights
def load_weights(epoch, model):
    model.load_weights(os.path.join(MODEL_DIR, 'weights.{}.h5'.format(epoch)))

# Defing a function for model development
def build_model(batch_size, seq_len, vocab_size):
    model = Sequential()
    model.add(Embedding(vocab_size,
                        512, 
                        batch_input_shape=(batch_size, seq_len)))
    
    for i in range(3):
        model.add(LSTM(256, return_sequences=True, stateful=True))
        model.add(Dropout(0.2))
    ## Using Time Distributed Dense Layer for each return sequences
    model.add(TimeDistributed(Dense(vocab_size))) 
    model.add(Activation('softmax'))
    return model

In [None]:
# Defining a train function to train the model such that for every 10 epochs the model will return respective weights 
def train(text, epochs=100, save_freq=10):

    #model_architecture
    model = build_model(batch_size, batch_sequence, vocab_size)
    print(model.summary())
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


    #Train data generation
    T = np.asarray([char_to_idx[c] for c in text], dtype=np.int32) #convert complete text into numerical indices

    print("Length of text:" + str(T.size)) #129,665

    steps_per_epoch = (len(text) / batch_size - 1) / batch_sequence  

    for epoch in range(epochs):
        print('\nEpoch {}/{}'.format(epoch + 1, epochs))
        losses, accs = [], []
        for i, (X, Y) in enumerate(generate_batchs(T, vocab_size)):
            loss, acc = model.train_on_batch(X, Y)
            losses.append(loss)
            accs.append(acc)
        print('epoch {}: loss = {}, acc = {}'.format(epoch + 1, np.mean(loss), np.mean(acc)))
        

        if (epoch + 1) % 10 == 0:
            save_weights(epoch + 1, model)
            print('Saved checkpoint to', 'weights.{}.h5'.format(epoch + 1))

In [None]:
train(input_text)

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (16, 64, 512)             44032     
                                                                 
 lstm (LSTM)                 (16, 64, 256)             787456    
                                                                 
 dropout (Dropout)           (16, 64, 256)             0         
                                                                 
 lstm_1 (LSTM)               (16, 64, 256)             525312    
                                                                 
 dropout_1 (Dropout)         (16, 64, 256)             0         
                                                                 
 lstm_2 (LSTM)               (16, 64, 256)             525312    
                                                                 
 dropout_2 (Dropout)         (16, 64, 256)             0

----
<a id=section7></a>
# **7. Music Genaration**
----
- In this step we will genarate the **abc notes**.
- Later on these notes will be put <a href="https://www.abcjs.net/abcjs-editor.html">here</a> 
- Then we will be able to hear the **respective notes.**

In [None]:
# Setting the data and the model directories respectively
DATA_DIR = '/content/'
MODEL_DIR = '/content/'

In [None]:
# Save the weights into the respective model directory path with respective model names
def save_weights(epoch, model):
    if not os.path.exists(MODEL_DIR):
        os.makedirs(MODEL_DIR)
    model.save_weights(os.path.join(MODEL_DIR, 'weights.{}.h5'.format(epoch)))

In [None]:
# Loading the weights into the model
def load_weights(epoch, model):
    model.load_weights(os.path.join(MODEL_DIR, 'weights.{}.h5'.format(epoch)))

In [None]:
# Building the test model with the given vocab size = 86 
def build_test_model(vocab_size):
    model = Sequential()
    model.add(Embedding(vocab_size,
                        512,
                        batch_input_shape=(1,1)))
    for i in range(3):
        model.add(LSTM(256,
                       return_sequences=(i != 2),
                       stateful=True))
        model.add(Dropout(0.2))

    model.add(Dense(vocab_size))
    model.add(Activation('softmax'))
    return model

In [None]:
# Genarating the text corpus upon the test model
def generate_music(epoch, num_char):
    with open(os.path.join(DATA_DIR, 'char_to_idx')) as f:
        char_to_idx = json.load(f)

    idx_to_char = dict(zip(char_to_idx.values(), char_to_idx.keys()))

    # Feeding the pre-defined parameters
    vocab_size = len(char_to_idx)
    batch_size = 16
    seq_len = 64

    # calling the test model with vocab size = 86
    model =  build_test_model(vocab_size)
    load_weights(epoch, model)

    sampled = []
    batch = np.zeros((1, 1))
    batch[0, 0] = np.random.randint(vocab_size)
    for i in range(num_char):
        if i ==0:
            #Predicting for the respective batches and returning a flatten array
            result = model.predict_on_batch(batch).ravel()
            sample = np.random.choice(range(vocab_size), p=result)
            sampled.append(idx_to_char[sample])
        else:
            #Reassigning the batch size as of the sample size
            batch[0,0] = np.array([[sample]])
            
            #Predicting for the respective batches and returning a flatten array
            result = model.predict_on_batch(batch).ravel()
            sample = np.random.choice(range(vocab_size), p=result)
            sampled.append(idx_to_char[sample])
    return ''.join(sampled)

In [None]:
generate_music(10, 512)

'?fef dAG|"D"FEF "C"D2:|[2 "Dm"A2E FGG|\\\n"D"G3 cBG|"C"FA2d eec|"D"a3 fc:|\n\n\nX: 1U6\nT:PTee Cher\nogt!ers\n% Nottingham Music Datebase\nS:TKl, via EF\nY:AB\nM:6/8\nK:G\nP:D\nA/2A/2|"D"c2c d2c|"A"dee "D"ded|"D"ddd "G"dBG|"G"BGG G2B|\n"Am"fdd "G"Bcc|"D"A2c bdf|"A"edc "E"Ade|"G"d3 d2:|\nP:B\nf|"G"gba e2d|c^G Bcd|"Dm"cdd e2d|"D"c2A ABc|\n"D"ddd Aiaf|"G"ggg feg|"A"e2c "A""G7"A2c| "G"g2d a2g|"D"f2f a^ga|\n"A7"g3+b2fB/2f/2g geb|"G"g3 -gdd|"BAAB/A"A7/B#7"G2B|"D"dec "A7"c3\n|\n\n\nX: 283\nT:Jory\'d Bancle\n% Nottingham Music Database\nS:Ch'

### **Observation:**

- This is the rendered output of the **input text corpus** that was given to the model.

- Now, this output is of **ABC Notation of music**. You can learn more about **ABC Notation** <a href="https://en.wikipedia.org/wiki/ABC_notation">here!</a>

- This output we will copy and paste <a href="https://www.abcjs.net/abcjs-editor.html">here!</a> and press the **Play button**


----
<a id=section8></a>
# **8. Applications**
----

- This model can be used for in-house **music production systems.**
- This can be widely used to automate **manual instruments.**
- This can be also used to make **automatic VST(virtual studio toolkit) plugins**

----
<a id=section9></a>
# **9. Limitations**
----
- The only limitation of this model is that it is being trained with **very less data.**

- However, while getting trained on various intrument data this model can be further enhanced for **different instruments** as well.

- We have **trained this model for only 100 epochs**. As the **number of epochs increase** it is expected that the **accuracy of the model will increase**.

----
<a id=section10></a>
# **10. Conclusion**
----

- In this project an **Automatic music genaration system** was made from scratch.
- Here, we recieved a **validation accuracy of 92%**.
- This project can be widely used for **music production systems**
- The only limitation of this model is that it is being trained with **very less data.**
- However, while getting trained on various intrument data this model can be further enhanced for different instruments as well.
