# Music Generation Project

## Problem Statement

As a data scientist for a Record Company that utilizes music samples, personnel in accounting are worried about the exponentially increasing expenses with clearing samples for artists beats. They are wondering if a model can generate new music from existing jazz and classical works.  In order to have a bigger collection from which to sample from in house, we can then scale up and create sounds with other genres.  This could really help the finances of the business as we could be decreasing those expenditures that result in clearing samples. We can define success in part by the objective measure of minimizing value loss when creating these models.  Another factor to take into account is how this music is perceived and we can use some objective criteria for that as well such as rhythm, or patterns of notes that sound dynamic.

## Background Research

### Music
The art of music concerns itself with combining sonic frequencies usually from conventional instruments in order to achieve beauty of form and/or emotional expression which are in accordance with certain standards proposed by their culture. There are disagreements as to the subject of the origin of music.  Mainly around whether it coincided before, after, or during the origin of language.  While we cannot be sure of its origin, there is no denying its significance as a cultural pillar across the history of mankind.  

### Music Theory
Music is often studied through the lens of mathematics. For example, any string instruments will vibrate at particular frequencies.  With regard to music theory, discrete whole numbers have long been suited for labeling the pitches or keys of the piano.  Music theory was also helpful to legendary composers like Bach, Mozart, and Beethoven; in that it supplied the composer with a framework for pattern recognition, which went a long way to making sense of the notes in order to commit them to memory.  These principles of mapping sounds to discrete numbers along with pattern recognition are part of what makes it possible to generate music through machine learning.  

### Pitch
Pitch is the perceptual property of sounds that allows their ordering on a frequency related scale, it allows for us to judge how "high" or "low" a sound is within a musical melody.  
Humans tend to recognize relative relationships, not absolute physical values. And when we do, those relationships (especially in the aural domain) tend to be logarithmic. That is, we don’t perceive the difference (subtraction) of two frequencies, but rather the ratio (division).

### Musical notes
In music, a note is a symbol denoting a musical sound.  Notes can represent the pitch and duration of a sound in musical notation.  Notes can be thought of as the building blocks for written music, and help with its comprehension and analysis.

## Software Requirements
* Numpy
* Matplotlib
* Sklearn
* Keras
* Music21
* MuseScore3(In conjunctin with Music21,enables the notes to be displayed in musical sheet format)

### Music21
The project uses the music21 library in order to help parse out the notes in the data set.  It is also a useful library in order to do EDA around musical concepts such as the distribution of frequencies, and the proportionality as to which a note is used within a given measure.  

## Description of Dataset

The dataset is comprised of midi files scraped from https://bushgrafts.com/midi/.  These midi files are of the jazz genre, while 301 files were scraped some of them need to be reformatted in order to go through the processing via the music 21 library. This leaves 214 midi jazz files to work with.  Additionally, the dataset also includes 295 classical music midi files that were found on kaggle here, https://www.kaggle.com/soumikrakshit/classical-music-midi.  Along with this there are 400 other piano midi files that were supplied for this project.

## MIDI Webscraper

In [8]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import mido 
from bs4 import BeautifulSoup
import json
import requests
import time
import urllib

## Web scrape site for Jazz Midi Files

In [None]:
url = 'https://bushgrafts.com/midi/'

res = requests.get(url)

print(res.status_code)

soup = BeautifulSoup(res.content, 'lxml')

links=soup.find_all('a')

### For loop to scrape

In [10]:
i=0
for link in links:
    if('.mid' in link.get('href',[])):
        i+=1
        print("Downloading file: ", i)
        
        response=requests.get(link.get('href'))
        
        midi= open('midi'+str(i)+'.mid', 'wb')
        midi.write(response.content)
        midi.close()
        print(f'File, {i}, downloaded')
        time.sleep(12)
print('All midis downloaded')
#code influenced by:
#https://www.geeksforgeeks.org/downloading-pdfs-with-python-using-requests-and-beautifulsoup/

Downloading file:  1
File, 1, downloaded
Downloading file:  2
File, 2, downloaded
Downloading file:  3
File, 3, downloaded
Downloading file:  4
File, 4, downloaded
Downloading file:  5
File, 5, downloaded
Downloading file:  6
File, 6, downloaded
Downloading file:  7
File, 7, downloaded
Downloading file:  8
File, 8, downloaded
Downloading file:  9
File, 9, downloaded
Downloading file:  10
File, 10, downloaded
Downloading file:  11
File, 11, downloaded
Downloading file:  12
File, 12, downloaded
Downloading file:  13
File, 13, downloaded
Downloading file:  14
File, 14, downloaded
Downloading file:  15
File, 15, downloaded
Downloading file:  16
File, 16, downloaded
Downloading file:  17
File, 17, downloaded
Downloading file:  18
File, 18, downloaded
Downloading file:  19
File, 19, downloaded
Downloading file:  20
File, 20, downloaded
Downloading file:  21
File, 21, downloaded
Downloading file:  22
File, 22, downloaded
Downloading file:  23
File, 23, downloaded
Downloading file:  24
File, 2