Our goal is to create a model that can take data about a song and determine the genre of a song.


In [None]:
import pandas as pd

## Cleaning up the data

In [None]:
df = pd.read_csv('/content/dataset.csv')
sorted(df)

From here, we have to decide which of these characteristics of a song are most useful in determining its genre and drop the ones that aren't.

In [None]:
df = df.dropna(axis=0)
df = df.drop(['Unnamed: 0', 'album_name', 'artists', 'explicit', 'track_name', 'mode', 'duration_ms', 'popularity', 'track_id'], axis=1)

Out of the genres our dataset has sorted by, we will be picking Indian - kind of vague but let's see what happens! You could also technically follow these steps with artists instead of genres though you would have to be careful to pick an artist that has a distinctive style of music.

In [None]:
df = df.replace({"indian" : 1})
df.loc[df['track_genre'] != 'indian'] = 0

## Splitting the dataset

In [None]:
y = df.pop('track_genre')

In [None]:
from sklearn.model_selection import train_test_split
import tensorflow as tf

X = df

import numpy as np

# Convert the NumPy array to float32
X = np.asarray(X).astype(np.float32)
y = np.asarray(y).astype(np.float32)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) #y track genre category x every other trait, training wwith 30%


xtens = tf.convert_to_tensor(X_train, dtype=tf.float32)
ytens = tf.convert_to_tensor(y_train, dtype=tf.float32)

```test_size=0.3``` specifies that 30% of the data will be used for testing the model while 70% will be used to train the model

```tf.convert_to_tensor()``` converts the data that is in ```X_train``` into a Tensor that can be used to input into a TensorFlow model



In [None]:
# Setting up the model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(10, activation="relu"))
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))


# Training the model
model.compile(optimizer="adam",loss="binary_crossentropy", metrics=["accuracy"])
model.fit(xtens, ytens, epochs=100)

```binary_crossentropy``` is appropriate for our model because we want a binary classification - whether something is something or it is not something.

```metrics=["accuracy"]``` means that our model will be evaluated on how accurate it is.

In [None]:
tf.saved_model.save(model, '/content/model.saved_model')