<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:125%;
           font-family:Verdana;
           letter-spacing:0.5px">

<h1 style="padding: 14px !important;
          color:white; text-align:center;" id="firstHeading">Mushroom Classification 🍄</h1>
    <a class="anchor-link" href="https://www.kaggle.com/ankits29/mushroom-classification-using-ann/notebook#firstHeading">¶</a>
</div>

![Stay Home](https://image.freepik.com/free-vector/mushroom-anatomy-labeled-biology-illustration_1995-566.jpg)

Mushrooms are fungi. They belong in a kingdom of their own, separate from plants and animals. Fungi differ from plants and animals in the way they obtain their nutrients. Generally, plants make their food using the sun's energy (photosynthesis), while animals eat, then internally digest, their food. Fungi do neither: their mycelium grows into or around the food source, secretes enzymes that digest the food externally, and the mycelium then absorbs the digested nutrients.

In this kernel, you will look at various properties of a mushroom and predict whether it is edible or not.

In [None]:
import pandas as pd
import plotly.graph_objects as go

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:125%;
           font-family:Verdana;
           letter-spacing:0.5px">

<h1 style="padding: 14px !important;
              color:white; text-align:center;" id="loading"> Loading The Data</h1>
    <a class="anchor-link" href="https://www.kaggle.com/ankits29/mushroom-classification-using-ann/notebook#loading">¶</a>
</div>

In [None]:
df = pd.read_csv('/kaggle/input/mushroom-classification/mushrooms.csv')
df.head()

In [None]:
df.shape

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:125%;
           font-family:Verdana;
           letter-spacing:0.5px">

<h1 style="padding: 14px !important;
              color:white; text-align:center;" id="visualize">Visualizing Data 📊</h1>
    <a class="anchor-link" href="https://www.kaggle.com/ankits29/mushroom-classification-using-ann/notebook#visualize">¶</a>
</div>

In [None]:
fig = go.Figure()
fig.add_trace(go.Histogram(
    x=df.loc[df.loc[:,'class'] == 'e']['bruises'],
    histfunc = "count",
    name='Edible',
    marker_color='#EB89B5',
))
fig.add_trace(go.Histogram(
    x=df.loc[df.loc[:,'class'] == 'p']['bruises'],
    histfunc = "count",
    name='Posionous',
    marker_color='#330C73',
))
fig.update_layout(
    title_text='Histogram of Bruises with Class',
    xaxis_title_text='Value',
    yaxis_title_text='Count',
)
fig.show()

In [None]:
df['class'] = df['class'].map({'p': 1, 'e': 0})

In [None]:
class_by_population = df.groupby(['population'])['class'].value_counts(normalize=True).unstack()
class_by_population = class_by_population.sort_values(by=1, ascending=False)
fig = go.Figure(data=[
    go.Bar(name='Poisonous', x=class_by_population.index, y=class_by_population[1]),
    go.Bar(name='Edible', x=class_by_population.index, y=class_by_population[0])
])
# Change the bar mode
fig.update_layout(barmode='stack')
fig.show()

In [None]:
class_by_habitat = df.groupby(['habitat'])['class'].value_counts(normalize=True).unstack()
class_by_habitat = class_by_habitat.sort_values(by=1, ascending=False)
fig = go.Figure(data=[
    go.Bar(name='Poisonous', x=class_by_habitat.index, y=class_by_habitat[1]),
    go.Bar(name='Edible', x=class_by_habitat.index, y=class_by_habitat[0])
])
# Change the bar mode
fig.update_layout(barmode='stack')
fig.show()

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:125%;
           font-family:Verdana;
           letter-spacing:0.5px">

<h1 style="padding: 14px !important;
              color:white; text-align:center;" id="preprocess">Preprocessing</h1>
    <a class="anchor-link" href="https://www.kaggle.com/ankits29/mushroom-classification-using-ann/notebook#preprocess">¶</a>
</div>

First spliting the target feature and independent features.

In [None]:
y = df.loc[:,'class'].values
X = df.drop(['class'], axis=1)

All the features here are categorical, you can perform one hot encoding on them. Mind you it is going to result in having a lot of features. Let's see if ANN can handle that.

In [None]:
from sklearn.preprocessing import OneHotEncoder

encoder = OneHotEncoder(drop='first')
X = encoder.fit_transform(X)

In [None]:
X.shape

Woah! we have 95 features there after encoding.

You can now create your train, test and validation splits.

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, shuffle=True)

In [None]:
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.15, shuffle=True)

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:125%;
           font-family:Verdana;
           letter-spacing:0.5px">

<h1 style="padding: 14px !important;
              color:white; text-align:center;" id="creation">Model Creation</h1>
    <a class="anchor-link" href="https://www.kaggle.com/ankits29/mushroom-classification-using-ann/notebook#creation">¶</a>
</div>

In [None]:
from tensorflow import keras

Usually in an ANN most of the layers have same activation function and other parameters. So instead defining the same parameters again, you can use partial function of the functools module, which provides us with a higher order function that return the specified function with the default parameter specified.

In [None]:
from functools import partial

MyDense = partial(keras.layers.Dense,
                 activation="selu",
                 kernel_initializer="lecun_normal")

You can use a *selu activation* function since the problem at hand is trivial for an ANN, a sequential architecture could do. selu activation function is a scaled version of elu activation function and it provides a *self normalization* effect thus you don't have to use batch normalization separately. Selu activation function can be only used with sequential models. 

Selu activation function should be used with *LeCun initialization* technique. It speeds up the training considerably.

While creating the model you can also use some *dropout layer* in between to avoid overfitting of the model to the training set.

For the final layer, you will have to use *sigmoid activation* function, as the problem is a binary classification one.

You can use some call back function to still speed up the training process.

*EarlyStopping* callback can be used if the model performance on validation set is not improving for a few consecutive steps. Patience paramters sets the number of steps. It can also restore the parameters of the best model towards the end.

Learning rate of the model can also be tweaked during the training to speed up the training process. The scheduler used here multiplies the learning rate of the model by 0.5 every time the best validation loss of the model does not improve for 5 consecutive steps. This type of scheduling that depends on the performance of the model on validation set is called *Performance Scheduling*

Finally, the optimizer used here to train the model is *Nadam optimizer*, which is a combination of *Adam optimization* and *Nestrov trick*. It is considered to usually converge faster than Adam optimizer.

In [None]:
input_layer = keras.layers.Input(shape=X_train.shape[1:])
hidden1 = MyDense(20)(input_layer)
hidden2 = MyDense(10)(hidden1)
dropout = keras.layers.Dropout(rate=0.2)(hidden2)
output = MyDense(1, activation="sigmoid", kernel_initializer="uniform")(dropout)

model = keras.models.Model(inputs=[input_layer], outputs=[output])

early_stopping_cb = keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)
lr_scheduler_cb = keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=5)
optimizer = keras.optimizers.Nadam(lr=0.001, beta_1=0.9, beta_2=0.999)

model.compile(optimizer=optimizer, loss="binary_crossentropy", metrics=["accuracy"])

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:125%;
           font-family:Verdana;
           letter-spacing:0.5px">

<h1 style="padding: 14px !important;
              color:white; text-align:center;" id="training">Model Training</h1>
    <a class="anchor-link" href="https://www.kaggle.com/ankits29/mushroom-classification-using-ann/notebook#training">¶</a>
</div>

In [None]:
history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_val, y_val))

In [None]:
model.evaluate(X_test, y_test)

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 7))
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

Notice how fast the model converges. Probably because of the Nadam optimization and Learning rate scheduling.

<div style="color:green;
           font-size:150%;
           font-family:cursive;
           letter-spacing:0.5px">

<p style="padding: 14px !important; text-align:center; color:green; font-family:cursive;">Hope you found the notebook interesting. This was my first notebook on deep learning. If you took something away from this, please upvote the notebook as it encourages me to create more such notebooks.</p>
    <p style="padding: 14px !important; text-align:center; color:green; font-family:cursive;"> Also suggestion are always welcomed.
</div>