### Problem Statement

The Audubon Society Field Guide to North American Mushrooms contains descriptions
of hypothetical samples corresponding to 23 species of gilled mushrooms in the
Agaricus and Lepiota Family Mushroom (1981). Each species is labelled as either
definitely edible, definitely poisonous, or maybe edible but not recommended. This last
category was merged with the toxic category. The Guide asserts unequivocally that
there is no simple rule for judging a mushroom's edibility, such as "leaflets three, leave it
be" for Poisonous Oak and Ivy.

### Objective
The main goal is to predict which mushroom is poisonous & which is edible.

### Dataset Link:
https://www.kaggle.com/datasets/uciml/mushroom-classification

### Attribute Information: 

- **`classes:`** edible = e, poisonous = p

- **`cap-shape:`** bell = b, conical = c, convex = x, flat = f, knobbed = k, sunken = s

- **`cap-surface:`** fibrous = f, grooves = g, scaly = y, smooth = s
 
- **`cap-color:`** brown = n, buff = b, cinnamon = c, gray = g, green = r, pink = p,purple = u, red = e, white = w, yellow = y
 
- **`bruises:`** bruises = t, no = f
 
- **`odor:`** almond = a, anise = l, creosote = c, fishy = y, foul = f, musty = m,none = n, pungent = p, spicy = s
 
- **`gill-attachment:`** attached = a, descending = d, free = f, notched = n
 
- **`gill-spacing:`** close = c, crowded = w, distant = d
 
- **`gill-size:`** broad = b, narrow = n
 
- **`gill-color:`** black = k, brown = n, buff = b, chocolate = h, gray = g, green = r, orange = o, pink = p, purple = u, red = e, white = w, yellow = y
 
- **`stalk-shape:`** enlarging = e, tapering = t
 
- **`stalk-root:`** bulbous = b, club = c, cup = u, equal = e, rhizomorphs = z, rooted = r, missing = ?
 
- **`stalk-surface-above-ring:`** fibrous = f, scaly = y, silky = k, smooth = s
 
- **`stalk-surface-below-ring:`** fibrous = f, scaly = y, silky = k, smooth = s
 
- **`stalk-color-above-ring:`** brown = n, buff = b, cinnamon = c, gray = g, orange = o, pink = p, red = e, white = w, yellow = y
 
- **`stalk-color-below-ring:`** brown = n, buff = b, cinnamon = c, gray = g, orange = o, pink = p, red = e, white = w, yellow = y
 
- **`veil-type:`** partial = p, universal = u
 
- **`veil-color:`** brown = n, orange = o, white = w, yellow = y
 
- **`ring-number:`** none = n,one = o, two = t
 
- **`ring-type:`** cobwebby = c, evanescent = e, flaring = f , large = l, none = n, pendant = p, sheathing = s, zone = z
 
- **`spore-print-color:`** black = k, brown = n , buff = b, chocolate = h, green = r, orange = o, purple = u, white = w, yellow = y
 
- **`population:`** abundant = a, clustered = c, numerous = n, scattered = s, several = v, solitary = y
 
- **`habitat:`** grasses = g, leaves = l, meadows = m, paths = p, urban = u, waste = w, woods = d

→ stalk-root feature contains missing values as (?).

In [None]:
# Importing required packages

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
# Reading the data
pd.read_csv('https://github.com/iamaakashpal/dataset/raw/main/mushroom_dataset.zip')

In [None]:
# Storing data into dataset variable
dataset = pd.read_csv('https://github.com/iamaakashpal/dataset/raw/main/mushroom_dataset.zip')

In [None]:
# Checking size of the dataset
dataset.shape

In [None]:
# Displaying random 5 values of the data.
dataset.sample(5)

In [None]:
# Displaying Top 5 records of the dataset
dataset.head()

In [None]:
# Displaying Last 5 records of the dataset
dataset.tail()

In [None]:
# Checking the data type of each features
dataset.info()

- All the features in the dataset are **`categorial`** values.

In [None]:
# Checking Missing Value
dataset.isnull().sum()

→ No missing value, but the stalk-root feature contains missing as a value as (?) we will investigate further.

In [None]:
# Checking duplicate value
dataset.duplicated().sum()

→ There is no duplicate value in the dataset.