Detect the poisonous mushroom using its physical characteristics.
The goal is to build a machine learning model which predicts whether a mushroom is edible or poisonous based on the characteristics. There are two classes: ‘e’ and ‘p’.
- ‘e’ means that the mushroom is edible.
- ‘p’ means that the mushroom is poisonous.
This dataset describes mushrooms in terms of their physical characteristics
- cap-shape: bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s
- cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s
- cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,white=w,yellow=y
- bruises?: bruises=t,no=f
- odor: almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungent=p,spicy=s
- gill-attachment: attached=a,descending=d,free=f,notched=n
- gill-spacing: close=c,crowded=w,distant=d
- gill-size: broad=b,narrow=n
- gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y
- stalk-shape: enlarging=e,tapering=t
- stalk-root: bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=?
- stalk-surface-above-ring: fibrous=f,scaly=y,silky=k,smooth=s
- stalk-surface-below-ring: fibrous=f,scaly=y,silky=k,smooth=s
- stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y
- stalk-color-below-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y
- veil-type: partial=p,universal=u
- veil-color: brown=n,orange=o,white=w,yellow=y
- ring-number: none=n,one=o,two=t
- ring-type: cobwebby=c,evanescent=e,flaring=f,large=l, none=n,pendant=p,sheathing=s,zone=z
- spore-print-color: black=k,brown=n,buff=b,chocolate=h,green=r, orange=o,purple=u,white=w,yellow=y
- population: abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=y
- habitat: grasses=g,leaves=l,meadows=m,paths=p, urban=u,waste=w,woods=d
Data is available as multiple sets of files. Each file will contain physical characteristics and a column to indicate whether it is edible [‘e’] or not [‘p']. Apart from data files, schema files are provided as a part of Data Sharing Agreement which contains all the relevant information about both train and test data such as:
- File name convention
- No of columns in each file
- Data type of each column
- Name of the columns
- Application Framework - flask, wsgiref
- Database operations - sqlite3
- Data processing and ML - numpy, pandas, matplotlib, sklearn, xgboost, kneed, pickle, seaborn
- General operations - os, shutil, csv, json, re, datetime, time
Clone this repo using
git clone https://github.com/Anil-45/MushroomClassification.git
Install the required modules using
pip install -r requirements.txt
Run the following command to start the application
python app.py
Open the application
Upload Train CSV
use this option to upload custom training filesTrain
trains the model the using uploaded training filesDefault Train
trains the model using default files. Make sure the data files are present indata/raw/train
to train the model. Trained models are saved tomodels
folderDefault Predict
predicts the output using saved models. Make sure the data files are present indata/raw/test
for predictionUpload Test CSV
use this option to upload custom test filesPredict
predicts the outcome of custom files using saved models
You can find the logs in logs
folder
Created by @Anil_Reddy
This project is available under the MIT.