Skip to content

equinlan/mushroom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mushroom

Mushroom is simply the data, Jupyter notebook, and files that supported the writing of a Medium blog post, What Decision Trees Tell Us About Deadly Mushrooms.

The article attempts to answer three questions from a dataset containing 22 features of North American mushroom samples:

  1. Can a machine learning model reliably identify poisonous mushrooms based on the data?
  2. Does any one feature of the data reliably classify mushroom toxicity?
  3. Can we formulate simple, memorizable rules from the data that reliably classify mushroom toxicity?

Data

Data was obtained fromm the UCI Machine Learning Repository Mushroom Data Set, donated by Jeff Schlimmer and drawn from The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf.

Note there are two datasets in the UCI-hosted data folder, seemingly due to some data recovery efforts on the part of the donor:

  • agaricus-lepiota.data, which has fewer records and uses single-chararcter representations of categorical values
  • expanded, which has more records and uses full-word representations of categorical values (unzipped from expanded.Z) My exploration makes use of the expanded dataset.

Code

View the Jupyter notebook.

To run the code locally you will need Python and JupyterLab or Jupyter Notebook as well as the following Python libraries:

Images and DOT Data

The Jupyter notebook generates DOT data and several images from that data.

  • Images can be found in /images.
  • DOT data can be found in /dot.

License

Everything outside the /data folder is licensed under an MIT License. Use of the contents of /data should be cited appropriately (see the Machine Learning Repository's citation policy).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published