Kaggle competition: Leaf Classification
Kaggle competition: https://www.kaggle.com/c/leaf-classification
Kaggle instructions - Can you see the random forest for the leaves?
There are estimated to be nearly half a million species of plant in the world. Classification of species has been historically problematic and often results in duplicate identifications. Automating plant recognition might have many applications, including:
- Species population tracking and preservation
- Plant-based medicinal research
- Crop and food supply management
The objective of this playground competition is to use binary leaf images and extracted features, including shape, margin & texture, to accurately identify 99 species of plants. Leaves, due to their volume, prevalence, and unique characteristics, are an effective means of differentiating plant species. They also provide a fun introduction to applying techniques that involve image-based features.
As a first step, try building a classifier that uses the provided pre-extracted features. Next, try creating a set of your own features. Finally, examine the errors you're making and see what you can do to improve.
Kaggle is hosting this competition for the data science community to use for fun and education. This dataset originates from leaf images collected by
James Cope, Thibaut Beghin, Paolo Remagnino, & Sarah Barman of the Royal Botanic Gardens, Kew, UK.
Charles Mallah, James Cope, James Orwell. Plant Leaf Classification Using Probabilistic Integration of Shape, Texture and Margin Features. Signal Processing, Pattern Recognition and Applications, in press. 2013.
We thank the UCI machine learning repository for hosting the dataset.