GitHub - daniau23/rice-type-classification-regression: Use of ML algo and Deep learning for rice classification and regression

RICE TYPE CLASSIFICATION & EXTENT PREDICTION

The project explores the use of machine learning and deep learning capablities for the prediction of the rice type. The Project begins with as follows;

The exploration of the dataset using Python for Exploratory Data Analysis
Treating of outliers using the IQR treatment as the recommended approach to treat skewness discovered in eccentricity and major axis length. There was a trade-off using this approach namely;
- Lost of some data points
- Skewness level of other features increasing
Feature engineering, data normalisation and dimensionality reduction approaches were used to prepare the data for unsupervised and supervised learning approaches
Unsupervised learning (KMeans) was used on the normalised data so as to create more features to be added back to the outlier treated dataset (outlier_treat.csv), (Feature engineering).
Having loaded the cluste.csv file, the same approach for data normalisation awas used to prepared the dataset using a Pipeline and using a Grid Search to search the best estimator. The following was discovered.
- Gradient Boosting Classifier was the best approach.
- Most of the classes were correctly classified based of the confusion matrix.
- Precision and Recall for both classes on the test data; gonen and jasmine were significantly high, so as for the ROC curve and AUC score having a value of 1; indicatingthe classifier is not a random classifier.
For regession analysis, the target values was Extent. The following was done;
- Prepared the data via a pipeline and search for optimal classifier using a grid search; best estimator was Linear Regression
- The RMSE score was extremely low and the $r^2$ score; 0.132 was horrible
- By adding polynomial features, the $r^2$ score improved to 0.135; still horrible.

DEEP LEARNING

The deep learning appraoch for classification was done in two approaches as followings
- Using the same pipline apporoach as previously used, which gave similar results to the machine learning approach, less misclassification on test data.
- Using a different pipeline approach; removing PCA, thereby feeding the mode with more features. Deep learning network architecture remained the same. But performance dropped as more misclassification were seen on the test data.
For the regression analysis, the deep learning approach did yield a better RMSE score, 0.0989 with a loss of nearly 0; 0.0098
All saved models can be found in the models folder and all saved pipelines for deep learning approaches can be found in the pipeline folder.
All project files can be explored in the project_files folder; gnatt chart, project charter, status report, Work Breakdown Structure (WBS).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
EDA		EDA
classify		classify
clustering		clustering
data		data
deep_learning		deep_learning
img		img
model		model
my_data		my_data
pipeline		pipeline
project_files		project_files
regression		regression
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RICE TYPE CLASSIFICATION & EXTENT PREDICTION

DEEP LEARNING

About

Releases

Packages

Languages

daniau23/rice-type-classification-regression

Folders and files

Latest commit

History

Repository files navigation

RICE TYPE CLASSIFICATION & EXTENT PREDICTION

DEEP LEARNING

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages