Wild-Blue-Berry-Yield-Prediction

In this repo, I have worked on Wild Blue Berry Yield prediction project. Dataset was generated by Pollination Simulation Model for research purpose

Description: The dataset used for predictive modelling was generated by the Wild Blueberry Pollination Simulation Model, which is an open-source, spatially-explicit computer simulation program, that enables exploration of how various factors, including plant spatial arrangement, outcrossing and self-pollination, bee species compositions and weather conditions, in isolation and combination, affect pollination efficiency and yield of the wild blueberry agro-ecosystem. The simulation model has been validated by the field observation and experimental data collected in Maine USA and Canadian Maritimes during the last 30 years and now is a useful tool for hypothesis testing and theory development for wild blueberry pollination researches. This simulated data provides researchers who have actual data collected from field observation and those who wants to experiment the potential of machine learning algorithms response to real data and computer simulation modelling generated data as input for crop yield prediction models.

Problem Statement: The target feature is yield which is a continuous variable. The task is to classify this variable based on the other 17 features step-by-step by going through each day's task. The evaluation metrics will be RMSE score

Web App: Click Here

Published article by Analytics Vidhya: Click here

Solution:

EDA using matplotlib, pandas and seaborn
Feature selection using mutual_info_regressor
Kmeans Clustering to cluster types of bees columns
Standardizing input features
Basline modeling using gradient boosted trees: RMSE - 188
Cross validation using gradient boosted trees: RMSE - 141
Model hyperparameters tuning using pipeline object with XGBRegressor: RMSE - 18
Explainable AI using shap

Learning outcomes:

Feature selection methods
Development of machine learning pipelines using sklearn's pipeline object

Acknowledegement: TMLC Academy

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.gitignore		.gitignore
Data in brief_WildBlueberryPollinationSimulation .docx		Data in brief_WildBlueberryPollinationSimulation .docx
LICENSE		LICENSE
README.md		README.md
Wild blueberry yield prediction using a combination of computer simulation.pdf		Wild blueberry yield prediction using a combination of computer simulation.pdf
WildBlueberryPollinationSimulationData.csv		WildBlueberryPollinationSimulationData.csv
wild-blue-berry-yield-prediction.ipynb		wild-blue-berry-yield-prediction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wild-Blue-Berry-Yield-Prediction

Web App: Click Here

Published article by Analytics Vidhya: Click here

About

Releases

Packages

Languages

License

avikumart/Wild-Blue-Berry-Yield-Prediction-Flask-app-deploy

Folders and files

Latest commit

History

Repository files navigation

Wild-Blue-Berry-Yield-Prediction

Web App: Click Here

Published article by Analytics Vidhya: Click here

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages