GitHub - akshaysuresh1/may22-barrel: Erdos Institute Data Science Bootcamp

Team Barrel

Topic: Predicting Fertilizer Input for Rice Cultivation in India

Team members:

This project was completed as part of the Erdos Institute Data Science Bootcamp, May 2022. Special mention to James Bramante for mentoring us throughout the duration of the bootcamp.

5 minute video presentation: MP4
Presentation slides: PDF
Executive summary: PDF

Project Goal

Home to over 1.38 billion people, India is tackling a severe hunger crisis. Though the country has achieved self-sufficiency in grain production, nearly 14% of the population is still undernourished. India's agricultural landscape is primarily rural, where widespread poverty, low literacy rates, and poor infrastructure lead to questions over its sustainability. Indiscriminate use of fertilizers has led to significant irregularity in crop production despite consistent agricultural subsidies.

With the current global shortage of fertilizers, precision farming is vital to eliminate redundant costs and streamline resources to ensure equitable food access for all communities. Here, we assist policy-makers in their decisions through models predicting the fertilizer consumption (nitrogen, phosphorus, and potash) required to obtain a specific rice yield.

Methodology

Rice is a hardy crop capable of thriving in a variety of soils, including loams, silts, and gravel. Collating up to 26 years of district-level rice cultivation (cropped area, yield, irrigated area) and environment data (temperature, precipitation, wind speed, evapotranspiration), our analysis involved two key steps.

Firstly, we grouped districts with similar ecological parameters into clusters. To do so, we experimented with two unsupervised learning approaches, namely, K-means and hierarchical clustering.
At the level of clusters, we regressed the historical NPK consumption data against rice yield. Here, we trialed simple linear regression, random forest regression, and support vector regression.

Clustering

Based on environmental variables, both K-means and hierarchical clustering favor the grouping of Indian districts into 6 rice-growing clusters. Here is a map of India showing the spatial grouping of districts generated by our hierarchical clustering algorithm.

We note that the above map bears some visual resemble to the Koppen-Geiger climate classification map of India. However, we caution readers against performing meticulous comparisons between these maps as our algorithms additionally incorporate soil-dependent features such as surface runoff and evapotranspiration.

Modeling

For every cluster, we independently regressed their nitrogen, phosphorous, and potash fertilizer inputs per unit area against rice yield. Performing a 80-20 train-test split, we evaluated model performance on our test data using the SMAPE metric. As shown below, support vector regression marginally outperforms other models with a smaller SMAPE.

Future Work

Future extensions to our model will incorporate soil nutrient data, solar irradiance data, and knowledge of off-season farming practices (e.g., crop rotation) to improve the accuracy of our estimated fertilizer inputs.

Troubleshooting

Please submit an issue to voice any problems or requests. Suggestions that will help improve our data analyses are always welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
API_query		API_query
Clustering		Clustering
Data_prep		Data_prep
EDA		EDA
Final_data		Final_data
Modeling		Modeling
Raw_data		Raw_data
media		media
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Team Barrel

Table of Contents

Project Goal

Methodology

Clustering

Modeling

Future Work

Troubleshooting

About

Releases

Packages

Contributors 5

Languages

License

akshaysuresh1/may22-barrel

Folders and files

Latest commit

History

Repository files navigation

Team Barrel

Table of Contents

Project Goal

Methodology

Clustering

Modeling

Future Work

Troubleshooting

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages