# Agricultural Data for Rajasthan, India (2018-2019)
## Project Goals
 - Build a recommender system that allows users to input parameters and the system will recommend the best crops to plant
 - First step is to train a model on available data that can predict the price of the crop
 - System will run the prediction model for each type of crop and produce a ranked list of crops based on price

In [12]:
import mysql.connector
from sqlalchemy import create_engine
import pandas as pd

In [18]:
conn_string = "mysql+pymysql://dev:devpass@localhost/agData"
engine = create_engine(conn_string)
cnx = engine.connect()

In [19]:
pd.read_sql("SHOW tables;", cnx)

Unnamed: 0,Tables_in_agData
0,crop_price
1,crop_production
2,soil_analysis
3,water_usage


### SQL Table crop_price
### SQL Table crop_production
- district: District name where the crop was grown (Categorical)
- crop: Crop name (Categorical)
- season: Kharif or Rabi
- area: field size in hectares (Numerical)
- yield: production per area in quintals = (production / area) x100 (Numerical)
- production: overall production in metric tons (Numerical)
### SQL Table soil_analysis
### SQL Table water_usage

Both yield and price contribute to the profitability of the crop so it makes sense to maximize both. Average Yield x Average price will give us a measure of the average revenue per area used. Here are the crops that had higher than average revenue.

In [53]:
crop_revenue_df = pd.read_sql(
    "SELECT prod.crop, SUM(area) AS total_area, AVG(yield) AS avg_yield, AVG($.price) AS avg_price, (AVG(yield)+AVG($.price)) AS revenue_per_area "
    "FROM crop_production AS prod "
    "JOIN crop_price AS $ "
    "ON prod.crop = $.crop "
    "GROUP BY crop "
    "HAVING revenue_per_area > "
        "(SELECT AVG(yield) FROM crop_production) "
    "ORDER BY revenue_per_area DESC;", cnx)
crop_revenue_df.head()

Unnamed: 0,crop,total_area,avg_yield,avg_price,revenue_per_area
0,Gram,100694500.0,36.970811,2544.595999,2581.56681
1,Onion,146825800.0,35.003965,2536.805576,2571.809541
2,Coriander,137686900.0,38.400358,2532.29454,2570.694898
3,Chilli,118804900.0,38.322075,2530.936034,2569.258109
4,Sugarcane,122980500.0,36.384276,2526.102832,2562.487108
