# REAP: Rice Efficacy Across Philippines: A Data-Driven Cross-Dataset Analysis with India

### Project Objectives
This project entails a comparative analysis between the agricultural practices of India and the Philippines, with a specific focus on rice cultivation. Python will be employed for data analytics, aiming to provide meaningful insights into the similarities and differences between these two methods. The project seeks to identify key factors contributing to the effectiveness of these practices, while highlighting areas for potential improvement.

### Walkthrough

#### Datasets Acquisition
In this section, we will provide an overview of the datasets included in our study.

    1. Rice and Production Datasets
    2. Climate and Weather Datasets
    3. Soil Characteristics Datasets
    4. Agricultural Practices Datasets
    5. Socioeconomic and Demographic Datasets
    6. Policy and Trade Datasets
    

In [5]:
import numpy as np
import pandas as pd
import os

In [10]:
check_indProd = pd.read_csv('Datasets/ind_AgriculturalPracticesDatasets_verified.csv')
check_indProd.head(5)

Unnamed: 0,State,District,Crop,Crop_Year,Season,Area,Production,Yield
0,Andaman and Nicobar Island,NICOBARS,Arecanut,2007,Kharif,2439.6,3415.0,1.4
1,Andaman and Nicobar Island,NICOBARS,Arecanut,2007,Rabi,1626.4,2277.0,1.4
2,Andaman and Nicobar Island,NICOBARS,Arecanut,2008,Autumn,4147.0,3060.0,0.74
3,Andaman and Nicobar Island,NICOBARS,Arecanut,2008,Summer,4147.0,2660.0,0.64
4,Andaman and Nicobar Island,NICOBARS,Arecanut,2009,Autumn,4153.0,3120.0,0.75


In [62]:
# Drop the rows that does not contain the word "Rice"
new_indProd = check_indProd.dropna(subset=['Crop'])
new_indProd = check_indProd[check_indProd['Crop'].str.contains('Rice')]

new_indProd.head()

Unnamed: 0,State,District,Crop,Crop_Year,Season,Area,Production,Yield
468,Andaman and Nicobar Island,NICOBARS,Rice,2007,Kharif,7333.75,21864.0,2.98
469,Andaman and Nicobar Island,NICOBARS,Rice,2008,Autumn,7900.0,14730.0,1.86
470,Andaman and Nicobar Island,NICOBARS,Rice,2009,Autumn,8140.0,16600.0,2.04
471,Andaman and Nicobar Island,NICOBARS,Rice,2000,Kharif,102.0,321.0,3.15
472,Andaman and Nicobar Island,NICOBARS,Rice,2001,Kharif,83.0,300.0,3.61


In [72]:
check_phiProd = pd.read_csv('Datasets/phi_AgriculturalPracticesDatasets_verified.csv')
check_phiProd.head(5)

Unnamed: 0,Provinces,Crop,Crop_Year,Production
0,Abra,Rice,1987,23085
1,Abra,Rice,1988,31157
2,Abra,Rice,1989,27509
3,Abra,Rice,1990,25637
4,Abra,Rice,1991,24109


In [110]:
check_phiProd['Production'] = check_phiProd['Production'].astype(float)
yr_phiVal = check_phiProd['Crop_Year'].sort_values().unique()
ph_dictio = {}

for index, value in check_phiProd['Production'].items():
    if value == '..':
        check_phiProd.at[index, 'Production'] = 0
        
for value in yr_phiVal:
    ph_prodTbl = check_phiProd.loc[check_phiProd['Crop_Year'] == value, 'Production'].sum()
    ph_dictio[value] = ph_prodTbl


{1987: 8536652.0, 1988: 8967526.0, 1989: 9455268.0, 1990: 9315772.0, 1991: 9669758.0, 1992: 9125436.0, 1993: 9430704.0, 1994: 10534012.0, 1995: 10538733.0, 1996: 11280946.0, 1997: 11266060.0, 1998: 8552418.0, 1999: 11783805.0, 2000: 12387126.0, 2001: 12952940.0, 2002: 13268797.0, 2003: 13497897.0, 2004: 14494221.0, 2005: 14600378.0, 2006: 15324087.0, 2007: 16239238.0, 2008: 16814649.0, 2009: 16265572.5, 2010: 15771358.0, 2011: 16683104.0, 2012: 18031629.47, 2013: 18438776.729999997, 2014: 18967177.17, 2015: 18149346.78, 2016: 17626966.82, 2017: 19276093.630000003, 2018: 19066058.939999998, 2019: 18814577.349999998, 2020: 19294510.650000002, 2021: 19959846.21, 2022: 19756054.259999998}


In [112]:
check_indProd['Production'] = check_indProd['Production'].astype(float)
yr_indVal = check_indProd['Crop_Year'].sort_values().unique()
in_dictio = {}

for index, value in check_indProd['Production'].items():
    if value == '..':
        check_indProd.at[index, 'Production'] = 0

for value in yr_indVal:
    in_prodTbl = check_indProd.loc[check_indProd['Crop_Year'] == value, 'Production'].sum()
    in_dictio[value] = in_prodTbl


{1997: 61317562.0, 1998: 83065010.0, 1999: 83511141.0, 2000: 83248654.0, 2001: 90197955.0, 2002: 71603326.0, 2003: 88027062.0, 2004: 84202290.0, 2005: 89613933.0, 2006: 92251794.0, 2007: 95092333.0, 2008: 97366163.0, 2009: 88907038.0, 2010: 96279758.0, 2011: 105059032.0, 2012: 107631837.0, 2013: 111364711.0, 2014: 114729889.0, 2015: 101797395.0, 2016: 118936131.0, 2017: 118179300.0, 2018: 118689951.0, 2019: 134631489.0, 2020: 724429.0}
