## Context
Growing need for Geo Spatial analysis based on India ( by Subdivision )<br>

## Content

Data description<br>
The csv file contains-<br>

Name: Depicting the ROW no <br>
SUBDIVISION: Regions considered due to similar topological features <br>

1. Andaman and Nicobar Islands
2. Arunachal Pradesh
3. Assam and Meghalaya
4. Nagaland, Manipur, Mizoram and Tripura
5. Sub Himalayan West Bengal and Sikkim
6. Gangetic West Bengal
7. Orissa
8. Jharkhand
9. Bihar
10. East Uttar Pradesh
11. West Uttar Pradesh
12. Uttarakhand
13. Haryana Delhi and Chandigarh
14. Punjab
15. Himachal Pradesh
16. Jammu and Kashmir
17. West Rajasthan
18. East Rajasthan
19. West Madhya Pradesh
20. East Madhya Pradesh
21. Gujarat Region
22. Saurashtra and Kutch
23. Konkan and Goa
24. Madhya Maharashtra
25. Matathwada
26. Vidarbha
27. Chhattisgarh
28. Coastal Andhra Pradesh
29. Telangana
30. Rayalseema
31. Tamil Nadu
32. Coastal Karnataka
33. North Interior Karnataka
34. South Interior Karnataka
35. Kerala
36. Lakshadweep
YEAR: Year in which rainfall occured<br>
The next 12 Columns are the different months<br><br>
The last 4 columns are the cumulative time period to show the seasons<br>

## Acknowledgements
The data is fetched from data.gov.in and the coordinates are added to make Geo Spatial analysis.<br>

## Inspiration

The trend of climate change is upon us and having a detailed visualization would be beneficial to alot of people 

In [None]:
#Modules for EDA
import numpy as np 
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
plt.style.use('fivethirtyeight')

%matplotlib inline

In [None]:
df = pd.read_csv('/kaggle/input/rainfall-data-from-1901-to-2017-for-india/Rainfall_Data_LL.csv')
df.shape

In [None]:
df.info()

In [None]:
pd.set_option('max_columns',22)
df.head()

In [None]:
df.drop('Name',axis=1,inplace=True)
df.head()

# **Frequency of each Subdivision**

In [None]:
plt.figure(figsize=(20,15))
df['SUBDIVISION'].value_counts().plot(kind='barh')
plt.title('Frequency of each Subdivision')
plt.gca().invert_yaxis()
plt.show()

In [None]:
df['SUBDIVISION'].value_counts()

# **State Wise analysis**

In [None]:
states = df['SUBDIVISION'].unique()
state_df = df.groupby('SUBDIVISION')

### **Unique States and Union Territories** 

In [None]:
states

In [None]:
months = df.columns[2:14]

# **Get Region Rainfall Stats (Year and Month wise)**

In [None]:
def getRegionRainfallStats(regionIndex):
    stateName = states[regionIndex]
    state = state_df.get_group(states[regionIndex])
    years = state['YEAR'] 
    for month in months:
        plt.figure(figsize=(20,7))
        plt.plot(years,state[month],label=month,linewidth=1)
        plt.title(f"{stateName}'s rainfall stats({years.min()}-{years.max()}) in {month} month.")
        plt.xlabel("Year")
        plt.legend()
        plt.show()

# **Let's See Arunachal Pradesh's stats**

In [None]:
#Index of Arunachal Pradesh state
states[1] #1

In [None]:
getRegionRainfallStats(1)

# **Let's See west Uttar Pradesh's rainfall stats**

In [None]:
states[10]

In [None]:
getRegionRainfallStats(10)

#### **Similarly you can check on other regions rainfall by using getRegionRainfallStats function**

# **Visualizing state data of given year.**

In [None]:
def getStateYearData(regionIndex,year,return_data=False):
    try:
        stateName = states[regionIndex]
        dfstate = state_df.get_group(states[regionIndex])
        dfstate.set_index('YEAR',inplace=True)
        year_data = dfstate.loc[year][months]
        plt.figure(figsize=(10,7))
        sns.barplot(x=months,y=year_data.values)
        plt.title(f"Rainfall Data of {stateName} in year {year}")
        plt.show()
        if return_data:
            return year_data
    except KeyError:
        return "Enter Valid Year."
    except IndexError:
        return "Enter Valid State Index."
    except Exception as e:
        print(e)
        return "Opps! Something went."

# **Arunanchal Pradesh rainfall data in year 1990**

In [None]:
states[1] #Index is 1

In [None]:
getStateYearData(1,1990)

# **Jammu & Kashmir rainfall data in year 2011**

In [None]:
states[15] #Index is 15

In [None]:
getStateYearData(15,2011,return_data=True)

#### **Similarly you can check on other regions rainfall data of a given year.** 

# **Average Rainfall in last five years of each Region (month wise)**

In [None]:
def getLast5AvgRainfallData(regionIndex,return_data=False):
    
    stateName = states[regionIndex]
    dflast5 = state_df.get_group(states[regionIndex]).tail()
    
    last5years = dflast5['YEAR']
    rainfallAvg5 = dflast5.tail().mean()[months]

    #Plotting
    plt.figure(figsize=(10,7))
    sns.barplot(x=months, y=rainfallAvg5.values)
    plt.title(f"Average rainfall of {stateName} region in last five years({last5years.min()}-{last5years.max()})")
    plt.ylabel("Months")
    plt.xlabel("Rainfall")
    plt.show()
    if return_data:
        return rainfallAvg5

# **Let's See the Average rainfall of Assam and Meghalaya region in last 5 years**

In [None]:
states[2] #Index is 2

In [None]:
getLast5AvgRainfallData(2,return_data=True)

# **Now we will see the Average Rainfall in last 5 years of each state**

In [None]:
for index in range(len(states)):
    getLast5AvgRainfallData(index)
    print("\n")

### **Now We had average rainfall of last five years**
### **But what if you want average rainfall data of certain period? like(1970-1975)**
### **Let's Do that**

In [None]:
def getAverageStateDataOfPeriod(regionIndex,startYear,endYear,return_data=False):
    if startYear > endYear:
        raise ValueError("Error! startYear>endYear?")
    try:
        stateName = states[regionIndex]
        dfstate = state_df.get_group(states[regionIndex])
        dfstate.set_index('YEAR',inplace=True)
        
        years_data = dfstate.loc[startYear:endYear][months].mean()
        

        plt.figure(figsize=(10,7))
        sns.barplot(x=months,y=years_data.values)
        plt.title(f"Average Rainfall Data of {stateName} from {startYear}-{endYear}")
        plt.show()
        
        if return_data:
            return years_data
    except Exception as e:
        print(e)

# **Average Rainfall Data of Arunachal Pradesh from 1980-1990**

In [None]:
states[1] #Index is 1

In [None]:
getAverageStateDataOfPeriod(1,startYear=1980,endYear=1990)

# **Average Rainfall Data of Gujarat Region from 1968-1972**

In [None]:
states[20]

In [None]:
getAverageStateDataOfPeriod(20,1968, 1972, return_data=True)

# **Annual Rainfall of each state (Year wise)**

In [None]:
def getAnnualRainfallStats(regionIndex):
    stateName = states[regionIndex]
    state = state_df.get_group(states[regionIndex])
    years = state['YEAR']
    
    plt.figure(figsize=(20,7))
    plt.plot(years,state['ANNUAL'])
    plt.title(f"Annual Rainfall of {stateName} Region({years.min()}-{years.max()}).")
    plt.xlabel("Year")
    plt.ylabel("Annual Rainfall")
    plt.show()

In [None]:
for i in range(len(states)):
    getAnnualRainfallStats(i)
    print("\n")

# **Comparing Annual Rainfalls of all states of a given year**

In [None]:
def AnnaulStateGraph(year):
    try:
        annual_rainfall = []
        for i in range(len(states)):
            state = state_df.get_group(states[i])
            state.set_index('YEAR',inplace=True)
            annual_rainfall.append(state.loc[year]['ANNUAL'])
        plt.figure(figsize=(10,20))
        sns.barplot(x=annual_rainfall,y=states)
        plt.title(f"Annual Rainfall of each state in {year}")
        plt.ylabel("States")
        plt.xlabel("Annual Rainfall.")
        plt.show()
        print(f"Total Annual Rainfall of india in {year}:",np.sum(annual_rainfall))
        print(f"Average Annual Rainfall of state:",np.mean(annual_rainfall))
    except Exception as e:
        print(e)

# **Annual Rainfall of each state in 1999**

In [None]:
AnnaulStateGraph(1999)

# **Annual Rainfall of each state in 2015**

In [None]:
AnnaulStateGraph(2015)

#### **Similarly you can get Annual rainfall data of each stateof a given year.**
#### **But some some states don't have common years it may cause an error.**

# **That's the analysis I have done on this dataset**

#### **If you have any other Ideas then feel free to share your thoughts in the comments.**
#### **Save it and share it.**
#### **Happy Learning :)**