Project title climate risk and disaster management

Project statement : Rainfall variability in India significantly impacts agriculture, water resources, and disaster management. However, unpredictable rainfall patterns make it difficult for farmers and policymakers to plan effectively. This project aims to analyze historical rainfall data (1901–2015) and build a machine learning model to predict future rainfall trends, supporting better planning and risk management.

Description : Rainfall plays a crucial role in India’s economy, as a large portion of agriculture and water supply depends on the monsoon. Irregular and unpredictable rainfall patterns often lead to challenges such as droughts, floods, and reduced crop productivity, affecting millions of people. By studying historical rainfall data from 1901 to 2015, this project focuses on identifying long-term trends and predicting future rainfall patterns using machine learning techniques. The insights gained can help in better planning for agriculture, water resource management, and disaster preparedness.

In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score


In [7]:
# Load your dataset (uploaded to Colab)
df = pd.read_csv("/content/rainfall in india 1901-2015.csv")

# Show first few rows
print(df.head())
print("\nColumns in dataset:", df.columns)


                 SUBDIVISION  YEAR   JAN    FEB   MAR    APR    MAY    JUN  \
0  ANDAMAN & NICOBAR ISLANDS  1901  49.2   87.1  29.2    2.3  528.8  517.5   
1  ANDAMAN & NICOBAR ISLANDS  1902   0.0  159.8  12.2    0.0  446.1  537.1   
2  ANDAMAN & NICOBAR ISLANDS  1903  12.7  144.0   0.0    1.0  235.1  479.9   
3  ANDAMAN & NICOBAR ISLANDS  1904   9.4   14.7   0.0  202.4  304.5  495.1   
4  ANDAMAN & NICOBAR ISLANDS  1905   1.3    0.0   3.3   26.9  279.5  628.7   

     JUL    AUG    SEP    OCT    NOV    DEC  ANNUAL  Jan-Feb  Mar-May  \
0  365.1  481.1  332.6  388.5  558.2   33.6  3373.2    136.3    560.3   
1  228.9  753.7  666.2  197.2  359.0  160.5  3520.7    159.8    458.3   
2  728.4  326.7  339.0  181.2  284.4  225.0  2957.4    156.7    236.1   
3  502.0  160.1  820.4  222.2  308.7   40.1  3079.6     24.1    506.9   
4  368.7  330.5  297.0  260.7   25.4  344.7  2566.7      1.3    309.7   

   Jun-Sep  Oct-Dec  
0   1696.3    980.3  
1   2185.9    716.7  
2   1874.0    690.6  
3   

In [8]:
# Keep only Year and Annual Rainfall
data = df[["YEAR", "ANNUAL"]]

# Drop missing values if any
data = data.dropna()

print(data.head())


   YEAR  ANNUAL
0  1901  3373.2
1  1902  3520.7
2  1903  2957.4
3  1904  3079.6
4  1905  2566.7
