**DESCRIPTION**

The EDA project in Sharva/Samagra Shiksha Abhiyan financial data involves analyzing historical budget allocation and expenditure data to identify patterns and trends. Using this analysis, a predictive model is developed to estimate the budget allocation for the next year.

**ABBREVIATION USED**

1. SMA - SHARVA SHIKSHA ABHIYAN

2. SMSA - SAMAGRA SHIKSHA ABHIYAN

In [1]:
#Import required libraries
import pandas as pd

In [42]:
#IMPORTING DATASET 
#Data source - https://openbudgetsindia.org/dataset/sarva-shiksha-abhiyan-ssa-2015-16-to-2017-18
ssa_df = pd.read_csv("https://raw.githubusercontent.com/Jatansahu/EDA-SHARVA-SHIKSHA-ABHIYAN/main/ssacsv.csv?token=GHSAT0AAAAAAB5J5GGWR277QD5INAQZ4MY4ZCJL4YA")
smsa_df = pd.read_csv("https://raw.githubusercontent.com/Jatansahu/EDA-SHARVA-SHIKSHA-ABHIYAN/main/smsa.csv?token=GHSAT0AAAAAAB5J5GGWUZVBPBXPWCNRRKQKZCJL5ZQ")

In [43]:
#Working with  SMA 
ssa = ssa_df.copy()
smsa = smsa_df.copy()
ssa

Unnamed: 0,State,State_UT_Code,Financial Year,Budget Approved,Funds Released by the Government of India,Funds Released by the States/UTs,Total Funds Released (Government of India and States' Share),Expenditure Incurred by the States/UTs,Unspent Balance,Extent of Funds Released against Budget Approved,...,Unnamed: 16,Unnamed: 17,Unnamed: 18,Unnamed: 19,Unnamed: 20,Unnamed: 21,Unnamed: 22,Unnamed: 23,Unnamed: 24,Unnamed: 25
0,Andhra Pradesh,1.0,2015-2016,2116.062,723.748,447.030,1170.778,1610.515,297.883,55.328,...,,,,,,,,,,
1,Arunachal Pradesh,2.0,2015-2016,358.645,181.794,33.109,214.904,292.713,1.080,59.921,...,,,,,,,,,,
2,Assam,3.0,2015-2016,1682.157,1107.840,109.630,1217.470,1165.272,33.243,72.376,...,,,,,,,,,,
3,Bihar,4.0,2015-2016,7387.148,2515.573,2891.506,5407.079,5762.259,-4.126,73.196,...,,,,,,,,,,
4,Chhattisgarh,5.0,2015-2016,2149.343,622.197,,,1477.519,36.023,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,,,,,,,,,,,...,,,,,,,,,,
995,,,,,,,,,,,...,,,,,,,,,,
996,,,,,,,,,,,...,,,,,,,,,,
997,,,,,,,,,,,...,,,,,,,,,,


#PART A) DATA CLEANING(INITIAL PHASE)

In [44]:
#Checking null values
ssa.isna().sum()

State                                                           888
State_UT_Code                                                   888
Financial Year                                                  888
Budget Approved                                                 891
Funds Released by the Government of India                       891
Funds Released by the States/UTs                                894
Total Funds Released (Government of India and States' Share)    894
Expenditure Incurred by the States/UTs                          891
Unspent Balance                                                 891
Extent of Funds Released against Budget Approved                894
Extent of Funds Utilised against Budget Approved                891
Unnamed: 11                                                     999
Unnamed: 12                                                     999
Unnamed: 13                                                     999
Unnamed: 14                                     

In [45]:
#Shape of the dataset
ssa.shape

(999, 26)

In [46]:
# Delete columns containing either 90% or more than 90% NaN Values
perc = 90.0
min_count =  int(((100-perc)/100)*ssa.shape[0] + 1)

#axis=1 : Drop columns which contain missing value.
#thresh=min_count : Delete columns which contains less than min_count number of non-NaN values.
ssa = ssa.dropna( axis=1, thresh=min_count)  
ssa.shape

(999, 11)

In [47]:
# Delete Rows containing either 90% or more than 90% NaN Values
perc = 90
min_count =  int(((100-perc)/100)*ssa.shape[1] + 1)

#axis=0 : Drop rows  which contain missing value.
#thresh=min_count : Delete rows which contains less than min_count number of non-NaN values.
ssa = ssa.dropna( axis=0, thresh=min_count)  
ssa.shape

(111, 11)

In [48]:
#displaying top 10 data
ssa.head(10)

Unnamed: 0,State,State_UT_Code,Financial Year,Budget Approved,Funds Released by the Government of India,Funds Released by the States/UTs,Total Funds Released (Government of India and States' Share),Expenditure Incurred by the States/UTs,Unspent Balance,Extent of Funds Released against Budget Approved,Extent of Funds Utilised against Budget Approved
0,Andhra Pradesh,1.0,2015-2016,2116.062,723.748,447.03,1170.778,1610.515,297.883,55.328,76.11
1,Arunachal Pradesh,2.0,2015-2016,358.645,181.794,33.109,214.904,292.713,1.08,59.921,81.62
2,Assam,3.0,2015-2016,1682.157,1107.84,109.63,1217.47,1165.272,33.243,72.376,69.27
3,Bihar,4.0,2015-2016,7387.148,2515.573,2891.506,5407.079,5762.259,-4.126,73.196,78.0
4,Chhattisgarh,5.0,2015-2016,2149.343,622.197,,,1477.519,36.023,,68.74
5,Goa,6.0,2015-2016,24.238,8.371,3.863,12.234,15.858,0.034,50.475,65.43
6,Gujarat,7.0,2015-2016,1973.598,615.638,405.792,1021.43,1824.934,-337.576,51.755,92.47
7,Haryana,8.0,2015-2016,1120.583,345.012,,,529.163,99.413,,47.22
8,Himachal Pradesh,9.0,2015-2016,345.338,121.553,46.02,167.573,325.267,-115.782,48.524,94.19
9,Jharkhand,10.0,2015-2016,1649.303,558.633,477.17,1035.803,1355.91,118.182,62.802,82.21


#PART B -> UNDERSTANDING DATA