# Check for basic stats and form insights for SPI Score.


## Scope
- [x] Check for null values.
- [x] Check for basic stats like mean,median and mode for all the columuns.
- [x] Assign continents to the conutries.
- [x] Save the new csv.

## Summary
- `Top 5` countries with highest SPI score all are from `Europe`.
- `Bottom 5` countries with the lowest SPI score are all from `Africa`.
- `67.4` is the average SPI score among the `168` countries.
- `32.5` is the min SPI score by `Somalia`.
- `92.6` is the max SPI score by `Norway`.
- `89` out of `168` countries have thier SPI score > average, which is almost 53%.

## Imports

In [1]:
# Import libraries
import pandas as pd
import altair as alt

In [2]:
# Read csv
df = pd.read_csv('C:/Users/Tanish/Desktop/SPI-Analysis-Project/Data/spi.csv')

In [3]:
# Preview
df.head()

Unnamed: 0,spi_rank,country,spi_score,basic_human_needs,wellbeing,opportunity,basic_nutri_med_care,water_sanitation,shelter,personal_safety,access_basic_knowledge,access_info_comm,health_wellness,env_quality,personal_rights,personal_freedom_choice,inclusiveness,access_adv_edu
0,1,Norway,92.63,95.29,93.3,89.3,98.81,98.33,93.75,90.29,98.66,95.8,89.32,89.44,96.34,91.16,83.77,85.92
1,2,Finland,92.26,95.62,93.09,88.07,98.99,99.26,96.48,87.75,96.32,95.14,85.73,95.15,96.13,88.1,82.81,85.23
2,3,Denmark,92.15,95.3,92.74,88.41,98.62,98.21,94.92,89.46,97.44,98.18,85.15,90.2,97.08,90.03,81.64,84.89
3,4,Iceland,91.78,96.66,93.65,85.04,98.99,98.82,93.16,95.66,99.51,93.12,91.02,90.93,95.14,88.01,77.63,79.39
4,5,Switzerland,91.78,95.25,93.8,86.28,98.72,98.96,92.97,90.35,98.6,95.07,91.5,90.05,96.69,90.65,74.81,82.99


The head here also shows the top 5 countries with highest SPI score, which are:
1. Norway
2. Finland
3. Denmark
4. Iceland
5. Switzerland

- All the above countries are from Europe.

In [4]:
# Shape
df.shape

(169, 18)

In [5]:
# Check for col names
df.columns

Index(['spi_rank', 'country', 'spi_score', 'basic_human_needs', 'wellbeing',
       'opportunity', 'basic_nutri_med_care', 'water_sanitation', 'shelter',
       'personal_safety', 'access_basic_knowledge', 'access_info_comm',
       'health_wellness', 'env_quality', 'personal_rights',
       'personal_freedom_choice', 'inclusiveness', 'access_adv_edu'],
      dtype='object')

In [6]:
# Check for tail
df.tail(6)

Unnamed: 0,spi_rank,country,spi_score,basic_human_needs,wellbeing,opportunity,basic_nutri_med_care,water_sanitation,shelter,personal_safety,access_basic_knowledge,access_info_comm,health_wellness,env_quality,personal_rights,personal_freedom_choice,inclusiveness,access_adv_edu
163,164,Somalia,35.62,40.21,38.41,28.22,55.75,32.42,35.52,37.17,25.5,33.4,29.57,65.16,23.8,31.9,27.4,29.8
164,165,Eritrea,35.33,44.94,35.95,25.1,57.92,27.91,50.27,43.67,40.18,6.81,41.68,55.12,14.88,37.86,24.82,22.84
165,166,Chad,34.6,35.65,36.26,31.87,47.24,21.48,33.0,40.9,23.14,24.31,41.47,56.13,52.04,28.66,22.03,24.76
166,167,Central African Republic,33.53,29.91,34.83,35.84,36.42,26.95,26.79,29.46,34.81,22.57,24.6,57.35,52.39,26.67,37.87,26.43
167,168,South Sudan,32.5,39.96,34.17,23.37,59.29,24.43,33.28,42.84,27.18,9.16,37.14,63.22,27.4,32.5,13.42,20.17
168,169,World,65.05,74.18,64.42,56.54,84.92,69.99,80.63,61.2,72.03,70.22,60.18,55.27,60.16,62.22,42.22,61.58


The bottom 5 countries with lowest SPI score are:
1. South Sudan
2. Central African Republic
3. Chad
4. Eritrea
5. Somalia

- All the above countries belong to African continent.

In [7]:
# Check for null values
df.isnull().sum()

spi_rank                   0
country                    0
spi_score                  0
basic_human_needs          0
wellbeing                  0
opportunity                0
basic_nutri_med_care       0
water_sanitation           0
shelter                    0
personal_safety            0
access_basic_knowledge     0
access_info_comm           0
health_wellness            0
env_quality                0
personal_rights            0
personal_freedom_choice    0
inclusiveness              0
access_adv_edu             0
dtype: int64

In [8]:
# Use describe for basic stats
df.describe()

Unnamed: 0,spi_rank,spi_score,basic_human_needs,wellbeing,opportunity,basic_nutri_med_care,water_sanitation,shelter,personal_safety,access_basic_knowledge,access_info_comm,health_wellness,env_quality,personal_rights,personal_freedom_choice,inclusiveness,access_adv_edu
count,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0,169.0
mean,85.0,67.433136,76.142959,67.774379,58.381657,84.705976,76.12284,77.088166,66.656509,74.758698,66.822367,62.325562,67.189704,69.627811,62.908343,46.80284,54.188166
std,48.930222,15.01215,16.252248,15.397385,15.805868,14.41404,23.408526,18.811647,14.404784,19.46411,20.382707,16.034389,14.34008,21.535655,15.078164,17.008499,18.564111
min,1.0,32.5,29.91,34.17,23.37,36.42,14.8,26.79,29.46,23.14,6.81,21.03,23.95,14.88,26.67,4.26,19.7
25%,43.0,55.17,62.65,55.48,47.9,72.42,57.06,64.57,55.81,61.56,52.11,49.53,58.29,54.01,52.67,34.3,36.23
50%,85.0,68.09,82.46,67.35,56.44,91.33,86.15,87.3,67.21,79.08,70.28,62.37,67.28,71.2,62.42,47.24,54.32
75%,127.0,78.81,88.7,79.2,69.48,96.72,96.75,90.62,76.34,91.22,82.75,73.33,77.54,88.66,73.79,58.15,68.47
max,169.0,92.63,96.85,93.8,89.3,98.99,99.27,96.87,96.18,99.51,98.18,92.1,95.15,97.91,91.16,83.77,89.6


In [9]:
# Check if all the countries mentioned are unique or not
df.country.nunique()

169

In [10]:
# Get the list of all the countries from the df
list(df['country'])

['Norway',
 'Finland',
 'Denmark',
 'Iceland',
 'Switzerland',
 'Canada',
 'Sweden',
 'Netherlands',
 'Japan',
 'Germany',
 'Australia',
 'New Zealand',
 'Ireland',
 'Austria',
 'Luxembourg',
 'Belgium',
 'Korea. Republic of',
 'United Kingdom',
 'France',
 'Spain',
 'Estonia',
 'Czechia',
 'Italy',
 'United States',
 'Portugal',
 'Slovenia',
 'Lithuania',
 'Malta',
 'Cyprus',
 'Singapore',
 'Greece',
 'Israel',
 'Slovakia',
 'Latvia',
 'Poland',
 'Croatia',
 'Chile',
 'Costa Rica',
 'Uruguay',
 'Barbados',
 'Argentina',
 'Hungary',
 'Bulgaria',
 'Romania',
 'Mauritius',
 'Trinidad and Tobago',
 'Serbia',
 'Ukraine',
 'Georgia',
 'Kuwait',
 'Malaysia',
 'Panama',
 'Jamaica',
 'Belarus',
 'Armenia',
 'Albania',
 'Montenegro',
 'Tunisia',
 'Ecuador',
 'Moldova',
 'Peru',
 'Russia',
 'Republic of North Macedonia',
 'Kazakhstan',
 'Brazil',
 'Bosnia and Herzegovina',
 'United Arab Emirates',
 'Mexico',
 'Paraguay',
 'Colombia',
 'Thailand',
 'Sri Lanka',
 'Dominican Republic',
 'Maldives',

In [11]:
# Make a dictionary assigning all these countries with thier respective continents
country_to_continent = {
    'Norway': 'Europe',
    'Finland': 'Europe',
    'Denmark': 'Europe',
    'Iceland': 'Europe',
    'Switzerland': 'Europe',
    'Canada': 'North America',
    'Sweden': 'Europe',
    'Netherlands': 'Europe',
    'Japan': 'Asia',
    'Germany': 'Europe',
    'Australia': 'Oceania',
    'New Zealand': 'Oceania',
    'Ireland': 'Europe',
    'Austria': 'Europe',
    'Luxembourg': 'Europe',
    'Belgium': 'Europe',
    'Korea. Republic of': 'Asia',
    'United Kingdom': 'Europe',
    'France': 'Europe',
    'Spain': 'Europe',
    'Estonia': 'Europe',
    'Czechia': 'Europe',
    'Italy': 'Europe',
    'United States': 'North America',
    'Portugal': 'Europe',
    'Slovenia': 'Europe',
    'Lithuania': 'Europe',
    'Malta': 'Europe',
    'Cyprus': 'Asia',
    'Singapore': 'Asia',
    'Greece': 'Europe',
    'Israel': 'Asia',
    'Slovakia': 'Europe',
    'Latvia': 'Europe',
    'Poland': 'Europe',
    'Croatia': 'Europe',
    'Chile': 'South America',
    'Costa Rica': 'North America',
    'Uruguay': 'South America',
    'Barbados': 'North America',
    'Argentina': 'South America',
    'Hungary': 'Europe',
    'Bulgaria': 'Europe',
    'Romania': 'Europe',
    'Mauritius': 'Africa',
    'Trinidad and Tobago': 'North America',
    'Serbia': 'Europe',
    'Ukraine': 'Europe',
    'Georgia': 'Asia',
    'Kuwait': 'Asia',
    'Malaysia': 'Asia',
    'Panama': 'North America',
    'Jamaica': 'North America',
    'Belarus': 'Europe',
    'Armenia': 'Asia',
    'Albania': 'Europe',
    'Montenegro': 'Europe',
    'Tunisia': 'Africa',
    'Ecuador': 'South America',
    'Moldova': 'Europe',
    'Peru': 'South America',
    'Russia': 'Europe',
    'Republic of North Macedonia': 'Europe',
    'Kazakhstan': 'Asia',
    'Brazil': 'South America',
    'Bosnia and Herzegovina': 'Europe',
    'United Arab Emirates': 'Asia',
    'Mexico': 'North America',
    'Paraguay': 'South America',
    'Colombia': 'South America',
    'Thailand': 'Asia',
    'Sri Lanka': 'Asia',
    'Dominican Republic': 'North America',
    'Maldives': 'Asia',
    'Suriname': 'South America',
    'Cuba': 'North America',
    'Cabo Verde': 'Africa',
    'Vietnam': 'Asia',
    'Mongolia': 'Asia',
    'South Africa': 'Africa',
    'Fiji': 'Oceania',
    'Kyrgyzstan': 'Asia',
    'Jordan': 'Asia',
    'Bhutan': 'Asia',
    'Oman': 'Asia',
    'Qatar': 'Asia',
    'Turkey': 'Asia',
    'Lebanon': 'Asia',
    'Bolivia': 'South America',
    'Algeria': 'Africa',
    'Botswana': 'Africa',
    'West Bank and Gaza': 'Asia',
    'Guyana': 'South America',
    'Indonesia': 'Asia',
    'Uzbekistan': 'Asia',
    'Bahrain': 'Asia',
    'Philippines': 'Asia',
    'Iran': 'Asia',
    'Ghana': 'Africa',
    'China': 'Asia',
    'Morocco': 'Africa',
    'Sao Tome and Principe': 'Africa',
    'El Salvador': 'North America',
    'Gabon': 'Africa',
    'Saudi Arabia': 'Asia',
    'Namibia': 'Africa',
    'Azerbaijan': 'Asia',
    'Nicaragua': 'North America',
    'Egypt': 'Africa',
    'Honduras': 'North America',
    'Senegal': 'Africa',
    'Guatemala': 'North America',
    'Nepal': 'Asia',
    'Timor-Leste': 'Asia',
    'India': 'Asia',
    'Kenya': 'Africa',
    'Myanmar': 'Asia',
    'Iraq': 'Asia',
    'Libya': 'Africa',
    'Turkmenistan': 'Asia',
    'Bangladesh': 'Asia',
    'Gambia. The': 'Africa',
    'Tajikistan': 'Asia',
    'Malawi': 'Africa',
    'Benin': 'Africa',
    'Tanzania': 'Africa',
    'Comoros': 'Africa',
    'Cambodia': 'Asia',
    'Solomon Islands': 'Oceania',
    'Lesotho': 'Africa',
    "Côte d'Ivoire": 'Africa',
    'Syria': 'Asia',
    'Togo': 'Africa',
    'Zimbabwe': 'Africa',
    'Zambia': 'Africa',
    'Sierra Leone': 'Africa',
    'Rwanda': 'Africa',
    'Nigeria': 'Africa',
    'Cameroon': 'Africa',
    'Uganda': 'Africa',
    'Eswatini': 'Africa',
    'Liberia': 'Africa',
    'Pakistan': 'Asia',
    'Burkina Faso': 'Africa',
    'Laos': 'Asia',
    'Djibouti': 'Africa',
    'Congo. Republic of': 'Africa',
    'Ethiopia': 'Africa',
    'Madagascar': 'Africa',
    'Mozambique': 'Africa',
    'Mali': 'Africa',
    'Mauritania': 'Africa',
    'Angola': 'Africa',
    'Equatorial Guinea': 'Africa',
    'Sudan': 'Africa',
    'Papua New Guinea': 'Oceania',
    'Haiti': 'North America',
    'Guinea-Bissau': 'Africa',
    'Guinea': 'Africa',
    'Burundi': 'Africa',
    'Congo. Democratic Republic of': 'Africa',
    'Niger': 'Africa',
    'Yemen': 'Asia',
    'Somalia': 'Africa',
    'Eritrea': 'Africa',
    'Chad': 'Africa',
    'Central African Republic': 'Africa',
    'South Sudan': 'Africa',
    'World': 'World',
}


In [12]:
# Map the dictionary to add a new column
df['continent'] = df['country'].map(country_to_continent)

In [13]:
# Preview
df[['country','continent']].head()

Unnamed: 0,country,continent
0,Norway,Europe
1,Finland,Europe
2,Denmark,Europe
3,Iceland,Europe
4,Switzerland,Europe


In [14]:
# Check for unique continents
df.continent.unique()

array(['Europe', 'North America', 'Asia', 'Oceania', 'South America',
       'Africa', 'World'], dtype=object)

The last column in the dataset is `World` therefore it displays that and **Antarctica** is not on the list because of obvious reasons.

In [15]:
# Remove the last column
df.drop(df.index[-1], inplace=True)

In [16]:
# Validate the drop
df.shape

(168, 19)

In [17]:
# Check for unique continents again
df.continent.unique()

array(['Europe', 'North America', 'Asia', 'Oceania', 'South America',
       'Africa'], dtype=object)

In [18]:
# Filter the countries with spi_score > avg
filtered_countries = df[df['spi_score'] > 67.4]

In [19]:
# Count the filtered DataFrame
filtered_countries.shape[0]

89

In [20]:
# Save the DataFrame to a new CSV file
new_csv_path = r'C:/Users/Tanish/Desktop/SPI-Analysis-Project/Data/new_spi.csv'
df.to_csv(new_csv_path, index=False)

In [21]:
# Confirm that the DataFrame has been saved
print(f"DataFrame has been saved to {new_csv_path}")

DataFrame has been saved to C:/Users/Tanish/Desktop/SPI-Analysis-Project/Data/new_spi.csv
