## About Dataset 
---
This dataset provides information on the population statistics of various countries for the years 2023 and 2024. It includes details such as the total area of each country, population density, growth rate, percentage of the world population, and world rank by population.

Data source = https://www.kaggle.com/datasets/dataanalyst001/world-population-by-country-2024

The Questions, I Ask from data
___
## Basic Population Insights

1. Find the country with the largest population in 2024.
2. Find the country with the smallest population in 2024.
3. Rank the top 10 most populated countries.
4. Rank the bottom 10 least populated countries.
5. Calculate the average population across all countries.
6. Check the total world population (sum of all countries).
7. Identify the median population (middle country when sorted).

The Answer given by data 
---

Loading libraries 
 & Quick lookup on data 

In [217]:
# load all librires pandas, seaborn, matplotlib, numpy
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [218]:
# load data set worldpop
df = pd.read_csv('worldpop.csv')

In [219]:
# quick look into data
df.head()

Unnamed: 0,Rank,Country,Population (2024),Yearly Change,Net Change,Density (P/Km²),Land Area (Km²),Migrants (net),Fert. Rate,Med. Age,Urban Pop %
0,1,India,1450935791,0.0089,12866195,488,2973190,-630830,2.0,28,0.37
1,2,China,1419321278,-0.0023,-3263655,151,9388211,-318992,1.0,40,0.66
2,3,United States,345426571,0.0057,1949236,38,9147420,1286132,1.6,38,0.82
3,4,Indonesia,283487931,0.0082,2297864,156,1811570,-38469,2.1,30,0.59
4,5,Pakistan,251269164,0.0152,3764669,326,770880,-1401173,3.5,20,0.34


In [220]:
# quick analysis
df.describe()

Unnamed: 0,Rank,Population (2024),Yearly Change,Net Change,Fert. Rate,Med. Age
count,234.0,234.0,234.0,234.0,234.0,234.0
mean,117.5,34874070.0,0.009424,300229.1,2.332051,31.679487
std,67.694165,138347100.0,0.013671,1064043.0,1.163002,9.810427
min,1.0,496.0,-0.0504,-3263655.0,0.7,14.0
25%,59.25,478260.0,0.0001,75.25,1.5,23.0
50%,117.5,5615064.0,0.0086,18781.5,2.0,32.5
75%,175.75,23465080.0,0.01875,213915.5,2.975,40.0
max,234.0,1450936000.0,0.0507,12866200.0,6.0,59.0


## Basic Population Insights
---
1. Find the country with the largest population in 2024.
2. Find the country with the smallest population in 2024.
3. Rank the top 10 most populated countries.
4. Rank the bottom 10 least populated countries.
5. Calculate the average population across all countries.
6. Check the total world population (sum of all countries).
7. Identify the median population (middle country when sorted).

In [221]:
# quick look into data
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 234 entries, 0 to 233
Data columns (total 11 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Rank               234 non-null    int64  
 1   Country            234 non-null    object 
 2   Population (2024)  234 non-null    int64  
 3   Yearly Change      234 non-null    float64
 4   Net Change         234 non-null    int64  
 5   Density (P/Km²)    234 non-null    object 
 6   Land Area (Km²)    234 non-null    object 
 7   Migrants (net)     234 non-null    object 
 8   Fert. Rate         234 non-null    float64
 9   Med. Age           234 non-null    int64  
 10  Urban Pop %        234 non-null    object 
dtypes: float64(2), int64(4), object(5)
memory usage: 20.2+ KB


In [222]:
# 1. Find the country with the largest population in 2024.
# Find the row with maximum population
Toppopcountry = df.loc[df['Population (2024)'].idxmax()]

# Extract values
country = Toppopcountry['Country']
population = Toppopcountry['Population (2024)']

# Print sentence
print(f"The world's Highest population country is {country} and its population in 2024 is {population:,}.")



The world's Highest population country is India and its population in 2024 is 1,450,935,791.


In [223]:
# 2.Find the country with the smallest population in 2024.
Toppopcountry = df.loc[df['Population (2024)'].idxmin()]

# Extract values
country = Toppopcountry['Country']
population = Toppopcountry['Population (2024)']

# Print sentence
print(f"The world's Least populated country is {country} and its population in 2024 is {population:,}.")



The world's Least populated country is Holy See and its population in 2024 is 496.


In [224]:
# 3. Rank the top 10 most populated countries
df.nlargest(5,'Population (2024)')[['Country','Population (2024)']]

Unnamed: 0,Country,Population (2024)
0,India,1450935791
1,China,1419321278
2,United States,345426571
3,Indonesia,283487931
4,Pakistan,251269164


In [225]:
# 4. Rank the bottom 10 least populated countries.
df.nsmallest(5,'Population (2024)')[['Country','Population (2024)']]

Unnamed: 0,Country,Population (2024)
233,Holy See,496
232,Niue,1819
231,Tokelau,2506
230,Falkland Islands,3470
229,Montserrat,4389


In [226]:
# 5. Calculate the average population across all countries.
aveg = df['Population (2024)'].mean()
print('The average population across all countries is',aveg)    

The average population across all countries is 34874074.45726496


In [227]:
# 6. Check the total world population (sum of all countries)
world_pop = df['Population (2024)'].sum()
print('The total world population is',world_pop)

The total world population is 8160533423


In [228]:
# convert figure into billions
world_pop = world_pop/1000000000
print('The total world population is',world_pop,'billions')


The total world population is 8.160533423 billions


In [229]:
# 7.Identify the median population (middle country when sorted).
median = df['Population (2024)'].median()
print('The median population is',median)

The median population is 5615063.5


## 2. Yearly Change & Growth
---
8. Find the country with the highest yearly growth (%).
9. Find the country with the lowest yearly growth (%) (could even be negative).
10. Compare absolute net change in population across countries.
11. Identify countries with population decline.
13. Calculate the average yearly growth (%) across all countries.

In [243]:
# 8. Find the country with the highest yearly growth (%)
Heightgrowth = df.loc[df['Yearly Change'].idxmax()]
country = Heightgrowth['Country']
growth = Heightgrowth['Yearly Change']
print(f"The country with the highest yearly growth is {country} with a growth rate of {growth*100:.2f}%")


The country with the highest yearly growth is Chad with a growth rate of 5.07%


In [231]:
# 9. Find the country with the lowest yearly growth (%)
lowestgrowth = df.loc[df['Yearly Change'].idxmin()]
country = lowestgrowth['Country']
growth = lowestgrowth['Yearly Change']
print(f"The country with the highest yearly growth is {country} with a growth rate of {growth*100:.2f}%")


The country with the highest yearly growth is Saint Martin with a growth rate of -5.04%


In [232]:
# 10. Compare absolute net change in population across countries.
Top1 = df.loc[df['Net Change'].idxmin()]
country = Top1['Country']
netchange = Top1['Net Change']
print(f"The country with the highest net change in population is {country} with a net change of {netchange:,}")

The country with the highest net change in population is China with a net change of -3,263,655


In [233]:
# 11. Identify countries with population decline.
dec = df.loc[df['Net Change'].idxmin()]
country = dec['Country']
netchange = dec['Net Change']
print(f"The country with the highest net change in population is {country} with a net change of {netchange:,}")

The country with the highest net change in population is China with a net change of -3,263,655


In [234]:
# 12. Calculate the average yearly growth (%) across all countries.
aveg = df['Yearly Change'].mean()
print(f"The average gwoth rate across all countries is {aveg*100:.2f}%")

The average gwoth rate across all countries is 0.94%


### Density & Land Area
---
13. Find the country with the highest population density (P/km²).
14. Find the country with the lowest population density.
15. Compare countries with large land areas but low density (e.g., Canada, Russia).
16. Compare countries with small land areas but high density (e.g., Singapore, Bangladesh).

In [None]:
# 13 Find the country with the highest population density (P/km²).
Densedcountry = df.loc[df['Density (P/Km²)'].idxmax()]
country= Densedcountry['Country']
desity = Densedcountry['Density (P/Km²)']
print(f"The country with highest population density is {country} and their desity (P/km²) {desity} People")



The country with highest population density is Kenya and their desity (P/km²) 99 people


In [None]:
# Find the country with the lowest population density.
df.loc[df['Density (P/Km²)'].idxmin()]

Rank                       206
Country              Greenland
Population (2024)        55840
Yearly Change          -0.0015
Net Change                 -82
Density (P/Km²)              0
Land Area (Km²)        410,450
Migrants (net)            -284
Fert. Rate                 1.9
Med. Age                    35
Urban Pop %                0.9
Name: 205, dtype: object

In [237]:
# Compare countries with large land areas but low density (e.g., Canada, Russia).
df.loc[df['Land Area (Km²)'].idxmax()]

Rank                        13
Country                  Egypt
Population (2024)    116538258
Yearly Change           0.0175
Net Change             2002486
Density (P/Km²)            117
Land Area (Km²)        995,450
Migrants (net)         123,884
Fert. Rate                 2.7
Med. Age                    24
Urban Pop %               0.41
Name: 12, dtype: object

In [238]:
# Compare countries with small land areas but high density (e.g., Singapore, Bangladesh).
df.loc[df['Land Area (Km²)'].idxmin()]

Rank                      234
Country              Holy See
Population (2024)         496
Yearly Change             0.0
Net Change                  0
Density (P/Km²)         1,240
Land Area (Km²)             0
Migrants (net)             18
Fert. Rate                1.0
Med. Age                   59
Urban Pop %              N.A.
Name: 233, dtype: object

5. Fertility & Age Structure
Find the country with the highest fertility rate.

Find the country with the lowest fertility rate.

Compare fertility rates of developed vs. developing countries.

Find the country with the youngest median age.

Find the country with the oldest median age.

Check if fertility rate is linked with population growth.

In [239]:
# Find the country with the highest fertility rate.
Hfer = df.loc[df['Fert. Rate'].idxmax()]
Hfer[['Rank', 'Country', 'Fert. Rate']]

Rank                15
Country       DR Congo
Fert. Rate         6.0
Name: 14, dtype: object

In [240]:
# Find the country with the lowest fertility rate.
Lfer = df.loc[df['Fert. Rate'].idxmin()]
Lfer[['Rank', 'Country', 'Fert. Rate']]


Rank                   29
Country       South Korea
Fert. Rate            0.7
Name: 28, dtype: object

In [241]:
# Find the country with the youngest median age.
Youth = df.loc[df['Med. Age'].idxmin()]
Youth[['Rank', 'Country', 'Med. Age']]


Rank                             122
Country     Central African Republic
Med. Age                          14
Name: 121, dtype: object