## About Dataset 
---
This dataset provides information on the population statistics of various countries for the years 2023 and 2024. It includes details such as the total area of each country, population density, growth rate, percentage of the world population, and world rank by population.

Data source = https://www.kaggle.com/datasets/dataanalyst001/world-population-by-country-2024

The Questions, I Ask from data
___
## Basic Population Insights

1. Find the country with the largest population in 2024.
2. Find the country with the smallest population in 2024.
3. Rank the top 10 most populated countries.
4. Rank the bottom 10 least populated countries.
5. Calculate the average population across all countries.
6. Check the total world population (sum of all countries).
7. Identify the median population (middle country when sorted).

The Answer given by data 
---

Loading libraries 
 & Quick lookup on data 

In [111]:
# load all librires pandas, seaborn, matplotlib, numpy
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [112]:
# load data set worldpop
df = pd.read_csv('worldpop.csv')

In [113]:
# quick look into data
df.head()

Unnamed: 0,Rank,Country,Population (2024),Yearly Change,Net Change,Density (P/Km²),Land Area (Km²),Migrants (net),Fert. Rate,Med. Age,Urban Pop %
0,1,India,1450935791,0.0089,12866195,488,2973190,-630830,2.0,28,0.37
1,2,China,1419321278,-0.0023,-3263655,151,9388211,-318992,1.0,40,0.66
2,3,United States,345426571,0.0057,1949236,38,9147420,1286132,1.6,38,0.82
3,4,Indonesia,283487931,0.0082,2297864,156,1811570,-38469,2.1,30,0.59
4,5,Pakistan,251269164,0.0152,3764669,326,770880,-1401173,3.5,20,0.34


In [114]:
# quick analysis
df.describe()

Unnamed: 0,Rank,Population (2024),Yearly Change,Net Change,Land Area (Km²),Migrants (net),Fert. Rate,Med. Age
count,231.0,231.0,231.0,231.0,231.0,231.0,231.0,231.0
mean,116.121212,35326730.0,0.009561,304128.5,561347.5,76.78788,2.342424,31.502165
std,67.017084,139188800.0,0.013705,1070404.0,1701208.0,180462.4,1.166124,9.680493
min,1.0,1819.0,-0.0504,-3263655.0,1.0,-1401173.0,0.7,14.0
25%,58.5,527361.0,0.0004,117.5,2710.0,-12237.5,1.5,23.0
50%,116.0,5805962.0,0.0088,24385.0,82200.0,-604.0,2.0,32.0
75%,173.5,24013690.0,0.01885,223595.0,403820.0,3498.5,3.0,40.0
max,233.0,1450936000.0,0.0507,12866200.0,16376870.0,1286132.0,6.0,54.0


## Basic Population Insights
---
1. Find the country with the largest population in 2024.
2. Find the country with the smallest population in 2024.
3. Rank the top 10 most populated countries.
4. Rank the bottom 10 least populated countries.
5. Calculate the average population across all countries.
6. Check the total world population (sum of all countries).
7. Identify the median population (middle country when sorted).

In [115]:
# quick look into data
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 231 entries, 0 to 230
Data columns (total 11 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Rank               231 non-null    int64  
 1   Country            231 non-null    object 
 2   Population (2024)  231 non-null    int64  
 3   Yearly Change      231 non-null    float64
 4   Net Change         231 non-null    int64  
 5   Density (P/Km²)    231 non-null    object 
 6   Land Area (Km²)    231 non-null    int64  
 7   Migrants (net)     231 non-null    int64  
 8   Fert. Rate         231 non-null    float64
 9   Med. Age           231 non-null    int64  
 10  Urban Pop %        231 non-null    object 
dtypes: float64(2), int64(6), object(3)
memory usage: 20.0+ KB


In [116]:
# 1. Find the country with the largest population in 2024.
# Find the row with maximum population
Toppopcountry = df.loc[df['Population (2024)'].idxmax()]

# Extract values
country = Toppopcountry['Country']
population = Toppopcountry['Population (2024)']

# Print sentence
print(f"The world's Highest population country is {country} and its population in 2024 is {population:,}.")



The world's Highest population country is India and its population in 2024 is 1,450,935,791.


In [117]:
# 2.Find the country with the smallest population in 2024.
Toppopcountry = df.loc[df['Population (2024)'].idxmin()]

# Extract values
country = Toppopcountry['Country']
population = Toppopcountry['Population (2024)']

# Print sentence
print(f"The world's Least populated country is {country} and its population in 2024 is {population:,}.")



The world's Least populated country is Niue and its population in 2024 is 1,819.


In [118]:
# 3. Rank the top 10 most populated countries
df.nlargest(5,'Population (2024)')[['Country','Population (2024)']]

Unnamed: 0,Country,Population (2024)
0,India,1450935791
1,China,1419321278
2,United States,345426571
3,Indonesia,283487931
4,Pakistan,251269164


In [119]:
# 4. Rank the bottom 10 least populated countries.
df.nsmallest(5,'Population (2024)')[['Country','Population (2024)']]

Unnamed: 0,Country,Population (2024)
230,Niue,1819
229,Tokelau,2506
228,Montserrat,4389
227,Saint Helena,5237
226,Saint Pierre & Miquelon,5628


In [120]:
# 5. Calculate the average population across all countries.
aveg = df['Population (2024)'].mean()
print('The average population across all countries is',aveg)    

The average population across all countries is 35326725.614718616


In [121]:
# 6. Check the total world population (sum of all countries)
world_pop = df['Population (2024)'].sum()
print('The total world population is',world_pop)

The total world population is 8160473617


In [122]:
# convert figure into billions
world_pop = world_pop/1000000000
print('The total world population is',world_pop,'billions')


The total world population is 8.160473617 billions


In [123]:
# 7.Identify the median population (middle country when sorted).
median = df['Population (2024)'].median()
print('The median population is',median)

The median population is 5805962.0


## 2. Yearly Change & Growth
---
8. Find the country with the highest yearly growth (%).
9. Find the country with the lowest yearly growth (%) (could even be negative).
10. Compare absolute net change in population across countries.
11. Identify countries with population decline.
13. Calculate the average yearly growth (%) across all countries.

In [124]:
# 8. Find the country with the highest yearly growth (%)
Heightgrowth = df.loc[df['Yearly Change'].idxmax()]
country = Heightgrowth['Country']
growth = Heightgrowth['Yearly Change']
print(f"The country with the highest yearly growth is {country} with a growth rate of {growth*100:.2f}%")


The country with the highest yearly growth is Chad with a growth rate of 5.07%


In [125]:
# 9. Find the country with the lowest yearly growth (%)
lowestgrowth = df.loc[df['Yearly Change'].idxmin()]
country = lowestgrowth['Country']
growth = lowestgrowth['Yearly Change']
print(f"The country with the highest yearly growth is {country} with a growth rate of {growth*100:.2f}%")


The country with the highest yearly growth is Saint Martin with a growth rate of -5.04%


In [126]:
# 10. Compare absolute net change in population across countries.
Top1 = df.loc[df['Net Change'].idxmin()]
country = Top1['Country']
netchange = Top1['Net Change']
print(f"The country with the highest net change in population is {country} with a net change of {netchange:,}")

The country with the highest net change in population is China with a net change of -3,263,655


In [127]:
# 11. Identify countries with population decline.
dec = df.loc[df['Net Change'].idxmin()]
country = dec['Country']
netchange = dec['Net Change']
print(f"The country with the highest net change in population is {country} with a net change of {netchange:,}")

The country with the highest net change in population is China with a net change of -3,263,655


In [128]:
# 12. Calculate the average yearly growth (%) across all countries.
aveg = df['Yearly Change'].mean()
print(f"The average gwoth rate across all countries is {aveg*100:.2f}%")

The average gwoth rate across all countries is 0.96%


### Density & Land Area
---
13. Find the country with the highest population density (P/km²).
14. Find the country with the lowest population density.
15. Compare countries with large land areas but low density (e.g., Canada, Russia).
16. Compare countries with small land areas but high density (e.g., Singapore, Bangladesh).

In [142]:
# 13 Find the country with the highest population density (P/km²).
Densedcountry = df.loc[df['Density (P/Km²)'].idxmax()]
print(f"The country with highest population density is {Densedcountry['Country']} and their desity (P/km²) {Densedcountry['Density (P/Km²)']} People")



The country with highest population density is Kenya and their desity (P/km²) 99 People


In [130]:
# 14. Find the country with the lowest population density.
Densedcountry = df.loc[df['Density (P/Km²)'].idxmin()]
country= Densedcountry['Country']
desity = Densedcountry['Density (P/Km²)']
print(f"The country with lowest population density is {country} and their desity (P/km²) {desity} People")

The country with lowest population density is Sint Maarten and their desity (P/km²) 1,275 People


In [131]:
# 15. Compare countries with large land areas but low density (e.g., Canada, Russia).
landarea = df.loc[df['Land Area (Km²)'].idxmax()]
density = df.loc[df['Density (P/Km²)'].idxmin()]
print(f"The country with highest land area is {landarea['Country']} and their density (P/km²) is {density['Density (P/Km²)']}")

The country with highest land area is Russia and their density (P/km²) is 1,275


In [132]:
# 16. Compare countries with small land areas but high density (e.g., Singapore, Bangladesh).
landarea2 = df.loc[df['Land Area (Km²)'].idxmin()]
density2 = df.loc[df['Density (P/Km²)'].idxmax()]
print(f"The country with highest land area is {landarea2['Country']} and their density (P/km²) is {density2['Density (P/Km²)']}")

The country with highest land area is Monaco and their density (P/km²) is 99


### Fertility & Age Structure
17. Find the country with the highest fertility rate.
18. Find the country with the lowest fertility rate.
19. Compare fertility rates of developed vs. developing & less developed countries.
20. Find the country with the youngest median age.
21. Find the country with the oldest median age.

In [133]:
# 17. Find the country with the highest fertility rate.
Hfer = df.loc[df['Fert. Rate'].idxmax()]
print(f"The height fertility rate country is {Hfer['Country']} with fertility rate {Hfer['Fert. Rate']}")

The height fertility rate country is DR Congo with fertility rate 6.0


In [134]:
# 18. Find the country with the lowest fertility rate.
Lfer = df.loc[df['Fert. Rate'].idxmin()]
print(f"The lowest fertility rate country is {Lfer['Country']} with fertility rate {Lfer['Fert. Rate']}")


The lowest fertility rate country is South Korea with fertility rate 0.7


In [135]:
# 19. Compare fertility rates of developed vs. developing countries.
developed = df.loc[df['Fert. Rate'] > 4][['Country','Fert. Rate']].sort_values('Fert. Rate',ascending=False).nlargest(10,'Fert. Rate')
print(developed)

                      Country  Fert. Rate
14                   DR Congo         6.0
64                       Chad         6.0
121  Central African Republic         6.0
67                    Somalia         6.0
53                      Niger         5.9
57                       Mali         5.5
40                     Angola         5.0
77                    Burundi         4.8
35                Afghanistan         4.8
44                 Mozambique         4.7


In [136]:
developing = df.loc[(df['Fert. Rate'] > 3) & (df['Fert. Rate'] < 4), ['Country', 'Fert. Rate']].sort_values('Fert. Rate', ascending=False).nlargest(10,'Fert. Rate')
print(developing)

           Country  Fert. Rate
9         Ethiopia         3.9
48      Madagascar         3.9
117        Liberia         3.9
143         Gambia         3.9
68         Senegal         3.8
147  Guinea-Bissau         3.8
80     South Sudan         3.8
188          Samoa         3.8
162        Comoros         3.8
73        Zimbabwe         3.7


In [137]:
# less developed country
lessdeveloped = df.loc[df['Fert. Rate'] < 3][['Country', 'Fert. Rate']].sort_values('Fert. Rate', ascending=True).nlargest(10,'Fert. Rate')
print(lessdeveloped)

              Country  Fert. Rate
215  Marshall Islands         2.9
173        Micronesia         2.8
106        Kyrgyzstan         2.8
97             Israel         2.8
12              Egypt         2.7
32            Algeria         2.7
191              Guam         2.7
218      Saint Martin         2.7
146           Lesotho         2.7
103      Turkmenistan         2.7


In [138]:
# 20.  Find the country with the youngest median age.
Youth = df.loc[df['Med. Age'].idxmin()]
print(f"The country with the youngest median age is {Youth['Country']} with median age {Youth['Med. Age']}")


The country with the youngest median age is Central African Republic with median age 14


In [139]:
# 21. Find the country with the oldest median age.
old = df.loc[df['Med. Age'].idxmax()]
print(f"The country with the oldest median age is {old['Country']} with median age {old['Med. Age']}")

The country with the oldest median age is Monaco with median age 54


### Migration & Urbanization
22. Find the countries with the highest positive migration (net inflow).
23. Find the countries with the highest negative migration (net outflow).
24. Compare migration trends in highly populated vs. less populated countries.
25. Identify countries with 100% urban population (if any).
26. Compare countries with lowest urbanization % (mostly rural).
27. Check if urbanization % is related to median age.

In [140]:
# 22. Find the countries with the highest positive migration (net inflow).
inflow = df.loc[df['Migrants (net)'].idxmax()]
print(f"The country with the highest positive migration is {inflow['Country']} with net inflow {inflow['Migrants (net)']}")

The country with the highest positive migration is United States with net inflow 1286132


In [141]:
# 23. Find the countries with the highest negative migration (net outflow).
outflow = df.loc[df['Migrants (net)'].idxmin()]
print(f"The country with the highest negative migration is {outflow['Country']} with net outflow {outflow['Migrants (net)']}")

The country with the highest negative migration is Pakistan with net outflow -1401173
