# **World Population Analisys**

Everyone at one time or another has wondered, **how many are we in the world**? Unfortunately in our daily life we can know a very limited number of people, but precisely for this reason it is interesting to understand how many other people besides us live on this planet. But also, what are the **largest countries** in the world? And again, does the greater surface area mean that those countries are also the most populous or is there something else that affects the number of people present in a country?

So now let's try to find out with this notebook in here we will mainly analyze:
- the population, 
- the extension of the countries, 
- the population density, 
- the percentage of growth, 
- the % of the world's population made up of the various countries

# **Easy preprocessing data**

**Import libraries**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

**Read data file**

In [None]:
filepath = '../input/world-population/2021_population.csv'
data = pd.read_csv(filepath)

**Make a copy of the data so I can work on it**

In [None]:
dt = data.copy()

**See the shapes of the dataset**

In [None]:
dt.shape

**Take a look on the dtypes and if there are missing values**

In [None]:
dt.info()

**Remove the commas from the dataframe so i can use the numbers as int**

In [None]:
dt.replace(',','', regex=True, inplace=True)

**Remove garbage from the numbers**

In [None]:
dt['area'] = dt['area'].str.strip(' sq_km')
dt['density_sq_km'] = dt['density_sq_km'].str.strip('/sq_km')
dt['growth_rate'] = dt['growth_rate'].str.strip('%')
dt['world_%'] = dt['world_%'].str.strip('%')

**Convert data into int64 type**

In [None]:
dt['2021_last_updated'] = dt['2021_last_updated'].astype('int64')
dt['2020_population'] = dt['2020_population'].astype('int64')
dt['area'] = dt['area'].astype('int64')
dt['density_sq_km'] = dt['density_sq_km'].astype('int64')
dt['growth_rate'] = dt['growth_rate'].astype('float64')
dt['world_%'] = dt['world_%'].astype('float64')

**Have a look on the final dataframe**

In [None]:
dt.head()

# Population now (2021)

Let's try to answer our first question, which are the most populated countries in the world? and which ones are less populated?

In [None]:
top_10_country_pop = dt[:10]
least_10_country_pop = dt[-10:]

In [None]:
colors = ['lightslategray',] * 10
colors[0] = 'crimson'

fig = go.Figure(data=[go.Bar(
    x= top_10_country_pop['country'],
    y= top_10_country_pop['2021_last_updated'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 most populated countries in the world now')

![](https://i.redd.it/gltss558fkp31.png)

As we can see **China is the most populated country**, but if we look at the image above we can see that most of the people live in the Middle East area. I think this is so because in the western part of China it is very cold as it is close to the Himalayan mountains, so the ancient peoples settled in the east so the bigger cities of China, like the capital Beijing, are close to the coast.

In [None]:
colors = ['blue'] * 10
colors[9] = 'skyblue'

fig = go.Figure(data=[go.Bar(
    x= least_10_country_pop['country'],
    y= least_10_country_pop['2021_last_updated'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 least populated countries in the world now')

![](https://www.israelandstuff.com/wp-content/uploads/2017/11/Falkland-Islands-Google-maps.png)

While, as was to be expected, the least populated countries are also those with the least territories. Precisely the least populated country in the world are the **Falkland Islands**, small islands that are located (as we can see from the image above) **south of Latin America**. You have probably already heard of these islands as they were the scene of very violent fights between Argentines and British in 1982

# Area

Second question, what are the largest countries? And which are the smaller ones?

In [None]:
areas = dt.sort_values(by = 'area', ascending = False)

In [None]:
top_10_country_areas = areas[:10]
least_10_country_areas = areas[-10:]

In [None]:
colors = ['lightslategray',] * 10
colors[0] = 'crimson'

fig = go.Figure(data=[go.Bar(
    x= top_10_country_areas['country'],
    y= top_10_country_areas['area'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 biggest countries in the world')

![](https://i.redd.it/kt9jfn4g7t841.jpg)

As you probably already know, **the largest country in the world is Russia** which, even after the end of the Soviet Union and the creation of various independent states, still has many territories. But this large amount of territories does not lead to an equitable disposition of the population as most of the population lives in the western (European) part of Russia. This is most likely due to the suboptimal temperatures and living conditions in the central eastern part of the country as the northern part borders on the north pole while the southern part borders on not very hospitable territories.

In [None]:
colors = ['blue'] * 10
colors[9] = 'skyblue'
fig = go.Figure(data=[go.Bar(
    x= least_10_country_areas['country'],
    y= least_10_country_areas['area'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 least populated countries in the world in 2021')

![](https://th.bing.com/th/id/Re235d3839fe8ff42903495dba390080e?rik=8%2b%2fUj%2b3JNAyc7Q&riu=http%3a%2f%2ffrenchmoments.eu%2fwp-content%2fuploads%2f2014%2f08%2fSituation-Map-Principality-of-Monaco.jpg&ehk=WZwlGje8hnllF5uwFlPBD%2bQxHxb29aVCDXExLZIEBMQ%3d&risl=&pid=ImgRaw)

While the less extensive state, or micro state, in the world is the **Principality of Monaco**. A state extended for only 2 km², it borders on France alone and is bathed by the Ligurian Sea. Its best known and most prestigious location is Monte Carlo, the most central district of the city-state, in whose streets the famous Formula 1 racing circuit is located, one of the most difficult in the world due to its tortuousness and variety of routes.

# Population density

Ok, we have taken a look at the population of the various states and their extent but now let's try to analyze the population density. Which country has the largest number of people in relation to its size?

In [None]:
density = dt.sort_values(by = 'density_sq_km', ascending = False)

In [None]:
top_10_country_dens = density[:10]
least_10_country_dens = density[-10:]

In [None]:
colors = ['lightslategray',] * 10
colors[0] = 'crimson'

fig = go.Figure(data=[go.Bar(
    x= top_10_country_dens['country'],
    y= top_10_country_dens['density_sq_km'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 countries with the highest population density')

As we could have predicted, the smallest state, the **Principality of Monaco**, has the highest population density.

In [None]:
colors = ['blue'] * 10
colors[9] = 'skyblue'

fig = go.Figure(data=[go.Bar(
    x= least_10_country_dens['country'],
    y= least_10_country_dens['density_sq_km'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 countries with the lowest population density')

While the country with the lowest population density is the country with the lowest population, the **Falkland Islands**. This can be given precisely by the fact that, although the territory is not very large, there is the smallest population in the world and therefore also the density is very low.

# Growth rate

Okay, we're almost done. But now let's look at the growth rate to understand which countries have a growing population and which ones have fewer and fewer people living there.

In [None]:
growth = dt.sort_values(by = 'growth_rate', ascending = False)

In [None]:
top_10_country_growth = growth[:10]
least_10_country_growth = growth[-10:]

In [None]:
colors = ['lightslategray',] * 10
colors[0] = 'crimson'

fig = go.Figure(data=[go.Bar(
    x= top_10_country_growth['country'],
    y= top_10_country_growth['growth_rate'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 countries with the highest growth rate')

Unexpectedly (at least for me) **Syria is the country with the highest growth rate**. Despite the constant possibility of wars and clashes with other countries it is seen that the population is feeling well and that I procure a lot in order to increase their population

In [None]:
colors = ['blue'] * 10
colors[9] = 'skyblue'

fig = go.Figure(data=[go.Bar(
    x= least_10_country_growth['country'],
    y= least_10_country_growth['growth_rate'],
    marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title_text='The 10 countries with the lowest growth rate')

While, as we can see from the graph, **Lithuania is the country with the lowest growth rate**. It is even negative, which means that unfortunately there is not much immigration or there are more deaths than births. Honestly, I would not have expected it but I hope that in the coming years this ratio, like that of all countries with a negative growth ratio, can improve positively.

# % in world population by countries

As a final analysis we have reserved the perhaps most curious one. Which states have the largest% of the population of the world population?

In [None]:
top_30_country_pop = dt[:30]

In [None]:
fig = px.pie(top_30_country_pop, values='world_%',
             names='country', title='% world population by countries')
fig.show()

As we had previously seen, **China, India and the United States** are the most populous states and even make up **52.63% of the world population,** or more than half of it.

**Thank you so much for looking at this notebook, I hope you enjoyed it and if so I would invite you to put an upvote. If you have found any errors, please write them to me in the comments or even if you have any suggestions for improving the notebook. thank you very much again and good Kaggling!**