# Countries Around the World Population

We will scrape this [page](https://www.worldometers.info/geography/how-many-countries-are-there-in-the-world/) in order to read the population per country around the world table.

This table contains 4 columns:

* **``#``:** Number corresponding to the place of population (1 for the country with most population).
* **``Population(year)``:** Population for every country in the current year.
* **``World Share``:** Population world share for every country in %.
* **``Land Area (Km2)``:** Land area for every country in Km$^2$.

In [1]:
import pandas as pd
import requests

In [15]:
url = 'https://www.worldometers.info/geography/how-many-countries-are-there-in-the-world/'

header = {
  "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
  "X-Requested-With": "XMLHttpRequest"
}

r = requests.get(url, headers=header)

dfs = pd.read_html(r.text)

In [16]:
len(dfs)

1

### Top 10 countries with more population

In [29]:
df = dfs[0]
df.columns = ["#", "country","population", "share", "land_area"]
df.head(10)

Unnamed: 0,#,country,population,share,land_area
0,1,China,1439323776,18.5 %,9388211
1,2,India,1380004385,17.7 %,2973190
2,3,United States,331002651,4.2 %,9147420
3,4,Indonesia,273523615,3.5 %,1811570
4,5,Pakistan,220892340,2.8 %,770880
5,6,Brazil,212559417,2.7 %,8358140
6,7,Nigeria,206139589,2.6 %,910770
7,8,Bangladesh,164689383,2.1 %,130170
8,9,Russia,145934462,1.9 %,16376870
9,10,Mexico,128932753,1.7 %,1943950


**Density of that 10 countries**<br>
This is the measurement of population per km$^2$

Density is defined like:

$$Density = \frac{Population}{LandArea}$$

In [32]:
temp = df[df["country"].isin(df.head(10)["country"])].copy()
temp["density"] = temp["population"]/temp["land_area"]
temp[["country", "density"]].sort_values(by="density", ascending=False).head(10).reset_index(drop=True)

Unnamed: 0,country,density
0,Bangladesh,1265.186932
1,India,464.14941
2,Pakistan,286.545688
3,Nigeria,226.335506
4,China,153.311827
5,Indonesia,150.987053
6,Mexico,66.325139
7,United States,36.185356
8,Brazil,25.431426
9,Russia,8.911011


### Which countries have the biggest surface area?

In [33]:
df.sort_values(by="land_area", ascending=False).head(10).reset_index(drop=True)

Unnamed: 0,#,country,population,share,land_area
0,9,Russia,145934462,1.9 %,16376870
1,1,China,1439323776,18.5 %,9388211
2,3,United States,331002651,4.2 %,9147420
3,39,Canada,37742154,0.5 %,9093510
4,6,Brazil,212559417,2.7 %,8358140
5,55,Australia,25499884,0.3 %,7682300
6,2,India,1380004385,17.7 %,2973190
7,32,Argentina,45195774,0.6 %,2736690
8,63,Kazakhstan,18776707,0.2 %,2699700
9,33,Algeria,43851044,0.6 %,2381740
