# Project: GDP Data extraction and processing

## Scenario:
An international firm  is looking to expand its business in different countries and has hired you as a Data Analyst to create a script that can extract the list of the top 10 largest economies of the world in descending order of their GDPs in Billion USD (rounded to 2 decimal places), as logged by the International Monetary Fund (IMF). 

The required data seems to be available on the Wikipedia URL mentioned below:


URL: https://en.wikipedia.org/wiki/List_of_countries_by_GDP_%28nominal%29

## Objectives
 - Use Webscraping to extract required information from the website.
 - Use Pandas to load and process the tabular data as a dataframe.
 - Use Numpy to manipulate the information contatined in the dataframe.
 - Load the updated dataframe to CSV file.


In [1]:
# Installing required packages.
!pip install pandas
!pip install numpy 
import numpy as np
import pandas as pd




In [2]:
URL = "https://en.wikipedia.org/wiki/List_of_countries_by_GDP_%28nominal%29"

# Extracting tables from website using Pandas.
# Retaining the required table(table number 3) as the required DataFrame.
tables = pd.read_html(URL)
df = tables[2]

In [3]:
# Replaced complex column headers with column numbers
df.columns = range(df.shape[1])

# Retained columns with index 0 and 2 (name of country and GDP quoted by IMF)
df = df[[0,1]]
# Retained rows with index 1 to 10, indicating the top 10 economies.
df=df.iloc[1:11]

# Assigned column names as "Country" and "GDP (Billion USD)"
df.columns=["Country","GDP (Billion USD)"]

# Changed the data type of 'GDP (Billion USD)' column to integer using astype()
df[['GDP (Billion USD)']] = df[['GDP (Billion USD)']].astype(int)
# Convert the GDP value in Million USD to Billion USD
df[['GDP (Billion USD)']] = df[['GDP (Billion USD)']]/1000
# Used numpy.round() to round the GDP value to 2 decimal places.
df[['GDP (Billion USD)']] = np.round(df[['GDP (Billion USD)']],2)

print(df)

           Country  GDP (Billion USD)
1    United States           30507.22
2            China           19231.71
3          Germany            4744.80
4            India            4187.02
5            Japan            4186.43
6   United Kingdom            3839.18
7           France            3211.29
8            Italy            2422.86
9           Canada            2225.34
10          Brazil            2125.96


In [5]:
# Saved the DataFrame to the CSV file
df.to_csv('Top10ecomonies.csv')