# Petrol Prices Worldwide Analysis
Dillon Constantine - dillon.l.constantine@aib.ie

This piece of code is part of an Exploritory analysis into the Petrol Prices from June 2022. The Author of the dataset is Zeeshan Usmani. The Data was collected from google sources such as the sites - IMF, World Bank and United Nations.

The Dataset is an available from Kaggle (https://www.kaggle.com/) and is open source.


### Import the Libaries

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Importing plotly as a interactive visualisation tool
import plotly.express as px
import plotly.graph_objects as go

### Importing the Dataset

In [None]:
#Importing the DataSet

df_ppww = pd.read_csv("Petrol Dataset June 23 2022 -- Version 2.csv", index_col = 'S#', encoding='latin-1') 
df_ppww.head()

### First Look and Data Cleaning
This is where the data is first explored and any inconsitencies with the structure of the data, data types and null values are found and cleaned up.

In [None]:
# Used to get the data structures and types
df_ppww.shape
df_ppww.info()

In [None]:
# Need to change objects to floats (GDP Per Capita (USD) & Gallons GDP Per Capita Can Buy)
df_ppww['Daily Oil Consumption (Barrels)'] = df_ppww['Daily Oil Consumption (Barrels)'].apply(lambda x:x.replace(',', '')).astype(float)
df_ppww['GDP Per Capita ( USD )'] = df_ppww['GDP Per Capita ( USD )'].apply(lambda x:x.replace(',', '')).astype(float)
df_ppww['Gallons GDP Per Capita Can Buy'] = df_ppww['Gallons GDP Per Capita Can Buy'].apply(lambda x:x.replace(',', '')).astype(float)
df_ppww['World Share'] = df_ppww['World Share'].apply(lambda x:x.replace('%', '')).astype(float)

# Drop price in PKR as no use for the column and tidy up data.
df_ppww.drop('Price Per Liter (PKR)', axis = 1, inplace = True)

df_ppww.info()



In [None]:
# Create new Columns total cost column per day in each country. 
#(Need to find the quantity in liters of a Oil Barrel Book Reference - 159L or 42 Gallons

# Create a Gallon Cost per Day in each Country.
daily_gallons_cost = (df_ppww['Daily Oil Consumption (Barrels)']/42) * df_ppww['Price Per Gallon (USD)']
df_ppww['Daily Cost per Gallon'] = round(daily_gallons_cost,2)


# Create a Liter Cost per Day in each country .
daily_liters_cost = (df_ppww['Daily Oil Consumption (Barrels)']/159) * df_ppww['Price Per Liter (USD)']
df_ppww['Daily Cost per Liter'] = round(daily_liters_cost, 2)


df_ppww.head()

In [None]:
# Check for null values in each field.
df_ppww.isnull().sum()

# As there is no missing values in the dataset no need to drop missing values.

In [None]:
# Stats Summary - Can see the basic stats of all the columns.
round(df_ppww.describe(), 2)

# Top 10 Visualisations
This section of the code dipicts the top 10 values based on specified fields.

In [None]:
# Looking at the top 10 countries by the price per gallon (USD)
price_per_gallon_top_10 = df_ppww.sort_values('Price Per Gallon (USD)', ascending = False)[0:10]
price_per_gallon_top_10

In [None]:
# The Price Per Gallon in the Top 10  
plt.figure(figsize = (20,10))
sns.barplot(data = price_per_gallon_top_10, y = 'Price Per Gallon (USD)', x = 'Country')
