# Real World Examples - Line Charts
Keith Galli task - Line Graphs based on video https://www.youtube.com/watch?v=0P7QnIQDBJY&feature=youtu.be  


UP UNTIL THIS POINT, IN ALL OF THE EXERCISES THAT YOU HAVE COMPLETED, ALL OF THE DATAFRAMES HAVE BEEN REFERRED TO AS ***df***. IN THIS SCENARIO THE DATAFRAME HAS A NEW NAME ***gas***.

#### Load the Libraries

In [None]:
#Load Necessary Libraries
import numpy as np
import pandas as pd

#This is a library for creating graphs - 
    #sometimes additional libraries are also needed
    #& matplotlib is not the only option for creating graphs
import matplotlib.pyplot as plt


## Cleaning the Data

Before creating any of the charts below, you should check the datasets for null values.

#### Cleaning the Gas Prices Dataset

In [None]:
#Read in the Gas_prices file
gas = pd.read_csv('gas_prices.csv')

#Examine the dataframe - it provides prices for 10 countries over an 18 year period
gas

In [None]:
#get the shape of the dataframe. It is quite small
gas.shape

In [None]:
#count the number of null values in the dataframe
gas.isna().sum()

In [None]:
#get the precise location of the null value - in the Australia column
gas[gas['Australia'].isnull()]

In [None]:
#As this is the first row in our dataframe - this simplest thing might be to delete the row
gas = gas.dropna(how='any', subset=['Australia'])
gas['Australia'].isna().sum()

In [None]:
#Get the shape of the DataFrame again
gas.shape

## Creating Charts  


### Line Graph showing Prices

#### Display the Dataset

In [None]:
#Examine the dataframe 
gas

#### Very Basic Line Graph

In [None]:
#This code will extract the year and the price for the USA from the dataframe
plt.plot(gas.Year, gas.USA)
#This code will extract the year and the price for Canada from the dataframe
plt.plot(gas.Year, gas.Canada)
#This code will extract the year and the price for Australia from the dataframe
plt.plot(gas.Year, gas.Australia)

#If your column names have more than one word you will have to use the following format:
plt.plot(gas['Year'], gas['South Korea'])

#This code with show the chart 
plt.show()

#### More Detailed Line Graph
  
The graph above provides a very poor visual representation of the data. It has no title, and there are no labels on the axis. There is no legend to tell us which line is which. The x axis values are hard to read. We will address these issues below:


In [None]:
#You can control the size of your chart
plt.figure(figsize=(8,5))

#You can add a title with formatting using a font dictionary
plt.title('Gas Prices over Time (in USD)', fontdict={'fontweight':'bold', 'fontsize': 18})

#This code uses short hand notation to format the style and colour of the lines in the chart
#We can change the labels on our individual lines if we want to make them more meaningful than the default column names
plt.plot(gas.Year, gas.USA, 'b.-', label='United States')
plt.plot(gas.Year, gas.Canada, 'r.-', label='Canada')
plt.plot(gas.Year, gas['South Korea'], 'g.-', label='South Korea')
plt.plot(gas.Year, gas.Australia, 'y.-',label='Australia' )

#This code controls how the values on the x are displayed - the ticks indicate 3 year intervals
plt.xticks(gas.Year[::3].tolist())

#Although our records stop at 2008 - we might want more space at the edge of our chart - so we can add another tick interval
#plt.xticks(gas.Year[::3].tolist()+[2009])

#Add labels to the x and y axis
plt.xlabel('Year')
plt.ylabel('US Dollars')

#Add a legend to the chart - this may not work if you have not added labels to the lines (Like we did above)
plt.legend()

#Save the chart as a separate image
#plt.savefig('Gas_price_figure.png', dpi=300)

#display the chart within the notebook
plt.show()

####  An Alternative Way to Display Many Values in a Line Chart
Look at the FOR loop section

In [None]:
#This code - produces the same chart as the code above.
#You may find the code above easier to understand.

#You can control the size of your chart
plt.figure(figsize=(8,5))

#You can add a title with formatting by using a font dictionary
plt.title('Gas Prices over Time (in USD)', fontdict={'fontweight':'bold', 'fontsize': 18})

# Another Way to plot many values!
countries_to_look_at = ['Australia', 'USA', 'Canada', 'South Korea']
counter =0
for country in gas:                       #For each item in the dataframe
    if country in countries_to_look_at:   #If the item is in the dataframe appears in the list of countries_to_look_at
        counter+=1
        plt.plot(gas.Year, gas[country], marker='.', label=country)   #Plot the line - Set the label equal to country

        
#This code controls how the values on the x are displayed - the ticks indicate 3 year intervals
#Although our records stop at 2008 - we might want more space at the edge of our chart - so we can add another tick interval
plt.xticks(gas.Year[::3].tolist()+[2009])

#Add labels to the x and y axis
plt.xlabel('Year')
plt.ylabel('US Dollars')

#Add a legend to the chart - this may not work if you have not added labels to the lines (Like we did above)
plt.legend()

#Save the chart as a separate image
#plt.savefig('Gas_price_figure.png', dpi=300)

#display the chart within the notebook
plt.show()

**Change the reference from gas to df**

Use Find and Replace. If you choose **Edit** from the main menu and **Find** from the dropdown list, you can expand the **Find dialog box** to make it Find and Replace.

In the Find box type **gas**, in the Replace box type **df**, then click on the **Match Case** icon, then click the **Replace All** button at the bottom of the window.

Check the name of your datafile (where you are reading in the dataset at the top of the Notebook)... the name has probably been changed too. Change it back to correct file name. Run through the notebook to check it still works.

The title on your charts may also need to be changed back (but hopefully not as we have the Match Case icon selected).


## TASK

### Create a Detailed Line Graph  to Show the Gas Prices for Germany, Italy,  Mexico, UK and Japan
   
* Make the size of the chart 16 * 8 
* Provide a meaningful title at font size 16
* Use 2 year intervals
* Use a Circle marker - google 'matplotlib markers' or check out this link: https://matplotlib.org/stable/api/markers_api.html
* Rotate the labels in the x axis 45 degrees - google 'matplotlib rotate x axis labels' or check out this link: https://www.geeksforgeeks.org/how-to-rotate-x-axis-tick-label-text-in-matplotlib/


In [None]:
##Detailed Line Chart
