# Introduction

One of the most popular uses for python is for data analysis and visualisation. You might have created a graph in MicroSoft Excel before. During this process we have to click around the software to modify our graph's appearance/what data it takes as an input. This might not take too much time if you're only doing it once but what if you had to make similar graphs another ten times? A hundred times? Or a thousand times? The process would quickly get very boring, take a lot of time, and you'd be more likely to make mistakes.

We can use a python package called `matplotlib` to help us plot data using python. In this case, our program is a series of instructions telling matplotlib what data to use and how to format it. Its very customisable and you can use the same code over and over again.

This activity is a little more difficult than the ones before it. Don't worry if you don't understand how all the code works. Here you will mostly copy and paste it and just edit snippets. The aim is to show you how python is used for plotting since this is important in lots of jobs like data science and research!

<br> <br>

# Python plotting

## Women in STEM

In the following activity we are going to create our own customised version of this graph:

![Alt text](vscode-local:/c%3A/Users/Emily/Documents/GitHub/outreach_jupyter/images/12.png)

### Load our packages

In the previous activity we loaded a python package called `sleep` that contained functions to let python carry out additional tasks. Python does not know how to process and plot data on its own. So in this activity we're going to need to load two extra packages known as 'pandas' and 'matplotlib'. 'Pandas' is a package that lets python read and edit data, much like you'd use Excel to process raw data. 'Matplotlib' is the package that then lets us plot the data. We'd load them by pasting the following code into our first chunk and pressing the play button:

In [1]:
import matplotlib.pyplot as plt
import pandas as pd

### Load the data

In [2]:
data = pd.read_csv("https://raw.githubusercontent.com/ejjohnson93/ejjohnson93.github.io/main/data/women_stem_data.csv")

### Plot it!

In [None]:
x = data["Date"]

In [None]:
y = data["Comp Sci"]

In [None]:
plt.plot(x, y)
plt.show()

In [None]:
plt.plot(x, y)
plt.xlabel('Year')
plt.ylabel('% women majors')
plt.show()

In [None]:
plt.plot(x, y, label = "Computer Science")
plt.xlabel('Year')
plt.ylabel('% women majors')
plt.legend()
plt.show()

In [None]:
y = data["Comp Sci"]
y2 = data["Law"]
y3 = data["Medical school"]
y4 = data["Physical sciences"]

In [None]:
# Plot the data
plt.plot(x, y, label = "Computer Science")
plt.plot(x, y2, label = "Law")
plt.plot(x, y3, label = "Medical School")
plt.plot(x, y4, label = "Physical Sciences")

# Add axis labels and legend
plt.xlabel('Year')
plt.ylabel('% women majors')
plt.legend()
plt.show()

### Make it pretty

In [None]:

plt.plot(x, y, label = "Computer Science", color = "red")
plt.plot(x, y2, label = "Law", color = "lightseagreen")
plt.plot(x, y3, label = "Medical School", color = "teal")
plt.plot(x, y4, label = "Physical Sciences", color = "mediumturquoise")

plt.xlabel('Year')
plt.ylabel('% women majors')
plt.legend()
plt.show()

Challenge: can you change the colours of the graph to ones of your chosing

    Can you change the colours using their names?

    Can you change the colours using hexcodes instead?


In [None]:
# Code goes here

In [None]:
plt.plot(x, y, label = "Computer Science", color = "red", linewidth=1.5, linestyle="solid")
plt.plot(x, y2, label = "Law", color = "lightseagreen", linewidth=1.4, linestyle = "dashed")
plt.plot(x, y3, label = "Medical School", color = "teal", linewidth=1.3, linestyle = "dotted")
plt.plot(x, y4, label = "Physical Sciences", color = "mediumturquoise", linewidth=1.2, linestyle = "dashdot")

plt.xlabel('Year')
plt.ylabel('% women majors')
plt.legend()
plt.show()

Challenge: can you edit the code above so the graph has your own custom colours, line widths and styles? 

In [None]:
# Code goes here

In [None]:
import matplotlib.style as style 

In [None]:
style.available

In [None]:
style.use('ggplot')

plt.plot(x, y, label = "Computer Science")
plt.plot(x, y2, label = "Law")
plt.plot(x, y3, label = "Medical School")
plt.plot(x, y4, label = "Physical Sciences")

plt.xlabel('Year')
plt.ylabel('% women majors')
plt.legend()
plt.show()

Challenge: pick a style! Apply it to your graph instead of the one used in the example above. Try a few, which do you like best? 

In [None]:
# Code goes here

# Supplementary activities
## Map creation

In [3]:
# Import folium as 'folium' 
import folium

In [4]:
map = folium.Map(location=[53.410782, -2.977840])

In [5]:
map

In [None]:
map2 = folium.Map(location=[53.4074271, -2.96251923], zoom_start=17)

In [None]:
map2

In [6]:
map3 = folium.Map(location=[53.4074271, -2.96251923],
  tiles="Stamen Watercolor",
  zoom_start=12)

In [7]:
map3

 Challenge: can you create a map of your school’s location?

    Can you zoom in on it?

    Can you change the theme?


In [None]:
# Code goes here

 Challenge: pick another location, preferably somewhere outside the UK. Can you make a map for it?

    If you wanted to you could also just pick random values for latitude and longitude and see what you find! Values for latitude (the horizontal lines) can be between -90 and 90. Values for longitude (the vertical lines) can be between -180 and 180. Be warned, you’re most likely to just end up with open sea if you choose at random…


In [None]:
# Code goes here