![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fshorts&branch=master&subPath=Plotly.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# Making Charts

This is a short tutorial on how to create charts for use in class. We will cover
- scatter plots
- bar charts
- line graphs
- bubble plots
- pie charts
- histograms
- plotting from a data frame
- writing cleaner code

This is a live notebook. You should start by running the code. You can click on the "**Run All**" button in the menu above, that looks like this:
<img src="RunButton.png" alt="Run button image" width="100"/>
Or you can select the menu item **Kernel/Restart & Run All.**

To learn all about Plotly for use in Python, please see the documentation online at https://plotly.com/graphing-libraries/


## Getting started

We are programming in Python. If you need help with Python, please see Callyso's "Getting Started" resources on how to start programming.

It should be easy to use the following code directly, making edits to create any changes as you see fit. Just remember to follow the same syntax. That is, use the same punctuation (periods, commas, brackets, parenthesese, spacing) as you see in the code. If something is wrong, Python will give you an error message that can help explain what needs to be fixed.

We start by importing (loading into code) a software library, named Plotly Express.

In [20]:
import plotly.express as px

## Example 1 - a scatter plot

Here is a very simple example. Using the "**scatter**" function from Plotly Express (the px library) we pass x and y values and have them plotted. 


In [21]:
px.scatter(x=[0, 1, 2, 3, 4], y=[0, 1, 4, 9, 16])

Notice the plot above is **live.** You can click on it, scroll on it, even save it to your computer at home. 


### Titles

It is useful to add a title to our plot, which we do by adding the keyword **title** to the function call. 

In [22]:
px.scatter(x=[0, 1, 2, 3, 4], y=[0, 1, 4, 9, 16], title="Our first chart!")

### Labeling the axes

We can also rename the x and y axis, by add the **label** information to the function call. Here we are using a data format called a dictionary,
```
{"x":"Integers", "y":"Squares"}
```
which is a list of pairs of text strings. This tells the plot to replace the label **x** with the word **Integers**, and the label **y** with the word **Squares**.

In [23]:
px.scatter(x=[0, 1, 2, 3, 4], y=[0, 1, 4, 9, 16],labels={"x":"Integers", "y":"Squares"})

### Titles and axes labels

Of course, we can also include a title and change the x,y axis labels using a single function call, like this.

In [24]:
px.scatter(x=[0, 1, 2, 3, 4], y=[0, 1, 4, 9, 16],labels={"x":"Integers", "y":"Squares"},title="A nicer chart")

## Example 2 - a bar graph

Creating a bar graph is the same process, using the **bar** function in Plotly Express (px). 

In [25]:
px.bar(x=[0,1,2,3,4],y=[0,1,4,9,16])

### Titles and axes labels

As before, we can  include a title and change the x,y axis labels using a single function call:

In [26]:
px.bar(x=[0,1,2,3,4],y=[0,1,4,9,16],labels={"x":"Integers", "y":"Squares"},title="A bar chart")

## Example 3 - a line plot

Creating a line plot is the same process, using the **line** function in Plotly Express (px). 

In [27]:
px.line(x=[0,1,2,3,4],y=[0,1,4,9,16],title="My first line chart")

## Example 3 - a bubble chart

A bubble chart is really just a scatter chart, where we control the size of the makers. So we need to include both x and y values, as well as a **size** parameter for each bubble.

In [28]:
px.scatter(x=[0,1,2,3,4],y=[0,1,4,9,16],size=[1,1,5,5,10],title="A simple bubble chart")

## Example 5 - a pie graph

Creating a pie graph is similar, using the **pie** function. Instead of x,y for data, we just include a list of values. The pie chart will show the information as percentages.

In [29]:
px.pie(values=[0,1,2,3,4])

### Names and titles

We can add names for our list of values (the two lists must be the same length), and include a title for the chart, like this:

In [31]:
px.pie(values=[0,1,2,3,4],names=["Zero","One","Two","Three","Four"], title="My pie chart")

## Example 6 - a histogram

A histogram looks somewhat like a bar chart, but the information it represents is different. The histogram will take a list of values (the x values) and break them up into a number of equally spaced bins, and count how many items are in each bin.

For instance, you might look at the scores on a test, and see how many students got a score of 90% to 100%, how many got a score of 80% to 90%, how many of 70% to 80%, and so on. Plotting these "counts" gives the histogram of the grade distribution. 

For our example here, we will create a list of 10000 random numbers, using the **randn** function. The histogram will show something like a bell curve, with most of the numbers centered around zero. 

In [12]:
from numpy.random import randn
myList = randn(10000)
px.histogram(x=myList)

### Number of bins

You can specify the number of bins, to control how fine or coarse that the histrogram will divide up the list of numbers to count. Use the parameter **nbins.** We can also add a title, of course.

In [32]:
px.histogram(x=myList,nbins=40,title="A distribution of random numbers")

If you are curious about the list of random number, **myList**, we can print them out. Of course, this is a long list (10,000 numbers) so the printout here shows just a few of them, for simplicity:

In [14]:
myList

array([ 1.59610114,  2.48099261,  0.52261511, ..., -0.20331226,
       -0.46450327, -1.15879721])

## Example 6 - Plotting from a DataFrame

Data scientists often use a data structure known as a DataFrame to hold their data. A DataFrame is structured like a spreadsheet, with rows and columns holding various numbers representing the data.

In the following example, we load in a DataFrame (with a call to GapMinder) representing some information about the population of countries in Europe. We then plot using the pie function, to get a nice pie chart of the various countries. The **values** come from the 'pop' column in the DataFrame and the **names** come from the 'country' column in the DataFrame. 

In [33]:
df = px.data.gapminder().query("year == 2007").query("continent == 'Europe'")
px.pie(df, values='pop', names='country', title='Population of European continent')

In case you are interested, the following shows the information contained in the Dataframe named **df.** Notice you can see here the 'country' and 'pop' columns in the DataFrame, that were used in the code above.

In [16]:
df

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8
83,Austria,Europe,2007,79.829,8199783,36126.4927,AUT,40
119,Belgium,Europe,2007,79.441,10392226,33692.60508,BEL,56
155,Bosnia and Herzegovina,Europe,2007,74.852,4552198,7446.298803,BIH,70
191,Bulgaria,Europe,2007,73.005,7322858,10680.79282,BGR,100
383,Croatia,Europe,2007,75.748,4493312,14619.22272,HRV,191
407,Czech Republic,Europe,2007,76.486,10228744,22833.30851,CZE,203
419,Denmark,Europe,2007,78.332,5468120,35278.41874,DNK,208
527,Finland,Europe,2007,79.313,5238460,33207.0844,FIN,246
539,France,Europe,2007,80.657,61083916,30470.0167,FRA,250


### Another dataframe, with a Bubble chart

We can use the same idea to create a bubble chart, showing the relative population sizes of various countries, and how this relates to their Gross Domestic Product per Capita (GDP), and life expectancy.

The chart uses **color** to identify the continent, and a logarithmic scale on the x-axis (GDP) to make the patterns more clear. 

In [17]:
df2 = px.data.gapminder().query("year==2007")
px.scatter(df2, x="gdpPercap", y="lifeExp",
	         size="pop", color="continent",
                 hover_name="country", log_x=True, size_max=60)

## Writing cleaner code

In the above examples, we tried to keep the code very short by including x and y values in the function calls. This is not considered the best way to write code. 

Usually it is better to define the x and y lists as variables in Python, then pass those variables to the plotting function. The scatter result can also be stored in a variable (like **fig**) and then told to show itslef.

For instance, a scatter plot might be done like this:

In [18]:
x = [0,1,2,3,4]
y = [0,1,4,9,16]
title = "My compressed code makes a nice graph"
fig = px.scatter(x=x,y=y,title=title)
fig.show()

### Less confusion

Of course, you can name the x,y variables anything you like. So if the following example is clearer to you, feel free to write it like this:

In [19]:
x_list = [0,1,2,3,4]
y_list = [0,1,4,9,16]
my_title = "My compressed code makes a nice graph"
fig = px.scatter(x=x_list,y=y_list,title=my_title)
fig.show()

## Summary

There are many, many uses of Plotly for creating interesting graphs. This is only the briefest of introductions. As mentioned above, please go to the Plotly webpages for more details on how to use this tool. 

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)