# Filtering & Plotting: Modifying Data Before Creating a Chart with Plotly (Exercise 2)

## Question. &nbsp;&nbsp; What is the relationship between horsepower and miles to the gallon for American cars?

Dataset: Mileage per gallon performances of various car models from 1970-1982.

Columns:
- mpg (miles per gallon)
- cylinders (# cylinders)
- displacement (engine displacement in cubic inches)
- horsepower
- weight (vehicle weight in lbs)
- acceleration (time to accelerate from O to 60 mph in seconds)
- model_year (modulo 100)
- origin (origin of car as 1. American, 2. European, 3. Japanese)
- name (make and model)

## Pre-programming Discussion

#!action

Q1. What column should we use for filtering the data? Why?

#!response

origin. 

We only want to know about South American countries.

#!action

Q2. Which columns from the dataset will be used in the plot?

#!response

mpg and horsepower. 

The column names are similar to the terms in the question.

#!action

Q3. Which column will go on the x-axis?

#!response

horsepower. 

y changes as a function of x. Wording of question implies mpg changes as a function of horsepower. Therefore, mpg is y and horsepower is x.

#!action

Q4. Which column will go on the y-axis?

#!response

mpg. 

See previous question explanation.

#!action

Screenshot your answers.

## Programming Activity: Creating a Scatter Plot Graph of MPG vs Horsepower of American Cars

### Step 1. &nbsp;&nbsp;&nbsp; Read CSV Data into Pandas Dataframe

#### Substep. &nbsp;&nbsp; Import Pandas Library (if Needed)

#!blhint

- Create variable &nbsp;`pd`
- `import pandas as pd`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li><code>import .. as ..</code><ul><li>The <code>import .. as ..</code> block is found under IMPORT.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li></ul></li></ul></details>

In [1]:
#!response

import pandas as pd

#### Substep. &nbsp;&nbsp; Read CSV data and Save in Variable

#!blhint

- Create variable &nbsp;`df`
- `set df to` &nbsp;:&nbsp; `with pd do read_csv using` &nbsp;:&nbsp; `" datasets/mpg.csv "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>with .. do .. using ..</code><ul><li>The <code>with .. do .. using ..</code> block is found under VARIABLES.</li> <li>If <code>do</code> dropdown says "!Not populated until you execute code", click anywhere in the notebook tab, then click try "Run All Above Selected Cell" from the "Run" menu.</li> <li>If <code>with .. do .. using ..</code> block does not want to snap together nicely with the <code>set .. to</code> block, try dragging the <code>set .. to</code> block instead.</li> <li>You can use the <code>+ -</code> controls on the block to change the number of notches. Unless specifically instructed, the block should not have any empty notches when you click Blocks to Code. </li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [2]:
#!response

df = pd.read_csv('datasets/mpg.csv')

#### Substep. &nbsp;&nbsp; Display Dataframe Contents

#!blhint

- variable &nbsp;`df`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [3]:
#!response

df

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year,origin,name
0,18.0,8,307.0,130.0,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350.0,165.0,3693,11.5,70,1,buick skylark 320
2,18.0,8,318.0,150.0,3436,11.0,70,1,plymouth satellite
3,16.0,8,304.0,150.0,3433,12.0,70,1,amc rebel sst
4,17.0,8,302.0,140.0,3449,10.5,70,1,ford torino
...,...,...,...,...,...,...,...,...,...
393,27.0,4,140.0,86.0,2790,15.6,82,1,ford mustang gl
394,44.0,4,97.0,52.0,2130,24.6,82,2,vw pickup
395,32.0,4,135.0,84.0,2295,11.6,82,1,dodge rampage
396,28.0,4,120.0,79.0,2625,18.6,82,1,ford ranger


### Step 2. &nbsp;&nbsp;&nbsp; Data Modification

#### Substep. &nbsp;&nbsp; Filter Data

#!blhint

- `set df to` &nbsp;:&nbsp; `df [ .. ]` &nbsp;<-&nbsp; ` .. = .. ` &nbsp;<-
  - `from df get origin`
  - `1`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li><li><code>A</code> &nbsp;&lt;-&nbsp; <code>B</code>&nbsp; means insert block B into the hole in block A.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>\{dictVariable\} [ .. ]</code><ul><li>The <code>\{dictVariable\} [ .. ]</code> block is found under LISTS.</li></ul></li><li><code> .. = .. </code><ul><li>The <code> .. = .. </code> block is found under LOGIC.</li> <li>Use the dropdown to get other comparison operators.</li></ul></li><li><code>from .. get ..</code><ul><li>The <code>from .. get ..</code> block is found under VARIABLES.</li></ul></li></ul></li></ul></details>

In [4]:
#!response

df = df[df.origin == 1]

#### Substep. &nbsp;&nbsp; Display Dataframe Contents

#!blhint

- variable &nbsp;`df`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [5]:
#!response

df

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year,origin,name
0,18.0,8,307.0,130.0,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350.0,165.0,3693,11.5,70,1,buick skylark 320
2,18.0,8,318.0,150.0,3436,11.0,70,1,plymouth satellite
3,16.0,8,304.0,150.0,3433,12.0,70,1,amc rebel sst
4,17.0,8,302.0,140.0,3449,10.5,70,1,ford torino
...,...,...,...,...,...,...,...,...,...
392,27.0,4,151.0,90.0,2950,17.3,82,1,chevrolet camaro
393,27.0,4,140.0,86.0,2790,15.6,82,1,ford mustang gl
395,32.0,4,135.0,84.0,2295,11.6,82,1,dodge rampage
396,28.0,4,120.0,79.0,2625,18.6,82,1,ford ranger


### Step 3. &nbsp;&nbsp;&nbsp; Generate Plotly Scatter Graph

#### Substep. &nbsp;&nbsp; Import Plotly Express Library


#!blhint

- Create variable &nbsp;`px`
- `import plotly.express as px`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li><code>import .. as ..</code><ul><li>The <code>import .. as ..</code> block is found under IMPORT.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li></ul></li></ul></details>

In [6]:
#!response

import plotly.express as px

#### Substep. &nbsp;&nbsp; Set Columns as x and y


#!blhint

- Create variable &nbsp;`x`
- `set x to` &nbsp;:&nbsp; `" horsepower "`
- Create variable &nbsp;`y`
- `set y to` &nbsp;:&nbsp; `" mpg "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [7]:
#!response

x = 'horsepower'
y = 'mpg'

#### Substep. &nbsp;&nbsp; Set Additional Plot Options

#!blhint

- Create variable &nbsp;`title`
- `set title to` &nbsp;:&nbsp; `" MPG vs Horsepower of American Cars "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [8]:
#!response

title = 'MPG vs Horsepower of American Cars'

#### Substep. &nbsp;&nbsp; Generate Scatter Graph

#!blhint

- Grab a freestyle block, type &nbsp;`x=x`
- Grab a freestyle block, type &nbsp;`y=y`
- Grab a freestyle block, type &nbsp;`title=title`
- `with px do scatter using` &nbsp;:
    - variable &nbsp;`df`
    - `x=x`
    - `y=y`
    - `title=title`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li>freestyle<ul><li>Unless specifically instructed, use the first block from the FREESTYLE menu.</li></ul></li><li><code>with .. do .. using ..</code><ul><li>The <code>with .. do .. using ..</code> block is found under VARIABLES.</li> <li>If <code>do</code> dropdown says "!Not populated until you execute code", click anywhere in the notebook tab, then click try "Run All Above Selected Cell" from the "Run" menu.</li> <li>If <code>with .. do .. using ..</code> block does not want to snap together nicely with the <code>set .. to</code> block, try dragging the <code>set .. to</code> block instead.</li> <li>You can use the <code>+ -</code> controls on the block to change the number of notches. Unless specifically instructed, the block should not have any empty notches when you click Blocks to Code. </li></ul></li><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [9]:
#!response

px.scatter(df, x=x, y=y, title=title)

#!action

Screenshot the chart.