# Filtering & Plotting: Modifying Data Before Creating a Chart with Plotly (Exercise 1)

## Question. &nbsp;&nbsp; How have the CO2 emissions from South American countries changed over time?

Dataset: CO2 emissions data from all countries from 2008-2011.

Columns:
- Year
- Country
- Continent
- Emission

## Pre-programming Discussion

#!action

Q1. What column should we use for filtering the data? Why?

#!response

Continent. 

We only want to know about South American countries.

#!action

Q2. Which columns from the dataset will be used in the plot?

#!response

Year, Country, Emission. 

The column names are similar to the terms in the question.

#!action

Q3. Which column will go on the x-axis?

#!response

Year. 

y changes as a function of x. Wording of question implies emission rate changes as a function of time. Therefore, Emission is y and Year is x.

#!action

Q4. Which column will go on the y-axis?

#!response

Emission. 

See previous question explanation.

#!action

Q5. What column makes the most sense to sort by? Why?

#!response

Year. 

The independent variable has an implicit ordering, so we should sort on that variable.

#!action

Q6. Which column will be used for coloring the lines?

#!response

Country. 

See previous question explanation.

#!action

Screenshot your answers.

## Programming Activity: Creating a Line Graph of CO2 Emissions from South American Countries (2008-2011)

### Step 1. &nbsp;&nbsp;&nbsp; Read CSV Data into Pandas Dataframe

#### Substep. &nbsp;&nbsp; Import Pandas Library (if Needed)

#!blhint

- Create variable &nbsp;`pd`
- `import pandas as pd`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li><code>import .. as ..</code><ul><li>The <code>import .. as ..</code> block is found under IMPORT.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li></ul></li></ul></details>

In [28]:
#!response

import pandas as pd

#### Substep. &nbsp;&nbsp; Read CSV data and Save in Variable

#!blhint

- Create variable &nbsp;`df`
- `set df to` &nbsp;:&nbsp; `with pd do read_csv using` &nbsp;:&nbsp; `" datasets/emissions.csv "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>with .. do .. using ..</code><ul><li>The <code>with .. do .. using ..</code> block is found under VARIABLES.</li> <li>If <code>do</code> dropdown says "!Not populated until you execute code", click anywhere in the notebook tab, then click try "Run All Above Selected Cell" from the "Run" menu.</li> <li>If <code>with .. do .. using ..</code> block does not want to snap together nicely with the <code>set .. to</code> block, try dragging the <code>set .. to</code> block instead.</li> <li>You can use the <code>+ -</code> controls on the block to change the number of notches. Unless specifically instructed, the block should not have any empty notches when you click Blocks to Code. </li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [29]:
#!response

df = pd.read_csv('datasets/emissions.csv')

#### Substep. &nbsp;&nbsp; Display Dataframe Contents

#!blhint

- variable &nbsp;`df`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [30]:
#!response

df

Unnamed: 0,Year,Country,Continent,Emission
0,2008,Aruba,South America,24.750133
1,2009,Aruba,South America,24.876706
2,2010,Aruba,South America,24.182702
3,2011,Aruba,South America,23.922412
4,2008,Andorra,Europe,6.296125
...,...,...,...,...
783,2011,Zambia,Africa,0.212450
784,2008,Zimbabwe,Africa,0.569255
785,2009,Zimbabwe,Africa,0.600521
786,2010,Zimbabwe,Africa,0.646073


### Step 2. &nbsp;&nbsp;&nbsp; Data Modification

#### Substep. &nbsp;&nbsp; Filter Data

#!blhint

- `set df to` &nbsp;:&nbsp; `df [ .. ]` &nbsp;<-&nbsp; ` .. = .. ` &nbsp;<-
  - `from df get Continent`
  - `" South America "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li><li><code>A</code> &nbsp;&lt;-&nbsp; <code>B</code>&nbsp; means insert block B into the hole in block A.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>\{dictVariable\} [ .. ]</code><ul><li>The <code>\{dictVariable\} [ .. ]</code> block is found under LISTS.</li></ul></li><li><code> .. = .. </code><ul><li>The <code> .. = .. </code> block is found under LOGIC.</li> <li>Use the dropdown to get other comparison operators.</li></ul></li><li><code>from .. get ..</code><ul><li>The <code>from .. get ..</code> block is found under VARIABLES.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [31]:
#!response

df = df[df.Continent == 'South America']

#### Substep. &nbsp;&nbsp; Sort Data

#!blhint

- Grab freestyle block, type &nbsp;`by='Year'`
- `set df to` &nbsp;:&nbsp; `with df do sort_values using` &nbsp;:&nbsp; `by='Year'`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li>freestyle<ul><li>Unless specifically instructed, use the first block from the FREESTYLE menu.</li></ul></li><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>with .. do .. using ..</code><ul><li>The <code>with .. do .. using ..</code> block is found under VARIABLES.</li> <li>If <code>do</code> dropdown says "!Not populated until you execute code", click anywhere in the notebook tab, then click try "Run All Above Selected Cell" from the "Run" menu.</li> <li>If <code>with .. do .. using ..</code> block does not want to snap together nicely with the <code>set .. to</code> block, try dragging the <code>set .. to</code> block instead.</li> <li>You can use the <code>+ -</code> controls on the block to change the number of notches. Unless specifically instructed, the block should not have any empty notches when you click Blocks to Code. </li></ul></li></ul></li></ul></details>

In [32]:
#!response

df = df.sort_values(by='Year')

#### Substep. &nbsp;&nbsp; Display Dataframe Contents

#!blhint

- variable &nbsp;`df`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [33]:
#!response

df

Unnamed: 0,Year,Country,Continent,Emission
0,2008,Aruba,South America,24.750133
208,2008,Ecuador,South America,2.064521
132,2008,Chile,South America,4.31644
300,2008,Guyana,South America,2.083255
556,2008,Peru,South America,1.441345
100,2008,Brazil,South America,1.990429
584,2008,Paraguay,South America,0.719801
96,2008,Bolivia,South America,1.412189
152,2008,Colombia,South America,1.451147
732,2008,Uruguay,South America,2.471054


### Step 3. &nbsp;&nbsp;&nbsp; Generate Plotly Line Graph

#### Substep. &nbsp;&nbsp; Import Plotly Express Library


#!blhint

- Create variable &nbsp;`px`
- `import plotly.express as px`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li><code>import .. as ..</code><ul><li>The <code>import .. as ..</code> block is found under IMPORT.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li></ul></li></ul></details>

In [34]:
#!response

import plotly.express as px

#### Substep. &nbsp;&nbsp; Set Columns as x and y


#!blhint

- Create variable &nbsp;`x`
- `set x to` &nbsp;:&nbsp; `" Year "`
- Create variable &nbsp;`y`
- `set y to` &nbsp;:&nbsp; `" Emission "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [35]:
#!response

x = 'Year'
y = 'Emission'

#### Substep. &nbsp;&nbsp; Set Additional Plot Options

#!blhint

- Create variable &nbsp;`color`
- `set color to` &nbsp;:&nbsp; `" Country "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [36]:
#!response

color = 'Country'

#!blhint

- Create variable &nbsp;`title`
- `set title to` &nbsp;:&nbsp; `" CO2 Emissions from South American Countries (2008-2011) "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [37]:
#!response

title = 'CO2 Emissions from South American Countries (2008-2011)'

#!blhint

- Create variable &nbsp;`markers`
- `set markers to` &nbsp;:&nbsp; `true`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>true</code><ul><li>The <code>true</code> block is found under LOGIC.</li> <li>Click the dropdown on the <code>true</code> block to make a <code>false</code> block.</li></ul></li></ul></li></ul></details>

In [38]:
#!response

markers = True

#### Substep. &nbsp;&nbsp; Generate Line Graph

#!blhint

- Grab a freestyle block, type &nbsp;`x=x`
- Grab a freestyle block, type &nbsp;`y=y`
- Grab a freestyle block, type &nbsp;`color=color`
- Grab a freestyle block, type &nbsp;`title=title`
- Grab a freestyle block, type &nbsp;`markers=markers`
- `with px do line using` &nbsp;:
    - variable &nbsp;`df`
    - `x=x`
    - `y=y`
    - `color=color`
    - `title=title`
    - `markers=markers`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li>freestyle<ul><li>Unless specifically instructed, use the first block from the FREESTYLE menu.</li></ul></li><li><code>with .. do .. using ..</code><ul><li>The <code>with .. do .. using ..</code> block is found under VARIABLES.</li> <li>If <code>do</code> dropdown says "!Not populated until you execute code", click anywhere in the notebook tab, then click try "Run All Above Selected Cell" from the "Run" menu.</li> <li>If <code>with .. do .. using ..</code> block does not want to snap together nicely with the <code>set .. to</code> block, try dragging the <code>set .. to</code> block instead.</li> <li>You can use the <code>+ -</code> controls on the block to change the number of notches. Unless specifically instructed, the block should not have any empty notches when you click Blocks to Code. </li></ul></li><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [39]:
#!response

px.line(df, x=x, y=y, color=color, title=title, markers=markers)

#!action

Screenshot the chart.