# Filtering & Plotting: Modifying Data Before Creating a Chart with Plotly (Exercise 3)

## Question. &nbsp;&nbsp; What is the relationship between number won titles and win rate for American tennis players?

Dataset: This data-set has information on various statistical parameters on the former world number 1s tennis players from the last 2 decades.

Columns:
- Player (represents player's name)
- Country (represents player's country)
- Win (represents # of won matches in player's career)
- Losses (represents # of won matches in player's career)
- Win_Rate (represents win rate of player's career)
- Titles (represents # of won titles in player's career)
- Prize_Money (represents earned prize money in player's career)

## Pre-programming Discussion

#!action

Q1. What column should we use for filtering the data? Why?

#!response

Country

We only want to know about American players.

#!action

Q2. Which columns from the dataset will be used in the plot?

#!response

Win_Rate and Titles. 

The column names are similar to the terms in the question.

#!action

Q3. Which column will go on the x-axis?

#!response

Titles

y changes as a function of x. Wording of question implies Win_Rate changes as a function of won Titles. Therefore, Win_Rate is y and Titles is x.

#!action

Q4. Which column will go on the y-axis?

#!response

Win_Rate 

See previous question explanation.

#!action

Screenshot your answers.

## Programming Activity: Creating a Scatter Plot Graph of Titles vs Win Rate of American Tennis Players

### Step 1. &nbsp;&nbsp;&nbsp; Read CSV Data into Pandas Dataframe

#### Substep. &nbsp;&nbsp; Import Pandas Library

#!blhint

- Create variable &nbsp;`pd`
- `import pandas as pd`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li><code>import .. as ..</code><ul><li>The <code>import .. as ..</code> block is found under IMPORT.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li></ul></li></ul></details>

In [1]:
#!response
import pandas as pd

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="3|1_VG=6^a*PSJf%[AEJ">pd</variable></variables><block type="importAs_Python" id="HG@Qc0KnvuKPwPn4n2O}" x="16" y="10"><field name="libraryName">pandas</field><field name="libraryAlias" id="3|1_VG=6^a*PSJf%[AEJ">pd</field></block></xml>

#### Substep. &nbsp;&nbsp; Read CSV data and Save in Variable

#!blhint

- Create variable &nbsp;`df`
- `set df to` &nbsp;:&nbsp; `with pd do read_csv using` &nbsp;:&nbsp; `" datasets/Tennis.csv "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>with .. do .. using ..</code><ul><li>The <code>with .. do .. using ..</code> block is found under VARIABLES.</li> <li>If <code>do</code> dropdown says "!Not populated until you execute code", click anywhere in the notebook tab, then click try "Run All Above Selected Cell" from the "Run" menu.</li> <li>If <code>with .. do .. using ..</code> block does not want to snap together nicely with the <code>set .. to</code> block, try dragging the <code>set .. to</code> block instead.</li> <li>You can use the <code>+ -</code> controls on the block to change the number of notches. Unless specifically instructed, the block should not have any empty notches when you click Blocks to Code. </li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [2]:
#!response
df = pd.read_csv('datasets/Tennis.csv')

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="RVIV6};,0{p761]Hxk*Z">df</variable><variable id="3|1_VG=6^a*PSJf%[AEJ">pd</variable></variables><block type="variables_set" id=".{)PKwPoBWM+Yj6%*j_k" x="43" y="245"><field name="VAR" id="RVIV6};,0{p761]Hxk*Z">df</field><value name="VALUE"><block type="varDoMethod_Python" id="wFQ,$4$P}vKNrb1-J:u]"><field name="VAR" id="3|1_VG=6^a*PSJf%[AEJ">pd</field><field name="MEMBER">read_csv</field><data>pd:read_csv</data><value name="INPUT"><block type="text" id="7,]nX=K9Cowrj`4cigPO"><field name="TEXT">datasets/Tennis.csv</field></block></value></block></value></block></xml>

#### Substep. &nbsp;&nbsp; Display Dataframe Contents

#!blhint

- variable &nbsp;`df`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [3]:
#!response
df

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="RVIV6};,0{p761]Hxk*Z">df</variable></variables><block type="variables_get" id=";$!0oKA{,8M:rRzbx6%Q" x="8" y="272"><field name="VAR" id="RVIV6};,0{p761]Hxk*Z">df</field></block></xml>

Unnamed: 0,Player,Country,Wins,Losses,Win_Rate,Titles,Prize_Money
0,Roger Federer,Switzerland,1222,265,82.2,102,126266005
1,Pete Sampras,USA,762,222,77.4,64,43280489
2,Ivan Lendl,USA,1068,242,81.5,94,21262417
3,Jimmy Connors,USA,1274,282,81.9,109,8641040
4,Novak Djokovic,Serbia,871,182,82.7,75,134684000
5,Rafael Nadal,Spain,960,196,83.0,83,111328858
6,John McEnroe,USA,881,198,81.6,77,12552132
7,Bjorn Borg,Sweden,644,135,82.7,64,3655751
8,Andre Agassi,USA,870,274,76.0,60,31152975
9,Lleyton Hewitt,Australia,616,262,70.2,30,20879934


### Step 2. &nbsp;&nbsp;&nbsp; Data Modification

#### Substep. &nbsp;&nbsp; Filter Data

#!blhint

- `set df to` &nbsp;:&nbsp; `df [ .. ]` &nbsp;<-&nbsp; ` .. = .. ` &nbsp;<-
  - `from df get Country`
  - `" USA "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li><li><code>A</code> &nbsp;&lt;-&nbsp; <code>B</code>&nbsp; means insert block B into the hole in block A.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>\{dictVariable\} [ .. ]</code><ul><li>The <code>\{dictVariable\} [ .. ]</code> block is found under LISTS.</li></ul></li><li><code> .. = .. </code><ul><li>The <code> .. = .. </code> block is found under LOGIC.</li> <li>Use the dropdown to get other comparison operators.</li></ul></li><li><code>from .. get ..</code><ul><li>The <code>from .. get ..</code> block is found under VARIABLES.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [4]:
#!response
df = df[(df.Country == 'USA')]

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="RVIV6};,0{p761]Hxk*Z">df</variable></variables><block type="variables_set" id="LoP|#aOn]!^c`~CD}ZnQ" x="33" y="172"><field name="VAR" id="RVIV6};,0{p761]Hxk*Z">df</field><value name="VALUE"><block type="indexer_Python" id="c=maxm.ju@%B[H)INXE5"><field name="VAR" id="RVIV6};,0{p761]Hxk*Z">df</field><value name="INDEX"><block type="logic_compare" id="=$x(zv.U9JGkuo#44,6#"><field name="OP">EQ</field><value name="A"><block type="varGetProperty_Python" id="sB0Mv+CKuQgVYy?st}J2"><field name="VAR" id="RVIV6};,0{p761]Hxk*Z">df</field><field name="MEMBER">Country</field><data>df:Country</data></block></value><value name="B"><block type="text" id="W|t+_T/Im90x[]$9TdP}"><field name="TEXT">USA</field></block></value></block></value></block></value></block></xml>

#### Substep. &nbsp;&nbsp; Display Dataframe Contents

#!blhint

- variable &nbsp;`df`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [5]:
#!response
df

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="RVIV6};,0{p761]Hxk*Z">df</variable></variables><block type="variables_get" id="@ETx]YPoRQ@]rTgs]DnQ" x="26" y="239"><field name="VAR" id="RVIV6};,0{p761]Hxk*Z">df</field></block></xml>

Unnamed: 0,Player,Country,Wins,Losses,Win_Rate,Titles,Prize_Money
1,Pete Sampras,USA,762,222,77.4,64,43280489
2,Ivan Lendl,USA,1068,242,81.5,94,21262417
3,Jimmy Connors,USA,1274,282,81.9,109,8641040
6,John McEnroe,USA,881,198,81.6,77,12552132
8,Andre Agassi,USA,870,274,76.0,60,31152975
11,Jim Courier,USA,506,237,68.1,23,14034132
16,Andy Roddick,USA,612,213,74.2,32,20640030


### Step 3. &nbsp;&nbsp;&nbsp; Generate Plotly Scatter Graph

#### Substep. &nbsp;&nbsp; Import Plotly Express Library


#!blhint

- Create variable &nbsp;`px`
- `import plotly.express as px`

<details><summary>Blockly Hints</summary><ul><li>Blocks<ul><li><code>import .. as ..</code><ul><li>The <code>import .. as ..</code> block is found under IMPORT.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li></ul></li></ul></details>

In [6]:
#!response
import plotly.express as px

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="k~x.[L6_#?Sh9W0z!fdX">px</variable></variables><block type="importAs_Python" id="q_/Ip|8%GRBU|zB9jSSl" x="31" y="126"><field name="libraryName">plotly.express</field><field name="libraryAlias" id="k~x.[L6_#?Sh9W0z!fdX">px</field></block></xml>

#### Substep. &nbsp;&nbsp; Set Columns as x and y


#!blhint

- Create variable &nbsp;`x`
- `set x to` &nbsp;:&nbsp; `" Titles "`
- Create variable &nbsp;`y`
- `set y to` &nbsp;:&nbsp; `" Win_Rate "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [7]:
#!response
x = 'Titles'
y = 'Win_Rate'

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="+uC_Ek(,Ch[Xfh;~,g[4">x</variable><variable id="#-SQx9`JSJt6W]kxttEp">y</variable></variables><block type="variables_set" id="L0rXiVtcT-|r~Zo{r,b?" x="29" y="170"><field name="VAR" id="+uC_Ek(,Ch[Xfh;~,g[4">x</field><value name="VALUE"><block type="text" id="2Ijz5-q0Il1Z8f[jkv6S"><field name="TEXT">Titles</field></block></value><next><block type="variables_set" id="uu+vPFte`T%b$:3Ry[bw"><field name="VAR" id="#-SQx9`JSJt6W]kxttEp">y</field><value name="VALUE"><block type="text" id="X{bfRPSZc_Cm[M?*KT9h"><field name="TEXT">Win_Rate</field></block></value></block></next></block></xml>

#### Substep. &nbsp;&nbsp; Set Additional Plot Options

#!blhint

- Create variable &nbsp;`color`
- `set color to` &nbsp;:&nbsp; `" Player "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [8]:
#!response
color = 'Player'

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="4x)bDLxW6|R!sv/Vd}uu">color</variable></variables><block type="variables_set" id="|^32=XY%:{yf4w[P[ORo" x="16" y="227"><field name="VAR" id="4x)bDLxW6|R!sv/Vd}uu">color</field><value name="VALUE"><block type="text" id="OOW|:]c#CC-trDx16;pP"><field name="TEXT">Player</field></block></value></block></xml>

#!blhint

- Create variable &nbsp;`title`
- `set title to` &nbsp;:&nbsp; `" Titles vs Win Rate of American Tennis Players "`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li><code>set .. to</code><ul><li>The <code>set .. to</code> block is found under VARIABLES.</li> <li>A variable must have been created before it will appear in the dropdown.</li></ul></li><li><code>" .. "</code><ul><li>The <code>" .. "</code> block is found under TEXT.</li></ul></li></ul></li></ul></details>

In [9]:
#!response
title = 'Titles vs Win Rate of American Tennis Players'

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="Ns+To#5[#(:(tTNh[|[h">title</variable></variables><block type="variables_set" id="O(a5Lgg`XX=uKxv+bI4A" x="21" y="185"><field name="VAR" id="Ns+To#5[#(:(tTNh[|[h">title</field><value name="VALUE"><block type="text" id="LnoifPEc,9uDap0_X-i,"><field name="TEXT">Titles vs Win Rate of American Tennis Players</field></block></value></block></xml>

#### Substep. &nbsp;&nbsp; Generate Scatter Graph

#!blhint

- Grab a freestyle block, type &nbsp;`x=x`
- Grab a freestyle block, type &nbsp;`y=y`
- Grab a freestyle block, type &nbsp;`color=color`
- Grab a freestyle block, type &nbsp;`title=title`
- `with px do scatter using` &nbsp;:
    - variable &nbsp;`df`
    - `x=x`
    - `y=y`
    - `color=color`
    - `title=title`

<details><summary>Blockly Hints</summary><ul><li>Notation<ul><li><code>A</code> &nbsp;:&nbsp; <code>B</code>&nbsp; means snap blocks A and B together.</li></ul></li><li>Blocks<ul><li>freestyle<ul><li>Unless specifically instructed, use the first block from the FREESTYLE menu.</li></ul></li><li><code>with .. do .. using ..</code><ul><li>The <code>with .. do .. using ..</code> block is found under VARIABLES.</li> <li>If <code>do</code> dropdown says "!Not populated until you execute code", click anywhere in the notebook tab, then click try "Run All Above Selected Cell" from the "Run" menu.</li> <li>If <code>with .. do .. using ..</code> block does not want to snap together nicely with the <code>set .. to</code> block, try dragging the <code>set .. to</code> block instead.</li> <li>You can use the <code>+ -</code> controls on the block to change the number of notches. Unless specifically instructed, the block should not have any empty notches when you click Blocks to Code. </li></ul></li><li>variable<ul><li>After it is created, each variable has its own block at the end of the VARIABLES menu.</li></ul></li></ul></li></ul></details>

In [10]:
#!response
px.scatter(df, x=x, y=y, color=color, title=title)

#<xml xmlns="https://developers.google.com/blockly/xml"><variables><variable id="k~x.[L6_#?Sh9W0z!fdX">px</variable><variable id="RVIV6};,0{p761]Hxk*Z">df</variable></variables><block type="varDoMethod_Python" id="sKc!zD^_=81/5|ulu`S;" x="-44" y="184"><field name="VAR" id="k~x.[L6_#?Sh9W0z!fdX">px</field><field name="MEMBER">scatter</field><data>px:scatter</data><value name="INPUT"><block type="lists_create_with" id="hbq8etOkWSPlZeQXbZ%~"><mutation items="5"></mutation><value name="ADD0"><block type="variables_get" id=",b%S@Lasm#v+G(*;?ph:"><field name="VAR" id="RVIV6};,0{p761]Hxk*Z">df</field></block></value><value name="ADD1"><block type="dummyOutputCodeBlock_Python" id="rc]=?|Jx`7bT!x,{p$Wm"><field name="CODE">x=x</field></block></value><value name="ADD2"><block type="dummyOutputCodeBlock_Python" id="}.3LG6NTw$.-)FOgZ))*"><field name="CODE">y=y</field></block></value><value name="ADD3"><block type="dummyOutputCodeBlock_Python" id="qwyl#:)5ERx2syDH@lz}"><field name="CODE">color=color</field></block></value><value name="ADD4"><block type="dummyOutputCodeBlock_Python" id="*W*zR,diP!gyAG3r@FPJ"><field name="CODE">title=title</field></block></value></block></value></block></xml>

#!action

Screenshot the chart.