# [Python Reference Link](http://www.data8.org/sp20/python-reference.html)
*Run the cell below so that we can set our modules up*

In [2]:
import numpy as np
from datascience import *

# These lines set up graphing capabilities.
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import warnings
warnings.simplefilter('ignore', FutureWarning)

from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

# Importing our data

In [3]:
math211_survey = Table.read_table('MATH_211_Survey_Cleaned_GPA.csv')
drinks = Table.read_table('drinks.csv')
discounts = Table.read_table('discounts.csv')
countries = Table.read_table('countries.csv').select('country', 'name', 'world_6region')

# Let's look at a [Demo Request](https://forms.gle/rtF16ch5ABCzM8qn7)

For [Project01](https://skyline.cloudbank.2i2c.cloud/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fb00chan%2Ffall_24_math_211_skyline&urlpath=tree%2Ffall_24_math_211_skyline%2FProjects%2Fproject1%2Fproject1.ipynb&branch=main): Question 7 from 'Global Poverty' Section:

**Question 7.** Now, we'll actually write the function called `poverty_timeline`. Recall that `poverty_timeline` takes **the name of a country** as its argument (not the Alpha-3 country code). It should draw a line plot of the number of people living in poverty in that country with time on the horizontal axis. The line plot should have a point for each row in the `poverty` table for that country. To compute the population living in poverty from a poverty percentage, multiply by the population of the country **in that year**.

*Hint:* This question is long. Feel free to create cells and experiment. You can create cells by going to the toolbar and hitting the `+` button, or by going to the `Insert` tab.

In [4]:
def poverty_timeline(country):
    '''Draw a timeline of people living in extreme poverty in a country.'''
    geo = ...
    # This solution will take multiple lines of code. Use as many as you need
    ...
    # Don't change anything below this line. 
    plots.title(country)
    plots.ylim(bottom=0)
    plots.show() # This should be the last line of your function. 

## The Strategy: Overall

(1)The input is the name of a country, and it's corresponding output is a line plot of the number of people living in poverty within that country. We can use both the `poverty` and `population` tables, however they identify their countries through `geo` codes so we will need to find a way to convert the country name into it's corresponding `geo` code using the `countries` table. 

(2)While the `poverty` and `population` tables contain our information of interest, the population is given as a frequency/count, while the poverty table gives us a percentages. We will use both of these facts to help us calculate the number of people living in poverty (by multiplication). Therefore, we want to join these two tables to combine the information into a single table. 

To set up the join, we will need to filter the two tables to make sure that we have the `poverty` and `population` table contains the rows of only the country of interest (i.e. the input). Once we perform the appropriate filter, we will can then join by the `time` variable, since both tables contain years under the `time` column. This also makes sure that we match up the poverty percentage to their appropriate population count of that year (i.e. we want to match the 25% poverty of **1956** to the population size of India in **1956**). We will already know that the join will be mergining tables of the same country since we filtered it already as so. 

(4)Once we successfully join the table, then, we caluclate the number of people in poverty in a given country/year by multiplying the columns containing population and poverty and then appending that column to the table. 

(5) We will use the `.plot()` method to plot the corresponding poverty counts for an input country (will not be covered in demo request)

#### (1) Input Conversion
The input is the name of a country, and it's corresponding output is a line plot of the number of people living in poverty within that country. We can use both the `poverty` and `population` tables, however they identify their countries through `geo` codes so we will need to find a way to convert the country name into it's corresponding `geo` code using the `countries` table. 

Let's practice with the the data from the project below:

In [5]:
country = 'United States' #this line to mimic the input of the function in Question 7
geo = countries.where('name',are.equal_to(country)).column('country').item(0)
geo

'usa'

In [6]:
'ind'

'ind'

#### (2) Filter and Join setup
While the `poverty` and `population` tables contain our information of interest, the population is given as a frequency/count, while the poverty table gives us a percentages. We will use both of these facts to help us calculate the number of people living in poverty (by multiplication). Therefore, we want to join these two tables to combine the information into a single table.

To set up the join, we will need to filter the two tables to make sure that we have the `poverty` and `population` table contains the rows of only the country of interest (i.e. the input). Once we perform the appropriate filter, we will can then join by the `time` variable, since both tables contain years under the `time` column. This also makes sure that we match up the poverty percentage to their appropriate population count of that year (i.e. we want to match the 25% poverty of **1956** to the population size of India in **1956**). We will already know that the join will be mergining tables of the same country since we filtered it already as so. 

**For this step (and beyond), we will practice a parallel process with `drinks` and `discounts` to match the drink prices with their discounts, and calculate their corresponding discounted price.**

In [7]:
asha_drinks = drinks.where('Cafe',are.equal_to('Asha'))
asha_drinks

Drink,Size,Cafe,Price ($)
Milk Tea,Medium,Asha,5.5
Tea,Medium,Asha,2.5
Espresso,Medium,Asha,3.0
Latte,Medium,Asha,4.75
Cappucino,Medium,Asha,4.25
Blended Coffee,Medium,Asha,6.0
Drip Coffee,Medium,Asha,3.5
Milk Tea,Large,Asha,7.0
Tea,Large,Asha,4.0
Espresso,Large,Asha,4.5


In [8]:
asha_discounts = discounts.where('Location',are.equal_to('Asha'))
asha_discounts

Coupon,Location
0.05,Asha
0.1,Asha
0.15,Asha
0.25,Asha


#### (3) Join
To set up the join, we will need to filter the two tables to make sure that we have the `poverty` and `population` table contains the rows of only the country of interest (i.e. the input). Once we perform the appropriate filter, we will can then join by the `time` variable, since both tables contain years under the `time` column. This also makes sure that we match up the poverty percentage to their appropriate population count of that year (i.e. we want to match the 25% poverty of **1956** to the population size of India in **1956**). We will already know that the join will be mergining tables of the same country since we filtered it already as so. 

In [9]:
combined = asha_drinks.join('Cafe', asha_discounts,'Location')
combined

Cafe,Drink,Size,Price ($),Coupon
Asha,Milk Tea,Medium,5.5,0.05
Asha,Milk Tea,Medium,5.5,0.1
Asha,Milk Tea,Medium,5.5,0.15
Asha,Milk Tea,Medium,5.5,0.25
Asha,Tea,Medium,2.5,0.05
Asha,Tea,Medium,2.5,0.1
Asha,Tea,Medium,2.5,0.15
Asha,Tea,Medium,2.5,0.25
Asha,Espresso,Medium,3.0,0.05
Asha,Espresso,Medium,3.0,0.1


#### (4) Calculate and append
Once we successfully join the table, then, we caluclate the number of people in poverty in a given country/year by multiplying the columns containing population and poverty and then appending that column to the table. 

In [10]:
discount_prices = combined.column('Price ($)') * combined.column('Coupon')
discount_prices

array([ 0.275 ,  0.55  ,  0.825 ,  1.375 ,  0.125 ,  0.25  ,  0.375 ,
        0.625 ,  0.15  ,  0.3   ,  0.45  ,  0.75  ,  0.2375,  0.475 ,
        0.7125,  1.1875,  0.2125,  0.425 ,  0.6375,  1.0625,  0.3   ,
        0.6   ,  0.9   ,  1.5   ,  0.175 ,  0.35  ,  0.525 ,  0.875 ,
        0.35  ,  0.7   ,  1.05  ,  1.75  ,  0.2   ,  0.4   ,  0.6   ,
        1.    ,  0.225 ,  0.45  ,  0.675 ,  1.125 ,  0.3125,  0.625 ,
        0.9375,  1.5625,  0.2875,  0.575 ,  0.8625,  1.4375,  0.375 ,
        0.75  ,  1.125 ,  1.875 ,  0.25  ,  0.5   ,  0.75  ,  1.25  ])

In [11]:
combined.with_column('Discount ($ off)',discount_prices)

Cafe,Drink,Size,Price ($),Coupon,Discount ($ off)
Asha,Milk Tea,Medium,5.5,0.05,0.275
Asha,Milk Tea,Medium,5.5,0.1,0.55
Asha,Milk Tea,Medium,5.5,0.15,0.825
Asha,Milk Tea,Medium,5.5,0.25,1.375
Asha,Tea,Medium,2.5,0.05,0.125
Asha,Tea,Medium,2.5,0.1,0.25
Asha,Tea,Medium,2.5,0.15,0.375
Asha,Tea,Medium,2.5,0.25,0.625
Asha,Espresso,Medium,3.0,0.05,0.15
Asha,Espresso,Medium,3.0,0.1,0.3


# Accessing Rows of our data

In [13]:
math211_survey.show(1) # show me the first row of the table

Surveyor,GPA,Introversion/Extraversion,Pets,Campus_Commute,Handed,Sleep,Num_Units,Favorite_Show,Time_Off,pastime/hobby,Motivation,Num_Siblings,Wish_Siblings
Samantha,3.77,6,"Reptiles, Rodents","Car, Walk",Left-handed,On my side,13,HunterxHunter,Travel,Bowling,God â˜ðŸ½,3,0


In [15]:
first_row = math211_survey.row(0) # retrive the first row of the table, as an array
first_row

Row(Surveyor='Samantha', GPA=3.77, Introversion/Extraversion=6, Pets='Reptiles, Rodents', Campus_Commute='Car, Walk', Handed='Left-handed', Sleep='On my side', Num_Units=13.0, Favorite_Show='HunterxHunter', Time_Off='Travel', pastime/hobby='Bowling', Motivation='God â˜\x9dðŸ\x8f½', Num_Siblings=3, Wish_Siblings=0)

In [19]:
first_row.item('Favorite_Show')

'HunterxHunter'

In [20]:
first_row.item(9)

'Travel'

# Applying a function across the entire row

In [26]:
math211_survey.apply(sum) # this does not work BECAUSE some columns have strings in them

TypeError: unsupported operand type(s) for +: 'int' and 'numpy.str_'

In [24]:
math211_survey.select('GPA','Introversion/Extraversion', 'Num_Units').apply(sum)
#now it works because we selected columns with ONLY NUMBERS

array([ 22.77 ,  20.   ,  19.4  ,  10.5  ,  22.   ,  14.   ,  23.9  ,
        22.   ,  27.72 ,  12.5  ,  22.   ,  24.4  ,  23.8  ,  21.5  ,
        22.   ,  11.   ,  19.   ,  22.1  ,  22.9  ,  25.7  ,  21.57 ,
        14.2  ,  27.5  ,  19.   ,  21.2  ,  25.8  ,  15.2  ,  26.   ,
        24.   ,  26.88 ,  29.1  ,  14.   ,  18.85 ,  18.   ,  13.   ,
        23.   ,  21.   ,  15.3  ,  11.1  ,  17.9  ,  22.9  ,  25.83 ,
         9.   ,  22.59 ,  28.7  ,  21.9  ,  26.81 ,  18.5  ,  26.5  ,
        26.6  ,  20.75 ,  22.6  ,  23.1  ,  22.4  ,  18.84 ,  14.85 ,
        30.98 ,  19.2  ,  29.   ,  27.5  ,  22.7  ,  30.   ,  16.4  ,
        19.   ,  22.2  ,   7.6  ,  20.66 ,   9.3  ,  22.2  ,  16.2  ,
        22.   ,  21.   ,  18.7  ,  22.6  ,  23.8  ,  26.   ,  14.67 ,
        26.92 ,  23.8  ,  19.6  ,  24.7  ,  21.79 ,  29.5  ,  15.   ,
        22.84 ,  25.95 ,  19.5  ,  24.9  ,  20.   ,  20.5  ,  12.6  ,
        20.01 ,  26.8  ,  20.7  ,  20.7  ,   8.2  ,  22.5  ,  16.   ,
        27.   ,  25.

In [22]:
sum(first_row)

TypeError: unsupported operand type(s) for +: 'int' and 'numpy.str_'

In [27]:
sum(math211_survey.select('GPA','Introversion/Extraversion', 'Num_Units').row(0))

22.77

# Comparison Statements

### Assignments vs. Comparisons

In [28]:
this_variable = 7 #assignment line 
this_variable

7

In [29]:
this_variable == 7

True

In [30]:
this_variable == 6

False

In [31]:
this_variable = 6 
this_variable

6

In [32]:
this_variable == 6, this_variable == 7 

(True, False)

In [33]:
this_variable >1

True

In [34]:
this_variable != 5

True

In [35]:
2 < this_variable <50

True

In [36]:
is_this_variable_between = (2 < this_variable <50)
is_this_variable_between

True

### Aggregating Comparisons

In [37]:
this_array = make_array(1, 2, 4, 8, 16, 32, 64, 67)

In [38]:
this_array > 20

array([False, False, False, False, False,  True,  True,  True], dtype=bool)

In [39]:
this_array % 2 # this is a way we can check if a number is divisible by 2

array([1, 0, 0, 0, 0, 0, 0, 1])

In [41]:
sum(this_array>20)

3

# Control Statements

In [42]:
x = 20
s = 'You are 20'

In [43]:
if x >= 21:
    s = 'You may enter this bar/nightclub'
s

'You are 20'

In [44]:
if x >= 18:
    s = 'You can legally vote'
s

'You can legally vote'

### Combining Control Statements

In [55]:
#determining our students as 'Full-Time', 'Part-Time', and 'Less Than Part Time'
# we look at the Number of units to determine this classification\

num_units = 24
classification = ''

if num_units >= 12:
    classification = 'Full-Time'
elif num_units >= 6:
    classification = 'Part-Time'
else:
    classification = 'Less Than Part Time'


classification

'Full-Time'

In [57]:
def full_time_part_time(num_units):
    if num_units >= 12:
        classification = 'Full-Time'
    elif num_units >= 6:
        classification = 'Part-Time'
    else:                                 # If the above two statements are both false
        classification = 'Less Than Part Time'
    return classification
    

In [58]:
math211_survey.apply(full_time_part_time,'Num_Units')

array(['Full-Time', 'Full-Time', 'Full-Time', 'Less Than Part Time',
       'Full-Time', 'Part-Time', 'Full-Time', 'Full-Time', 'Full-Time',
       'Less Than Part Time', 'Full-Time', 'Full-Time', 'Full-Time',
       'Part-Time', 'Full-Time', 'Less Than Part Time', 'Full-Time',
       'Full-Time', 'Full-Time', 'Full-Time', 'Full-Time',
       'Less Than Part Time', 'Full-Time', 'Full-Time', 'Full-Time',
       'Full-Time', 'Less Than Part Time', 'Full-Time', 'Full-Time',
       'Full-Time', 'Full-Time', 'Less Than Part Time', 'Full-Time',
       'Full-Time', 'Part-Time', 'Full-Time', 'Full-Time',
       'Less Than Part Time', 'Less Than Part Time', 'Part-Time',
       'Full-Time', 'Full-Time', 'Less Than Part Time', 'Full-Time',
       'Full-Time', 'Full-Time', 'Full-Time', 'Part-Time', 'Full-Time',
       'Full-Time', 'Full-Time', 'Full-Time', 'Full-Time', 'Full-Time',
       'Full-Time', 'Less Than Part Time', 'Full-Time', 'Part-Time',
       'Full-Time', 'Full-Time', 'Full-Time', 'F

In [59]:
math211_survey.with_column('Classification',math211_survey.apply(full_time_part_time,'Num_Units'))

Surveyor,GPA,Introversion/Extraversion,Pets,Campus_Commute,Handed,Sleep,Num_Units,Favorite_Show,Time_Off,pastime/hobby,Motivation,Num_Siblings,Wish_Siblings,Classification
Samantha,3.77,6,"Reptiles, Rodents","Car, Walk",Left-handed,On my side,13,HunterxHunter,Travel,Bowling,God â˜ðŸ½,3,0,Full-Time
Samantha,3.0,5,,Car,Right-handed,On my side,12,One Piece,Travel all around Asia,Driving,Family,3,3,Full-Time
Samantha,3.4,4,Dogs,Bus,Right-handed,On my side,12,My Wife and Kids,Rest,Watching TV,Money,2,1,Full-Time
Samantha,3.5,3,,Car,Right-handed,Mountain climber position,4,Spy x Family,Go to Japan,I like making tech projects using microcontrollers,My dreams in life keeps me motivated.,2,0,Less Than Part Time
Samantha,3.0,7,Dogs,Car,Right-handed,On my stomach,12,My Hero Academia,Travel to other countries.,Reading,My dreams and ambitions,2,3,Full-Time
Samantha,3.0,5,,Car,Left-handed,On my stomach,6,Full Metal Alchemist The Dark Brotherhood,"Regain sleep, hiking/trail walking/exercise, maybe a trip.",Video games,Helping those I care for.,2,2,Part-Time
my best friend,2.9,6,Dogs,"Car, Bus, Walk",Right-handed,On my back,15,Maury,go to mexico,working out,becoming a better version of myself,6,8,Full-Time
Samantha,4.0,5,Dogs,Car,Right-handed,On my side,13,Friends,I would go to mexico,Doing arts and crafts,Working hard and being with friends,8,8,Full-Time
Samantha,3.72,8,Dogs,Car,Left-handed,On my side,16,Baddies West,Travel,Sitting at the beach,Money,2,5,Full-Time
Samantha,3.5,5,,Car,Right-handed,On my side,4,Your Name,"Travel, relax and have fun.",Hiking,Family,2,2,Less Than Part Time
