# Introduction

This week we are focusing on conditional expressions in two settings. First, we will use extensions of the comparison operators we talked about in week 2 to slice dataframes to extract relevant subsets. Then, we will explore the use of if/else flow control operators to create more complex scripts. 

The readings for this week are: 

* [Sections 5.1-5.7 of Think Python](https://greenteapress.com/thinkpython2/html/thinkpython2006.html)
* [Chapter 8 of Problem Solving with Python](https://problemsolvingwithpython.com/08-If-Else-Try-Except/08.00-Introduction/)

In our investigation of the college rankings data in the last notebook, we saw some outliers that suggested we might want to investigate them more carefully. Something that we will frequently want to do is select subsets of data that satisfy certain properties. For this, we can use the comparison operators, combined with the `.loc` syntax we've used before. In fact, we actually did this several times with the equality (`==`) operator in the previous notebook when we were extracting schools by type or by state. More description of ways to slice dataframes can be found in the [documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html). 

In [1]:
import pandas as pd
import matplotlib.pyplot as plt

In [4]:
College_Rankings = pd.read_csv("../Week7_Plotting2/Data/College_Rankings.csv")

In [23]:
College_Rankings.loc[College_Rankings["Grad4"]<20]

Unnamed: 0,Rank,Name,State,Type,Admit,Grad4,Cost,NeedAid,NonNeedAid,NNApercent,GradDebt,Salary,Year
128,129,"California State University, Long Beach, Long ...",CA,Public,36,15,25768,6538,2356,17,16579,46900,2019
428,129,"California State University, Long Beach, Long ...",CA,Public,35,15,25030,6549,2356,17,17058,47320,2021


In [27]:
College_Rankings["Cost"]<60000

0       True
1       True
2      False
3      False
4      False
5      False
6       True
7      False
8      False
9       True
10     False
11     False
12     False
13     False
14     False
15     False
16     False
17      True
18     False
19     False
20     False
21      True
22     False
23     False
24     False
25     False
26     False
27     False
28     False
29     False
       ...  
570     True
571     True
572     True
573     True
574     True
575    False
576     True
577     True
578     True
579     True
580    False
581     True
582     True
583     True
584     True
585     True
586     True
587    False
588     True
589     True
590     True
591     True
592     True
593     True
594     True
595     True
596     True
597     True
598     True
599     True
Name: Cost, Length: 600, dtype: bool

In [16]:
College_Rankings.loc[College_Rankings["Admit"] < 10]

Unnamed: 0,Rank,Name,State,Type,Admit,Grad4,Cost,NeedAid,NonNeedAid,NNApercent,GradDebt,Salary,Year
1,2,"Princeton University, Princeton",NJ,Private,7,90,58660,42097,0,0,6600,75100,2019
2,3,"Harvard University, Cambridge",MA,Private,6,86,61659,44430,20827,0,15117,87200,2019
8,9,"Yale University, New Haven",CT,Private,6,87,63250,45710,0,0,14853,66000,2019
14,15,"California Institute of Technology, Pasadena",CA,Private,9,85,60084,37557,0,0,12104,74000,2019
16,17,"Massachusetts Institute of Technology, Cambridge",MA,Private,8,81,61434,37090,0,0,19064,91600,2019
20,21,"Stanford University, Stanford",CA,Private,5,76,61852,41620,8980,0,19230,80900,2019
30,31,"Brown University, Providence",RI,Private,9,85,63496,40917,9452,0,24300,59700,2019
301,2,"Princeton University, Princeton",NJ,Private,7,93,58911,42009,0,0,6511,75090,2021
302,3,"Harvard University, Cambridge",MA,Private,7,89,61019,44521,20827,0,15406,87227,2021
308,9,"Yale University, New Haven",CT,Private,8,83,62838,45808,0,0,15382,65317,2021


In [17]:
College_Rankings.loc[College_Rankings["Grad4"] >= 90]

Unnamed: 0,Rank,Name,State,Type,Admit,Grad4,Cost,NeedAid,NonNeedAid,NNApercent,GradDebt,Salary,Year
1,2,"Princeton University, Princeton",NJ,Private,7,90,58660,42097,0,0,6600,75100,2019
3,4,"Davidson College, Davidson",NC,Liberal Arts,22,90,61119,37170,23834,15,22000,58500,2019
7,8,"Pomona College, Claremont",CA,Liberal Arts,12,90,63670,41443,0,0,16273,52600,2019
18,19,"Colgate University, Hamilton",NY,Liberal Arts,26,90,63580,41428,0,0,21405,61500,2019
29,30,"Carleton College, Northfield",MN,Liberal Arts,23,91,62846,34050,3095,15,18302,46100,2019
45,46,"University of Notre Dame, Notre Dame",IN,Private,21,90,62825,33025,15804,12,26674,69400,2019
46,47,"Georgetown University, Washington",DC,Private,17,91,64385,36878,0,0,22464,83300,2019
47,48,"College of the Holy Cross, Worcester",MA,Liberal Arts,43,90,60624,32207,33808,4,28354,63700,2019
49,50,"Harvey Mudd College, Claremont",CA,Liberal Arts,14,90,68055,35637,10546,46,24503,78600,2019
56,57,"Washington University in St. Louis, St. Louis",MO,Private,17,90,64350,35555,8665,18,23858,62300,2019


In [20]:
College_Rankings.loc[(College_Rankings["Grad4"] >= 90) & (College_Rankings["Admit"] < 10)]

Unnamed: 0,Rank,Name,State,Type,Admit,Grad4,Cost,NeedAid,NonNeedAid,NNApercent,GradDebt,Salary,Year
1,2,"Princeton University, Princeton",NJ,Private,7,90,58660,42097,0,0,6600,75100,2019
301,2,"Princeton University, Princeton",NJ,Private,7,93,58911,42009,0,0,6511,75090,2021


In [21]:
College_Rankings.loc[(College_Rankings["State"] == "CA") & (College_Rankings["Salary"] < 20000)]

Unnamed: 0,Rank,Name,State,Type,Admit,Grad4,Cost,NeedAid,NonNeedAid,NNApercent,GradDebt,Salary,Year
21,22,"Thomas Aquinas College, Santa Paula",CA,Liberal Arts,83,63,32500,15755,0,0,16263,0,2019
321,22,"Thomas Aquinas College, Santa Paula",CA,Liberal Arts,83,66,31760,15841,0,0,16024,-390,2021


In [32]:
College_Rankings.loc[College_Rankings["Year"]==2021].loc[(College_Rankings["State"].isin(["WA","OR","ID"])) & (College_Rankings["Salary"] > 50000)]

Unnamed: 0,Rank,Name,State,Type,Admit,Grad4,Cost,NeedAid,NonNeedAid,NNApercent,GradDebt,Salary,Year
404,105,"University of Washington, Seattle",WA,Public,56,60,46295,14956,5500,5,21308,52997,2021
467,168,"University of Portland, Portland",OR,Private,61,71,52210,21962,14965,83,25889,50827,2021


Conditional expressions can also be used to ask python to perform different operations depending on the type of data that it is evaluating. This is our first example of flow control in a python script, where we will ask python to perform different tasks depending on whether a specific statement is True or False. 

The main syntax for selection statements consists of a function, an expression that evaluates to True or False followed by a colon, and an indented code block that is to be executed if the expression is True. Try running the cell below and then changing the value to make the expression true. How does the output change?

In [3]:
a_number = 12

if a_number < 10:
    print('This is a single digit number!')

In [10]:
your_name = "Margaret"

if len(your_name) > 5:
    print("You have a long name!")
    
else:
    print("You have a short name!")

You have a long name!


In [9]:
a_new_number = -7

if a_new_number > 0: 
    print("This is a positive number!")
    
elif a_new_number == 0:
    print("Zero is boring!")
    
else: 
    print("This is a negative number!")
    

This is a negative number!


In [34]:
your_number = 3.6

if your_number - int(your_number) >=.5:
    your_number = int(your_number) + 1
    
else:
    your_number = int(your_number)
    
print(f"I rounded your number to: {your_number}")

I rounded your number to: 4


In [41]:
row_number = 100

if College_Rankings.iloc[row_number]["Name"] == "Washington State University":
    print("Go Cougs!")
elif College_Rankings.iloc[row_number]["Name"] == "Somewhere Less Exciting":
    print("We already checked that one!")
else:
    print("meh")
    College_Rankings.at[row_number,"Name"] = "Somewhere Less Exciting"


We already checked that one!
