# Module: Review of Python
After going through this notebook, students will be able to write functions that

- manipulate data
- apply conditional logic
- iterate through list-type structures
- leverage Python libraries
- practice abstraction / modularity
- use list comprehensions

# [5] Review of Data Types

Everything we do with coding boils down to working with data, so let's get familiar with the data types of Python and also start to build our comfort level with Python

Let's also get comfortable with using the Jupyter Notebook environment!


**Let's start with integers and floats.**

In [5]:
2 + 2

4

In [6]:
7.25 / 5

1.45

**Try any mathematical operation and see it does what we want.**

**What about strings? What are strings?**

In [2]:
'Hello'

'Hello'

**There are other data types that are built-in, primary data types, called Booleans**

In [3]:
True

True

Let's try `False`

There's also `None`

In [4]:
None

## Casting

- Let's play with `int()` and `str()`

- With `bool` let's reverse-engineer what it's doing by trying it on different inputs and see what the outputs are.

# [10] String Manipulation
A significant chunk of Python/datawork involves working with unstructured data like textual data.

Here are some examples:
- Analyzing restaurant reviews
- Detecting spam
- Classifying sentiment

How are these related to manipulating string data?

**There are many different things we can do with strings. Let's look at the docs:***

### https://docs.python.org/3/library/stdtypes.html#string-methods

**Let's see how we define variables.**

In [8]:
description = "Python is a great language to learn!"

In [9]:
description.lower()

'python is a great language to learn!'

**Now try uppercasing**

**Some other methods to try**

- `.capitalize()`
- `.endswith()`

In [12]:
messy_string = '    Hello how are you? There are a lot of $pace$ ' 

## Exercise
Strip the extraneous whitespace from the above variable.

# Exercise
Replace the dollar-sign `$` above with an `s`

# [10] If / Else
Now that we've built a foundation in data, the next layer that we add in Boolean logic.

Ultimately, even the most complex program can usually be phrased as a series or tree fo Boolean statements.


Suppose we are dealing with an e-Commerce site.


```
If the user has items that they have added to their shopping cart and they haven't taken any action to checkout in the last hour, send them a reminder email.
```

Let's look at the syntax. Execute the following code

In [15]:
x = 5

if x > 10:
    print("Wow! X is greater than 10!")
else:
    print("x is less than 10 :(")


### Exercise
Write an if/else statement that captures the following logic..

- If `person_type` is equal to 'Foe', print 'Beware'
- If `person_type` is equal to 'Neutral', print 'Approach with Caution'
- If `person_type` is equal to 'Friendly', print 'Shake hands'
- If `person_type` is equal to 'Close Friend', print 'Give trust'
- Otherwise, print 'Need more info...'

**hint**: Think of a way to make this really terse.

# [10] Review Lists
Lists are fundamental data structure in Python. Let's review the syntax.

In [16]:
restaurants= ['Chipotle', 'TGIF', 'Nonna Maria', 'AA Sushi']

Let's go through the common operations together:

- add an element
- remove an element
- check if an element is in the list
- check the length of a list

## Exercise
Suppose we have the following list

In [17]:
artists= ['Van Gogh', 'Michelangelo', 'Jackson Pollack']

Do the following operations:

- Add "Andy Warhol" to the list
- Remove "Jackson Pollack" from the list
- Check if "Andy Warhol" is in the list using the `is in` syntax

# [20] For Loops

A for loop, which is a specific type of "list iteation", is a powerful tool that allows us to iterate through lists or things like lists, so we can do logic for each element in the list.

In [1]:
books = ["Harry Potter", "Catcher in the Rye", "The Great Gatsby", "Just Mercy", "Das Kapital"]

for item_name in books:
    print(item_name)

Harry Potter
Catcher in the Rye
The Great Gatsby
Just Mercy
Das Kapital


## Modify with if/else
Change the above for loop to print "Expelliarmus" if the book is Harry Potter, otherwise keep fuctionality as is.

## Let's add some more complexity

In [1]:
books_with_ratings = [
    ["Harry Potter", 7.8],
    ["Catcher in the Rye", 9.2],
    ["The Great Gatsby", 8.7],
    ["Just Mercy", 7.1],
    ["Das Kapital", 7.0],
    ["The Hardy Boys", 6.5],
]

best_books = []

for item in books_with_ratings:
    # If rating is greater than 7.5, then add to best_books list
    pass

print(best_books)

[]


## Exercise Part 1
Given a list courses, return only the courses that the student hasn't taken.

In [29]:
courses_already_taken = ['Mathematics 101', 'Astronomy 203', 'English 405']

In [32]:
all_courses = [
    ['Mathematics 101', 5.1],
    ['Mathematics 102', 6.7],
    ['Mathematics 103', 8.8],
    ['Mathematics 201', 7.5],
    ['Mathematics 202', 6.1],
    ['Spanish 407', 7.2],
    ['Astronomy 203', 6.7],
    ['Astronomy 204', 8.8],
    ['English 101', 5.5],
    ['English 102', 6.5],
    ['English 405', 6.6],
    ['Physics 101', 7.2],
    ['Physics 102', 7.8],
]

## Exercise Part 2
Now return only the courses that they haven't taken AND the rating is higher than 7.0

# [10] Review of functions
Functions are the heart and soul of programming and a lot of what we will do in data munging and preparation is create functions that we can apply to columns in our dataset to transform it to what we need.

**Let's look at the syntax.**

In [33]:
FAVORITES = ['Chipotle', 'California Pizza Kitchen', "Paris Bistro"]

def is_in_favorites(restaurant):
    return restaurant in FAVORITES

**Key things to notice**

- colon
- indentation
- return statement
- name of variables
- `def`

**Let's call the function.**

## Exercise
A lot of the time in data munging, we are cleaning up dirty data. Data can be dirty for many reasons. 

- outliers
- typos
- wrong data types
- etc

Create a function that does the following:

```
Turns $12,305,200.22 into 12305200.22
```

**Let's take a look at the `random` library in the Python docs.**

## https://docs.python.org/3/library/random.html

## Exercise
Go ahead and use the `random` library to generate a random number.

In [34]:
import random

## [20] Lab
Create a function that randomly returns one of your favorite restaurants, given that it has a rating of 6.5 or higher.

**Bonus** This function should not return the same restaurant twice in a row.

In [35]:
FAVORITE_RESTUARANTS = [
    ['Chipotle', 7.6],
    ['Subway', 6.2],
    ['Boston Market', 6.4],
    ['Pizza Hut', 7.1],
    ['Don Ramon Cuban Eatery', 8.3],
    ['Paris Bistro', 7.7],
    ['AA Sushi', 8.5],
    ['Thai Palace', 8.3]
]

## [10] Abstraction and Modularity

In [3]:
def get_odds():
    numbers = range(1, 10000)
    results = []
    for num in numbers:
        if num % 2 == 1:
            results.append(num)
    return results

## Exercise
Modify the above function to take a `start`, `end` and `divisor` and it will return all of the numbers between `start` and `end` that are divisibly by `divisor`. 

For example, if the function is called `get_divisible_by`, it would work like the following:

```python
get_divisible_by(1, 5000, 5)
>>> [5, 10, 15, 20, 25, ..., 5000]
```

**NOTE**: We may come back to this function once we understand 

# [20] List comprehensions
What is the point of comprehensions?

In [4]:
desired_foods = ['banana', 'apple', 'coffee', 'pasta', 'mushrooms']
blacklisted_foods = ['coffee']

## Method 1: For Loop

In [5]:
allowed_foods = []
for food in desired_foods:
    if food not in blacklisted_foods:
        allowed_foods.append(food)

allowed_foods

['banana', 'apple', 'pasta', 'mushrooms']

## Method 2: List Comprehension
Here let's talk about terseness and abstraction.

In [6]:
allowed_foods = [food for food in desired_foods if food not in blacklisted_foods]
allowed_foods

['banana', 'apple', 'pasta', 'mushrooms']

## Exercise 2.1
Rewrite `get_divisible_by` using a list comprehension.

## Exercise 2.2 (List Comprehension + Abstraction/Generalization)
Create a new function that, instead of returning the numbers between `start` and `end` that are divisible by `divisor`, it returns the square of those numbers. For example,

```python
get_square_divisible_by(1, 50, 5)
>>> [25, 100, 225, ..., 2500]
```

Note: 25 is 5^2, 100 is 10^2, ...