# Python for Data Science, Level I
### *Session \#4*
---

### Helpful shortcuts
---

**SHIFT** + **ENTER** ----> Execute Cell

**TAB** ----> See autocomplete options

**ESC** then **b** ----> Create Cell 

**ESC** then **dd** ----> Delete Cell

**\[python expression\]?** ---> Explanation of that Python expression

**ESC** then **m** then __ENTER__ ----> Switch to Markdown mode

## I. Array Basics


### Warm Ups
---

**Import numpy:** `import numpy as np`

**Create an array from a list:** `int_array = np.array([8, 6, 7, 5, 3, 0, 9])`

**Broadcasting:** `int_array * 3`

**Element-wise arithmetic:** `np.array([1, 2, 3]) + np.array([3, 2, 1])`

**Create copy of array:** `array_copy = int_array.copy()`

### Exercises
---
**1. The array** `temps` **represents a sequence of temperature readings in Farenheit. Use broadcasting to convert them all to Celcius.**

Hint: To convert from Farenheit to Celcius, subtract 32 degress and multiply by 5/9.

In [141]:
temps = np.array([92, 103, 85, 94, 90, 92, 87, 82, 80])

**2. Use broadcasting to convert your new temperature array to Kelvin.**

Hint: To convert Celcius to Kelvin, add 273 degrees.

**3. Each of the** `game` **arrays represents how many points were scored that game by each player on the team**

**Create an array of each player's average number of points scored, using element-wise arithmetic and broadcasting.**

In [159]:
game1 = np.array([6, 26, 18, 10])
game2 = np.array([12, 24, 28, 8])
game3 = np.array([14, 30, 22, 14])

totals = game1 + game2 + game3
average = totals / 3

**4. What would the averages look like if each player had scored twice as many points during the last game?**

**5. The array** `book_pages` **represents how many pages each book has, and** `book_hours` **represents how long it took to finish the book.**

**Which book was most difficult to read, based on its pages-per-hour reading rate?**

In [None]:
book_pages = np.array([320, 173, 532, 801, 275])
book_hours = np.array([  3,   2,   6,  19,   4])

### Extra Credit
---
**1. Numpy also provides handy functions for creating arrays, such as** `np.random.randint()`. **The function takes three inputs: the lower bound, the upper bound, and the number of random numbers you want. For example:**

np.random.randint(1,10, 3) ---> array([1, 3, 9])

**How would you use** `randint()` **to simulate a turn in the game Yahtzee, which is played with 5 standard dice?** 

## II. Boolean Masks


### Warm Ups
---

**1. Create a boolean mask:** `int_array < 3`

**2. Filtering using a boolean mask:** `int_array[int_array < 3]`

array([0])

**3. Combining boolean masks with an OR operator:** `int_array[(3 < int_array) | (int_array < 7)]`

**4. Combining boolean masks with an AND operator:** `int_array[(3 < int_array) & (int_array < 7)]`

**5. Assignment using a boolean mask:** `int_array[int_array % 2 == 0] = 0`

### Exercises
---

**1. The array** `test_scores` **represents the final test scores for a class. How many students got at least 80 points on the test?**

In [132]:
test_scores = np.array([68, 95, 92, 73, 84, 90, 69, 75, 80, 82, 90, 85, 86, 64, 98])

**2. Filter the array to just students who got between 65 and 75 points on the test.**

**3. How many students got above 95 or below 65 points?**

**4. How many students got exactly 90 points?**

**5. The array** `golf_scores` **represents how many strokes a golfer took on each hole during a round.**

**There's a 6-stroke maximum, so overwrite any higher score with a 6.** 

Hint: Use a boolean mask to assign the new value.

In [None]:
golf_scores = np.random.randint(1, 10, 18)

### Extra Credit
---

**1. Par, or the expected number of strokes, for each hole is given by the array** `par`

**Which holes did the golfer shoot at or under par?**

Hint: First find a boolean mask for the holes under par, then use it on the array `[1, 2, 3, ..., 18]`.

In [236]:
par = np.array([3, 4, 3, 3, 5, 3, 4, 3, 3, 4, 5, 3, 3, 3, 5, 3, 3, 3])

## III. Array Methods

*Note: All array methods can also be called as functions of the Numpy module.* `int_array.sum()` <=> `np.sum(int_array)`

### Warm Ups
---

**Sum of an array:** `int_array.sum()`

**Product of an array:** `int_array.prod()`

**Find the max or min element:** `int_array.max()` **or** `int_array.min()`

**Find the average:** `int_array.mean()`

**Sort an array:** `int_array.sort()`

## Exercises
---
**1. Find the average test score from** `test_scores`

**2. Create an array showing how many points above/below average each test score was.**

**3. Find the final number of strokes over par, by summing** `golf_scores` **and** `par`

**4. Find the number of degrees between the coldest and hottest temperature readings from** `temps` 

### Extra Credit
---

**1. Among students who scored higher than 70 points in** `test_scores`, **what was the average score?** 

**2. If you removed the worst three scores from** `test_scores`, **what would be the average score?**

## IV. Functions for Creating Arrays

### Warm Ups
---

**Create a ten-element array of zeros or ones:** `np.zeros(10)` **or** `np.ones(10)`

**Create an array ranging from 5 to 10:** `np.arange(5, 11)`

**Create an array of random integers from 0 to 10:** `np.random.randint(0,10,20)`

**Create an array by randomly choosing from a list:** `np.random.choice([1, 2, 3], 5)`

**Create an evenly spaced array of 5 values between 0 and 10**: `np.linspace(0, 10, 5)`

### Exercises
---
**1. Create an array that gives you every eighth of a rotation between 0 and 360 degrees.** 

**For example, it would start:** `[0, 45, 90...]`

**2. The factorial of a number n,** `n!`, **is defined as** `n * (n-1) * (n-2) ... * 2 * 1`. 

**So for example,** `3!` **is equal to** `3 * 2 * 1 = 6`

**Create a vectorized version of this function, using Numpy functions/methods instead of a loop.**

In [235]:
def factorial(num):
    answer = 1
    for i in range(1, num+1):
        answer = answer * i
    return answer

**3. Complete the function** `infinite_monkeys()` **that will construct random sentences from a word bank.**

In [247]:
def infinite_monkeys():
    word_bank = ["the", "hotdog", "squirrel", "went", "swam", "flew", "cloud", "to", "land"]
    # Insert your code here

### Extra Credit
---

**1. (EXTRA HARD) Estimate the area under the curve of the sin function from 0 to pi:** 

Hint: Use rectangles! Use np.linspace() to create even slices along the x-axis, and np.sin() to find the height of the rectangles at those points. As you increase the number of slices, it should converge toward 2.

![image of sine function](http://www.visumath.be/helpfile/images/integralexample1.png "Area under sine function")

In [198]:
from math import pi

slice_count = 50
start = 0
stop = pi