In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("week-3.ipynb")

# Week 3 Lecture Notebook

### `NumPy`

####  `np.arange`

Arrays are provided by a package called [NumPy](http://www.numpy.org/) (pronounced "NUM-pie"). The package is called `numpy`, but it's standard to rename it `np` for brevity.  You can do that with:

    import numpy as np

NumPy provides a special function for this called `arange`.  The line of code `np.arange(start, stop, step)` evaluates to an array with all the numbers starting at `start` and counting up by `step`, stopping **before** `stop` is reached.

Run the following cells to see some examples.

In [None]:
import numpy as np

In [None]:
# Create an array that starts at 1 and count up by 2
# and then stops before 20
np.arange(1, 20, 2)

This array doesn't contain 20 because `np.arange` stops before the stop value is reached.

We've done elementwise arithmetic with a `NumPy` array. We can also use `NumPy ` functions on a `pnadas` `Series`. This is because a `pandas` `Series` is a generalized array. The essential difference is the presence of the index: while the `Numpy` array has an implicitly defined integer index used to access the values, the `pandas` `Series` has an explicitly defined index associated with the values.

In [None]:
first_100_odd_numbers = np.arange(1, 100, 2)
first_100_odd_numbers

In [None]:
first_100_odd_numbers[1]

### `Pandas`

####  Slicing

In [None]:
import pandas as pd
nc_ceo_totalpay = pd.read_csv('data/nc-ceo-total-pay.csv')
nc_ceo_totalpay

If we slect the column name `Total Pay` using `[ ]` notation we get a `Series`.

In [None]:
nc_ceo_totalpay['Total Pay']

We can use `numpy` functions on a `Series`.

In [None]:
mean_totalpay = np.mean(nc_ceo_totalpay['Total Pay'])
mean_totalpay

We can use the `np.round` function on a float type.

In [None]:
np.round(mean_totalpay)

**Example 1.** Select all the companies that are located in Charlotte. Return a `DataFrame`.

In [None]:
char_ceo_totalpay = ...
char_ceo_totalpay

In [None]:
grader.check("e1")

**Example 2.** Select all the companies that are located in Charlotte and Greensboro. Return a `DataFrame`.

In [None]:
char_gso_ceo_totalpay = ...
char_gso_ceo_totalpay

In [None]:
grader.check("e2")

**Example 3.** Select all the companies that pay the CEO more than 1 million in total pay.

In [None]:
nc_ceo_million = ...
nc_ceo_million

In [None]:
grader.check("e3")

### Loops

####  `for`

In [None]:
for number in range(5):
    print(number)

In [None]:
for number in np.arange(5):
    print(number)

In [None]:
my_string = "N. C. State Wolfpack"

for character in my_string:
    print(character)

Suppose we wanted to count the number of characters in `my_string`.

In [None]:
character_count = 0

for character in my_string:
    character_count += 1

print('There are', character_count, 'characters in my_string.')

Suppose we only wanted to count the letters in `my_string`.

To do this we would need a way to

1. remove the spaces

2. check to see if the character is a letter of the alphabet

In other words we need to use a conditional statement.

### Condtional Statements

#### `if`, `else`, and `elif`

In [None]:
gpa = 3.26

if gpa > 3.25:
    print("Cum Laude")

According to the [NC State undergraduate catalog](http://catalog.ncsu.edu/undergraduate/academic-policies-procedures/student-status-honors/academic-honors/) 

    Students with exceptional academic performance may be recognized in the following ways at the university level.
    
**Graduation with Honors**

Undergraduate degree honor designations are:

* **Cum Laude** for GPA 3.25 through 3.499

* **Magna Cum Laude** for GPA 3.5 through 3.749

* **Summa Cum Laude** for GPA 3.75 and above
    
To be eligible for degree honor designations students must have completed at least two semesters and at least 30 credit hours at NC State.

**Example 4.** Write a conditonal statement that takes the `gpa` and prints the appropiate degree honor designation. If the gpa does not qualify for academic honors print None.

In [None]:
gpa = 2.73

...

Now let's get back to our original question. How many letters are in `my_string`?

**Example 5.** Use a `for` loop and a conditional statement to count the number of letters in `my_string`.

**Hints:** 

* There are several string methods in Python. They can be found [here](https://www.w3schools.com/python/python_ref_string.asp). For example, the `.replace()` string method can be used to remove spaces.

* Import the string library (`import string`) and use the object `string.punctuation`. You can read more about this object by clicking [here](https://www.geeksforgeeks.org/string-punctuation-in-python/#:~:text=punctuation%20is%20a%20pre%2Dinitialized,the%20all%20sets%20of%20punctuation.&text=Parameters%20%3A%20Doesn't%20take%20any,since%20it's%20not%20a%20function.).

* The `in` and `not in` membership operators can be used to validate the memebership of a value in a sequence, such as strings, lists, or tuples. Clieck [here](https://www.geeksforgeeks.org/python-membership-identity-operators-not-not/) to check out an example.

* If you do it correctly your result should be 15.

In [None]:
# Import the string library function
import string

# Initialize letter count
letter_count = 0

# for loop

...

letter_count

### Rapper's Delight

["Rapper's Delight"](https://en.wikipedia.org/wiki/Rapper%27s_Delight) is a 1979 hip hop track by the Sugarhill Gang and produced by [Sylvia Robinson (aka The Mother of Hip Hop](https://en.wikipedia.org/wiki/Sylvia_Robinson). In 2000 NPR did a story about the song.

In [None]:
from IPython import display

IPython.display.Audio('audio/rappersdelight.mp3')

There were three rappers in the Sugar Hill Gang. The members, all from Englewood, New Jersey, consisted of Michael "Wonder Mike" Wright, Henry "Big Bank Hank" Jackson (January 11, 1956 - November 11, 2014), and Guy "Master Gee" O'Brien. 

We will take a look at the lyrics of Wonder Mike.

Run the cell below to read in the text file.

In [None]:
with open('data/wonder-mike.txt', 'r') as f:
    wonder_mike_lyrics = f.read()
type(wonder_mike_lyrics)

Since this a string we can iterate over the words. Let's take a look at a few lines.

In [None]:
print(wonder_mike_lyrics[:789])

Where does Wonder Mike's second verse begin?

In [None]:
lyrics.find("Wonder Mike Verse 2")

Does he have a third verse?

In [None]:
lyrics.find("Wonder Mike Verse 3")

Wonder Mike seems to use the words **boogie**, **hip**, and **hop** a lot.

**Example 6.**  How many times did he say **boogie**, **hip**, and **hop** in his three verses.

In [None]:
...

**Example 7.**  How many times did he say **hip hop** in his three verses.

In [None]:
...

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

When done exporting, download the .zip file by finding it in the file browswer on the left side of the screen, then right-click and select **Download**. You'll submit this .zip file for the assignment in Canvas to Gradescope for grading.

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False)