### PEP8 Style Guidelines, Linters, and Magic Commands

One of Guido Van Rossum's (the inventor of Python) key insights is that code is read much more often than it is written. The PEP 8 style guidelines are intended to improve the readability of code and make it consistent across the wide spectrum of Python code. The full list of rules is located here: 

https://www.python.org/dev/peps/pep-0008/

Some main things to remember are:

    1. Indentation should be four spaces.
    2. Put spaces around operations like +, -, ==, etc.
    3. Don't put spaces around keywords or parameters (ex: def function(greeting='hello').

Here is a well-styled cell of code:

In [None]:
def next_birthday(name='John Doe', age=18):
    age = age + 1
    return f'Hi my name is {name} and I will be {age} next year.'

next_birthday('Sally', 6)

In order to check that our code is clean and satisfies PEP 8 guidelines, we can use a linter. A linter or lint refers to tools that analyze source code to flag programming errors, bugs, stylistic errors, and suspicious constructs. 

Now, we can use magic linting commands. Python has a set of predefined ‘magic functions’ that you can call with a command line style syntax. There are two kinds of magics, line-oriented and cell-oriented. Line magics are prefixed with the % character and work much like OS command-line calls: they get as an argument the rest of the line, where arguments are passed without parentheses or quotes. Cell magics are prefixed with a double %%, and they are functions that get as an argument not only the rest of the line, but also the lines below it in a separate argument.

You have already used a line magic command in the past with plotting. Writing the line %matplotlib inline allows you to not have to write the line plt.show():


In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

plt.plot([1,2,3], [1,2,3], '.')

In order to get a linter working, we'll first need to install it. Run the following cell:

In [None]:
! pip install pycodestyle flake8 --user
! pip install pycodestyle_magic --user

## Now go to Kernel - Restart so that you can now use the packages that you just installed. You should not need to do any of the previous steps again, just start fresh on the line below.

**Using a linter:**

Now we can use a cell magic to help us with linting. First, run the cell below:

In [None]:
%load_ext pycodestyle_magic
%flake8_on

Now, run this cell:

In [None]:

def next_birthday(name = 'John Doe', age = 18):
  age=age+1
  return f'Hi my name is {name} and I will be {age} next year.'

next_birthday('Sally', 6)





Uh oh, we've got a bunch of issues to clear up. First, it says that "E251 unexpected spaces around keyword / parameter equals" four times (in line 1, character #23, 25, 41, and 43.. Let's get rid of the spaces around the arguments to fix these:

In [None]:

def next_birthday(name='John Doe', age=18):
  age=age+1
  return f'Hi my name is {name} and I will be {age} next year.'

next_birthday('Sally', 6)

Now we should fix our indentation:

In [None]:
def next_birthday(name='John Doe', age=18):
    age=age + 1
    return f'Hi my name is {name} and I will be {age} next year.'

next_birthday('Sally', 6)

Next, let's fix our spacing around the equality operator:

In [None]:
def next_birthday(name='John Doe', age=18):
    age = age + 1
    return f'Hi my name is {name} and I will be {age} next year.'

next_birthday('Sally', 6)

Finally, put two blank spaces after the end of your function:

In [None]:
def next_birthday(name='John Doe', age=18):
    age = age + 1
    return f'Hi my name is {name} and I will be {age} next year.'


next_birthday('Sally', 6)

Note: if you want to see which line number that you are on (this really helps with finding linting errors) go to View - Toggle Line Numbers.

### Part 1: Linting Practice: Clean up the code below using a linter

In [None]:
# exercise 1.1
def insult(name = "John Doe", age = 18):
  if age>25:
    return f"You're old, {name}!"
  else:
    return f"You're going to be old soon enough, {name}!" 

insult('Kanye',40)

### 1.2 `timeit`
Another type of magic command that is helpful is `timeit`. timeit sometimes won't work when linting is turned on (if that is the case for you you can try `%flake8_off`, or go to Kernel - Restart Kernel to completely turn off linting)

Timeit allows you to time how fast your algorithm takes to run, in order to compare which algorithm may be most efficient. 

Here's one example where we check all of the numbers between 2 and 16785407 before deciding if 16785408 is prime. It runs on a scale of seconds:

In [None]:
%%timeit


def is_prime(n):
    prime = True
    for i in range(2, n):
        if n % i == 0:
            prime = False
    return prime

is_prime(16785408)

We notice that if we break out of the loop with a return statement as soon as we find a divisor, we can determine that 16785408 is not prime much faster, on the scale of nanoseconds:

In [None]:
%%timeit


def is_prime(n):
    for i in range(2,n):
        if n % i == 0:
            return False
    return True


is_prime(16785408)

### Directions: For the remainder of the exercises below, make sure to correct all linting errors.

Run this cell to turn flake8 back on in order to lint all the functions and code you write below:

In [None]:
%flake8_on
# you can use %flake8_off to to turn linting OFF

### Part 2: Functions Review
Before we start our scrabble problems, we want to make sure you get a quick review of functions. Recall that in your last assignment, you created frequency lists of how often items occurred. Let's make more general functions that do this that you can re-use in the future.

#### 2.1. Write a function called `item_frequency` that takes in a list of items and returns a list of tuples containing (item, frequency) in descending order. DO NOT use the collections package. For example, 

```python
item_frequency(['hi', 'my', 'name', 'kanye', 'and', 'i', 'love', 'kanye', 'and', 'if', 'my', 'name', 'wasnt', 'kanye', 'then', 'i', 'would', 'change', 'it', 'to', 'kanye']) 
```

should return:

```python
[('kanye', 4),
 ('my', 2),
 ('name', 2),
 ('and', 2),
 ('i', 2),
 ('hi', 1),
 ('love', 1),
 ('if', 1),
 ('wasnt', 1),
 ('then', 1),
 ('would', 1),
 ('change', 1),
 ('it', 1),
 ('to', 1)]
 ```

In [None]:
# insert exercise 2.1

def item_frequency(items: list):
    """Docstring here"""
    pass


To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.item_frequency_unit_test(item_frequency)

### 2. Redo the previous exercise but now use the Counter tool in the collections package as well as its most_common method. You can read its documentation here:

https://docs.python.org/2/library/collections.html

Note: Doing it this way should reduce your code to only two or three lines!

Call this new function **item_frequency_counter**.

In [None]:
# insert exercise 2.2

To test your function, run the cell below:

In [None]:
import Unit6UnitTests as tests
tests.item_frequency_counter_unit_test(item_frequency_counter)

### 2.3. Now, make a function called `word_frequency` that takes in a string and returns a list of tuples containing (word, frequency) in descending order. 

Your function word_frequency should call your `item_frequency` function that you already wrote. By doing this, your new function should only be two or three lines long.

In [None]:
# insert exercise 2.3

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.word_frequency_unit_test(word_frequency)

### 2.4. Now, make a function called `letter_frequency` that takes in a string and returns a list of tuples containing (letter, frequency) in descending order.

Your function letter_frequency should call your `item_frequency` function that you already wrote. By doing this, your new function should only be two or three lines long.

In [None]:
# insert exercise 2.4

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.letter_frequency_unit_test(letter_frequency)

### Part 3: Scrabble

Thanks to Scrabble$^{TM}$, we have easy access to a list of almost all the possible words in the English language.  In this problem, you will access this list of over 260,000 words to find some very unique words.

First, your program will read the entire SOWPODS list of acceptable Scrabble$^{TM}$ words. 

We'll learn about file input/output more later, but for now, you can use the following code to read in all of the words in the Scrabble dictionary and save them in a list called words.

In [96]:
word_file = open('sowpods.txt', 'r')

words = []

for word in word_file.readlines():
    words.append(word.strip())

print(f'There are {len(words):,} words in the Scrabble Dictionary')
print(f"Some of these words are: {words[84231:84242]}")
print(f"Could I please have an {words[77610]}?")

There are 267,751 words in the Scrabble Dictionary
Some of these words are: ['fifi', 'fifing', 'fifteen', 'fifteener', 'fifteeners', 'fifteens', 'fifteenth', 'fifteenthly', 'fifteenths', 'fifth', 'fifthly']
Could I please have an espresso?


### 3.1. Write a function called `ith_word` that takes in an integer, `i`,  and returns the `i`th word in the words list.

What is the 10th word in the list of words?

In [None]:
# insert 3.1


To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.ith_word_unit_test(ith_word)

### 3.2. Write a function called `ith_to_last` that takes in an integer,` i`, and returns the `i`th to last word in the list. 

What is the fifth to last word in the list of words? What does it mean? Write your answer about this word inside comment hashtags.

In [None]:
# insert 3.2
def ith_to_last(i):
    pass

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.ith_to_last_unit_test(ith_to_last)

### 3.3 Write a function called `word_to_index` that takes in a `word` and returns the index of the word in that list.

If that word does not appear in the list, return -1. What index is the word zymurgies in the list? What does the word mean? Write your answer about this word inside comment hashtags.

In [None]:
# insert 3.3

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.word_to_index_unit_test(word_to_index)

### 3.4. Create a dictionary called `letter_value` for each letter's values according to the following values.

A=26, B=25, C=24, ..., Z=1


In [None]:
# insert 3.4

### 3.5. The dollar value of the word poo is 35, since p is worth 11 and each o is worth 12, so 35 = 11+12+12.

Create a function called `dollar_value` that takes in a `word` and returns the dollar value. You can assume that each letter of the word will appear in the letter_value dictionary.

Then calculate the dollar value of the word breatharian. What does the word mean? Write your answer about the word inside hashtags.

In [None]:
# insert 3.5


To test your function, run the following cell. The test should pass:

In [None]:
tests.dollar_value_unit_test(dollar_value)

### 3.6. Write a function called `find_max` that takes in no input and returns the maximum dollar amount and the word corresponding to that dollar amount in the form of a tuple ```(maxdollar, maxword)```. 

DO NOT use any new dictionaries to write this function; rather, just call your dollar_value function repeatedly and keep track of the biggest dollar and corresponding word that you have found so far using a for loop. Which unique word has the highest dollar value and what is the word? What does the word mean? Write these answers inside a hashtag.

In [None]:
# insert 3.6

To test your function, run the following cell. The test should pass:

In [None]:
tests.find_max_unit_test(find_max)

### 3.7. Write a function called `word_dictionary` that takes in no input and returns a dictionary called `word_value` in which the keys are the dollar values and the dictionary values are the list of words with that dollar value. 

In [None]:
# insert 3.7

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.word_dictionary_unit_test(word_dictionary)

### 3.8. Create a function called `n_smallest` that takes in an integer, `n`, and returns a list of n tuples containing the n smallest dollar values and the n lists of words associated with them. 

Use your function to find the 10 smallest dollar values and the words associated with them. Use your word_dictionary function that you created above.

In [None]:
# insert 3.8

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.n_smallest_unit_test(n_smallest)

### 3.9. Create a function called `n_largest` that takes in an integer, `n`, and returns a list of n tuples containing the n largest dollar values and the n lists of words associated with them.

Use your function to find the 10 largest dollar values and the words associated with them. Use your word_dictionary function that you created above.

In [None]:
# insert 3.9

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.n_largest_unit_test(n_largest)

### 3.10. Write a function called `max_words` that takes in nothing and finds the dollar value that has the most words associated with it.

The output of the function should be returned in the form of a tuple containing the dollar amount, followed by the number of words associated with that dollar amount. Hint: You may want to use the word_value dictionary you already have and use it to create a sorted list of tuples.

In [None]:
# insert 3.10

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.max_words_unit_test(max_words)

### 3.11.Use matplotlib to create a histogram of the distribution of all of the dollar values for each of the words. Use 100 bins for your histogram. You should see a high peak around 129, since there are 2542 words with that dollar value.

In [None]:
# insert 3.11

### 3.12.Write a program called `is_prime` that takes in a positive integer and returns whether or not it is prime. 

Use return commands in the appropriate places in order to increase efficiency and decrease the lines of code.

In [None]:
# insert 3.12

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.is_prime_unit_test(is_prime)

### 3.13. Minimum Prime Dollar Words

Interestingly, Kanye's dollar value ($79) is also a prime number.  Write a function called `min_prime` that takes in two integers, m and n, and searches the entire SOWPODS list of words to find the prime number value between m dollars and n dollars that has the fewest words associated with it. 

You can assume that m and n are positive integers and that n is greater or equal to m. If there are multiple prime numbers that satisfy the requirements, return the numerically first prime with the minimum words.

Call the `is_prime` function that you have already written within this new function.

The output should be returned in the form of a tuple containing the prime number followed by the list of words associated with that prime number.

In [None]:
# insert 3.13

To test your function, run the following cell. The test should pass:

In [None]:
import Unit6UnitTests as tests
tests.min_prime_unit_test(min_prime)