## Python challenges

This notebook includes a series of challenges to test your Python coding skills. If you get stuck, try googling for answers. If you don't understand *why* a particular answer works, try searching for the answer to that question. Revisit old tutorials from this class as needed, and finally, turn to the course chatroom for help. Best of luck.

In [None]:
# required software
# conda install numpy pandas toyplot requests -c conda-forge

In [3]:
import requests
import numpy as np
import pandas as pd
import toyplot

### Challenge 1: 
Execute the code cell below to see an example of how it works. 
Use markdown in the cell after the code-block to describe the function `random_words_api`. Try to be descriptive about what each step of code in this function does, and why it works. 

In [4]:
def random_words_api(nwords=10):
    "no docstring"
    URL = "https://random-word-api.herokuapp.com/word"
    response = requests.get(url=URL, params={"number": nwords})
    return response.json()

# demonstration
random_words_api(5)

['helpers', 'lodestones', 'escalloping', 'paroxysmal', 'percuss']

**Description:**
1. define a new function `random_words_api` with the input `nwords` set to a default of 10
2. the function has no docstring
3. define an object `URL` as this link: https://random-word-api.herokuapp.com/word. This URL calls a program hosted on the free Heroku platform. This program seems to return a random word when called. 
4. define an object `reponse` using the `.get` function from `requests` package. `.get` sends a GET command to the provided url. The url provided to the `.get` function is the previously defined `URL` object and the optional `params` object is assigned the format `number` and assigned `nwords` as the argument that the `requests` function will take. Essentially, requests will send a GET command to `URL` `nword` times, thereby collecting `nword` random words.
5. the function will return the `reponse` object in JSON format

8. run the function to test. In this test the `nwords` object is assigned the number '5' by the user, so the function will return 5 random words  in JSON format.

### Challenge 2: 
Use the `random_words_api` function to get 50 random words and store the result as a variable. Write a function that takes the list of words as input and returns a dictionary with the longest word as the key and the length of the longest word as a value. If there is a tie in the length of words then have it return additional words as keys with their lengths as values.

In [73]:
fiftywords = random_words_api(50) #generate list of 50 words

def longest_word(wordlist):
    "Returns longest word(s) from a list of words as a dictionary with key = longest word and value = length"
    empty_list = []
    wordict = { i : len(i) for i in wordlist } #convert word list to dictionary with lengths
    longest = max(wordict, key=wordict.get) #find the longest word
    length = wordict[longest] #store the length of the longest word as integer
    empty_list.append(length) #store length of longest word as list
    for k in wordict:
        longestdict = dict((k,v) for k, v in wordict.items() if v in empty_list)
    return longestdict

#test the. function
longest_word(fiftywords)

{'transferential': 14}

### Challenge 3: 
Write a function to take the list of words as input and trim all words to be at most 5 characters in length, and return as a list.

In [89]:
def word_trim(wordlist):
    "Trims all words in a list to maximum of 5 characters and returns them as a list"
    trimmed = [word[:5] for word in wordlist]
    return trimmed

#test the code
word_trim(fiftywords)

['signe',
 'lousi',
 'rhyth',
 'reech',
 'codec',
 'coxit',
 'humid',
 'lwei',
 'muchn',
 'proce',
 'disem',
 'praet',
 'edibl',
 'infus',
 'unbre',
 'postc',
 'flatl',
 'reuni',
 'longb',
 'multi',
 'kytes',
 'agiot',
 'alude',
 'busti',
 'canta',
 'athan',
 'video',
 'kalia',
 'fondl',
 'anabl',
 'ocell',
 'peddl',
 'modul',
 'narra',
 'wades',
 'resaw',
 'posad',
 'whapp',
 'laird',
 'brigh',
 'gayer',
 'trans',
 'tooll',
 'slopp',
 'fluen',
 'panto',
 'clept',
 'oosph',
 'obitu',
 'evagi']

### Challenge 4: 
Write a function to take a list of words as input and to count the occurrence of all letters in every word and return as a dictionary mapping letters to integers., e.g., {'a': 10, 'b': 3, 'c': 5, ...}.  

In [208]:
def letter_count(wordlist):
    "Counts letter occurences across all words in a list of words"
    fullstring = "".join(wordlist)
    alphadict = {char:0 for char in string.ascii_lowercase}
    for letter in fullstring:
        if letter in alphadict:
            alphadict[letter] += 1
    return alphadict

#test function
letterdict = letter_count(fiftywords)
letterdict

{'a': 38,
 'b': 9,
 'c': 14,
 'd': 17,
 'e': 51,
 'f': 6,
 'g': 9,
 'h': 10,
 'i': 36,
 'j': 0,
 'k': 3,
 'l': 28,
 'm': 8,
 'n': 26,
 'o': 26,
 'p': 15,
 'q': 0,
 'r': 26,
 's': 39,
 't': 29,
 'u': 12,
 'v': 3,
 'w': 6,
 'x': 1,
 'y': 7,
 'z': 0}

### Challenge 5:
Use [toyplot](https://toyplot.readthedocs.io/en/stable/tutorial.html) to create a barplot of the occurrences of each letter in your dictionary from the previous challenge. This will represent a histogram of the letters. Play with the size and color of the figure to try to make it look nice.

In [211]:
#create a list of integers, 1-26 for plotting
def alpha_list(r1, r2): 
    return list(range(r1, r2+1))
r1, r2 = 1, 26
nums = alpha_list(r1, r2)

In [205]:
letters = list(letterdict.keys()) #store keys as a list
lettercounts = list(letterdict.values()) #store values as a list

In [206]:
canvas = toyplot.Canvas(width=700, height=500)
axes = canvas.cartesian()
axes.label.text = "Letter Frequency"
axes.x.label.text = "letter"
axes.y.label.text = "count"
axes.x.ticks.show = True
axes.x.ticks.locator = toyplot.locator.Explicit(nums, letters) #set tick markings
colormap = toyplot.color.brewer.map("Spectral") #retrieve rainbow color palette
color=(nums, colormap) #map rainbow colors
mark = axes.bars(nums, lettercounts, color = color)

In [None]:
toyplot.bars(single_series, color=(single_series, colormap), width=600, height=200);

### Challenge 6: 
Using numpy create a new variable called `arr` with 1000 random samples from a normal distribution. Use the numpy `.histogram` function to bin these values into 20 bins, and then plot the histogram using a barplot from toyplot. Color the bars of the histogram orange.

In [144]:
numpy.random.seed(2345)
arr = numpy.random.normal(size=1000)

In [149]:
canvas = toyplot.Canvas(width=500, height=500)
axes = canvas.cartesian()
bars = axes.bars(numpy.histogram(arr, 20), color="orange")

### Challenge 7: 
Write a `while` statement to continue running code in a loop until a condition is met, and then call `break` to end the loop. Inside of the loop, randomly draw a single value from a uniform distribution between 0 and 100. If the value is less than 25 and greater than 22 then break the loop, otherwise, continue the loop until a value meeting this condition is sampled. Use a variable to keep track of how many iterations of the loop are run, and print this value after calling `break`. 


In [189]:
import random

loops = 0
while True:
    i = random.uniform(0, 100)
    if i > 25 or i < 22:
        loops += 1
    else:
        break
print(loops)

12


### Challenge 8: 
Use pandas to load a CSV file from https://eaton-lab.org/data/iris-data-dirty.csv and save as a dataframe. Add custom names to the columns in the dataframe, based on the type of values in them (e.g., numeric versus strings). You can come up with any column names you want for these.

In [191]:
#load the data into pandas dataframe
iris = pd.read_csv("http://eaton-lab.org/data/iris-data-dirty.csv", header=None) 

iris.columns=["trait1", "trait2", "trait3", "trait4", "species"] #assign column names
iris.head()#check the column names

Unnamed: 0,trait1,trait2,trait3,trait4,species
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


### Challenge 9: 
Calculate the mean value of the data in the left-most column for all data where the right-most column matches the value "Iris-setosa". 

In [196]:
setosa = iris[iris['species'] == 'Iris-setosa'] #define new dataframe with setosa data only
setosa["trait1"].mean() #calculate mean of left0most column

5.010204081632653

### Challenge 10:
Create a copy of your iris data dataframe and name it `df2`. Sort the rows of this dataframe based on the values in the first (leftmost) column so that the lowest values are first, and the highest values at the bottom. After sorting, reset the index of the dataframe so that the index is once again ordered. Once you get this to work, try to rewrite it in a simpler form by chaining multiple function calls together to accomplish the goal in one line, by calling code that looks a bit like this, but with the correct function calls: `df.function().function().function()`

In [201]:
df2 = iris #copy iris data into `df2`
df2.sort_values(by=['trait1'], ascending=True).reset_index(drop=True)

Unnamed: 0,trait1,trait2,trait3,trait4,species
0,4.3,3.0,1.1,0.1,Iris-setosa
1,4.4,3.2,1.3,0.2,Iris-setosa
2,4.4,3.0,1.3,0.2,Iris-setosa
3,4.4,2.9,1.4,0.2,Iris-setosa
4,4.5,2.3,1.3,0.3,Iris-setosa
...,...,...,...,...,...
145,7.7,2.8,6.7,2.0,Iris-virginica
146,7.7,2.6,6.9,2.3,Iris-virginica
147,7.7,3.8,6.7,2.2,Iris-virginica
148,7.7,3.0,6.1,2.3,Iris-virginica


### Challenge 11:
Write a function that uses string formatting (curly braces) to create a [mad lib](https://en.wikipedia.org/wiki/Mad_Libs) containing at least 4 words that will be filled in. The returned object of your function should be a string where the missing words are filled by randomly sampled words from the `random_words_api()` function. The sentence or paragraph of your mad lib can be anything you wish, be creative. 

In [260]:
def mad_lib():
    "madlib of Shakspeare's Sonnet 18, filled using random_words_api()"
    madlist = random_words_api(23)
    
    print("Sonnet 18\n")
    print("Shall I compare thee to a", madlist[0]+"?")
    print("Thou art more", madlist[1], "and more", madlist[2]+":")
    print("Rough", madlist[15],"do shake the darling", madlist[3],"of May,")
    print("And", madlist[4]+"’s lease hath all too short a", madlist[16]+";")
    print("Sometime too hot the", madlist[5],"of", madlist[17],"shines,")
    print("And often is his gold", madlist[6],"dimm'd;")
    print("And every fair from fair sometime", madlist[7]+",")
    print("By chance or", madlist[8]+"’s changing", madlist[18],"untrimm'd;")
    print("But thy eternal", madlist[9],"shall not fade,")
    print("Nor lose possession of that", madlist[10],"thou ow’st;")
    print("Nor shall", madlist[19],"brag thou wander’st in his", madlist[11]+",")
    print("When in eternal", madlist[12],"to", madlist[20],"thou grow’st:")
    print("   So long as", madlist[21],"can breathe or", madlist[13],"can see,")
    print("   So long", madlist[22],"this, and this gives", madlist[14],"to thee.")
    print("\n\n~William Shakespeare..?")
    
#test code
mad_lib()

Sonnet 18

Shall I compare thee to a murex?
Thou art more excrescent and more tunability:
Rough softbounds do shake the darling chillness of May,
And pewees’s lease hath all too short a winterish;
Sometime too hot the oms of imbrown shines,
And often is his gold snick dimm'd;
And every fair from fair sometime ectozoan,
By chance or rachilla’s changing burrier untrimm'd;
But thy eternal hebetic shall not fade,
Nor lose possession of that flamines thou ow’st;
Nor shall capsulate brag thou wander’st in his delimitation,
When in eternal orphically to swords thou grow’st:
   So long as bakshish can breathe or mistrysts can see,
   So long parasexual this, and this gives sylvinite to thee.


~William Shakespeare..?


<div class="alert alert-success">
After completing all challenges in this notebook, save and download the .ipynb file to your computer. Move the file to your hack-program repo and put it in a folder called notebooks. Add/stage this file and folder and commit the change, and push to GitHub. The assignment is due by end of day on 3/7/2021.  
</div>