# Case 1. Python basics

In this case, we'll apply what you've learned about programming so far and start using **functions**. Functions are building blocks you can use and re-use in your code. At this stage in your programming journey we encourage you to use mostly loops and if-else statements for your calculations. This will force you to build up your code in chunks that will ultimately connect together to produce the desired result. It will also give you a better mental model of what Python is doing "under the hood" when using functions and methods from tools like `pandas` and `numpy` to solve complex problems.

## For Local analysis

If you prefer to develop your code in VS Code in a Jupyter notebook, you can download <a href="cases/mini_case_1.ipynb">mini_case_1.ipynb</a>. This notebook file has the same text and code snippets that you see on this webpage. After creating the required code, copy-and-paste it into the code-inputs on this webpage for testing.

## Calculate word length

In [3]:
# Question 1.01: Create a function that returns a dictionary with word lengths
# and "keys". Use a loop to evaluate each word separately using the `len`
# function. For example, `len("something")` would return 9. Initialize a
# dictionary `wl` that is the same length as the list of words passed to your
# function and fill the dictionary with the number of characters in each word.
# Use the actual words as the keys for the dictionary. Note that your function
# should work with any list with one or more words

def word_length(words):
    # initialize a dictionary that will hold the
    # word length information
    # recall that python doesn't have named vectors
    # so I'm using a dictionary instead and using
    # another loop to create it
    wl = {w: 0 for w in words}
    for l in words:
        wl[l] = len(l)

    # replace with your code

    # return the dictionary of word lengths
    return wl


word_length(["mouse", "king"])  # should return 5 and 4 in a dictionary
word_length(["tax", "house", "purple"])  # should return 3, 5, and 6 in a dictionary

{'tax': 3, 'house': 5, 'purple': 6}

## Longest word (Python)

You have now developed a function `word_length` that works similarly to `nchar` in R. That is, `word_length("a", "we", "are")` will return a dictionary with the numbers 1, 2, and 3, with corresponding dictionary keys. We will use this function in the remaining exercises below.


In [4]:
# Question 1.02: Use your `word_length` function to calculate the length of two
# words in a list. Then compare the length of the words and `return` a string
# that describes (1) if one word is longer than the other and (2) how long both
# words are if they are equally long. The examples shown below indicate
# **exactly** what text you should generate. This includes the number of
# spaces! In this function you will need to use `if-else` statements and
# combine strings and numbers using f-strings or concatenation (`+`)

def longest_word(words):
    if len(words)>2: 
        mess = "Sorry. Please provide exactly two words to compare"
    else:
        w_l = word_length(words)
        if w_l[words[0]] == w_l[words[1]]:
            mess = "Both words are " + str(w_l[words[0]]) + " characters long"
        else:
            if w_l[words[0]] > w_l[words[1]]:
                mess = "The first word is longest, " + str(w_l[words[0]]) + " > " + str(w_l[words[1]])
            else:
                mess = "The second word is longest, " + str(w_l[words[0]]) + " < " + str(w_l[words[1]])
    # replace with your code
    
    
    

    return mess  # return message as a string


print(longest_word(["mouse", "king"])) # should return "The first word is longest, 5 > 4"
print(longest_word(["tax", "house"]))  # should return "The second word is longest, 3 < 5"
print(longest_word(["purple", "orange"]))  # should return "Both words are 6 characters long"
print(longest_word(
    ["purple", "orange", "red"]
))  # should return "Sorry. Please provide exactly two words to compare"

The first word is longest, 5 > 4
The second word is longest, 3 < 5
Both words are 6 characters long
Sorry. Please provide exactly two words to compare


## Shortest word

Our `longest_word` function works fine if we only have two words to compare. What if we want to find the shortest word in a list of any length?


In [5]:
# Question 1.03: Again, use your `word_length` function to calculate the number
# of characters in each word passed to the `shortest_word` function. Then use a
# loop to determine the length of the shortest word. You should only use the
# `word_length` function, a for loop, and an if-statement to complete this
# exercise. You may also want to iterate over only the word-length values by
# using the `.values()` method for a dictionary

from math import inf

def shortest_word(words):
    wl = word_length(words)
    s = inf
    for v in wl.values():
        if v < s:
            s = v

    return s  # length of the shortest word


print(shortest_word(["mouse", "king"]))  # should return 4
print(shortest_word(["tax", "house"]))  # should return 3
print(shortest_word(["purple", "orange"]))  # should return 6


4
3
6


## Shortest words

The `shortest_word` function works if we only want the shortest word. Things become a bit more complex if we want the shortest and the next shortest word length.


In [6]:
# Question 1.04: Again, use your `word_length` function to calculate the number
# of characters in each word passed to the `shortest_words` function. Then use
# a loop to determine the length of the shortest word. You will also need to
# update the length of the next shortest word as you iterate through the word
# lengths. You should only use the `word_length` function, a for loop, and an
# if-else-if statement to complete this exercise

from math import inf


def shortest_words(words):
    wl = word_length(words)
    s = inf  # initial value for shortest word length
    ns = inf  # initial value for next shortest word length

    for w in wl.values():
        if w<s:
            ns = s
            s = w
        elif w<ns:
            ns = w


    return [s, ns]  # length of the shortest and next shortest word


print(shortest_words(["mouse", "king", "on"]))  # should return [2, 4]
print(shortest_words(["tax", "house", "blue"]))  # should return [3, 4]
print(shortest_words(["purple", "red", "turquoise"]))  # should return [3, 6]
print(shortest_words(["blue", "house", "king"]))  # should return [4, 4]

[2, 4]
[3, 4]
[3, 6]
[4, 4]


## Shortest actual words (Python)

The `shortest_words` function provides the lengths of the two shortest words in a list. Now write a new function called `shortest_actual_words` that returns the shortest and next shortest words.


In [7]:
# Question 1.05: Create a function called `shortest_actual_words` that returns
# the actual shortest and next shortest words rather than their word lengths.
# Note: Your function should use `for` and `if-elif` statements but no other
# functions from Python.

from math import inf


def shortest_actual_words(words):
    wl = word_length(words)
    s = inf  # initial value for shortest word length
    ns = inf  # initial value for next shortest word length
    sw = ""  # initial value for shortest actual word
    nsw = ""  # initial value for next shortest actual word
    #wl_l = list(wl.items())
    #print(wl_l)
    for x,w in wl.items():
        if w < s:
            ns = s
            s = w
            nsw = sw
            sw = x
        elif w < ns:
            ns = w
            nsw = x

    return [sw, nsw]


print(shortest_actual_words(["mouse", "king", "on"]))  # should return ["on", "king"]
print(shortest_actual_words(["tax", "house", "blue"]))  # should return ["tax", "blue"]
print(shortest_actual_words(["purple", "red", "turquoise"]))  # should return ["red", "purple"]
print(shortest_actual_words(["purple", "red", "tax"]))  # should return ["red", "tax"]

['on', 'king']
['tax', 'blue']
['red', 'purple']
['red', 'tax']


## Mean and median word length (Python)

To wrap up this section on Python coding you will need to calculate mean and median word lengths. 


In [8]:
# Question 1.06: Create a function called `mean_word_length` that returns the
# average length from a list of words. Note: Your function should use a for
# loop, `word_length`, and `len` but no other functions from Python. Since you
# only need the word length values here you can use the `.values()` method for
# a dictionary

def mean_word_length(words):
    wl = word_length(words)
    wl_sum = 0
    for l in wl.values():
        wl_sum = wl_sum + l

    wl_mean = wl_sum/len(words)

    if wl_mean%2==0:
        return int(wl_mean)
    else:
        return round(wl_mean,6)


print(mean_word_length(["mouse", "king", "on"]))  # should return 3.666667
print(mean_word_length(["tax", "house", "blue"]))  # should return 4
print(mean_word_length(["purple", "red", "turquoise"]))  # should return 6


3.666667
4
6


In case you are not yet familiar with them, you may want to to review the documentation for <a href="https://docs.python.org/3/library/math.html" target="_blank">floor</a> and the <a href="https://docs.python.org/3/howto/sorting.html" target="_blank">sort</a> method for lists when you try to calculate the median word length. You can also apply the `sorted` function to a list in Python.


In [9]:
# Question 1.07: Create a function called `median_word_length` that returns the
# median length from a list of words. Note: Your function should use an
# `if-else` statement, `word_length`, `len`, `sort`, and `floor` but no other
# functions from Python

from math import floor


def median_word_length(words):
    wl = list(word_length(words).values())
    len_wl = len(wl)
    wl.sort()  # recall that lists are mutable and so are sorted in-place
    n = len(wl)
    
    mid = n // 2
    if n % 2 == 1:
        return wl[mid]
    else:
        return (wl[mid-1] + wl[mid])/2

    return wls_median


print(median_word_length(["mouse", "on"]))  # should return 3.5
print(median_word_length(["tax", "house", "blue"]))  # should return 4
print(median_word_length(["purple", "red", "turquoise"]))  # should return 6

3.5
4
6
