# Intro to Python 2

In [None]:
#my favorite coffee shop#
toaster_pastries = [("strawberry", 2.50), ("blueberry", 2.23), ("cherry", 2.97), ("campfire smore", 1.98), ("oreo supreme", 4.22), ("ice cream sundae", 2.49)]
coffee = [("small", 1.39), ("medium", 1.59), ("large", 1.79)]

#my state's sales tax#
tax = .045

## Answering Questions with Programming 

Imagine you want to quickly find out which combination of toaster pastries and coffee is the most economical if you are allergic to fruit. While some quick mental math will tell us that the "smore" flavor and a small coffee is the right option, creating code that will lead us to the same answer will give us a good starting point in case the data ever becomes complex. In this tutorial, we will create **loops** that use conditionals, define **functions** that will perform a series of tasks, and create **dictionaries** and **tuples** to help order complex data.

## Loops

Returning to the problem of which fruit flavor and coffee combination is the most economical, let's consider the steps we need to solve this problem.
1.) We need to **distinguish** "fruit-flavored" varities of toaster pastries from non-fruit-flavored ones.
2.) We need to **find** the lowest price among the remaining toaster pastries.
3.) We need to **add** that price to the lowest-priced coffee
4.) We need to **multiply** that sum by the state tax and return the result.
If we look at the actions bolded above, it becomes clear that our last two actions are arithmetic, meaning that they're relatively easy to code. While starting in reverse-order is not always advisable, it is good to have a clear idea of where your code will end.

In [None]:
total = nf_toster_pastry + lp_coffee
total_with_tax = (total ** tax) + total
return(total_with_tax)

This code won't do anything now, but we know that, at some point, we will need to define variables called nf_toaster_pastry and lp_coffee. In the first line, we suggest that the cost of the non-fruit toaster pastry and the lowest-priced coffee can be added together to to create our total. In the second line, we suggest that multiplying the total by the tax and then adding the total to that number will produce the totat with tax. In the final line, we return the total_with_tax.

Now, all we need is to perform the processes that will give us nf_toaster_pastry and lp_coffee.

Defining lp_coffee is much easier. There are many approaches we can take, but some are certainly better than others.

In [None]:
min(coffee, key = lambda entry: entry[1])

Using Python's **min** function allows us to find the lowest-priced coffee without resorting to an especially long loop. What may be confusing, however, is the "key" section of the code. What would happen if we just asked for min(coffee)?

In [None]:
min(coffee)

As you can see, "large" is returned as an answer. The reason for this is that we want the minimum of the value attached to each tuple (1.39, 1.59, and 1.79) rather than the value of the tuple in its entirety. Setting a key in the first section to say that we want min to be determined by entry[1], or the second part of each tuple, allows us to get an accurate idea of the min that we want.

To illustrate this more, please modify the code below so that we can determine which toaster pastry flavor is the more inexpensive and, using the min function's cousin **max**, which is the most calorically dense.

In [None]:
test_flavors_and_cals = [("galactic grape", 2.56, 433), ("persuasive plum", 3.45, 412), ("mindful apple", 3.98, 398), ("sullen cranberry", 1.23, 432), ("artful linzer tart", 5.99, 349)]
least_expensive = min()
most_cal = max()
print(least_expensive)
print(most_cal)

[ANSWER BELOW, DON'T SCROLL UNLESS YOU'VE TRIED THIS]

In [None]:
test_flavors_and_cals = [("galactic grape", 2.56, 433), ("persuasive plum", 3.45, 412), ("mindful apple", 3.98, 398), ("sullen cranberry", 1.23, 432), ("artful linzer tart", 5.99, 349)]
least_expensive = min(test_flavors_and_cals, key = lambda entry: entry[1])
most_cal = max(test_flavors_and_cals, key = lambda entry: entry[2])
print(least_expensive)
print(most_cal)

So, returning to our code so far, we can confidently define lp_coffee as being equal to the function above. As we might note, however, lp_coffee is a tuple. Attempting an arithmatic operation on a tuple will not go well (what is ("small", 139) + nf_toaster_pastry?). We'll want to revise our code slightly to avoid future errors.

In [None]:
lp_coffee = min(coffee, key = lambda entry: entry[1])
#by adding a [1] below, we clarify that we ony want to perform an arithmatic operation on the second slice. 
total = nf_toster_pastry + lp_coffee[1]
total_with_tax = (total ** tax) + total
return(total_with_tax)

Okay, we have only one task left: distinguishing fruit-flavored varietes from non-fruit-flavored ones. While specific forms of entity-recognition or word-embeddings may help in this task, Python does not have a function built to categorize food. We may be stuck with making lists.

Intuitively, it is easier to list items we consider to be fruits instead of listing all of the items that are not fruits. We can also make the assumption that toaster pastry fruit flavors will generally cater more towards fruits that are commonly eaten in their specific markets; accounting for "durians" may not be necessary. While string-matching isn't the most expensive process out there, using a list of 340 fruits to detect toaster pastry flavors is excessive.

Taking about four minutes for each list, trying making: 1.) a list of fruits from memory 2.) a list of fruits from Wikipedia or a grocery website.

In [None]:
list1 = []
list2 = []
list3 = ["Apples", "Apricots", "Banana", "Blueberry", "Cantaloupe", "Cherry", "Date", "Palm", "Australian Native Citrus", "Avocado",
         "Carissa", "Carob", "Cattleya Guava", "Ceriman", "Cherry of the Rio Grande", "Citron", "Clementines", "Cordia", "Crabapple", 
         "Green Grapes", "Grapefruit", "Honeydew Melon", "Lemons", "Limes", "Oranges", "Mandarin", "Mangos", "Papayas", "Peaches", 
         "Pears", "Pineapple", "Plantains", "Plums", "Pomelo", "Red Grapes", "Strawberry", "Tangarines", "Watermelon"]

Judging by list3 alone, there are a few processing steps we'll need to take when trying to match these strings to those in the toaster_pastry list. First, all of these entries begin with capital letters. Second, many are pluralized, which goes against flavor naming conventions ("Apples" is not in "atomic apple").

Python's **lower** function will let us take on this first issue. **Lower** returns a version of the input string that features only lowercase letters. A function called **upper** does the opposite.

In [None]:
test_string = "Look at those apples"
term = "Apples"

term in test_string

In [None]:
term.lower() in test_string

Next, we want to remove the "s" from many of the entries in list3. While this will work for most cases, "Peaches" and "Citrus" each present unique problems. Saying we want to get rid of all terms that end with "es" would give us "Peach", but it would also give us "Appl", "Clementin", "Grap", "Lim", "Orang", "Red Grap", and "Tangarin." While these words may seem good enough for string matching, a flavor call "Applied Tiramisu" or "Limitless Fudge Cake" would be misidentified as fruit flavors.

We have a few options to proceed, all of which have their drawbacks:
1.) Hand-correct the data. This will give us accuracy, but will take the longest amount of time.
2.) Implement an imperfect rule that will remove the "s" from every string that ends with s, even through we know that it will leave "Australian Native Citru" and "Peache"
2a.) Implement this rule with the conditions that it ignore words that end in "us", giving us "Peache" as our only error.
2b.) Implement this rule with the conditions that it ignore words that end in "us" and that if it encounters the word "Peaches", it should just return "peach"
3.) Import a word tree or dictionary that will return that actual singular version of each pluralized term. We would, however, likely run into an error with "Australian Native Citrus"

Since there isn't necessarily a right answer, we're going to see how some of these are implemented in code. You'll notice that **if**, **else**, and **elif** (else if) are used to set the various conditions mentioned above.

In [None]:
#2#
for fruit in list3:
    fruit = fruit.lower()
    if fruit[-1] == "s":
        print(fruit[:-1])
    else:
        print(fruit)


In [None]:
#2a#
for fruit in list3:
    fruit = fruit.lower()
    if fruit[-1] == "s":
        if fruit[-2:] == "us":
            print(fruit)
        else:
            print(fruit[:-1])
    else:
        print(fruit)

In [None]:
#2b#
for fruit in list3:
    fruit = fruit.lower()
    if fruit[-1] == "s":
        if fruit[-2:] == "us":
            print(fruit)
        elif fruit == "peaches":
            print("peach")
        else:
            print(fruit[:-1])
    else:
        print(fruit)

For the time-being, let's say that option 2b is what we want to use. Generally, it's not the best practice to write code that will correct specific instances and nuances in your dataset. For one, it's computationally inefficent to, in a list of a million words let's say, check if each word is "peaches" before moving on to the next part of a loop. Second, it shows a level of involvement and familiarity with the dataset that makes the code seem less generalizable than it could be.

## Defining Functions

Since copying and repeatedly pasting our code would be tedious, we're going to define its operations as a **function** called fruit_singulizer. This will allow us to call the function at anytime and have the same operation performed.

In [None]:
def fruit_singulizer(fruit):
    fruit = fruit.lower()
    if fruit[-1] == "s":
        if fruit[-2:] == "us":
            return(fruit)
        elif fruit == "peaches":
            return("peach")
        else:
            return(fruit[:-1])
    else:
        return(fruit)

Starting a line with "def" tells Python that you're beginning to define a new function. Rather than give that function a specific variable to work with, you want to give it a generic input. That is, rather than having this function only work with list3, I'm having it take any string as an input. For conveinence's sake, I've used "fruit" as an input here. Let's see how this works in practice:

In [None]:
for fruit in list3:
    fruit_singulizer(fruit)

As an excercise, try modifying the existing fruit_singulizer function to be compatible with your entires for list1 and list2.

Now that we have standardized our lists, we can begin string matching in earnest. We can create a new function called is_Fruit that will take two strings as an input and return whether or not the item is a fruit.

In [None]:
def is_fruit(fruit, flavors):
    return(fruit in flavors)

With this last component, we can begin to start putting our code together to sort fruit and non-fruit toaster pastry flavors. In line 1, we use Python's **map** function to create a new list. Much like when we used lambda to sort coffee prices, we're using it here to create a new list that contained that first part of every tuple in toaster_patries.

For the list fruits, we use **list comprehenesion**, a condensed version of a for-loop that creates a new list. In this short command, we say that for every item in list3, we want returned the fruit_singulizer version of that item.

In [None]:
flavors = list(map(lambda x: x[0], toaster_pastries))
fruits = [fruit_singulizer(item) for item in list3]

for flavor in flavors:
    if flavor in fruits:
        pass
    else:
        print(flavor)

Now that we have several working components, it's time to return to the original prompt for this tutorial: what's the most economical combination of a non-fruit toaster pastry and coffee? Below, the code snippets of code we've used so far will be modified to answer this question.

In [None]:
fruits = [fruit_singulizer(item) for item in list3]

nf_toaster_pastries = []
for entry in toaster_pastries:
    flavor = entry[0]
    if flavor in fruits:
        pass
    else:
        nf_toaster_pastries.append(entry)
nf_toaster_pastry =  min(nf_toaster_pastries, key = lambda entry: entry[1])       
lp_coffee = min(coffee, key = lambda entry: entry[1])
total = nf_toaster_pastry[1] + lp_coffee[1]
total_with_tax = (total ** tax) + total
print("You ordered a " + nf_toaster_pastry[0] +" toaster pastry with a " + lp_coffee[0] + " coffee for $" +str(round(total_with_tax,2)))

Some modifications were made from the earlier code. Since having the price of the non-fruit toaster pastries was important, the mapping function was removed and flavor was defined by the first entry in toaster pastries.

Like the lp_coffee variable, nf_toaster_pastry was determined by taking the entry with the smallest value from a list.

Additionally, an entire print statement was added. The **round** function was used to turn total_with_tax into a recognizable dollar amount with only two places.