# Adding nutritional labels to foods

**Learning goal:** In this case, you will learn what Python functions are and how to create and execute them.

You and your classmates are conducting a research project to determine how precise food manufacturers' claims about the nutritional contents of their products are. You have already gathered data for about 500 products in your local supermarket, and now you want to label each one according to its calorie density and sugar and fat content.

For this, you want to use labeling criteria that the FDA (Food and Drug Administration) has defined for the country:

| Category 	| Label        	| Criteria                                     	|
|----------	|--------------	|----------------------------------------------	|
| Calories 	| Calorie free 	| Less than 5 calories per serving             	|
| Calories 	| Low calorie  	| Less than 40 calories per serving            	|
| Fat      	| Fat free     	| Less than 0.5 grams of total fat per serving 	|
| Fat      	| Low fat      	| 3 grams or less of total fat per serving     	|
| Sugar    	| Sugar free   	| Less than 0.5 grams of sugar per serving     	|

Source: [American Diabetes Association](https://www.diabetes.org/healthy-living/recipes-nutrition/reading-food-labels) and [Institute of Medicine](https://www.ncbi.nlm.nih.gov/books/NBK209851/).


## Adding calorie labels

Let's start easy. When writing long pieces of code, it is always a good idea to split the task into smaller units to make things more manageable. This is the decision tree for the calorie labels:

![Calories tree](data/images/calories_tree.png)

Let's write code that implements this decision tree and prints the resulting label. We will use this sample food to test our code (nutritional facts taken from the [US Department of Agriculture](https://fdc.nal.usda.gov/fdc-app.html#/food-details/362759/nutrients)):

~~~python
banana = {
    "serving_size":28, # In grams
    "calories":94.1, # In Kcal
    "fat":300, # In milligrams
    "sodium":1.96, # In milligrams
    "sugar":16, # In grams
    "fiber":0.504 # In grams
}
~~~

In [None]:
banana = {
    "serving_size":28,
    "calories":94.1,
    "fat":300,
    "sodium":1.96,
    "sugar":16,
    "fiber":0.504
}

if banana["calories"] < 5:
    calories_label = "CALORIE FREE"
elif banana["calories"] < 40:
    calories_label = "LOW CALORIE"
else:
    calories_label = None

print(calories_label)

Banana doesn't have a label because it has too many calories and therefore doesn't qualify as `CALORIE FREE` or `LOW CALORIE`. Its label is a `None` Python object.

### Exercise 1

Do the same for this [food](https://fdc.nal.usda.gov/fdc-app.html#/food-details/1103276/nutrients):

~~~python
tomato = {
    "serving_size":125,
    "calories":22.5,
    "fat":250,
    "sodium":6.25,
    "sugar":3.29,
    "fiber":1.5
}
~~~

**Answer.**

Since a whole tomato (~125g) has more than 5 calories and less than 40 calories, it should be labeled as `LOW CALORIE`.

Did you notice that you had to rewrite parts of the code to match the new food? If you were to do this for a watermelon, then you would have to modify those parts of the code again. Imagine having to do that for the whole supermarket!

This is where **functions** can be useful. When you have a bit of code that you find yourself using again and again, with only small adjustments each time, you might be better off just writing a function. Functions are great if you want to generalize your code so that it can be used in a variety of situations. For instance, we can turn this code:

~~~python
if banana["calories"] < 5:
    calories_label = "CALORIE FREE"
elif banana["calories"] < 40:
    calories_label = "LOW CALORIE"
else:
    calories_label = None

print(calories_label)
~~~

into a function so that you don't have to change `banana` for the name of the food every time you run it. You can have a placeholder instead, like this:

In [None]:
def assign_calories_label(food):
    if food["calories"] < 5: # We changed "banana" to "food"
        calories_label = "CALORIE FREE"
    elif food["calories"] < 40: # We changed "banana" to "food"
        calories_label = "LOW CALORIE"
    else:
        calories_label = None
    print(calories_label)

Let's talk about some of the code here. You'll notice that much of the code is the same as before, but "banana" has been replaced with "food", and there is a new line `def assign_calories_label(food):` that precedes everything else. This line is known as the **function declaration statement**. It always starts with the restricted keyword **`def`**, followed by what we wanted to name the function (in this case, `assign_calories_label`), then a collection of **arguments** enclosed in parentheses, and finally ends with a colon. In this case, `food` is the only argument, though later, you will see functions with multiple arguments, as well as some with no arguments.

You can think of arguments as if they're part of a conversation. For example, the function name `assign_calories_label` is the topic of conversation, and the argument `food` is the discussion item. So if we were talking about assigning a calorie label and you said, "what is the label for a banana?" then "banana" becomes the argument you **passed** into our conversation (you *pass* arguments to functions).

If you run the cell above, you won't see anything get printed. Why is that? Well, a function is meant to generalize our code, but without giving it information about the specific situation we want to use it for, it won't know what to do for us! That's the purpose of the argument(s) (in this case, `food`) - they are placeholders that we are allowed to replace with information specific to our current situation so that the function can give us the relevant output. In this case, the specific information (the type of food) we are dealing with is a tomato, so let's replace "food" with "tomato" and run:

In [None]:
assign_calories_label(tomato)

### Exercise 2

Test the `assign_calories_label()` function using `banana` instead.

**Answer.**

So you see that when you call a function, you don't have to modify the code of the function itself, but only the arguments you pass (the standard way of saying that you executed a function is that you **called** it).

It is important to note that the code inside the function **must be indented** in order for it to work correctly.

It is also a good idea to add a **[`docstring`](https://www.programiz.com/python-programming/docstrings)** to your function. Docstrings are useful because if you come back to your code in the future, you will be able to easily see in plain English what each function was meant to do. Docstrings are placed inside triple quotes (`"""`). `assign_calorie_labels` with a suitable docstring could look like this:

In [None]:
def assign_calories_label(food):
    """
    Assign a calorie label according to FDA rules.
    
    Arguments:
    food: A Python dictionary that has at least a "calories" key.
    
    Outputs:
    No outputs. This function simply prints the label.
    """
    if food["calories"] < 5:
        calories_label = "CALORIE FREE"
    elif food["calories"] < 40:
        calories_label = "LOW CALORIE"
    else:
        calories_label = None
    print(calories_label)

### Exercise 3

Run `help(assign_calories_label)`. What do you see?

**Answer.**

So far, our function prints something to the screen, but it doesn't have any **outputs**; i.e., it does not produce anything of value to the computer. (The computer does not consider something displayed on the screen to be an output, even though that definition may seem natural to us since we are visual creatures - this is a very important distinction to remember!)

To have our function return an output, we can use the `return` keyword. This new version of our function does not immediately print the label but returns it as a string object instead (or `None` if no label was applicable):

In [None]:
def assign_calories_label(food):
    """
    Assign a calorie label according to FDA rules.
    
    Arguments:
    food: A Python dictionary that has at least a "calories" key.
    
    Outputs:
    calories_label: A string, either "CALORIE FREE", "LOW CALORIE" or None
    """
    if food["calories"] < 5:
        calories_label = "CALORIE FREE"
    elif food["calories"] < 40:
        calories_label = "LOW CALORIE"
    else:
        calories_label = None
    return calories_label

So now we can save the label values to variables to print them later on:

In [None]:
tomato_calorie_label = assign_calories_label(tomato)
banana_calorie_label = assign_calories_label(banana)

In [None]:
print("The label for tomato is", tomato_calorie_label)
print("And the label for banana is", banana_calorie_label)

This diagram summarizes the different parts of a user-defined function definition in Python:

![Def](data/images/def_anatomy.png)

## Adding fat labels

Now our function is ready to add calorie labels. Let's do the same for fat content. This is the decision tree:

![Fat decision tree](data/images/fat_tree.png)

### Exercise 4

Create the function `assign_fat_label()` to implement this decision tree.

**Answer.**

Let's test this out with `tomato`:

In [None]:
print(assign_fat_label(tomato))

Well, this is surprising! Tomatoes are supposed to be `FAT FREE`, aren't they? So, what happened here?

Let's inspect our data:

In [None]:
tomato

We know that a whole tomato weighs about 125 grams, but here we see that it has 250 units of fat! These certainly can't be grams!

It turns out our dataset has fat content *in milligrams*, not grams. Thus, 250mg is equivalent to 0.25 grams. That's more like it!

For our conditionals to work properly, we need to convert the fat content to grams. Since 1 gram is equal to 1,000 milligrams, we need to divide the number by 1,000.

### Exercise 5

Modify the `assign_fat_label()` function to use grams instead of milligrams in the conditionals.

**Answer.**

Let's test our modified function:

In [None]:
print(assign_fat_label(tomato))

This is the expected output! We're done here.

## Adding sugar labels

For sugar, our task is more straightforward: If there are less than 0.5 grams of sugar per serving, it is `SUGAR FREE`. If there are more than 0.5 grams, then the food receives no label. In code, this is:

In [None]:
def assign_sugar_label(food):
    """
    Assign a sugar label according to FDA rules.
    
    Arguments:
    food: A Python dictionary that has at least a "sugar" key.
    
    Outputs:
    sugar_label: A string, either "SUGAR FREE" or None
    """
    if food["sugar"] < 0.5:
        sugar_label = "SUGAR FREE"
    else:
        sugar_label = None
    return sugar_label

Let's test a banana, which is definitely *not* sugar free (its label should be `None`):

In [None]:
print(assign_sugar_label(banana))

## Putting it all together

### Exercise 6

Think of a strategy to make these three functions into a single function while writing the least amount of redundant code. Don't write any code yet - just note what course of action you would follow.

**Answer.**

## Anonymous functions

We generally don't want to create functions that we won't be using often, so we can use **anonymous functions** instead. These are just like ordinary "`def`" functions, only that you don't have to give them a name (that's why they're anonymous). To tell Python that some piece of code is going to be an anonymous function, we replace `def` with the keyword **`lambda`** and use the following syntax:

~~~python
lambda my_input: <do_something_with_my_input> # You can have more than 1 input, just separate them with commas
~~~

And then, to evaluate our function, we use

~~~python
(lambda my_input: <do_something_with_my_input>)(the_actual_input)
~~~

So our anonymous version of `assign_fda_labels(food)` could be:

~~~python
lambda food: [assign_calories_label(food), assign_fat_label(food), assign_sugar_label(food)]
~~~

and to actually run it, we would do this:

In [None]:
(lambda food: [assign_calories_label(food), assign_fat_label(food), assign_sugar_label(food)])(banana)

Generally speaking, you want to use `lambda` functions when you have extremely readable code that you intend to use only once. You should not use `lambda` if you plan to copy and paste your code again and again in different parts of your script, especially if that code is not immediately readable.