Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel -> Restart) and then **run all cells** (in the menubar, select Cell -> Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [None]:
NAME = "Nathan Schaefer"
COLLABORATORS = "Nick Hageman"

---

# Introduction to Python (and Technical Background): Part I

First, let's import numpy (a package often used in scientific computing in general)

In [3]:
import numpy as np
# This gets rid of the "display" error from VSCode -- ugh.
from IPython.display import display

## Arrays: Example comparisions of Python with C++ and Matlab

### Example 1: summation of elements in an array

Below is a piece of C++ code similar to what you would have written in IEC many times for computing the summation of all of the elements in an array:

```c++
    int data[5] = {2, 15, -4, 6, 9};
    int s = 0; // will store summation of all elements
    for (int i = 0; i < 5; i++)
    {
        s  =s + data[i];
    }
    cout << s << endl;
```

Below is a similar piece of Matlab code (using more C++-style syntax rather than the Matlab sum function):
```matlab
    data = [2, 15, -4, 6, 9];
    s = 0; % will store summation of all elements
    for i = 1:5
        s = s + data(i);
    end
    disp(s);
```

Below is another piece of Matlab code that does not using an explicit index i (still using more C++-style syntax):
```matlab
    data = [2, 15, -4, 6, 9];
    s = 0; % will store summation of all elements
    for d = data
        s = s + d;
    end
    disp(s);
```

However, below is the more common way to do this in Matlab:
```matlab
    data = [2, 15, -4, 6, 9];
    s = sum(data); % compute summation of all elements
    disp(s);
```

Let's do something similar in Python (using numpy arrays rather than Python lists). 

One starting option (looking like a mixture of the C++-style and the first Matlab option):

In [None]:
data = np.array([2, 15, -4, 6, 9])
s = 0 # will store summation of all elements
for i in range(5): # range(5) is equivalent to range(0, 5) and provides the sequence of numbers 0, 1, 2, 3, and 4
    s += data[i]
print(s)

Here is another option (looking like a mixture of the C++-style and the second Matlab option):

In [None]:
data = np.array([2, 15, -4, 6, 9])
s = 0 # will store summation of all elements
for d in data: 
    s += d
print(s)

Below is another option using the sum function (looking like the more-common Matlab option):

In [None]:
data = np.array([2, 15, -4, 6, 9])
s = sum(data) # compute summation of all elements
print(s)

### Example 2: Summation of differences from a constant value

Now suppose we want to compute the following:
$$ 
\sum_{i=0}^4 (data[i] - 10)^2
$$

In C++, this might look like:
```c++
    int data[5] = {2, 15, -4, 6, 9};
    int ssd = 0;
    for (int i = 0; i < 5; i++)
    {
        ssd += (data[i] - 10)*(data[i] - 10);
    }
    cout << ssd << endl;
```

In Matlab, this might look like:
```matlab
    data = [2, 15, -4, 6, 9];
    ssd = sum((data - 10).^2); # recall use of elementwise exponentiation
    disp(ssd);
```

One option for computing this in Python (most like Matlab):

In [None]:
data = np.array([2, 15, -4, 6, 9])
ssd = sum((data-10)**2) # note use of ** rather than ^ (^ is the bitwise OR operator)
print(ssd)

If you prefer to have it most look like the C++-style, below is another Python option:

In [None]:
data = np.array([2, 15, -4, 6, 9])
ssd = 0
for i in range(5):
    ssd += (data[i]-10)*(data[i]-10)
print(ssd)

### Example 3: Weighted sum and dot product

Given a data array and an array of weights (each of size 6), suppose we wish to compute the following:
$$
\sum_{i=0}^5 data[i]*weights[i]
$$

Below is how we might do this in C++:
```c++
    double data[6] = {2.3, 7.8, 4.2, 8.6, 9.2, 7.2};
    double weights[6] = {0.1, 0.2, 0.3, 0.5, 0.2, 0.1};
    double weighted_sum = 0;
    for (int i = 0; i < 6; i++)
    {
        weighted_sum += data[i]*weights[i];
    }
    cout << weighted_sum << endl;
```

Below is how we might do this in Matlab:
```matlab
    data = [2.3, 7.8, 4.2, 8.6, 9.2, 7.2];
    weights = [0.1, 0.2, 0.3, 0.5, 0.2, 0.1];
    weighted_sum = sum(data.*weights); % recall use of elementwise multiplication
    disp(weighted_sum);
```

Below is another option in Matlab (recall that elementwise multiplication followed by summation can be computed with a dot product):
```matlab
    data = [2.3, 7.8, 4.2, 8.6, 9.2, 7.2];
    weights = [0.1, 0.2, 0.3, 0.5, 0.2, 0.1];
    weighted_sum = dot(data,weights); 
    disp(weighted_sum);
```

Below are two Matlab-like-options in Python (first using the sum function and then using the dot product):

In [None]:
data = np.array([2.3, 7.8, 4.2, 8.6, 9.2, 7.2])
weights = np.array([0.1, 0.2, 0.3, 0.5, 0.2, 0.1])
weighted_sum = sum(data*weights) # note that python does elementwise multiplication by default for numpy arrays
print(weighted_sum)

In [None]:
data = np.array([2.3, 7.8, 4.2, 8.6, 9.2, 7.2])
weights = np.array([0.1, 0.2, 0.3, 0.5, 0.2, 0.1])
weighted_sum = np.dot(data,weights) 
print(weighted_sum)

If you prefer, you may also write this using a loop:

In [None]:
data = np.array([2.3, 7.8, 4.2, 8.6, 9.2, 7.2])
weights = np.array([0.1, 0.2, 0.3, 0.5, 0.2, 0.1])
weighted_sum = 0
for i in range(len(data)): # note: now using len function rather than hard-coding the 6
    weighted_sum += data[i]*weights[i]
print(weighted_sum)

If you still wanted to use a loop, but not use an explicit index, you can use zip to iterate over the two arrays simultaneously:

In [None]:
data = np.array([2.3, 7.8, 4.2, 8.6, 9.2, 7.2])
weights = np.array([0.1, 0.2, 0.3, 0.5, 0.2, 0.1])
weighted_sum = 0
for data_item,weight in zip(data,weights): # zip returns a tuple
    weighted_sum += data_item*weight
print(weighted_sum)

### Example 4: Summation involving indices and data values

Now, given an array of $n$ data values, suppose you wish to compute the following:

$$
\sum_{i=0}^{n-1} i*data[i]
$$

In C++, the code might look something like the following:
```c++
    double data[6] = {2.3, 7.8, 4.2, 8.6, 9.2, 7.2};
    int n = 6;
    double total = 0;
    for (int i = 0; i < n; i++)
    {
        total += i*data[i];
    }
    cout << total << endl;
```

In Matlab, the code might look something like the following:
```matlab
    data = [2.3, 7.8, 4.2, 8.6, 9.2, 7.2];
    indices = 0:5;
    total = sum(indices.*data);
    disp(total);
```

In Python, one option for computing this in a loop would be to use enumerate (which allows you to iterate over items and indices):

In [None]:
data = np.array([2.3, 7.8, 4.2, 8.6, 9.2, 7.2])
total = 0
for i, data_item in enumerate(data): # enumerate gives an index value, and the data value
    total += i*data_item             # does i start at 1, or 0?
    print("i = ",i)
print(total)

In this case, you could still use the range option (but the above is often preferred):

In [None]:
data = np.array([2.3, 7.8, 4.2, 8.6, 9.2, 7.2])
total = 0
for i in range(len(data)): 
    total += i*data[i]
    print("i=",i)
print(total)

## Assignment

For this problem, we will assume we have a list of items we want to consider purchasing at Fareway or Costco (and have stored their names in a Python list) and a list of the names of the units (e.g., 'oz') for which the item is sold. We also have a numpy array to store the prices at Fareway, a numpy array to store the unit values at Fareway, a numpy array to store the prices at Costco, and a numpy array to store the unit values at Costco:
* items[i]: name of item i
* unit_names[i]: unit name of item i (e.g., 'oz')
* prices_fareway[i]: price of item i at Fareway for the units specified in unit_value_fareway[i] and unit_name[i]
* unit_values_fareway[i]: number of units of item i at Fareway (e.g., the 10.8 in 10.8 oz)
* prices_costco[i]: price of item i at Costco for the units specified in unit_value_costco[i] and unit_name[i]
* unit_values_costco[i]: number of units of item i at Costco (e.g., the 55 in 55 oz)

Below is the data we will assume to have available for the assignment that follows. We have also provided example of printing all of the items and their costs/quantities in a table. 

In [8]:
# list of items and unit names
items = ['applesauce_cup', 
         'bananas', 
         'cheerios', 
         'cheese_shredded', 
         'cookies_pepperidge_farm_milano', 
         'crackers_graham', 
         'eggs_organic', 
         'oatmeal']
unit_names = ['ct','lbs','oz','oz','oz','oz','ct','oz']

# arrays of prices and unit values
prices_fareway = np.array([2.79, # applesauce_cup
                         1.77, # bananas
                         2.67, # cheerios
                         2.00, # cheese_shredded
                         1.88, # cookies_pepperidge_farm_milano
                         3.29, # crackers_graham
                         4.89, # eggs_organic
                         4.79  # oatmeal
                              ])
unit_values_fareway = np.array([4, # applesauce_cup
                              3, # bananas
                              10.8, # cheerios
                              8, # cheese_shredded
                              6, # cookies_pepperidge_farm_milano
                              14.4, # crackers_graham
                              12, # eggs_organic
                              42  # oatmeal
                            ])

prices_costco = np.array([9.89, # applesauce_cup
                         1.49, # bananas
                         7.99, # cheerios
                         10.99, # cheese_shredded
                         8.99, # cookies_pepperidge_farm_milano
                         7.99, # crackers_graham
                         6.39, # eggs_organic
                         8.29  # oatmeal
                              ])
unit_values_costco = np.array([24, # applesauce_cup
                              3, # bananas
                              55, # cheerios
                              48, # cheese_shredded
                              22.5, # cookies_pepperidge_farm_milano
                              57.6, # crackers_graham
                              24, # eggs_organic
                              160  # oatmeal
                            ])

# import pandas to be able to display in nice-looking table
import pandas as pd
# my_columns = ['Item', 'Unit Name', 'Fareway Price', 'Units', 'Costco Price', 'Units']
my_columns = ['items', 'unit_names', 'prices_fareway', 'unit_values_fareway', 'prices_costco', 'unit_values_costco']
display(pd.DataFrame(np.column_stack([items, unit_names, prices_fareway, unit_values_fareway, prices_costco, unit_values_costco]),
                     columns=my_columns))

Unnamed: 0,items,unit_names,prices_fareway,unit_values_fareway,prices_costco,unit_values_costco
0,applesauce_cup,ct,2.79,4.0,9.89,24.0
1,bananas,lbs,1.77,3.0,1.49,3.0
2,cheerios,oz,2.67,10.8,7.99,55.0
3,cheese_shredded,oz,2.0,8.0,10.99,48.0
4,cookies_pepperidge_farm_milano,oz,1.88,6.0,8.99,22.5
5,crackers_graham,oz,3.29,14.4,7.99,57.6
6,eggs_organic,ct,4.89,12.0,6.39,24.0
7,oatmeal,oz,4.79,42.0,8.29,160.0


### Step 1.

Compute the total cost at each store (Fareway and Costco) if purchased items in the unit values as given (you should get \\$24.08 for Fareway and \\$62.02 for Costco). 

In [13]:
# replace the 0 in the following lines with your solution (keep the variable names the same)
fareway_total_cost = sum(prices_fareway)
costco_total_cost = sum(prices_costco)
# YOUR CODE HERE
# raise NotImplementedError()
# compute summation of all elements
print(fareway_total_cost)
print(costco_total_cost)

24.08
62.02


In [12]:
# check your answer:
import pytest
assert fareway_total_cost == pytest.approx(24.08)
assert costco_total_cost == pytest.approx(62.02)

### Step 2.

Compute two new arrays: a Fareway price/unit array and a Costco price/unit array.

In [16]:
# replace the following two lines with your solution (keep the same variable names)
fareway_price_per_unit = np.array(prices_fareway / unit_values_fareway)
costco_price_per_unit = np.array(prices_costco / unit_values_costco)


# YOUR CODE HERE
# raise NotImplementedError()
print(fareway_price_per_unit)
print(costco_price_per_unit)
# code to update table
columns = ['Item', 'Unit Name', 'Fareway Price', 'Units', 'Fareway Price/Unit', 'Costco Price', 'Units', 'Costco Price/Unit']
display(pd.DataFrame(np.column_stack([items, unit_names, prices_fareway, unit_values_fareway, np.round(fareway_price_per_unit, decimals = 2), 
                                                         prices_costco, unit_values_costco, np.round(costco_price_per_unit, decimals = 2)]),
                     columns=columns))

[0.6975     0.59       0.24722222 0.25       0.31333333 0.22847222
 0.4075     0.11404762]
[0.41208333 0.49666667 0.14527273 0.22895833 0.39955556 0.13871528
 0.26625    0.0518125 ]


Unnamed: 0,Item,Unit Name,Fareway Price,Units,Fareway Price/Unit,Costco Price,Units.1,Costco Price/Unit
0,applesauce_cup,ct,2.79,4.0,0.7,9.89,24.0,0.41
1,bananas,lbs,1.77,3.0,0.59,1.49,3.0,0.5
2,cheerios,oz,2.67,10.8,0.25,7.99,55.0,0.15
3,cheese_shredded,oz,2.0,8.0,0.25,10.99,48.0,0.23
4,cookies_pepperidge_farm_milano,oz,1.88,6.0,0.31,8.99,22.5,0.4
5,crackers_graham,oz,3.29,14.4,0.23,7.99,57.6,0.14
6,eggs_organic,ct,4.89,12.0,0.41,6.39,24.0,0.27
7,oatmeal,oz,4.79,42.0,0.11,8.29,160.0,0.05


### Step 3.

Display the names of the item(s) for which the Fareway value is better (i.e., the price per unit is lower). (You will actually only find one item.) Hint: Consider using a for loop to compare the price-per-unit values (with range or zip). In any iteration where the Fareway value is lower, display the item. (If you are experienced with Python, there are other options as well, such as use of list comprehensions: see, for example, https://docs.python.org/3/tutorial/datastructures.html.)

Aside: Example syntax for an if statement (also refer to the Python documentation: https://docs.python.org/3/tutorial/controlflow.html):
```python
    a = 5
    b = 2
    if (a > b):
        print ('a is larger')
    else:
        print ('a is not larger')
```


In [24]:
print ("Item(s) with a better value at Fareway: ")
# YOUR CODE HERE
# raise NotImplementedError()
for i in range(len(fareway_price_per_unit)):
     if fareway_price_per_unit[i] < costco_price_per_unit[i]:
           print(items[i])

Item(s) with a better value at Fareway: 
cookies_pepperidge_farm_milano


### Step 4.

While Costco often has unit values that are larger than those at Fareway (because you by more of an item), for purposes of comparison, let us pretend like we could purchase items at Costco with the same unit values as those at Fareway. For example, instead of purchasing 24 applesauce cups for \\$9.89, let us suppose that we can purchase 4 applesauce cups for \\$1.65 (the price per unit cost (\\$9.89/24) times the 4 cups). We will call this the "costco-equivalent" price for the item (Fareway unit values; Costco prices). In this step, you should compute the costco-equivalent prices for all of the items and compute/display the total (you should get \\$16.30). You should then display the amount you would save by purchasing the items at Costco (you should get \\$7.78). (You may use any of the arrays already computed above.)

In [29]:
print(f'Recall Fareway total: ${fareway_total_cost:.2f}')
# YOUR CODE HERE

# raise NotImplementedError()
costco_as_fareway_total_cost = 0
for i in range(len(costco_price_per_unit)): 
    costco_as_fareway_total_cost += unit_values_fareway[i]*costco_price_per_unit[i]
print(costco_as_fareway_total_cost)

print(fareway_total_cost - costco_as_fareway_total_cost)


Recall Fareway total: $24.08
16.30490378787879
7.775096212121209
