# Lists in Python

Previously, we have learned that [variables](python_variables.ipynb) in Python can take on values that are either numbers (e.g., integer or floating point numbers) or strings.  However, we only considered variables as having a single value.  Lists are variables that are collections are values.


## Creating and modifying lists

We can create a list in Python by simply entering a sequence of values or variables, separated by commas, between two square brackets.  We can enter any number of values, and the values do not even need to have the same type.  In the following, we create a list consiting of the values `2.0`, `1.3e-2`, `'hello'`, and the letter `'B'` and assign it to the variable `x`.

In [None]:
x = [2.0, 1.3e-2, 'hello', 'B']

print(x)

Lists can also be empty.  Below we create an empty list and assign it to the variable `a_list`.

In [None]:
a_list = []

print(a_list)

We can append a value to the end of a list by simply using the "append" member function of the list.  That is, to append the value `4.8` to the list `x` that we created above, we simply write:

In [None]:
x.append(4.8)

print(x)

We can perform this process repeatedly to successively add items to a list.  In the example below, we build upon the empty list `a_list` that we created above:

In [None]:
a_list.append('a')
a_list.append('b')
a_list.append('c')
a_list.append('d')
a_list.append('e')

print(a_list)

Finally, we note that there are functions in Python that automatically generate lists for us.  An important function is `range`.  This function creates an `iterator` for us, which we can easily convert to a list, using the function `list`:

In [None]:
b_list = range(10)
print(b_list)
print( list(b_list) )

b_list = list( range(3, 9) )
print(b_list)

b_list = list( range(2, 20, 5) )
print(b_list)

## Accessing lists

The items in a list maintain their order.  Each is assigned an integer, starting from zero.  We can access each item from a list using this integer:

In [None]:
print(x)

print(x[0])
print(x[1])

Note that we can conveniently access elements from the back of the list using negative numbers.  The last element of a list is associated with the integer `-1`, the second to last element is associated with `-2`, etc.

In [None]:
print(x)

print(x[-1])
print(x[-2])

## Unpacking lists

Sometimes holding values in a list is not convenient, and we want to "unpack" a list in to a set of individual variables that contain a single value.  We can do this as show below:

In [None]:
print(x)


A, B, C, D, E = x
print(A)
print(B)
print(C)
print(D)
print(E)

Note that the number of variables on the left of the equal sign needs to be the same as the number of item in the list on the right, otherwise an error will occur

In [None]:
A, B = x


We can remedy this by adding an asterix `*` in front of the final variable on the left.  This puts the rest of the items of the list into the final variable.

In [None]:
A, B, *C = x

print(A)
print(B)
print(C)

## List functions

There are some built-in functions for lists that we can use to perform operations that we frequently need.  The function `len` returns the number of items in a list, while the function `sum` returns the sum of all the items in a list of numbers.

In [None]:
x_list = list(range(10))
print(x_list)

length = len(x_list)
print(length)

N = sum(x_list)
print(N)

## List comprehensions

We can construct a list from another list by using something called a list comprehension.  In the example below, we create a list `x_list`, which contains the integers from 0 to 9.  We construct a list `y_list` using a list comprehension, which iterates, in order, over every element in `x_list`, and puts the square of that element in the corresponding position in the list `y_list`.

In [None]:
x_list = range(10)
print(list(x_list))

y_list = [x**2  for x in x_list]
print(y_list)

## Looping

Probably one of the most important aspects of lists is that it allows us to perform repetitive operations for each item in the list.  This is known as a loop.  

As an example, let us assume that we have a four component system, and the mass of each component is kept in the variable `mass_list`. We can access each component in the list sequentially by using a loop. The `for` statement means that we iterate over every element of `mass_list`, placing each element in the variable `mass`.

In [None]:
mass_list = [2.3, 6.2, 2.8, 5.6]
element = 0

for mass in mass_list:
    print(element, mass)
    element += 1

Alternatively, we can loop through a list using an `iterator` variable. We can do so with the combination of previously discussed functions: `range()` and `len()`. 

In [None]:
mass_list = [2.3, 6.2, 2.8, 5.6]

for i in range(len(mass_list)):
    print(i, mass_list[i])

This method has the advantage of allowing you to access not just one element in a list, but also the elements next to it. For example, if we wanted to find the difference between the elements in `mass_list`, we can do so as follows.

In [None]:
mass_list = [2.3, 6.2, 2.8, 5.6]

for i in range(len(mass_list) - 1):
    diff = mass_list[i+1]-mass_list[i] 
    print(diff)

Demonstrated above, we can also perform mathematical operations inside of the loop. If we want to know the total mass of the system, we just need to sum over all the elements of `mass_list`.

In [None]:
mass_list = [2.3, 6.2, 2.8, 5.6]

total_mass = 0.0
for mass in mass_list:
    print(f'Cummulative mass: {total_mass}')
    total_mass += mass
    
print(f'Total mass: {total_mass}')

  We initialize the variable `total_mass` to zero, and in the loop, we increment its value by the value of each item in the list (i.e. the value of `mass`).  Note that the symbol `+=` means we increment the value of the variable on the left of the symbol, by the value of the variable on the right. i.e. `total_mass += mass` is equivalent to `total_mass = total_mass + mass`.

Additionally, we can also use loops to construct lists.  In the code below, we construct a list of mass fractions of each of the species.

In [None]:
mass_frac = []
for x in mass_list:
    mass_frac.append(x/total_mass)
print(mass_frac)

`mass_frac` is initialised as an empty list. With each iteration of the loop, a component of `mass_list` is assigned to the variable `x`, `x` is then used to calculate the respective mass fraction. The calculated result is then added to the end of the list.\
\
We can confirm the result is correct by summing the mass fractions in `mass_frac`.

In [None]:
print(sum(mass_frac))

## Worked Example: Rectangle/Mid-point Rule Integration

Combining what we've learned so far, consider the following integral:
\begin{equation}
  I = \int_0^{10} dx\, (x^2 + 2x + 1)
\end{equation}
We can approximate the integral by using the midpoint rule and 10 rectangles of equal width $\Delta x=1$.  If we number the rectangles from 0 to 9, the center of rectangle $i$ is $x_i=(i+1/2)\Delta x$.  Then the approximation is
\begin{align*}
I &\approx  \sum_{i=0}^{9} \Delta x\, (x_i^2 + 2x_i + 1)
\\
&\approx \sum_{i=0}^9 \Delta x\,f_i
\end{align*}
where $f_i = x_i^2 + 2x_i + 1$.

To do so, we must first create a list that corresponds to the center point of each rectangle, from which we will find the height of the rectangle.

In [None]:
N = 10 # number of rectangles
a = 0 # lower limit
b = 10 # upper limit

dx = (b-a)/N

x_list = [(i+0.5)*dx for i in range(N)] # list with the center of each rectangle
f_list = [x**2 + 2*x + 1 for x in x_list]

print(x_list)
print(f_list)



Note that we could have also constructed the required lists using a loop:

In [None]:
x_list = []
f_list = []
for i in range(N):
    x = dx*(i+0.5)
    x_list.append(x)
    f_list.append(x**2+2*x+1)
    
print(x_list)
print(f_list)

Finally, we can multiply the heights by the bar width to get the area of each bar, then sum them up to find the approximate area under the curve.  This is just the physical content of the approximation:
\begin{align*}
I &\approx  \sum_{i=0}^{9} \Delta x\, (x_i^2 + 2x_i + 1)
\approx \sum_{i=0}^9 \Delta x\,f_i
.
\end{align*}
We will perform this sum with a `for` loop.  Note that the index $i$ runs over the number of the rectangle.  The area of each rectangle, which is equal to $\Delta x\,f_i$, is sequentially added to the variable $I$, in a manner that is very similar to what is shown mathematically in the equation above.  

The answer below should be compared to the exact value $I=443.333$.

In [None]:
I = 0
for i in range(N):
    I += f_list[i]
I *= dx    
print(I)

As we increase the number of rectangles, the accuracy of the approximation should increase. 

The following code block can be ran to visualize the process we just completed. Don't worry if the code looks a bit complicated, it is purely for demonstration purposes.

In [None]:
import numpy as np
import pylab as plt
import matplotlib.patches as mpatches
from matplotlib.patches import Rectangle
x_exact = np.arange(0.0, 10.0, 0.01)
f_exact = [x**2 + 2*x + 1 for x in x_exact]
#plt.figure()
plt.plot(x_exact, f_exact, color='black')
plt.plot(x_list, [0 for x in x_list], 'o', color='black', label='Mid-points')
currentAxis = plt.gca()
for x, f in zip(x_list, f_list):
    xleft = x - 0.5*dx
    xright = x + 0.5*dx
    currentAxis.add_patch(Rectangle((x-0.5*dx, 0), dx, f, alpha=0.5, edgecolor='blue'))    
    #plt.plot([xleft, xleft, xright, xright], [0.0, f, f, 0.0], color='blue')
plt.xlabel(r'$x$')
plt.ylabel(r'$f(x)=x^2+2x+1$')
plt.legend()
plt.show()



#plt.plot(x_list, f_list, label='x^2 + 2x + 1')
#plt.plot(x_list, [0 for x in x_list], 'x', label='Mid-points')
#plt.xlabel('x')
#plt.ylabel('f(x)')
#for i in range(N):
#  rect=mpatches.Rectangle((x_list[i]-0.5*dx,0), dx,f_list[i], color="green", alpha=0.1)
#  plt.gca().add_patch(rect)
#plt.legend()

## Looping with multiple lists
Python's `zip()` function allows for the simultaneous iteration of multiple lists. `zip()` takes in two or more `list` arguements, and returns a new list, whose elements comprise of the pairing of the input lists. For example:  

In [None]:
list_1 = [1, 3, 5]
list_2 = [2, 4, 6]

print(list(zip(list_1, list_2)))

Note that if the input arguements are lists of different sizes, the returned list will have size equal to the smallest inputted list, the 'extra' elements will be ignored. In practise, `zip()` is very useful when multiple lists are used within the same loop

Consider the folllwing table of materials, if we want to calculate the density of each material, we can do so with a loop.

| Material | Mass  | Volume           |
|:--- | --- | --- |
|             |kg          | m$^3$ |
| Water     | 837 |       0.84 |
| Ethanol  | 205 |       0.26 |
| Steel     | 629 |       0.08 |

In [None]:
mass_list = [837, 205, 629]
volume_list = [0.84, 0.26, 0.08]

density_list =[]
for mass, volume in zip(mass_list, volume_list):
  density_list.append(mass/volume)

print(density_list)

With each iteration of the loop, a component of `mass_list` and its corresponding component in `volume_list`, is assigned to `mass` and `volume` respectively, from which the density is calculated and added to the bottom of `density_list`. 

Alternatively an iterable index, a list of integers that corresponds to a 
position in a list, can be used to access elements sequentially. Returning to the previously seen problem, it can also be solved as such:

In [None]:
mass_list = [837, 205, 629]
volume_list = [0.84, 0.26, 0.08]

density_list =[]
for index in range(len(mass_list)):
  density_list.append(mass_list[index] / volume_list[index])

print(density_list)

`len()` is a function that returns the size of a list. While `range()` returns a sequence of numbers (starting at 0), with size equal to its input. In this scenario `len(mass_list)` would give the integer `3`, consequently `range(len(mass_list))` would give the list `[0, 1, 2]`. Each element of the aforementioned list would then be assigned to the variable `index` and used to access elements in `mass_list` and `volume_list`.

###Practise Exercises
**Q1**. Calculate the sum of integers from 1 to 100, i.e.
\begin{equation}
  = 1 + 2 + 3.... +99 + 100
\end{equation}

**Q2**. Returning to the the same exercise from the previous notebook, complete the exercise by utilising lists and loops.

Calculate and print the heat of combustion for 100.0g of propanol, given the following heats of formation:

| Compound | ΔH$^o$$_f$ (kJ mol$^{-1}$) |
|:---|---|
| CO$_{2(g)}$    | -393.5 |
|H$_2$O$_{(l)}$| -285.86|
|C$_3$H$_7$OH$_{(g)}$|-303.0|

**Hint**: create a list for both the heats of formation, and the coefficients of each compound.

**Solutions to practise exercises.**

In [None]:
sum = 0
for i in range(100):
  sum +=i+1
print(sum)

In [None]:
prop_mass = 100 
prop_mw = 60 

heats = []
heats.append(-393.5) 
heats.append(-285.86)
heats.append(-303.0)

coeffs = [3, 4, -1]

combust_heat_permole = 0
for heat, coeff in zip(heats, coeffs):
  combust_heat_permole += heat*coeff

combust_heat = combust_heat_permole * prop_mass / prop_mw

print(f'The heat of combustion for {prop_mass}g of propanol is {combust_heat:.2f}kJ/mol')

### Conclusion

[Next up](python_dictionaries.ipynb), we'll look at another Python datatype, dictionaries.