# Lesson 3: Avoiding repetition with Lists and Loops

To this point, we have focused on variables holding a single value or operations performed only once. But, what if we have many values that we want to store or many operations that we want to perform? This is where lists and loops come in. In this module, we will learn how to harness the power of lists and loops to make our code more efficient and powerful.

Learning objectives of this module:
1. Learn the purpose of lists, how to create one, and how to manipulate them.
2. Learn how to use loops to repeat operations, including how to use loops to iterate through lists. There are two main types of loops we will discuss:
    - For loops
    - While loops
3. Learn how to combine loops with conditionals to create more complex programs.
4. __Briefly introduce the idea of vectorized operations and numpy arrays: a more efficient way to perform operations on lists of numbers.__

## Lesson 3.3: Python Packages, Vectorization, and the Numpy Library

Base python provides all you need to write complex programs for any task you need, and we have provided you with all of the tools required (mostly) to write your own code. But, you don't really want to have to write your own code for everything, even for something as simple as calculating the mean of a list. Luckily, there are many packages that you can install on top of base python that you can use to make your life easier and quicker (see the table in the [UVA Computational Resources repo](https://github.com/UVaBME/UVA_Computing_Resources/blob/main/Python_Resources/Python_Software_and_Packages.md) for examples of commonly used packages at UVA). In this module, we will briefly discuss one of the foundational python packages, NumPy, and how it implements *vectorization* to make your code faster and easier to read.

### NumPy arrays

So far we have discussed lists and how they can
 be used to store lots of data. However, lists are not the most efficient way to store data and require using loops to perform operations on them. NumPy arrays are a more efficient way to store data and allow us to perform operations on them without using loops (process called __vectorization__), which makes our code faster and easier to read. For now, it's not important to understand the reason for this, but you can think of numpy arrays as lists with added functionality.

Let's see how to create a numpy array, in this case from a list:

In [None]:
import numpy as np

measurements = np.array([100, 10, 20, 15])
measurements

array([100,  10,  20,  15])

First, because numpy is not a part of base python, we need to load the numpy library. To do this, we use the import command. We can also give the library a 'nickname', which we will use to reference the numpy library in our code. For numpy, it's standard to use the nickname 'np'.

Next, we can use the `array()` function (more on what a function is soon) to create a numpy array. Because, this is a part of the numpy package, we need to tell python to look for the function in the numpy library, which is why we use the syntax `np.array(myList)`.

We can access elements of the array in the same as our lists before:

In [None]:
print(measurements[0])
print(measurements[0:2])
print(measurements[-1])

100
[100  10]
15


Now, that we have a numpy array, we can quickly perform operations that required a for loop previously. For example, to convert the values from millimeters to meters, as before:

In [None]:
measurements = measurements/1e6
measurements

array([1.0e-04, 1.0e-05, 2.0e-05, 1.5e-05])

That was easy! You can also perform element-wise operations with two arrays of the same length. You can also perform multiple conditional operations on numpy arrays simulatenously.

In [None]:
mass = np.array([1, 2, 3])
volume = np.array([1, 4, 9])
density = mass / volume
density

array([1.        , 0.5       , 0.33333333])

*Brief aside: notice that while our lists contains integers, after performing division, we end up with floats, even for our value of 1. Python will always return a float when dividing.*

In addition to numerical operations, you can also perform multiple conditional operations on numpy arrays simultaneously.

In [None]:
measurements < 1e-4

array([False,  True,  True,  True])

This is especially helpful because we can use boolean lists to access specific items within numpy arrays. So let's say we only want items in our array that are less than 1e-4. We can do that quickly using the above techniques:

In [None]:
measurements[measurements < 1e-4]

array([1.0e-05, 2.0e-05, 1.5e-05])

Like lists, numpy arrays contain additional methods that perform specific operations on the arrays. Here are a couple of examples:

In [None]:
print('The sum of our measurements is', measurements.sum())
print('The mean of our measurements is', measurements.mean())
print('The standard deviation of our measurements is', measurements.std())
print('The max of our measurements is', measurements.max())

The sum of our measurements is 0.00014500000000000003
The mean of our measurements is 3.625000000000001e-05
The standard deviation of our measurements is 3.69754986443726e-05
The max of our measurements is 0.0001


### So why numpy arrays?

Because of the way numpy arrays are stored, they are much more efficient than lists. They employ a concept known as __vectorization__, effectively meaning that operations on arrays are performed on all elements of arrays at once. This is much faster than iterating over lists and performing operations on each element individually. If done effectively, employing vectorization in your own code using numpy arrays can result in a speedup of several orders of magnitude.

You'll get a little more experience with numpy arrays (and pandas dataframes) on day 2, so for the remainder of the first day, we'll continue to focus on coding basics.