# Lesson 5a. 
# Lists

We have been working with individual data types like integers, strings, Booleans and floats for a few weeks, but now it's time to introduce *data structures* which act as containers for other types of values in Python. 

The most widely utilised and versatile data structure is a *list* which contains multiple values in an *ordered* sequence. Lists are defined by *square brackets* and each individual element in a list is separated by a comma. Lists are the basic *ordered* and *mutable* (mutable is a type of data structure in which data contained in the structure can be changed) data collection type in Python.

Lists can contain any combination of data types or structures; it's perfectly reasonable to place numerical, string, or Boolean data types in them. The start and end of a list are distinguished by opening and closing square brackets, which is what distinguishes it from other data structures in Python.

In [5]:
# There are all examples of lists.

L1 = [50, 86, 3, 100, 987, 65]
L2 = ['I', 'LOVE', 'Python']
L3 = [True, True, False, True, False]
L4 = [50, 86, 'Python', True, 23.05]

# we can use the built-in python function type() to show that these variables are all lists
type(L1)

list

The key benefits of lists are that they are very flexible (they can contain any combination of data types and structures) and they are *ordered*: this means that we can access the individual items stored in a list quickly and accurately.

Say we have a list of some of the Barclays office locations stored in a list called ```offices```:
```
offices = ['Glasgow', 'Pune', 'Whippany', 'London']
```
Lists are great as they allow us to collect similar items together instead of creating individual variables for each individual item. Having one list which we can update and change is much more convenient as we can add and remove the items in the list and apply the same logic to each item without having to duplicate much code.

The list above is a lot neater and easier to interact with than multiple variables without a data structure. If we wanted to apply the same logic to every item below, we would have to write out the same code 4 times. If we were to add or delete items to our list, this would also require us to add or delete logic to cater for the changes in the collection of office variables.

```
office1 = 'Glasgow' 
office2 = 'Pune'
office3 = 'Whippany'
office4 = 'London'
```

## Indexing

How would we access an element of the list, say Glasgow for example? We can use what is called the list's *index*, which is the list name followed by an integer corresponding to the order of the item in the list. It's important to note that Python uses *zero-based* indexing, which means that the counting of the elements in a list starts from 0, so ```offices[0]``` will correspond to ```'Glasgow'``` in this case. The last index is always the number of items in the list minus 1, as we are starting to count from 0. 

Zero-based indexing is bound to catch all programmers out at some point, as we naturally tend to start counting from 1; don't worry at all, it will become second-nature over time.

In [6]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# We can access all of the four items in the list by using their corresponding indexes (remembering to count from 0).
print(offices[0])
print(offices[1])
print(offices[2])
print(offices[3])

# if you try to use an index that doesn't correspond to a item in the list 
# ie. an index greater than the length of the list you will receive an error. 
# (even though there are 4 items in this list, as we start counting from 0, the 4th item is out of range).

print(offices[4])

Glasgow
Pune
Whippany
London


IndexError: list index out of range

## Slicing a List

Using an index (single integer in square brackets) will return a single value from the list. If we want to return a subset of the values in the list, we can use two integers separated by a colon (still contained in square brackets), which is known as a *slice*.

- ```offices[3]``` is a list with an index (one integer in square brackets) which will return one element.
- ```offices[1:3]``` is a list with a slice (two integers separated by a colon in square brackets) which will return a list.

In a slice, the first integer is the index where the slice begins, and the second integer is the index where the slice ends. A slice will go up to but not include the value of the second index.

In [7]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# We can use two integers separated by a colon to create sub-lists of the original offices list.
# This slice starts at 0 so includes 'Glasgow' and stops at 1, it will not include the value found at the second slice.
print(offices[0:1])
# This slice starts at 1 so includes 'Glasgow' and stops at 3, so will go up to but not include 'London'.
print(offices[1:3])

['Glasgow']
['Pune', 'Whippany']


### Indexing and Slicing Strings

The indexing and slicing syntax we've learned can also be applied to strings to create smaller portions of text (substrings).

We can think of a string like ```'Barclays!'``` as a sequence of characters, similar to a list. Each character in the string has a corresponding index, starting from 0. This allows us to access or extract specific parts of a string easily, for example:

In [7]:
# initialise a string variable storing the string 'Barclays!'
string = 'Barclays!'

# We can now index and slice this list using the same square-bracket syntax that we use with lists!

# print the first character of the string (remember that we start indexing from 0)
print(string[0])

# print the third to sixth characters of the string (remember that we start indexing from 0)
print(string[2:5])

B
rcl


## Changing Values in a List

Similarly to reassigning a variable, we can reassign the value stored at a position in a list by using its index and the assignment operator (=). 

In [8]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# if we wanted to change 'Glasgow' to 'Northampton', we would need the index of 'Glasgow'.
# as it is the first value in the list, we know it has index 0.
# if we now use the assignment operator to assign the value to offices[0]
# the offices[0] value will be updated from Glasgow to Northampton.

# print original list
print(offices)

# assign 'Northampton' to index 0
offices[0] = 'Northampton'

# print updated list
print(offices)

['Glasgow', 'Pune', 'Whippany', 'London']
['Northampton', 'Pune', 'Whippany', 'London']


## Using Lists with For Loops

In lesson three, we used for loops to execute a block of code a pre-defined number of times using Python's built-in ```range()``` function. Previously, we've only been able to iterate through integer values, using the ```range()``` function which outputs a sequence of integers based on the parameters passed into it.

In [9]:
# This for loop will execute the following block of code 
# for each value in the range from 0 up to 4.

for value in range(4):
    print(value)

0
1
2
3


If we want to use the ```range()``` function to access individual elements in a list, we can use it to generate a sequence of integers. These integers can then be used as indices to access each element in the list.

In [10]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# As we know that the offices list is 4 items long, we know that range(4) will generate a sequence of values from 0 - 3.
# Every time the code block is executed, the value will increase by 1 starting from 0 up to 3.
# The first time this block is executed, the value = 0 so offices[0] = 'Glasgow
# ..
# The final time this block is executed, the value = 3 so offices[3] = 'London'

for value in range(4):
    print("Index " + str(value) + " in offices is " + offices[value])

Index 0 in offices is Glasgow
Index 1 in offices is Pune
Index 2 in offices is Whippany
Index 3 in offices is London


For the code above to execute without failing and to access every element in the list, we need to accurately know the number of elements in the list to pass into the ```range()``` function. If the value passed in is too small, we won't access every value in the list, if it's too big, we will cause an error as we are trying to access an element with an index greater than the length of the list. As it's not always possible to know the length of a list in advance, we can use Python's built-in ```len()``` function to determine the length of the list and pass it into the ```range()``` function.

In [11]:
# we can update this list to include more or less offices, and the code will now be able to 
# access every element without failing or missing values.

offices = ['Glasgow', 'Pune', 'Whippany', 'London', 'Northampton']

# print the length of offices
print("There are " + str(len(offices)) + " total offices in the list.")

# instead of coding in the length of the list, we can use the len() function to pass into the range()
# so we will always iterate through the correct number of indexes.

for value in range(len(offices)):
    print("Index " + str(value) + " in offices is " + offices[value])

There are 5 total offices in the list.
Index 0 in offices is Glasgow
Index 1 in offices is Pune
Index 2 in offices is Whippany
Index 3 in offices is London
Index 4 in offices is Northampton


It's also possible to iterate through the values in a list by simply passing the name of the list into the for loop. We can replace the ```range()``` function with the name of the list. If we aren't concerned with the index of each item in the list, this is the simplest way to iterate through the values in a list.

In [12]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# This for loop will iterate through all items in the list.
for value in offices:    
    # We can use .upper() with a string value to change the string to uppercase.
    # Similarly, we can use .lower() with a string value to change the string to lower case.
    
    print("Upper Case: " + value.upper() + ". Lower Case: " + value.lower() + '.')

Upper Case: GLASGOW. Lower Case: glasgow.
Upper Case: PUNE. Lower Case: pune.
Upper Case: WHIPPANY. Lower Case: whippany.
Upper Case: LONDON. Lower Case: london.


## Adding & Removing Values from a List

A method is similar to a function, except it is "called on" a specific data type or structure. The main difference between a function and a method is that a method comes after the value it operates on. In the example above, ```.upper()``` & ```.lower()``` are methods that can only be called on string values and come after the string variable name. In this section, we will be introduced to several commonly used, list-specific methods.

### Append & Insert

If we have already defined a list and want to add a new value called 'value' to that list, we can use ```.append('value')``` to add 'value' to the end of the list. Alternatively, we can use ```.insert(index, 'value')``` with an index value to add the value at a specific location in the list.

In [14]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# if we want to add 'Northampton' to the end of the list we can use .append('Northampton')

print(offices)

offices.append('Northampton')

print(offices)

['Glasgow', 'Pune', 'Whippany', 'London']
['Glasgow', 'Pune', 'Whippany', 'London', 'Northampton']


In [15]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# if we want to add 'Northampton' to a specific location in the list we can use .insert(index, 'Northampton')
# where index is the location in the list that the value will be inserted

print(offices)

offices.insert(2, 'Northampton')

print(offices)

['Glasgow', 'Pune', 'Whippany', 'London']
['Glasgow', 'Pune', 'Northampton', 'Whippany', 'London']


### Remove

If we'd like to remove a specific value from a list, we can use the ```.remove('value')``` method, where 'value' is the item we want to remove.

In [16]:
offices = ['Glasgow', 'Pune', 'Whippany', 'London']

# if we want to remove 'Glasgow' from the list we can use .remove('Glasgow')

print(offices)

offices.remove('Glasgow')

print(offices)

['Glasgow', 'Pune', 'Whippany', 'London']
['Pune', 'Whippany', 'London']


## Working with Lists

Lists are perfect when you have a collection of similar values to which you'd like to apply a similar logical process. For this reason, they pair well with functions, which we looked at in the previous lesson, to create simple, flexible, and powerful logical processes that we can use to automate repetitive tasks. 

In [17]:
inputNumbers = [100, 45, 78, 99, 456, 55, 432, 66, 190, 85, 35, 876, 9057, 79, 83, 89, 97, 900, 937, 631]
factors = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

# Above is defined a list of 20 integers (when working with data this is a small amount of input values).
# Also defined is a list of the first 10 prime numbers, which we can use as a way to approximate if the input numbers are prime.
# If any of the input numbers are prime themselves, they will not be divisible by any of the factors
# (a number is prime if it is not divisible by any other numbers).

# define the Prime_Checker function which takes in an input number and outputs whether the input number is prime or not.
# set default argument of Prime_Checker to be list of factors defined above.

def Prime_Checker(num, factors = factors):
    # Initialise a Boolean variable that will output if the input num is prime or not.
    isPrime = True
    
    # for every value in the list in the list of factors, check if the input number is divisible by the value.
    for value in factors:
        # if the input is divisible by any of the values in the factors list, change the isPrime Boolean to False.
        if num % value == 0:
            isPrime = False
    
    # return Boolean value indicating if input number is prime
    return isPrime

# create an empty list which we can append any primeNumbers to
primeNumbers = []

# use a for loop to iterate through the inputNumbers to see if they are divisible by any of the factors.
for num in inputNumbers:
    
    # store the Boolean value which is outputted for the num by the Prime_Checker function (defined above).
    inputPrime = Prime_Checker(num, factors)
    
    # if num is a Prime then we append the num from inputNumbers to the primeNumbers list.
    if inputPrime == True:
        primeNumbers.append(num)
        
# once we've applied the Prime_Checker function to all of the num values in inputNumbers, 
# we can print the primeNumbers list to display which of of the input numbers are primes.

print(primeNumbers)


[79, 83, 89, 97, 937, 631]


This is a great example of showing how programming can automate repetitive tasks to create results almost instantaneously. If I were to work out manually if 20 integers were prime by checking if each of those factors divided each input number, it would take me a significant amount of time. It would also introduce a large element of human error into the processing, as it's likely that at some point in the large number of calculations I would make a mistake.

Furthermore, this example shows the utility of lists. If I wanted to check if any more numbers are prime, I would simply have to add them to the ```inputNumbers``` list. Similarly, I could quickly include more or fewer values in the list of factors to make our prime number approximator more or less accurate.

It would be quite common when working with data to have datasets with thousands or millions of values in a column. To manually do calculations on this scale quickly becomes infeasible; however, with programming, we can do this quickly, easily, and accurately with relatively simple code.