# Lists

**Python list is a sequence datatype, i.e. an ordered set of values in square brackets. Since items are ordered, lists can be indexed and sliced, like strings. However, unlike strings, lists are *mutable*, meaning the contents can be changed at any time.** 

**Lists are also *iterables*, which means you can iterate over a list with `for` and `while` loops.**

**REMEMBER: Indexing starts from 0.**

**There are in fact twelve different ways to create a list in Python (literally, appending, concatenation, inserting, indexing, slicing etc.)! But not all are readable, or viable in certain situations.**

In [1]:
list("9102473658")

['9', '1', '0', '2', '4', '7', '3', '6', '5', '8']

In [2]:
# Empty list

computer_parts = []

In [3]:
# Iterating over a list

computer_parts = ['monitor', 'keyboard', 'mouse', 'mat', 'screen']

for part in computer_parts:
    print(part)

monitor
keyboard
mouse
mat
screen


In [25]:
type(computer_parts)

list

In [4]:
# Indexing a list

print("I need to buy a computer", computer_parts[4])

I need to buy a computer screen


In [5]:
computer_parts[-1]

'screen'

In [6]:
# Negative slicing

computer_parts[-1:-4:-1]

['screen', 'mat', 'mouse']

**Immutables include integer, float and Boolean datatypes, and strings and tuples, e.g.**

    x = 1 
    x = 5
    
**You are not changing the integer, you are re-binding the variable to a new integer. It is also why you can only concatenate or duplicate strings.**

## Adding to the list

In [7]:
# Add at the end of list

computer_parts += ['harddrive']

print(computer_parts)

['monitor', 'keyboard', 'mouse', 'mat', 'screen', 'harddrive']


In [8]:
# Add at the end of list

computer_parts.append('motherboard')

print(computer_parts)

['monitor', 'keyboard', 'mouse', 'mat', 'screen', 'harddrive', 'motherboard']


In [9]:
# Build list of computer parts using user input

choice = '-'

# Empty list to be populated
computer_parts = []

while choice != '0':
    if choice in '123456':
        print("Adding {}...".format(choice))
        if choice == '1':
            computer_parts.append('monitor')
        elif choice == '2':
            computer_parts.append('keyboard')
        elif choice == '3':
            computer_parts.append('mouse')
        elif choice == '4':
            computer_parts.append('mat')
        elif choice == '5':
            computer_parts.append('screen')
        elif choice == '6':
            computer_parts.append('hdmi cable')
    else:
        print('''Please add options from below:
        1. monitor
        2. keyboard
        3. mouse
        4. mat
        5. screen
        6. hdmi cable
        0. EXIT''')
        
    choice = input()
    
print(computer_parts)

Please add options from below:
        1. monitor
        2. keyboard
        3. mouse
        4. mat
        5. screen
        6. hdmi cable
        0. EXIT
1
Adding 1...
2
Adding 2...
3
Adding 3...
4
Adding 4...
5
Adding 5...
6
Adding 6...
7
Please add options from below:
        1. monitor
        2. keyboard
        3. mouse
        4. mat
        5. screen
        6. hdmi cable
        0. EXIT
0
['monitor', 'keyboard', 'mouse', 'mat', 'screen', 'hdmi cable']


**This is a lot of coding when you can just iterate over a pre-defined list using `for` loops to print out the options to new lists.**

**This means that if you need to add or delete items from the list, you only need to update the pre-defined list `available_parts`, making the code much easier to maintain. This is known as refactoring the code.**

In [1]:
available_parts = ['monitor', 'keyboard', 'mouse', 'mat', 'screen', 'hdmi cable']

# List of valid number choices (must be strings)
valid_choices = []

for i in range(1, len(available_parts) + 1):
    valid_choices.append(str(i))


#print(valid_choices) # Remove after testing

choice = '-'

computer_parts = []

available_parts.sort()

while choice != '0':
    if choice in valid_choices:
        print("Adding {}...".format(choice))
        index = int(choice) - 1
        chosen_part = available_parts[index]
        computer_parts.append(chosen_part)
    else:
        print("Choose from options below:")
        # Iterate over full list of parts
        for number, part in enumerate(available_parts):
            print("{0}: {1}".format(number + 1, part))
            
    choice = input()
    
print(computer_parts)

Choose from options below:
1: hdmi cable
2: keyboard
3: mat
4: monitor
5: mouse
6: screen
1
Adding 1...
2
Adding 2...
3
Adding 3...
4
Adding 4...
5
Adding 5...
6
Adding 6...
7
Choose from options below:
1: hdmi cable
2: keyboard
3: mat
4: monitor
5: mouse
6: screen
0
['hdmi cable', 'keyboard', 'mat', 'monitor', 'mouse', 'screen']


**However, the '0: EXIT' option is no longer displayed.**

**FYI, see below for an example of list comprehension used instead of the `for` loop to build the list of valid choices, which is what advanced programmers would use:**

            valid_choices = [str(i) for i in range(1, len(available_parts) + 1)]

## Creating list from list

**If you want to filter or extract items from one list to another list, a simple `for` loop with a condition can achieve this.**

In [12]:
# 11 flowers & 8 shrubs - separate into two lists

data = [
    "Andromeda - Shrub",
    "Bellflower - Flower",
    "China Pink - Flower",
    "Daffodil - Flower",
    "Evening Primrose - Flower",
    "French Marigold - Flower",
    "Hydrangea - Shrub",
    "Iris - Flower",
    "Japanese Camellia - Shrub",
    "Lavender - Shrub",
    "Lilac - Shrub",
    "Magnolia - Shrub",
    "Peony - Shrub",
    "Queen Anne's Lace - Flower",
    "Red Hot Poker - Flower",
    "Snapdragon - Flower",
    "Sunflower - Flower",
    "Tiger Lily - Flower",
    "Witch Hazel - Shrub"
]

flowers = []
shrubs = []

for plant in data:
    if "shrub" in plant.casefold():
        shrubs.append(plant)
    else:
        flowers.append(plant)
        
print("FLOWERS:")
print(flowers)
print()
print("SHRUBS:")
print(shrubs)

FLOWERS:
['Bellflower - Flower', 'China Pink - Flower', 'Daffodil - Flower', 'Evening Primrose - Flower', 'French Marigold - Flower', 'Iris - Flower', "Queen Anne's Lace - Flower", 'Red Hot Poker - Flower', 'Snapdragon - Flower', 'Sunflower - Flower', 'Tiger Lily - Flower']

SHRUBS:
['Andromeda - Shrub', 'Hydrangea - Shrub', 'Japanese Camellia - Shrub', 'Lavender - Shrub', 'Lilac - Shrub', 'Magnolia - Shrub', 'Peony - Shrub', 'Witch Hazel - Shrub']


## Adding list to list

**You can concatenate lists (`+`)**

In [13]:
evens = [2, 4, 6, 8, 10, 12, 14]

odds = [1, 3, 5, 7, 9, 11, 13]

numbers = evens + odds

In [14]:
print(numbers)

[2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13]


**If you want to copy a list, it is best to use `copy()` method, but if you want to create a new sub-list from a list, simply use indexing or slicing to copy the values to a new list:**

In [15]:
digits = numbers[1::2]

print(digits)

[4, 8, 12, 1, 5, 9, 13]


## Deleting from the list

**You can simply remove individual items from a list, or a range of items, using `del` operator, which accepts index positioning only. However, remember that every time you run the command, the index is automatically re-ordered, so if you want to remove further items, index positioning of items will have changed.**

**A useful example is deleting outliers from a distribution, which requires a condition in a `for` loop.**

In [16]:
# See outliers at start and end of list

data = [4, 5, 104, 105, 110, 120, 130, 130, 150, 160, 170, 183, 185, 187, 188, 191, 350, 360]

In [17]:
del data[0:2]

# Note 4 & 5 have been deleted from list
print(data)

[104, 105, 110, 120, 130, 130, 150, 160, 170, 183, 185, 187, 188, 191, 350, 360]


**With a small dataset, you can simply slice the list to extract the outliers. However, if you have a much larger dataset and can't easily locate the indices to remove a range of outliers, i.e. values below or above specific values, you need to iterate over the list to determine the indices for the low outliers, then another loop for the high outliers, using the newly-indexed list after removing the low outliers.**

**NOTE: In a distribution, fortunately values are usually already sorted, but note that this method only works if the list values are sorted in order.**

In [18]:
# See outliers at start and end of list

data = [4, 5, 104, 105, 110, 120, 130, 130, 150, 160, 170, 183, 185, 187, 188, 191, 350, 360]

min_outlier = 100
max_outlier = 200

In [19]:
# Process low outliers - stop loop when value gets to 100

stop = 0

for index, value in enumerate(data):
    if value >= min_outlier:
        stop = index
        break

# Index value when loop is forcibly stopped
print(stop)

del data[:stop]

print(data)

2
[104, 105, 110, 120, 130, 130, 150, 160, 170, 183, 185, 187, 188, 191, 350, 360]


In [20]:
# Process high outliers - stop loop when value gets to 200

start = 0

for index, value in enumerate(data):
    if value >= max_outlier:
        start = index
        break
        
# Index value when loop is forcibly stopped
print(start)

del data[start:]

print(data)

14
[104, 105, 110, 120, 130, 130, 150, 160, 170, 183, 185, 187, 188, 191]


**Using an earlier example of adding computer parts to a list, what if the user accidentally adds a computer part to the list? You need to add a condition that allows the user to remove it. You could add an option to the list to delete item, or item can be removed if you select it twice in the input.**

In [21]:
available_parts = ['monitor', 'keyboard', 'mouse', 'mat', 'screen', 'hdmi cable']

# List of valid input choices
valid_choices = []

for i in range(1, len(available_parts) + 1):
    valid_choices.append(str(i))

# Initial choice
choice = '-'

# Final list of parts
computer_parts = []

while choice != '0':
    if choice in valid_choices:
        index = int(choice) - 1
        chosen_part = available_parts[index]
        if chosen_part in computer_parts:
            print("Removing {} from list".format(choice))
            computer_parts.remove(chosen_part)
        else:
            print("Adding {} to list".format(choice))
            computer_parts.append(chosen_part)
        print("Your list now contains: {}".format(computer_parts))
    else:
        print("Choose from options below:")
        # Iterate over full list of parts
        for number, part in enumerate(available_parts):
            print("{0}: {1}".format(number + 1, part))
            
    choice = input()
    
print(computer_parts)

Choose from options below:
1: monitor
2: keyboard
3: mouse
4: mat
5: screen
6: hdmi cable
key
Choose from options below:
1: monitor
2: keyboard
3: mouse
4: mat
5: screen
6: hdmi cable
1
Adding 1 to list
Your list now contains: ['monitor']
2
Adding 2 to list
Your list now contains: ['monitor', 'keyboard']
3
Adding 3 to list
Your list now contains: ['monitor', 'keyboard', 'mouse']
4
Adding 4 to list
Your list now contains: ['monitor', 'keyboard', 'mouse', 'mat']
5
Adding 5 to list
Your list now contains: ['monitor', 'keyboard', 'mouse', 'mat', 'screen']
6
Adding 6 to list
Your list now contains: ['monitor', 'keyboard', 'mouse', 'mat', 'screen', 'hdmi cable']
1
Removing 1 from list
Your list now contains: ['keyboard', 'mouse', 'mat', 'screen', 'hdmi cable']
0
['keyboard', 'mouse', 'mat', 'screen', 'hdmi cable']


## Sort vs Sorted

**The `sort()` method works specifically with lists only. The `sorted()` function works on any sequence, including lists.**

* **The `sort()` method changes the list in place, i.e. mutates it without changing its ID.**

* **The `sorted()` function creates a copy of the list and sorts it.**

In [22]:
even = [2, 6, 4, 8, 20, 18, 12, 10, 14, 16]

even.sort()

print(even)

[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]


In [23]:
even.sort(reverse=True)

print(even)

[20, 18, 16, 14, 12, 10, 8, 6, 4, 2]


In [26]:
# Pangram is a phrase that contains all the letters in the alphabet

pangram = "The quick brown fox jumps over the lazy dog"

letters = sorted(pangram)

print(letters)

[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'T', 'a', 'b', 'c', 'd', 'e', 'e', 'e', 'f', 'g', 'h', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'o', 'o', 'o', 'p', 'q', 'r', 'r', 's', 't', 'u', 'u', 'v', 'w', 'x', 'y', 'z']


**Note that the capital letters are processed, then the lowercase letters. You can perform case-insensitive sorting, by using the `key` argument to specify the string method `casefold()` or `lower()`.**

In [27]:
letters = sorted(pangram, key=str.casefold)

print(letters)

[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'a', 'b', 'c', 'd', 'e', 'e', 'e', 'f', 'g', 'h', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'o', 'o', 'o', 'p', 'q', 'r', 'r', 's', 'T', 't', 'u', 'u', 'v', 'w', 'x', 'y', 'z']


In [28]:
sorted("SuperCaliFragilisticExpialidocious", key=str.casefold)

['a',
 'a',
 'a',
 'C',
 'c',
 'c',
 'd',
 'e',
 'E',
 'F',
 'g',
 'i',
 'i',
 'i',
 'i',
 'i',
 'i',
 'i',
 'l',
 'l',
 'l',
 'o',
 'o',
 'p',
 'p',
 'r',
 'r',
 'S',
 's',
 's',
 't',
 'u',
 'u',
 'x']

**Remember that `sort()` method only works on lists, so it sorts multiple strings, rather than sorting the string itself.**

In [29]:
texts = ['The', '40s', 'was', 'the', 'BEST', 'decade', 'EVER', '!']

texts.sort()

print(texts)

['!', '40s', 'BEST', 'EVER', 'The', 'decade', 'the', 'was']


In [30]:
texts.sort(key=str.casefold)

print(texts)

['!', '40s', 'BEST', 'decade', 'EVER', 'The', 'the', 'was']


## Replacing items in list

**You can replace an individual item in a list using indexing and re-assigning that position to a new value:**

In [31]:
print(computer_parts)

['keyboard', 'mouse', 'mat', 'screen', 'hdmi cable']


In [32]:
computer_parts[0] = 'monitor'

print(computer_parts)

['monitor', 'mouse', 'mat', 'screen', 'hdmi cable']


**You can replace an individual item with a nested list:**

In [33]:
computer_parts[4] = ['trackball', 'monitor']

print(computer_parts)

['monitor', 'mouse', 'mat', 'screen', ['trackball', 'monitor']]


**Or if you do not want the list to be nested, make sure to slice the list, not index:**

In [34]:
computer_parts[4:] = ['trackball', 'monitor']

print(computer_parts)

['monitor', 'mouse', 'mat', 'screen', 'trackball', 'monitor']


In [35]:
computer_parts[0] = ['screen']

print(computer_parts)

[['screen'], 'mouse', 'mat', 'screen', 'trackball', 'monitor']


**As you can see, there are subtle differences depending on context and how coding is applied.**

In [36]:
computer_parts[0:1] = ['screen']

print(computer_parts)

['screen', 'mouse', 'mat', 'screen', 'trackball', 'monitor']


## Iterating backwards over a list

**This technique allows you to change the size of a list easily.**

In [1]:
# Unsorted numbers

data = [104, 101, 4, 105, 308, 103, 5, 107, 100, 306, 106, 102, 108]

min_valid = 100
max_valid = 200

In [2]:
for index in range(len(data) -1, -1, -1):
    print(index)

12
11
10
9
8
7
6
5
4
3
2
1
0


In [3]:
for index in range(len(data) -1, -1, -1):
    if data[index] < min_valid or data[index] > max_valid:
        print(index, data) # remove after testing
        del data[index]

9 [104, 101, 4, 105, 308, 103, 5, 107, 100, 306, 106, 102, 108]
6 [104, 101, 4, 105, 308, 103, 5, 107, 100, 106, 102, 108]
4 [104, 101, 4, 105, 308, 103, 107, 100, 106, 102, 108]
2 [104, 101, 4, 105, 103, 107, 100, 106, 102, 108]


In [4]:
print(data)

[104, 101, 105, 103, 107, 100, 106, 102, 108]


**You can also iterate backwards over a sequence, like a list, using the `reversed()` function. The advantage of using this function is that you can also use it in the `enumerate()` function.**

In [5]:
data = [104, 101, 4, 105, 308, 103, 5, 107, 100, 306, 106, 102, 108]

min_valid = 100
max_valid = 200

for index, value in enumerate(reversed(data)):
    print(index, value)

0 108
1 102
2 106
3 306
4 100
5 107
6 5
7 103
8 308
9 105
10 4
11 101
12 104


**As you can see, the numbers are reversed but the index still reads from 0 to 12. In order to reverse the index also, subtract one from the length of the data to get the top index position for each loop. Subtract the current index from that to get the index backwards.**

In [6]:
top_index = len(data) - 1

for index, value in enumerate(reversed(data)):
    print(top_index - index, value)

12 108
11 102
10 106
9 306
8 100
7 107
6 5
5 103
4 308
3 105
2 4
1 101
0 104


In [8]:
top_index = len(data) - 1

for index, value in enumerate(reversed(data)):
    if value < min_valid or value > max_valid:
        print(top_index - index, value)
        del data[top_index - index]
        
print(data)

9 306
6 5
4 308
2 4
[104, 101, 105, 103, 107, 100, 106, 102, 108]


**NOTE: Compared to the first example of removing outliers from dataset, using backwards iteration means the data does not need to be sorted in order.**

## Nested lists

**Nested lists can be a bit of headache, especially to iterate over. Earlier, you replaced a list item with a nested list by indexing. You can also nest lists literally, within square brackets.**

In [9]:
even = [2, 4, 6, 8, 10]

odd = [1, 3, 5, 7, 9, 11]

numbers = [even, odd]

print(numbers)

[[2, 4, 6, 8, 10], [1, 3, 5, 7, 9, 11]]


In [10]:
numbers[0]

[2, 4, 6, 8, 10]

In [11]:
numbers[0][0]

2

In [12]:
# The outer loop iterates over the nested lists

for number_list in numbers:
    print(number_list)

[2, 4, 6, 8, 10]
[1, 3, 5, 7, 9, 11]


In [13]:
# The inner loop iterates over the values in each nested list

for number_list in numbers:
    print(number_list)
    
    for number in number_list:
        print(number)

[2, 4, 6, 8, 10]
2
4
6
8
10
[1, 3, 5, 7, 9, 11]
1
3
5
7
9
11


**Not many people know that the word 'spam', meaning junk email, comes from a Monty Python sketch:** 

In [14]:
# Literal nested lists should be formatted this way

menu = [
    ['egg', 'bacon'], 
    ['egg', 'sausage', 'bacon'], 
    ['egg', 'spam'], 
    ['egg', 'bacon', 'spam'], 
    ['egg', 'bacon', 'sausage', 'spam'], 
    ['spam', 'bacon', 'sausage', 'spam'], 
    ['spam', 'egg', 'spam', 'spam', 'bacon', 'spam'], 
    ['spam', 'sausage', 'spam', 'bacon', 'spam', 'tomato', 'spam']
]

In [15]:
for meal in menu:
    if 'spam' not in meal:
        print(meal)

['egg', 'bacon']
['egg', 'sausage', 'bacon']


In [18]:
for meal in menu:
    if 'spam' not in meal:
        print(meal)
        
        for food in meal:
            print(food)
    else:
        print("{0} has {1} spam".format(meal, meal.count('spam')))

['egg', 'bacon']
egg
bacon
['egg', 'sausage', 'bacon']
egg
sausage
bacon
['egg', 'spam'] has 1 spam
['egg', 'bacon', 'spam'] has 1 spam
['egg', 'bacon', 'sausage', 'spam'] has 1 spam
['spam', 'bacon', 'sausage', 'spam'] has 2 spam
['spam', 'egg', 'spam', 'spam', 'bacon', 'spam'] has 4 spam
['spam', 'sausage', 'spam', 'bacon', 'spam', 'tomato', 'spam'] has 4 spam


In [40]:
# Print meals without 'spam' - deleting items so backwards iteration

menu = [
    ['egg', 'bacon'], 
    ['egg', 'sausage', 'bacon'], 
    ['egg', 'spam'], 
    ['egg', 'bacon', 'spam'], 
    ['egg', 'bacon', 'sausage', 'spam'], 
    ['spam', 'bacon', 'sausage', 'spam'], 
    ['spam', 'egg', 'spam', 'spam', 'bacon', 'spam'], 
    ['spam', 'sausage', 'spam', 'bacon', 'spam', 'tomato', 'spam']
]

for meal in menu:
    for index in range(len(meal) - 1, -1, -1):
        if meal[index] == 'spam':
            del meal[index]
            
    print(meal)

['egg', 'bacon']
['egg', 'sausage', 'bacon']
['egg']
['egg', 'bacon']
['egg', 'bacon', 'sausage']
['bacon', 'sausage']
['egg', 'bacon']
['sausage', 'bacon', 'tomato']


In [5]:
# No mutating original menu, just print items that are not 'spam'
# Note the use of end argument in print function

menu = [
    ['egg', 'bacon'], 
    ['egg', 'sausage', 'bacon'], 
    ['egg', 'spam'], 
    ['egg', 'bacon', 'spam'], 
    ['egg', 'bacon', 'sausage', 'spam'], 
    ['spam', 'bacon', 'sausage', 'spam'], 
    ['spam', 'egg', 'spam', 'spam', 'bacon', 'spam'], 
    ['spam', 'sausage', 'spam', 'bacon', 'spam', 'tomato', 'spam']
]


for meal in menu:
    for item in meal:
        if item != 'spam':
            print(item, end=" ")
            
    print()

egg bacon 
egg sausage bacon 
egg 
egg bacon 
egg bacon sausage 
bacon sausage 
egg bacon 
sausage bacon tomato 


**You can replace the `if` statement with a generator expression, which is advanced Python. This is in order to add commas between each item without leaving a trailing comma at the end.**

In [3]:
for meal in menu:
    items = ", ".join((item for item in meal if item != 'spam'))
    print(items)

egg, bacon
egg, sausage, bacon
egg
egg, bacon
egg, bacon, sausage
bacon, sausage
egg, bacon
sausage, bacon, tomato
