# Lists

In [1]:
import project

## Sequence

- Definition: a sequence is a collection of numbered/ordered values
- String: a sequence of one-character strings

Operations:
- `len`
- indexing
- slicing
- `for` loop

### Indexing

- enables you to extract one item in your sequence, that is one character in a string
- Syntax: string_var`[index]`
    - index needs to be in range, that is from `0` to `len(string_var) - 1`
    - other index values will produce `IndexError`

In [2]:
day = "Monday"
print(day)
print(day[1])  # 2nd character
print(day[5])  # last

print(day[-1]) # last
print(day[-2]) # 2nd last
print(day[50]) # this won't work

Monday
o
y
y
a


IndexError: string index out of range

### Slicing
- enables you to extract a sub-sequence
- sub-sequence will be of same type as original sequence
- Syntax: string_var`[start_index:end_indx]`
    - start_index is inclusive
    - end_index is exclusive
    - index need not be in range. Slicing will ignore indices which are not in range of `0` to `len(string_var) - 1`

In [3]:
print(day)
print(day[1:3])    # include 1, exclude 3
print(day[1:100])  # slicing is forgiving
print(day[1:])     # can skip 2nd number
print(day[:3])     # can skip 1st number
print(day[:])      # this, too!
print(day[-3:-1])  # can use negative indices

Monday
on
onday
onday
Mon
Monday
da


### for loops

- can iterate over every item in a sequence

In [4]:
# print each letter of the string using while loop
index = 0
while index < len(day):
    print(day[index])
    index += 1

M
o
n
d
a
y


In [5]:
# print each letter of the string using for loop
# letter is a new variable that is the value of each iteration

for letter in day:
    print(letter)

M
o
n
d
a
y


In [6]:
# the 2nd variable must be defined
# 2nd var b undefined
for a in b: 
    print(a)

NameError: name 'b' is not defined

In [7]:
# print each letter of the string using for loop with range built-in function call
# range enables us to iterate over every index in the string

for idx in range(len(day)):
    print(day[idx])

M
o
n
d
a
y


In [8]:
# range built-in function: an optional 3rd number is the increment
# let's print every other character in the string
for idx in range(0, len(day), 2):  
    print(day[idx])

M
n
a


In [9]:
# Practice: Write a for loop to generate a string that makes an acronym

phrase = "National Collegiate Athletic Association 2022"
acro = ""
for letter in phrase:
    if letter.upper() == letter and letter.isalpha():
        #print(letter)
        # How can we make sure you don't consider spaces and numbers?
        # TODO: try isalpha method (update if condition)
        # TODO: now instead of printing the letter, concatenate the letter to acro
        acro += letter

print(acro)

NCAA


Other string methods: https://www.w3schools.com/python/python_ref_string.asp. Methods in Python have very intuitive names. Please don't memorize the methods.

## Learning Objectives

- Create a list and use sequence operations on a list.
- Write loops that process lists
- Explain key differences between strings and lists: type flexibility, mutability
- Mutate a list using:
    - indexing
    - methods such as append, extend, sort, and pop
- split() a string into a list
- join() list elements into a string

### Motivation for lists:

- what if we want to store a sequence of numbers or a sequence of numbers and strings?
- lists enable us to create flexible sequences: we can store anything inside a list as item, including:
    - int
    - float
    - str
    - bool
    - other lists, etc.,

In [10]:
day = "Monday"

# first quotation indicates beginning of string sequence
# second quotation indicates end of string sequence

In [11]:
empty_list = []
some_nums = [11, 22, 33, 44]

# [ indicates beginning of sequence
# ] indicates end of sequence

In [12]:
grocery_list = ["apples", "milk", "broccoli", "spinach", "oranges"] 

# TODO: compute length of grocery_list
print(len(grocery_list))

# TODO: use indexing to display broccoli
print(grocery_list[2])

# TODO: use indexing to display last item
print(grocery_list[-1])

# TODO: use slicing to extract only the vegetables
print(grocery_list[2:4])

# TODO: discuss why the following gives IndexError
print(grocery_list[len(grocery_list)])
# last possible index is always 1 value lesser than length

5
broccoli
oranges
['broccoli', 'spinach']


IndexError: list index out of range

In [13]:
# Iterate over every item in grocery_list to display:
# <Item> purchased!

for item in grocery_list:
    print(item + " purchased!")

apples purchased!
milk purchased!
broccoli purchased!
spinach purchased!
oranges purchased!


In [14]:
# Compute sum of numbers in some_nums
# Let's do this example using PythonTutor

total = 0

for num in some_nums:
    total += num
    
print(total)

110


In [15]:
# We can use a list instead of a variable for each operation
[10, 20, 30][1]

20

In [16]:
# We can use a list instead of a variable for each operation
[10, 20, 30][1:]

[20, 30]

How does Python differentiate between two different usages of `[]`?

### Other sequence operations

- concatentation using `+`
- `in`
- multiply by an `int`

In [17]:
msg = "Happy "
print(msg + "Monday :)")

some_nums = [11, 22, 33, 44]
print(some_nums + [1, 2, 3]) # `+` concatenates two lists

Happy Monday :)
[11, 22, 33, 44, 1, 2, 3]


In [18]:
msg = "Happy"
print("H" in msg)
print("h" in msg)

some_nums = [11, 22, 33, 44]
print(33 in some_nums)
print(55 in some_nums)

True
False
True
False


In [19]:
msg = "Happy :)"
print(msg * 3)

print(["Go", "Badgers!"] * 3)

Happy :)Happy :)Happy :)
['Go', 'Badgers!', 'Go', 'Badgers!', 'Go', 'Badgers!']


## Strings versus lists

<div>
<img src="attachment:Mutation.png" width="600"/>
</div>

### Difference 1: lists are flexible

In [20]:
# String sequence can only contain characters as items
# List sequence can contain anything as an item

l = ["hello", 14, True, 5.678, None, ["list", "inside", "a", "list"]]

# TODO: fix the bug in this loop
for i in range(len(l)):
    print(l[i])
    
# TODO: use indexing to extract and print the inner list
print(l[-1])

# TODO: print type of last item in l
print(type(l[-1]))

# TODO: use double indexing to extract "inside"
print(l[-1][1])

hello
14
True
5.678
None
['list', 'inside', 'a', 'list']
['list', 'inside', 'a', 'list']
<class 'list'>
inside


In [21]:
# List of lists usecase example

game_grid = [
[".", ".", ".", ".", ".", "S"],
[".", "S", "S", "S", ".", "S"],
[".", ".", ".", ".", ".", "S"],
[".", ".", ".", ".", ".", "."],
[".", ".", ".", ".", "S", "."],
[".", ".", ".", ".", "S", "."]
]

for row in game_grid:
    for position in row:
        print(position, end = "")
    print()

.....S
.SSS.S
.....S
......
....S.
....S.


### Difference 2: lists are mutable

<div>
<img src="attachment:Mutability.png" width="600"/>
</div>

- Mutability has nothing to do with variable assignments / re-assignments
- Mutability has to do with changing values inside a sequence

In [22]:
# Variables can always be re-assigned

s = "AB"
s = "CD"
s += "E"
print(s)

nums = [1, 2]
nums = [3, 4]
print(nums)

CDE
[3, 4]


In [23]:
name = "Andrew"

# TODO: let's try to change "Andrew" to "Andrea"
name[-1] = "a" 

# doesn't work because strings are immutable, that is you cannot change the value inside an existing string

TypeError: 'str' object does not support item assignment

In [24]:
nums = [2, 2, 9]

# TODO: change 9 to 0
nums[2] = 0
print(nums)

# works because lists are mutable, that is you can change the value inside an existing list

[2, 2, 0]


### Mutating a list

- update using index (works only for existing index)
- `append` method
- `extend` method
- `pop` method
- `sort` method

Unlike string methods, list methods mutate the original list. String methods produce a new string because strings are immutable.

In [1]:
grocery_list = ["apples", "milk", "broccoli", "spinach", "oranges"] 

# TODO: try to add "blueberries" at index 5
grocery_list[5] = "blueberries" 
# doesn't work because grocery_list does not contain a previous item at index 5

IndexError: list assignment index out of range

So, how do we add a new item to a list?

In [2]:
grocery_list.append("blueberries")
print(grocery_list)

['apples', 'milk', 'broccoli', 'spinach', 'oranges', 'blueberries']


Can we add multiple items using `append`?

In [3]:
grocery_list.append("peanut butter", "jelly", "bread")# doesn't work because append only accepts a single argument

TypeError: list.append() takes exactly one argument (3 given)

In [4]:
grocery_list.append(["peanut butter", "jelly", "bread"])# adds the list argument as such
print(grocery_list)

['apples', 'milk', 'broccoli', 'spinach', 'oranges', 'blueberries', ['peanut butter', 'jelly', 'bread']]


`extend` method will enable us to add every item in argument list to the original list as individual item.

In [5]:
grocery_list.extend(["falafel", "pita bread", "hummus"])
print(grocery_list)

['apples', 'milk', 'broccoli', 'spinach', 'oranges', 'blueberries', ['peanut butter', 'jelly', 'bread'], 'falafel', 'pita bread', 'hummus']


`pop` enables us to remove item from the list:

- by default pop() removes the last item from the list
- you can override that by passing an index as argument
- pop remove the item from the original list and also returns it

In [8]:
grocery_list.pop(1)

'milk'

In [31]:
grocery_list

['apples',
 'milk',
 'broccoli',
 'spinach',
 'oranges',
 'blueberries',
 ['peanut butter', 'jelly', 'bread'],
 'falafel',
 'pita bread']

In [32]:
# TODO: remove "oranges" from grocery_list and store it into a variable
some_fruit = grocery_list.pop(4)
print(some_fruit)
print(grocery_list)

oranges
['apples', 'milk', 'broccoli', 'spinach', 'blueberries', ['peanut butter', 'jelly', 'bread'], 'falafel', 'pita bread']


`sort` method enables us to sort the list using alphanumeric ordering.

TODO: Try sorting each of the following lists.

In [33]:
L = [3, 9, 2, 0, 34, 90] # TODO: initialize list with some unordered numbers
L.sort()
L

[0, 2, 3, 9, 34, 90]

In [34]:
L = ["hi", "apple", "hello", "world", "orange"] # TODO: initialize list with some unordered strings
L.sort()
L

['apple', 'hello', 'hi', 'orange', 'world']

In [35]:
L = [False, True, False, True, True]
L.sort()
L

[False, False, True, True, True]

In [36]:
L = ["str", 1, 2.0, False]
L.sort() # Doesn't work as you cannot compare different types!

TypeError: '<' not supported between instances of 'int' and 'str'

`split` method splits a string into a list of strings, using a separator (argument)
- Syntax: some_string.split(separator_string)

In [9]:
sentence = "a,quick,brown,fox"
words = "a,quick,brown,fox".split(",")
words

['a', 'quick', 'brown', 'fox']

`join` method joins a list of strings into a single string using a separator (argument)
- Syntax: separator_string.join(some_list)

In [11]:
characters = ["M", "SS", "SS", "PP", ""]
place = "I".join(characters)
place

# TODO: remove the last item in characters list and see what place you get (re-run cell)

'MISSISSIPPI'

### List all Engineering majors (primary major) among current lecture (example: LEC001) students

In [39]:
engineering_major_list = []

for idx in range(project.count()):
    lecture = project.get_lecture(idx)
    major = project.get_major(idx)
    
    if lecture == "LEC001":
        if major.startswith("Engineering"):
            engineering_major_list.append(major)
            
engineering_major_list

['Engineering: Biomedical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Industrial',
 'Engineering: Other',
 'Engineering: Industrial',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Biomedical',
 'Engineering: Mechanical',
 'Engineering: Other',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Industrial',
 'Engineering: Industrial',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Biomedical',
 'Engineering: Mechanical',
 'Engineering: Industrial',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Biomedical',
 'Engineering: Biomedical',
 'Engineering: Biomedical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Biomedical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Mechanical',
 'Engineering: Biomedical',
 '

### Profanity filtering

In [40]:
bad_words = ["omg", "midterm", "exam"]

def censor(input_string):
    """
    replaces every bad word in input string with that word's first
    letter and then * for the remaining characters
    """
    # TODO: use split to extract every word 
    words = input_string.split(" ")

    # Iterate over every word: 1. check if word is in bad_words 2. compute replacement 3. replace word
    for index in range(len(words)):
        curr_word = words[index]
        if curr_word.lower() in bad_words:
            words[index] = curr_word[0] + "*" * (len(curr_word) - 1)
    
    # TODO: join the words back using the correct separator and return the joined string
    return " ".join(words)
    
censor("omg the midterm was so awesome!")

'o** the m****** was so awesome!'

### Wordle (self-study example)

In [41]:
def get_wordle_results(guess):
    wordle_result = ""
    for i in range(len(guess)):
        if guess[i] == word_of_the_day[i]:
            wordle_result += "O"
        elif word_of_the_day.find(guess[i]) != -1:
            wordle_result += "_"
        else:
            wordle_result += "X"
    return wordle_result

max_num_guesses = 6
current_num_guesses = 1
word_of_the_day = "CRANE"

print("Welcome to PyWordle!")
print("You have 6 guesses to guess a 5 character word.")
print("X\tThe letter is not in the word.")
print("_\tThe letter is in the word, but in the wrong place.")
print("O\tThe letter is in the correct place!")

while current_num_guesses <= max_num_guesses:
    guess = input("Guess the word: ")
    guess = guess.upper()

    wordle_results = get_wordle_results(guess)
    print("{}\t{}".format(guess, wordle_results))
    if guess == word_of_the_day:
        break
    current_num_guesses += 1
    
if current_num_guesses > max_num_guesses:
    print("Better luck next time!")
    print("The word was: {}".format(word_of_the_day))
else:
    print("You won in {} guesses!".format(current_num_guesses))

Welcome to PyWordle!
You have 6 guesses to guess a 5 character word.
X	The letter is not in the word.
_	The letter is in the word, but in the wrong place.
O	The letter is in the correct place!
Guess the word: rance
RANCE	____O
Guess the word: nacre
NACRE	____O
Guess the word: crane
CRANE	OOOOO
You won in 3 guesses!
