<font color='blue'> First of all, please “Copy to Drive” to get your own copy for editing. </font>

<font color='red'> Run all the cells. For places with "Complete the codes below", please replace the "XXX" placeholder with your own codes.</font>

# Ch 3.1 Data Structures and Sequences

Python’s data structures are simple but powerful. We start with tuple, list, and dictionary, which are some of the most frequently used sequence types.

## Built-In Sequence Functions

Python has a handful of useful sequence functions that you should familiarize yourself with and use at any opportunity.

### <font color='blue'> **enumerate( )** </font>

The enumerate function returns a **sequence of (i, value) tuples**

It is common when iterating over a sequence to want to keep track of the index of the current item. A do-it-yourself approach would look like:


```
index = 0
for value in collection:
   # do something with value
   index += 1
```



In [1]:
index = 0
for value in [3, 4, 'hello', '!!!']:
   print(value)
   index += 1

3
4
hello
!!!


Since this is so common, Python has a built-in function, enumerate, which returns a **sequence of (i, value) tuples**:



```
for index, value in enumerate(collection):
   # do something with value
```



In [2]:
mylist = [3, 4, 'hello', '!!!']
for i, item in enumerate(mylist):
    print(f"{i}. {item}")

0. 3
1. 4
2. hello
3. !!!


In [3]:
# a list of lists

movie_list1 = [["Monty Python and the Holy Grail", 1975],
               ["On the Waterfront", 1954],
               ["Cat on a Hot Tin Roof", 1958]]

# using enumerate method
for i, movie in enumerate(movie_list1, start=1):
    print(f"{i}. {movie[0]} ({movie[1]})")
print()

1. Monty Python and the Holy Grail (1975)
2. On the Waterfront (1954)
3. Cat on a Hot Tin Roof (1958)



### <font color='blue'> **sorted( )** </font>

The sorted function **returns a new sorted list** from the elements of any sequence:

In [4]:
sorted([7, 1, 2, 6, 0, 3, 2])    # [0, 1, 2, 2, 3, 6, 7]

[0, 1, 2, 2, 3, 6, 7]

In [5]:
sorted("horse race")  # [' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

[' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

### <font color='blue'> **zip( )** </font>

zip “pairs” up the elements of a number of lists, tuples, or other sequences to **create a list of tuples**:

In [6]:
seq1 = ["foo", "bar", "baz"]
seq2 = ["one", "two", "three"]

zipped = zip(seq1, seq2)
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

zip can take an arbitrary number of sequences, and the number of elements it produces is determined by the shortest sequence:

In [7]:
seq3 = [False, True]

list(zip(seq1, seq2, seq3))  #[('foo', 'one', False), ('bar', 'two', True)]

[('foo', 'one', False), ('bar', 'two', True)]

A common use of zip is simultaneously iterating over multiple sequences, possibly also combined with enumerate:

In [8]:
for index, (a, b) in enumerate(zip(seq1, seq2)):
  print(f"{index}: {a}, {b}")

0: foo, one
1: bar, two
2: baz, three


In [9]:
# If you have two sequences that you want to pair up element-wise in a dictionary, you might try to do this:
mapping = {}
key_list = [1, 2, 3]
value_list = ['a', 'b', 'c']

for key, value in zip(key_list, value_list):
    mapping[key] = value

mapping

{1: 'a', 2: 'b', 3: 'c'}

In [10]:
keys = [1, 2, 3]
values = ['a', 'b', 'c']

# Combine the lists into a list of tuples
tuples = zip(keys, values)

# Create a dictionary from the list of tuples
mapping = dict(tuples)

mapping

{1: 'a', 2: 'b', 3: 'c'}

### <font color='blue'> **reversed( )** </font>

reversed iterates over the elements of a sequence in reverse order:

In [11]:
list(reversed(range(10)))  # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

## List, Set, and Dictionary Comprehensions

### List Comprehensions

List comprehensions allow you to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter into one concise expression. Consider the following `for` loop:

    result = []
    for value in collection:
        if condition:
            result.append(expr)

In [12]:
strings = ["a", "as", "bat", "car", "dove", "python"]
result = []

for word in strings:
    if len(word) > 2:
        result.append(word.upper())

result

['BAT', 'CAR', 'DOVE', 'PYTHON']

Equivalently list comprehensions take the following form:

    [expr for value in collection if condition]

In [13]:
strings = ["a", "as", "bat", "car", "dove", "python"]
wordslist = [item.upper() for item in strings if len(item) > 2]
wordslist

['BAT', 'CAR', 'DOVE', 'PYTHON']

### Dictionary and Set Comprehension



Set and dictionary comprehensions are a natural extension, producing sets and dictionaries in an idiomatically similar way instead of lists.

    dict_comp = {key-expr: value-expr for value in collection if condition}   
    
    set_comp = {expr for value in collection if condition}

Consider the list of strings from before. Suppose we wanted a set containing just the lengths of the strings contained in the collection; we could easily compute this using a set comprehension:

In [14]:
strings = ["a", "as", "bat", "car", "dove", "python"]

unique_lengths = {len(x) for x in strings}
unique_lengths

{1, 2, 3, 4, 6}

In [15]:
set(map(len, strings)) # We could also use the map function

{1, 2, 3, 4, 6}

As a simple dictionary comprehension example, we could create a lookup map of these strings for their locations in the list:

In [16]:
loc_mapping = {value: index for index, value in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

### Nested list comprehensions

Suppose we have a list of lists containing some English and Spanish names:

In [17]:
all_data = [["John", "Emily", "Michael", "Mary", "Steven"],
            ["Maria", "Juan", "Javier", "Natalia", "Pilar"]]

Suppose we wanted to get a single list containing all names with two or more a’s in them. We could certainly do this with a simple `for` loop:

In [18]:
names_of_interest = []

for names in all_data:
    enough_as = [name for name in names if name.count("a") >= 2]
    names_of_interest.extend(enough_as)

names_of_interest

['Maria', 'Natalia']

You can actually wrap this whole operation up in a single nested list comprehension, which will look like:

In [19]:
result = [name for names in all_data for name in names if name.count("a") >= 2]

result

['Maria', 'Natalia']

Basically:

*  **for names in all_data**: This is the outer loop, iterating over each sublist of names in the all_data list.

*  **for name in names**: This is the inner loop, iterating over each individual name in the current sublist.

*  **if name.count("a") >= 2**: This is a condition that checks if the count of the letter 'a' in the current name is greater than or equal to 2.

*  **name**: This is the expression that gets included in the new list if the condition is met.

Another example:

In [20]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
flattened = []

for tup in some_tuples:
    for x in tup:
        flattened.append(x)

flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Equivalently:

In [21]:
flattened = [x for tup in some_tuples for x in tup]
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

*  **for tup in some_tuples**: This is the outer loop that iterates over each tuple (tup) in the list of tuples (some_tuples).

*  **for x in tup**: This is the inner loop that iterates over each element (x) within the current tuple.

*  **x**: This is the expression that gets included in the new list (flattened). In this case, it represents each individual element in each tuple.

It’s important to distinguish the syntax just shown from a list comprehension inside a list comprehension, which is also perfectly valid:

In [22]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
[[x for x in tup] for tup in some_tuples]  # This produces a list of lists

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

**Outer Loop (for tup in some_tuples)**:
*  The outer loop iterates over each tuple (tup) in the some_tuples list.

**Inner Loop (for x in tup)**:
*  The inner loop iterates over each element (x) within the current tuple.

**Expression (x)**:
*  The expression represents each individual element in the tuple.

**List Comprehension**:
*  The inner list comprehension `[x for x in tup]` creates a list of elements by iterating through each element in the tuple.

**Outer List**:
*  The outer list comprehension `[[x for x in tup] for tup in some_tuples]` creates a list of lists.
*  Each inner list is generated by the inner comprehension for a specific tuple.

<font color='red'>Complete the codes in the cell below. Please replace the "XXX" placeholder with your own codes. </font>

In [23]:
words = ["ap1p4le", "ba5na8na", "or9an0ge", "g3rap6e", "s1tra1wberr2y"]

filtered_words_1 = []
for word in words: #  Outer loop
    if 'a' in word and len(word) <= 7: # Outer Condition

        letters = []
        for letter in word:  # Inner loop
            if letter.isalpha(): # Inner condition
                letters.append(letter)  # Inner expression: letter

        filtered_words_1.append(letters)

In [24]:
filtered_words_1

[['a', 'p', 'p', 'l', 'e'], ['g', 'r', 'a', 'p', 'e']]

In [25]:
# Example data
words = ["ap1p4le", "ba5na8na", "or9an0ge", "g3rap6e", "s1tra1wberr2y"]

filtered_words_1 = []
for word in words: #  Outer loop
    if 'a' in word and len(word) <= 7: # Outer Condition

        letters = []
        for letter in word:  # Inner loop
            if letter.isalpha(): # Inner condition
                letters.append(letter)  # Inner expression: letter

        filtered_words_1.append(letters)


# Make an equivalent list comprenhension
filtered_words_2 = [[letter for letter in word if letter.isalpha()] for word in words if "a" in word and len(word)<=7]
                    #[[Inner expression  Inner loop  Inner Condition] Outer loop   Outer Condition]
# Print the result
print(filtered_words_1)
print(filtered_words_2)

[['a', 'p', 'p', 'l', 'e'], ['g', 'r', 'a', 'p', 'e']]
[['a', 'p', 'p', 'l', 'e'], ['g', 'r', 'a', 'p', 'e']]


Example of Word Count:

In [26]:
def build_dictionary(words):
    # The frequencies dictionary will be built with your code below.
    # Each key is a word string and the corresponding value is an integer
    # indicating that word's frequency.

    frequencies = {}
    for word in words:
        if word in frequencies:
            frequencies[word] += 1
        else:
            frequencies[word] = 1
    return frequencies

In [27]:
lyrics = '''I I I music, IT IT IT Tuesday night.
I I music she she she.
skirts. I wear T-shirts.'''

In [28]:
build_dictionary(lyrics.split())

{'I': 6,
 'music,': 1,
 'IT': 3,
 'Tuesday': 1,
 'night.': 1,
 'music': 1,
 'she': 2,
 'she.': 1,
 'skirts.': 1,
 'wear': 1,
 'T-shirts.': 1}

<font color='red'>Complete the codes in the cell below. Please replace the "XXX" placeholder with your own codes. </font>

In [29]:
# Please use list comprehension to re-write the function to get the same output
def build_dict(words):
    # use list comprehension
    freq = {word:words.count(word) for word in set(words)}
    return freq

build_dict(lyrics.split())

{'music,': 1,
 'T-shirts.': 1,
 'Tuesday': 1,
 'she': 2,
 'IT': 3,
 'she.': 1,
 'music': 1,
 'I': 6,
 'night.': 1,
 'skirts.': 1,
 'wear': 1}

In [30]:
def build_dict(words):
    # use list comprehension
    freq = {word: words.count(word) for word in words}
    #maintain order
    return dict.fromkeys(words, 0) | {word: freq[word] for word in words} # merge dicts

build_dict(lyrics.split())

{'I': 6,
 'music,': 1,
 'IT': 3,
 'Tuesday': 1,
 'night.': 1,
 'music': 1,
 'she': 2,
 'she.': 1,
 'skirts.': 1,
 'wear': 1,
 'T-shirts.': 1}