# Lecture 9 - Tuples, Lists and Dictionaries (https://bit.ly/intro_python_09)

Today:
* Tuples (recap many things introduced with respect to strings) 
* Lists 
* List comprehensions
* More Dictionaries
* Sets

# Tuples

Tuples are immutable (can't be changed) sequences. Tuples are *everywhere* in Python, and are often defined somewhat implicitly.

A lot of this is going to feel like revision from covering strings, because Python is very consistent in handling sequences.

In [15]:
# A tuple 

x = ("Julia", "Roberts", 1967) # I'm using the examples from the open book, btw

type(x)

tuple

Tuples are written like lists, except you switch square [] brackets for rounded brackets (), aka parentheses. 

Like lists, they can have arbitrary length.

The rounded brackets are actually optional (but we'll mostly include them for clarity):

In [3]:
x = "Julia", "Roberts", 1967

type(x)

tuple

In [4]:
x = ("Julia", "Roberts", 1967) # I'm using the examples from the open book, btw

x[0] # You can address the members of a tuple using indices, like in
# strings and lists

'Julia'

In [5]:
x[1]

'Roberts'

In [14]:
# A tuple of length one is specified like this

x = (1,)

type(x)

tuple

In [20]:
# or just

x = 1,

type(x)

tuple

In [21]:
# Note x is not a tuple here, as the interpretor just simplifies out the brackets

x = (1)

type(x)

int

In [3]:
x = () ## This is a tuple of length 0, it is not (,) or just ',' - they all create syntax errors

type(x)

tuple

In [22]:
# Slicing works just like with strings:

x = ("a", "sequence", "of", "strings")

x[1:] # If you don't get this, go back to the previous lecture on strings. All the
# same slicing things you can do with strings work with tuples

('sequence', 'of', 'strings')

**Immutable**

The key difference between lists and tuples is that tuples are immutable - they can't be edited.

In [23]:
# Like strings, tuples can't be edited

x = ("a", "sequence", "of", "strings")

x[0] = "the"

TypeError: 'tuple' object does not support item assignment

In [3]:
# To make edited tuples from existing tuples you therefore slice and dice
# them, using the '+' operator, which (as with strings), represents concatenation:

x = ("a", "sequence", "of", "strings")

("the",) + x[1:] # This produces the effect of replacing the first member of x, 
# creating a new tuple



('the', 'sequence', 'of', 'strings')

Again, like with strings, having tuples be immutable is a design choice that
makes them easy to share between different parts of a program without
worry that they will change. 

**Length**

In [25]:
julia = ("Julia", "Roberts", 1967, "Duplicity", 2009, "Actress", "Atlanta, Georgia")

len(julia) # The length of the tuple

7

**In operator revisited**

In [26]:
# Like strings we can do search in tuples using the in operator:

5 in (1, 2, 3, 4, 5, 6)

True

In [27]:
# And its negation (not in):

5 not in (1, 2, 3, 4, 5, 6)

False

**Tuple assignment**

You can unpack the values in a tuple into multiple variables on one line (more nice syntactic sugar):

In [5]:
julia = ("Julia", "Roberts", 1967, "Duplicity", 2009, "Actress", "Atlanta, Georgia")

(name, surname, b_year, movie, m_year, profession, b_place) = julia

print(surname)

Roberts


In [16]:
# This allows you to do neat variable value swaps on one line:

x = 10
y = 5

print(x, y)

# Instead of:
z = x
x = y
y = z

print(x, y)

# You can do:

(x, y) = (y, x) # Which is clearer, less code and doesn't make a needless extra
# variable

print(x, y)

10 5
5 10
10 5


In [30]:
# You actually don't even need the brackets, the tuples are implicit:

x, y = 5, 10 # This is the same as (x, y) = (5, 10)

print(x,  y)

x, y = y, x # Here the parentheses are again implicit

print(x,  y)

5 10
10 5


**Multiple return values**

In [3]:
# Tuples allow you to return multiple values from a function

import math # don't worry about this, its a module import, we'll
# cover this shortly

def f(r):
    """ Return (circumference, area) of a circle of radius r """
    
    c = 2 * math.pi * r # Circumference 
    a = math.pi * r * r # Area 
    
    return (c, a)
  
r = float(input("Enter a radius: "))

c, a = f(r)

print("Radius is", r, "Circumference is", c, " area is ", a)
  

Enter a radius: 5
Radius is 5.0 Circumference is 31.41592653589793  area is  78.53981633974483


**Nested / Composed tuples**

In [32]:
# The values in a tuple can be any legit Python object.
# So you can make complex nested (tree like objects)

julia_more_info = ( ("Julia", "Roberts"), (8, "October", 1967),
                     "Actress", ("Atlanta", "Georgia"),
                     ( ("Duplicity", 2009 ),
                       ("Notting Hill", 1999),
                       ("Pretty Woman", 1990),
                       ("Erin Brockovich", 2000),
                       ("Eat Pray Love", 2010),
                       ("Mona Lisa Smile", 2003),
                       ("Oceans Twelve", 2004) ))

#print(len(julia_more_info[0]))

name, dob, profession, birth_place, bibliography = julia_more_info

print(bibliography)

(('Duplicity', 2009), ('Notting Hill', 1999), ('Pretty Woman', 1990), ('Erin Brockovich', 2000), ('Eat Pray Love', 2010), ('Mona Lisa Smile', 2003), ('Oceans Twelve', 2004))


The above example is kind of over complicated, we'll see better ways
to make complex data like this when we cover Python objects.








**Tuples and Nested Sets**

As a quick aside, nested tuples can represent any nested set of sets, e.g.:

<img src="https://raw.githubusercontent.com/benedictpaten/intro_python/main/lecture_notebooks/figures/graffles/tree%20nested%20sets.jpg" width=600 height=300 />

In [33]:
# A tuple representation 
fruit = ("Fruit", ("Berries", ("Bananas",), ("Strawberries",), ("Raspberries",)), ("Citrus", ("Lemons",), ("Oranges",)), ("Apples",))

print(fruit[1][1])

('Bananas',)


Nested sets are equivalent to "trees", another important type of graph structure used commonly in computer science:

<img src="https://raw.githubusercontent.com/benedictpaten/intro_python/main/lecture_notebooks/figures/graffles/tree%20structure.jpg" width=600 height=300 />

**This kind of nested representation pops up many places, for example JSON and XML are both similar nested set structures.**

**Tuple comparison**

In [34]:
# Tuples are by default compared lexicographically: https://en.wikipedia.org/wiki/Lexicographical_order

a = (5, 7)
b = (5, 10)

a < b # Yes, because a[0] == b[0] but a[1] < b[1]

True

In [35]:
# Note this works too, just like with a dictionary sort

a = (5,)
b = (5, 10)

a < b # Yes, because a[0] == b[0] and a is shorter than b

True

In [8]:
# Tuples can only be compared if their corresponding elements are comparable

a = ("5", 7)
b = (5, 10)

#a = (int(a[0]), a[1])  ## if we want to make a and b comparable we need
# change them to be comparable by making a new tuple

a < b

TypeError: '<' not supported between instances of 'str' and 'int'

In [9]:
# Comparisons can be nested

a = ((1, 2), 3) 
b = ((1, 3), 3)  

a < b # True because the a[0] < b[0] because (1, 2) < (1, 3)

2


True

# Challenge 1

In [19]:
# Write a function min_max that takes a list or tuple and 
# returns the minimum and maximum values in the sequence as a tuple

#### Write your code here

l = (4, 7, 2, 0, 10, 8)
x, y = min_max(l) # should return (0, 10)
print(x, y)

0 10


# Lists (again)

We started covering lists earlier, let's now fill in the missing details.

Given that we've covered strings and tuples, many of the details will be the same and should feel consistent and the same (and somewhat repetitive), which is why we'll only briefly review them.

**Slicing and length**

In [38]:
x = ["hello", 2.0, 5, [10, 20]] # A list, containing 4 elements. The last 
# element is itself a list with two elements

x[0] # First element

'hello'

In [39]:
x[-1] # Last element

[10, 20]

In [40]:
len(x) # Length of a list

4

In [41]:
x[len(x)-1] # Same as x[-1] (which is clearer)

[10, 20]

In [42]:
x[1:] # Yup, slicing works the same as with strings and tuples. 

# This consistency among basic data types is part of why Python
# is so nice to use. Again, if this isn't clear to you play with the slicing
# examples on strings and try them with lists and tuples - they work the same.

[2.0, 5, [10, 20]]

**Concatenate operator**

In [43]:
[ 1, 2, 3 ] + [ 4, 5 ] # Here + concatenates the two lists, creating a new list, just like 
# with strings and tuples

[1, 2, 3, 4, 5]

**In operator**

In [44]:
# Yup, the "in" operator works on lists the way you'd expect

x = ["hello", 2.0, 5, [10, 20]]

5 in x

True

In [45]:
"monkey" not in x

True

**Comparison**

In [46]:
# This works using the same lexicographic method

x = [1, 2, 3]
y = [2, 5]

x < y

True

In [47]:
# Again, it only works if the items themselves can be compared

x = [1, 2, 3]
y = [1, 2, []]

x < y

TypeError: '<' not supported between instances of 'int' and 'list'

# Lists are mutable

In [48]:
# Lists, unlike strings, ints, floats and tuples, are mutable

x = [1, 2, 3]

x[0] = 3

print(x)

[3, 2, 3]


**Lists can be edited using slices**

In [7]:
# We can use slices to insert, replace and remove elements

x = [1, 2, 3, 4, 5, 6, 7]

x[1:3] = [8, 9] # Replace the second and third elements in the list (2 and 3)
# with 8 and 9

print(x)

[1, 8, 9, 4, 5, 6, 7]


In [50]:
x = [1, 2, 3, 4, 5, 6, 7]

x[1:3] = [] # Replace the second and third elements with an empty list, so removing them

print(x)

[1, 4, 5, 6, 7]


In [8]:
x = [1, 2, 3, 4, 5, 6, 7]

x[1:3] = [8, 9, 10] # Replace the second and third elements with an larger 
# list, so inserting elements

print(x)

[1, 8, 9, 10, 4, 5, 6, 7]


**Methods to add elements to the end of a list**

In [12]:
# Append method lets us add an element to the end of a list

x = [1, 8, 9, 10, 4, 5, 6, 7] 

x.append(5)

print(x)

[1, 8, 9, 10, 4, 5, 6, 7, 5]


In [10]:
# You can add multiple elements to the end of a list using extend()

x.extend([4, 5, 6]) # This takes the input list and adds its elements
# to x in order at the end

print(x)

[1, 8, 9, 10, 4, 5, 6, 7, 5, 4, 5, 6]


**Lists methods to remove elements**

In [17]:
x = [1, 8, 9, 10, 4, 5, 6, 7, 5, 4, 5, 6, 10]

# Remove

x.remove(10) # Removes the first instance of 10

print(x)


[1, 8, 9, 4, 5, 6, 7, 5, 4, 5, 6, 10]


In [18]:
# You can also use "del"

del x[0] # Remove the first element of the list

print(x)

[8, 9, 4, 5, 6, 7, 5, 4, 5, 6, 10]


In [19]:
del x[1:3] # Can delete ranges (here removing the 2 and 3rd elements)

print(x)

[8, 5, 6, 7, 5, 4, 5, 6, 10]


In [20]:
# Remove the last element of the list:

x.pop()

print(x)

[8, 5, 6, 7, 5, 4, 5, 6]


**Inserting into a list**

In [57]:
# Insert 

x = [1, 2, 3, 4, 5, 6, 7]

x.insert(2, "boo") # Insert an element at the 3rd position, shifting the existing
# list up

print(x)

[1, 2, 'boo', 3, 4, 5, 6, 7]


**Other useful list functions**

In [58]:
# Python provides lots of useful list functions, non-exhaustively:

x = [1, 2, 3, 4]

# Type conversion from list to tuple
tuple(x)

(1, 2, 3, 4)

In [59]:
x = (1, 2, 3, 4)

# Type conversion from tuple to list 
list(x)

[1, 2, 3, 4]

In [60]:
x = [1, 2, 3, 4]

x.reverse() # Reverses the elements of the list in place (this doesn't make a new
# list, it just reverses the sequence of the elements in x)

print(x)

[4, 3, 2, 1]


**Nesting and Matrices**

In [61]:
# We have seen examples of nested lists/tuples, e.g.

x = [ 1, 2, [ 4, 5 ]]

# A matrix is just a 2D 'list of lists', a special case of this

x = [ [ 1, 2], [3, 4]]

# You can access it using nested array accesses
x[0][1] 

2

Matrices and arrays are important data types of many kinds of scientific and mathematical computing. At the end of the course (time permitting), we'll look at specialized packages, like Numpy, for representing this kind of data more efficiently than "lists of lists", and which can be used for doing math stuff, e.g. linear algebra, machine learning, etc.

# Challenge 2

In [20]:
x = [1, 2, 3, 4, 5, 6, 7]

# Use a list slice operation to insert the sequence 8, 9, 10
# between the third and fourth element in the list, i.e so that x = [ 1, 2, 3, 8, 9, 10, 4, 5, 6, 7 ]



print(x)

[1, 2, 3, 8, 9, 10, 4, 5, 6, 7]


# List comprehensions

These are a super useful mashup of a for loop, a list and conditionals. I use these *all* the time.

In [16]:
x = [ 1, 2, 3, 4, 5 ]

y = [ i * 2 for i in x ]

## Is the same as:
# y = []
# for i in x:
#     y.append(i*2)

print(y)

[2, 4, 6, 8, 10]


In [None]:
# The basic structure of the simplest form is:
[ EXPRESSION1 for x in ITERABLE ]

# it is equivalent to writing:
# l = []
# for x in ITERABLE:
#   l.append(EXPRESSION1)

Such expressions are really useful for systematically modifying the elements in a list. Here are a couple more examples:

In [64]:
# Append a prefix string to a set of strings:

x = [ "essay.doc", "draft.xls", "funny.jpg"] # Suppose we have a list of file names

y = [ "/my_dir/" + i for i in x ]

print(y)

['/my_dir/essay.doc', '/my_dir/draft.xls', '/my_dir/funny.jpg']


In [65]:
import random # this is a module for making random numbers, etc.

y = [ "head" if random.random() > 0.5 else "tail" for i in range(10) ]

print(y) # A sequence of random coin tosses

['tail', 'head', 'tail', 'tail', 'tail', 'tail', 'head', 'head', 'head', 'tail']


The syntax also allows for a conditional in a list comprehension:

In [66]:
x = [ "a", 1, 2, "list"]

l = [ i for i in x if type(i) == str ] # Makes a new list, l, containing only the strings
# in x

print(l)

['a', 'list']


In [None]:
# The basic structure is:
[ EXPRESSION1 for x in ITERABLE (optionally) if EXPRESSION2 ]

# it is equivalent to writing:
# l = []
# for x in ITERABLE:
#   if EXPRESSION2:
#       l.append(EXPRESSION1)

You can always do this with more code, but these shorthands are succinct and save code.

Here's one more example:

In [6]:
def how_many_strings(x):
  """ Returns the number of strings in x, where x is a list or equivalent. """
  j = 0
  for i in x:
    if type(i) == str:
      j += 1
  return j

# This can be accomplished equivalently:

def how_many_strings2(x):
  """ Returns the number of strings in x, where x is a list or equivalent. """
  return len([ None for i in x if type(i) == str ])


how_many_strings2([ 1, "a", "string", 6, (7,), [ 5, 6] ])


2

Here's another interesting example:

In [12]:
def flat_list_elements(x):
  """ Returns the flattened version of one level nested list. """
  new_list = []
  for i in x:
    if type(i) == list:
      new_list.extend(i)
    else:
      new_list.append(i)
  return new_list

# This can be accomplished equivalently:

def flat_list_elements2(x):
  """ Returns the flatted version of one level nested list. """
  new_list = []
  [new_list.extend(i) if type(i) == list else new_list.append(i) for i in x] # In this case the list created
  # is a side effect and not used
  return new_list


print(flat_list_elements2([[1], [2, 3], 4, [5, 6, 7]]))


[1, 2, 3, 4, 5, 6, 7]


# Challenge 3

In [21]:
# (Q1) Write a list comprehension to calculate the first 10 square numbers.

print(x)

# (Q2) Write a list comprehension that produces a modified list excluding any odd numbers.

print(y)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 4, 16, 36, 64]


# More Dictionaries

Reminder: Dictionaries are maps from a set of keys to a set of values

* Dictionaries are defined by curly braces {}

* They are a collection of key:value pairs, each pair separated by a comma.

* Keys and values can be most Python objects, e.g: { 1:"hello", "two":True }

* You look up elements in a dictionary using square bracket notation:

In [68]:
english_to_spanish = {"two": "dos", "one": "uno", "three":"tres"}

fruit_costs = {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}

As a reminder:

* **No duplicate keys**: You can't have duplicate keys in a dictionary

* **Duplicate values are allowed**

* **Dictionaries are mutable**

# Why use dictionaries?

* Dictionaries are convenient - it is surprising how often we want to create maps from one set to another.

* Dictionaries are fast: 
    * The cost of looking up a key value pair in a dictionary is a constant time operation (denoted O(1) - we'll get into this a bit more later). In contrast checking if something is in a list costs N operations where N is the length of the list (denoted O(N)). 
    * Addition, removal and update operations are all O(1) on dictionaries.







# How do dictionaries work?

* AFAIK, all Python Dictionaries are implemented as hash tables. Hash tables are a fundamental data structure that use a "hash function".

* A hash function is a function that maps from a set of keys to some finite number of addresses (think a discrete interval of memory)

* In a hash table each address of the codomain of the hash function has an associated bucket

* To lookup an element in a hash table the value of the hash function for a given key is computed and the resulting value is used to look up a corresponding bucket where the value is stored. Collisions (where two keys map to the same bucket), are dealt with by comparing entries for equality. 


<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/7d/Hash_table_3_1_1_0_1_0_0_SP.svg/1920px-Hash_table_3_1_1_0_1_0_0_SP.svg.png" width=400 height=200 />

* Assuming collisions are rare, hash tables have O(1) performance. 

* For details: https://en.wikipedia.org/wiki/Hash_table
  


# More Dictionary Odds and Ends

**Some types can't be keys**:

In [3]:
{ [ 1, ]:"hello"} # Lists can't be keys because they are mutable

TypeError: unhashable type: 'list'

In [4]:
{ (1,):"hello"} # However, tuples, being immutable, can - another reason to use tuples

{(1,): 'hello'}

**Get the Keys**: To get the set of keys in a dictionary do this:

In [5]:
fruit_costs = {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}

fruit_costs.keys() # This will contain no duplicates

dict_keys(['apples', 'bananas', 'oranges', 'pears'])

**Get the Values**:

In [2]:
fruit_costs = {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}

fruit_costs.values() # This could contain duplicates

dict_values([430, 312, 525, 217])

**Dictionary syntax is consistent with other Python types**: Much of the same stuff that works with lists works with dictionaries:

(I'm not covering every function and keyword here - but use you knowledge and intuition to test what works)

In [7]:
fruit_costs = {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}

len(fruit_costs) # The number of key value pairs in a

4

In [8]:
del fruit_costs["apples"] # Removes the key value pair 1:"hello"

print(fruit_costs)

{'bananas': 312, 'oranges': 525, 'pears': 217}


In [9]:
fruit_costs.pop("oranges") # This is like del

print(fruit_costs)

{'bananas': 312, 'pears': 217}


**For loops on dictionaries**

In [10]:
fruit_costs = {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}

for i in fruit_costs: # A for loop over the keys in the dictionary
   print(i, fruit_costs[i])

apples 430
bananas 312
oranges 525
pears 217


**Testing key membership**

In [11]:
fruit_costs = {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}

# Test for key membership using "in" and "not in"

"apples" in fruit_costs

True

In [12]:
"passion fruit" not in fruit_costs # Not in

True

In [13]:
217 in fruit_costs # This is membership of the set of keys, not the values!

False

**Dictionary comprehensions**

Lastly, like list comprehensions, Python provides a convenient syntax for mashing up loops, conditionals and dictionaries - dictionary comprehensions.

In [18]:
# Dictionary comprehension

{ i : i*2 for i in range(10) } # Like a list comprehension, 
# but instead you iterate to produce a sequence of key:value pairs

{0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 5: 10, 6: 12, 7: 14, 8: 16, 9: 18}

In [None]:
# The basic structure is:
{ KEY:VALUE for x in ITERABLE (optionally) if EXPRESSION2 }

# it is equivalent to writing:
# d = {}
# for x in ITERABLE:
#   if EXPRESSION2:
#       d[KEY] = VALUE

# Challenge 4

In [24]:
x = [ ("Dave", "Davies", "831-123-4567"), ("Mobin", "Shafin", "821-678-1234"), ("Marina", "Chang", "805-789-3456")]

# Q1: Use a for loop to build a dictionary, d, of first names to 
# phone numbers from x, note each tuple in x is (first name, last name, phone number)


# Q2: Use a dictionary comphrehension to do the same thing as in Q1.


{'Dave': '831-123-4567', 'Mobin': '821-678-1234', 'Marina': '805-789-3456'}
{'Dave': '831-123-4567', 'Mobin': '821-678-1234', 'Marina': '805-789-3456'}


# Sets

* In Python, sets are collections of **unique, unordered** elements. 

* Like dictionaries, sets are (AFAIK), hash based, so they offer constant time (O(1)) cost membership operations.

* If you've done any discrete math, you'll appreciate Python provides awesome set functionality.

In [5]:
alice_text = """Alice was beginning to get very tired of sitting by her sister on the bank, 
and of having nothing to do: once or twice she had peeped into the book her sister was reading, 
but it had no pictures or conversations in it, `and what is the use of a book,' 
thought Alice `without pictures or conversation?'"""

alice_words = alice_text.split() # This breaks up alice into a sequence of word, removing
# the white space

print("The tokenized alice text", alice_words)

s = set(alice_words)

print(s)

print("In Alice there are {} words and {} unique words".format(len(alice_words), len(s)))


The tokenized alice text ['Alice', 'was', 'beginning', 'to', 'get', 'very', 'tired', 'of', 'sitting', 'by', 'her', 'sister', 'on', 'the', 'bank,', 'and', 'of', 'having', 'nothing', 'to', 'do:', 'once', 'or', 'twice', 'she', 'had', 'peeped', 'into', 'the', 'book', 'her', 'sister', 'was', 'reading,', 'but', 'it', 'had', 'no', 'pictures', 'or', 'conversations', 'in', 'it,', '`and', 'what', 'is', 'the', 'use', 'of', 'a', "book,'", 'thought', 'Alice', '`without', 'pictures', 'or', "conversation?'"]
{'twice', 'what', 'of', 'bank,', 'is', 'tired', '`and', 'book', 'but', 'thought', 'pictures', 'it,', 'sitting', 'sister', 'once', 'on', 'do:', 'very', 'by', 'and', 'conversations', 'in', 'peeped', "book,'", 'beginning', "conversation?'", 'having', 'was', 'Alice', 'reading,', 'get', '`without', 'or', 'had', 'to', 'no', 'her', 'into', 'a', 'nothing', 'she', 'use', 'the', 'it'}
In Alice there are 57 words and 44 unique words


**Sets can be created using curly brackets**

Like dictionaries, you can create a set with curly brackets, the difference being with sets you omit the colons to indicate key-value pairs.

In [3]:
s2 = { "a", "lot", "of", "good", "that", "did", "a", "lot"}

print(s2) # Note, sets are unordered

{'did', 'a', 'lot', 'good', 'of', 'that'}


In [11]:
# Note, if you want to create an empty set do this:

x = set()  

# If you write:

x = {}

type(x) # You make an empty dictionary instead

dict

**All the basic Python op things work with sets:**

In [74]:
"Alice" in s

True

In [75]:
for i in s: #Yup, for loops
  print(i)

nothing
Alice
pictures
get
tired
`and
no
her
into
on
very
conversations
by
book,'
or
and
bank,
in
but
was
the
`without
it
she
what
beginning
it,
is
book
twice
to
do:
use
of
once
a
had
sitting
peeped
conversation?'
reading,
having
sister
thought


**Don't expect the ordering to be meaningful:**

In [5]:
x = { "a", "list", "of", "strings" }

print(x) # Sets are unordered

{'list', 'of', 'strings', 'a'}


**Python offers set theoretic ops:**

In [7]:
x = { 1, 2, 3, 4 }
y = { 3, 4, 5, 6 }

print(x.union(y)) # Set union


{1, 2, 3, 4, 5, 6}


In [78]:
print(x.intersection(y)) # Set intersection

{3, 4}


In [79]:
print(x.difference(y)) # This is x - y

{1, 2}


**You can also do set comprehensions**

In [10]:
{ a.upper() for a in alice_words } # Alice, but shouting

{'A',
 'ALICE',
 'AND',
 'BANK,',
 'BEGINNING',
 'BOOK',
 "BOOK,'",
 'BUT',
 'BY',
 "CONVERSATION?'",
 'CONVERSATIONS',
 'DO:',
 'GET',
 'HAD',
 'HAVING',
 'HER',
 'IN',
 'INTO',
 'IS',
 'IT',
 'IT,',
 'NO',
 'NOTHING',
 'OF',
 'ON',
 'ONCE',
 'OR',
 'PEEPED',
 'PICTURES',
 'READING,',
 'SHE',
 'SISTER',
 'SITTING',
 'THE',
 'THOUGHT',
 'TIRED',
 'TO',
 'TWICE',
 'USE',
 'VERY',
 'WAS',
 'WHAT',
 '`AND',
 '`WITHOUT'}

In [None]:
# The general format is like a list comprehension, but instead with the curly brackets

# The basic structure is:
{ EXPRESSION1 for x in ITERABLE (optionally) if EXPRESSION2 }

# it is equivalent to writing:
# l = set()
# for x in ITERABLE:
#   if EXPRESSION2:
#       l.add(EXPRESSION1)

# Challenge 5

In [7]:
dickens_text = """My father’s family name being Pirrip, and my Christian name Philip, 
my infant tongue could make of both names nothing longer or more explicit than Pip. 
So, I called myself Pip, and came to be called Pip.I give Pirrip as my father’s family name, 
on the authority of his tombstone and my sister,—Mrs. Joe Gargery, who married the blacksmith. 
As I never saw my father or my mother, and never saw any likeness of either of them 
(for their days were long before the days of photographs), my first fancies regarding what they 
were like were unreasonably derived from their tombstones. 
The shape of the letters on my father’s, gave me an odd idea that he was a square, stout, 
dark man, with curly black hair. From the character and turn of the inscription, 
“Also Georgiana Wife of the Above,” I drew a childish conclusion that my mother was freckled and sickly. 
To five little stone lozenges, each about a foot and a half long, which were arranged in a neat 
row beside their grave, and were sacred to the memory of five little brothers of mine,—who gave up trying to 
get a living, exceedingly early in that universal struggle,—I am indebted for a belief I religiously 
entertained that they had all been born on their backs with their hands in their trousers-pockets, 
and had never taken them out in this state of existence."""

# Q1: Calculate how many unique words are shared (common to both) in dickens_text and alice_text.

# Q2: Calculate how many unique words are present (found in either) in dickens_text and/or alice_text.


There are 13 words in both alice and dickens
There are 187 words in both alice and dickens


# Reading

* Read Chapter 9 (tuples): http://openbookproject.net/thinkcs/python/english3e/tuples.html
* Read Chapter 11 (lists): http://openbookproject.net/thinkcs/python/english3e/lists.html

# Homework

* Go to Canvas and complete the lecture quiz, which involves completing each challenge problem
* ZyBook Reading 9


# Practice Problems

In [None]:
"""
Problem 1: Tuple Manipulation
Write a function that takes a tuple t and returns a new tuple with the first 
and last elements swapped. If the tuple has length less than 2, return the original tuple.
"""

def swap_ends(t):
    # Your code here
    pass

# Test cases
assert swap_ends((1, 2, 3, 4)) == (4, 2, 3, 1)
assert swap_ends((1,)) == (1,)
assert swap_ends(('a', 'b', 'c')) == ('c', 'b', 'a')
print("All test cases passed for Problem 1!")

In [None]:
"""
Problem 2: List Filtering
Write a function that takes a list and returns a new list with duplicate elements 
removed while preserving the original order of elements (first occurrence is kept).
"""
def remove_duplicates(lst):
    # Your code here
    pass

# Test cases
assert remove_duplicates([1, 2, 3, 2, 4, 1, 5]) == [1, 2, 3, 4, 5]
assert remove_duplicates(['a', 'b', 'a', 'c', 'b']) == ['a', 'b', 'c']
assert remove_duplicates([]) == []
print("All test cases passed for Problem 2!")

In [None]:
"""
Problem 3: Dictionary Merging
Write a function that takes two dictionaries and returns a new dictionary containing 
all key-value pairs from both dictionaries. If a key exists in both dictionaries, 
the value from dict2 should override the value from dict1.
"""
def merge_dicts(dict1, dict2):
    # Your code here
    pass

# Test cases
d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'b': 4, 'd': 5}
assert merge_dicts(d1, d2) == {'a': 1, 'b': 4, 'c': 3, 'd': 5}
assert merge_dicts({'x': 1}, {}) == {'x': 1}
assert merge_dicts({}, {'y': 2}) == {'y': 2}
print("All test cases passed for Problem 3!")

In [None]:
"""
Problem 4: Lists and Dictionaries
Write a function that takes a list of strings and returns a dictionary where 
the keys are word lengths and the values are lists of words of that length.
"""
def group_by_length(words):
    # Your code here
    pass

# Test cases
words = ['cat', 'dog', 'apple', 'fish', 'elephant', 'ant']
result = group_by_length(words)
assert result[3] == ['cat', 'dog', 'ant']
assert result[4] == ['fish']
assert result[5] == ['apple']
assert result[8] == ['elephant']
assert group_by_length([]) == {}
print("All test cases passed for Problem 4!")

In [None]:
"""
Problem 5: Nested List Manipulation
Write a function that calculates the sum of all numbers in a nested list 
structure of arbitrary depth. The list may contain numbers and other lists.
"""
def nested_sum(lst):
    # Your code here
    pass

# Test cases
assert nested_sum([1, [2, 3], [4, [5, 6]], 7]) == 28
assert nested_sum([1, 2, [3, 4, [5]]]) == 15
assert nested_sum([]) == 0
assert nested_sum([1, [], [2, []], [], 3]) == 6
print("All test cases passed for Problem 5!")