# Introduction to Python 3: Lists, Dictionaries, and Sets
## Luca de Alfaro
Copyright Luca de Alfaro, 2018-21.  CC-BY-NC License.



Prepared on: Tue Aug  3 11:57:34 2021

This is a book chapter; it is not a homework assignment.  
Do not submit it as a solution to a homework assignment; you would receive no credit.


## Lists and tuples

### Lists

Lists are one of the basic data types in Python.


In [1]:
l = ['a', 'b', 'c']
print(l)


['a', 'b', 'c']


In [2]:
l2 = ['cat', 'dog', 'bird']
print(l2[0])
print(l2[1])


cat
dog


In Python, it's perfectly all-right to have list elements of different types. 

In [3]:
[3, "cat", 4.5]


[3, 'cat', 4.5]

### List operations

You append an element to a list like so:


In [4]:
l.append('spider')
l


['a', 'b', 'c', 'spider']

You can sum two lists.  


In [5]:
l + [1, 2, 3]


['a', 'b', 'c', 'spider', 1, 2, 3]

And as you can see from the above, in Python list elements don't have to be all
of the same type (but of course, you better know what you are doing if you
are mixing types: for instance, if you try to increment all list elements by 
1 forgetting that you have non-numeric types in it, you would get
an error).

There are many more list operations.  Among them:
You can 'pop' (retrieve, and remove) an element in any position:


In [6]:
x = l.pop(3)
print(x)
print(l)


spider
['a', 'b', 'c']


You can obtain the reverse of a list:


In [7]:
l.reverse()
l


['c', 'b', 'a']

And you can sort the list (the sort command has options; see Python docs).


In [8]:
l = l + ['cat']
l.sort()
l


['a', 'b', 'c', 'cat']

You can apply one operation to all elements of a list like this. 
First, let's notice how to capitalize a string.


In [9]:
"dog".upper()


'DOG'

Well, that's kind of too much. What I meant was:


In [10]:
"dog".capitalize()


'Dog'

Ok that's better.  Now, I want to get a list like l, except with the animals capitalized.


In [11]:
l_capitalized = [s.capitalize() for s in l]
l_capitalized


['A', 'B', 'C', 'Cat']

In [12]:
[s.capitalize() for s in l if s.startswith("c")]


['C', 'Cat']

What's going on?  Basically, in the `[` ... `]` we are building another list, 
and we give the instructions on how to create each element. 
And how do we create each element?  We iterate over the list `l`, with `for s in l`,
and for each of its elements `s`, we do `s.capitalize()`, which capitalizes the string.

The `[f(x) for x in l]` syntax is called a _list comprehension_.

You can get the length of a string, or a list, with the len() operator.


In [13]:
print(len(l))
print(len(l_capitalized))
print([len(s) for s in l])


4
4
[1, 1, 1, 3]


See https://docs.python.org/3.7/tutorial/datastructures.html for more list functions.

### Tuples

Tuples are kind of like lists, except they are immutable.
Here are two points in 2-D.


In [14]:
p1 = (1., 2.)
p2 = (3.1, 3.2)



The useful thing with tuples is that they are easy to take apart.
Whereas a beginner would write


In [15]:
x = p1[0]
y = p1[1]
print (x, y)


1.0 2.0


anyone with a bit of Python experience would instead write: 


In [16]:
x, y = p1
print(x, y)


1.0 2.0


Of course, the above works only if the tuple of variables on the left hand side
is the same length as the tuple on the right hand side!


In [17]:
import traceback
try:
    x, y, z = p2
except:
    print(traceback.format_exc())


Traceback (most recent call last):
  File "<ipython-input-17-023e370181fb>", line 3, in <module>
    x, y, z = p2
ValueError: not enough values to unpack (expected 3, got 2)



If you don't care about a component, you can just use _


In [18]:
x, _ = p1
x


1.0

### List slicing

You can 'slice' (yeah, that's a technical term) the beginning and end of a list:


In [19]:
l = ['cat', 'dog', 'bird', 'fish', 'ant', 'fly']
l[:3] # Till element 3, excluded


['cat', 'dog', 'bird']

In [20]:
l[3:] # From element 3 onwards


['fish', 'ant', 'fly']

In [21]:
print(l[1:3]) # From element 1 included, to element 3 excluded


['dog', 'bird']


If you use negative numbers, they count backwards from the end
of the list.  It's weird, but very useful.


In [22]:
l[-1] # This is the last element


'fly'

In [23]:
l[-2:] # From the penultimate onwards, so the last two.


['ant', 'fly']

One particularly nice thing about slicing is that it never generates
errors.  If there's not enough of the list to slice it the way you
want, you will simply get a smaller slice (of course, this means that
the size of the resulting slice is not guaranteed).


In [24]:
l = ['a', 'b', 'c', 'd', 'e', 'f']
l


['a', 'b', 'c', 'd', 'e', 'f']

But on the other hand, this does not work.

    print(l[-10])

## Dictionaries


Dictionaries in Python are essentially maps between sets, or, one-to-many functions. 
Or if you are in CS, they are like hash tables.  In fact, turns out they are hash tables.
Except you don't need to worry about their implementation. 
Enough said, let's define one.


In [25]:
n_of_paws = {'cat': 4, 'fish': 0, 'bird': 2, 'snake': 0}


Dictionaries can be indexed with [] notation like list indexing, 
except they are indexed by their "keys", not by integers.


In [26]:
n_of_paws['fish']


0

You can also build a dictionary like this:


In [27]:
d = dict(dog=4, cat=4, bird=2, fish=0)
d


{'bird': 2, 'cat': 4, 'dog': 4, 'fish': 0}

Of course, to be able to use the above syntax, the keys need to satisfy the constraints of variable names (for example, they cannot be numbers)...


If you are not sure whether a key is in the dictionary, you can use .get() 
rather than []:


In [28]:
print(n_of_paws.get('fish'))
print(n_of_paws.get('elephant'))


0
None


You can check whehter something is in a dictionary with the `in` operator:


In [29]:
print("elephant" in n_of_paws)
print("cat" in n_of_paws)


False
True


Suppose you are given a list of animals.


In [30]:
animals = ['pig', 'donkey', 'chicken', 'cat', 'dog', 'snake']


Now you want to build a second list, containing the number of paws of each.


In [31]:
my_paws = [n_of_paws.get(a) for a in animals]
my_paws


[None, None, None, 4, None, 0]

#### Dictionary keys, values, and key-value pairs

You can ask for the list of keys of a dictionary:


In [32]:
n_of_paws.keys()


dict_keys(['cat', 'fish', 'bird', 'snake'])

and you can iterate over it: 

In [33]:
for k in n_of_paws.keys():
    print(k)


cat
fish
bird
snake


Similarly for values: 

In [34]:
for v in n_of_paws.values():
    print(v)


4
0
2
0


You can also iterate on key-value pairs: 

In [35]:
for k, v in n_of_paws.items():
    print(k, ":", v)


cat : 4
fish : 0
bird : 2
snake : 0


The _very unfortunate_ thing is that these operators `.keys()`, `.values()`, and `.items()` do not give you a list of keys, values, and items.   Rather, they give you _iterators_ on those lists.  This means that, if you change the dictionary, also the behavior of the iterator changes!  This is _very unfortunate_!

In [36]:
d = {'luca': 4, 'anna': 5}
s = d.keys()
d['helen'] = 8
for k in s:
    print(k)


luca
anna
helen


To remedy this, you need to explicitly build a list, like so: 

In [37]:
d = {'luca': 4, 'anna': 5}
s = list(d.keys()) # Notice!
d['helen'] = 8
for sname in s:
    print(sname)


luca
anna


This is a clear example where, in the development of Python 3, developers erroneously thought that efficiency (just returning an iterator) had the priority over... sanity of mind. 

In practice, always remember to use: 

    list(d.keys())

rather than 

    d.keys()

and similarly for `.values()` and `.items()`. 

We can create a dictionary from another dictionary: 


In [38]:
my_paws = {a : n_of_paws.get(a) for a in animals}
my_paws


{'cat': 4,
 'chicken': None,
 'dog': None,
 'donkey': None,
 'pig': None,
 'snake': 0}

What we did above is a dictionary comprehension.
It works similarly to a string comprehension, but uses the syntax `{k: d for` ...`}`
to build the dictionary.  

What if you want in the dictionary only things that do not map to None?
You can add an if clause to the comprehension.


In [39]:
my_known_paws = {a : n_of_paws.get(a) for a in animals if a in n_of_paws}
my_known_paws


{'cat': 4, 'snake': 0}

As you might start to guess, the national sport in Python consists in doing
as much as possible in a single line.  Concise code leads to concise thinking
and understanding.

## Sets

Sets are data structures that represent... sets.  Sets are superficially similar to lists, 
except that they cannot have repeated elements, and their elements are not stored in any particular order -- it does not make sense to ask for _"the third element of the set"_. 


In [40]:
s = set() # {} would be a dictionary...
print(s)


set()


In [41]:
set1 = {'cat', 'dog'}
set2 = {'bird', 'mouse', 'cat'}
set3 = {'dog', 'cat'}


We can take union, intersection, and difference of sets:


In [42]:
print(set1 | set2) # union
print(set1 & set2) # intersection
print(set1 - set2) # difference


{'dog', 'cat', 'mouse', 'bird'}
{'cat'}
{'dog'}


Set equality is defined as element-wise equality
(order does not matter)


In [43]:
set1 == set3


True

We can add elements to a set... 


In [44]:
set1.add('duck')
print(set1)
set1.add('dog')
print(set1)


{'dog', 'cat', 'duck'}
{'dog', 'cat', 'duck'}


... and as you can see, sets really have no repeated elements,
so if you add a dog to a set containing already a dog, 
nothing changes.

We can test membership using "in", just like for lists.


In [45]:
print('cat' in set1)
print('opossum' in set1)


True
False


A quick way to remove duplicates from a list is to turn it 
into a set, then back into the list.  This loses the ordering though,
as sets do not preserve the order of the elements of the lists
from which they were created:


In [46]:
l = ['a', 'b', 'c', 'g', 'c', 'd', 'f', 'g']
l_uniq = list(set(l))
l_uniq


['b', 'g', 'a', 'f', 'c', 'd']

In [47]:
if len(l) == len(set(l)):
    print("The elements are unique")
else:
    print("There is a repeated element")


There is a repeated element


If you want to preserve the ordering, then you can use iteration
(covered later) and do as follows:


In [48]:
l_uniq = [] # list
occurrences = set() # set
for s in l:
    if s not in occurrences:
        l_uniq.append(s) # list append
        occurrences.add(s) # set add
l_uniq


['a', 'b', 'c', 'g', 'd', 'f']