## Python Tuples


Python provides another type that is an ordered collection of objects, called a tuple.

Tuples are identical to lists in all respects, except for the following properties:

- Tuples are defined by enclosing the elements in parentheses (()) instead of square brackets ([]).
- Tuples are immutable.


In [55]:
t = ('foo', 'bar', 'baz', 'qux', 'quux', 'corge')
t

('foo', 'bar', 'baz', 'qux', 'quux', 'corge')

In [56]:
print(t[0])
print(t[-1])
print(t[1::2])

foo
corge
('bar', 'qux', 'corge')


In [57]:
t[::-1]

('corge', 'quux', 'qux', 'baz', 'bar', 'foo')

In [58]:
t = ('foo', 'bar', 'baz', 'qux', 'quux', 'corge')
t[2] = 'Bark!'

TypeError: 'tuple' object does not support item assignment

Why use a tuple instead of a list?

- Program execution is faster when manipulating a tuple than it is for the equivalent list. (This is probably not going to be noticeable when the list or tuple is small.)

- Sometimes you don’t want data to be modified. If the values in the collection are meant to remain constant for the life of the program, using a tuple instead of a list guards against accidental modification.

- There is another Python data type that you will encounter shortly called a dictionary, which requires as one of its components a value that is of an immutable type. A tuple can be used for this purpose, whereas a list can’t be.

In [59]:
a = 'foo'
b = 42
a, 3.14159, b

('foo', 3.14159, 42)

In [61]:
t = ()
type(t)

tuple

In [63]:
t = (2,)
print(t[0])
print(t[-1])

2
2


### Tuple Assignment, Packing, and Unpacking


In [65]:
t = ('foo', 'bar', 'baz', 'qux')

<img src="resources/24.png" width=500></img>

If that “packed” object is subsequently assigned to a new tuple, the individual items are “unpacked” into the objects in the tuple:

<img src="resources/25.png" width=500></img>

When unpacking, the number of variables on the left must match the number of values in the tuple:

In [None]:
(s1, s2, s3) = t

In [66]:
(s1, s2, s3, s4) = t

In [68]:
(s1, s2, s3, s4, s5) = t

ValueError: not enough values to unpack (expected 5, got 4)

In [1]:
(s1, s2, s3, s4) = ('foo', 'bar', 'baz', 'qux')

## Dictionary

Dictionaries and lists share the following characteristics:

- Both are mutable.
- Both are dynamic. They can grow and shrink as needed.
- Both can be nested. A list can contain another list. A dictionary can contain another dictionary. A dictionary can also contain a list, and vice versa.

Dictionaries differ from lists primarily in how elements are accessed:

- List elements are accessed by their position in the list, via indexing.
- Dictionary elements are accessed via keys.

### Defining a Dictionary

```
d = {
    <key>: <value>,
    <key>: <value>,
      .
      .
      .
    <key>: <value>
}
```

```MLB_team = dict([
     ('Colorado', 'Rockies'),
     ('Boston', 'Red Sox'),
     ('Minnesota', 'Twins'),
     ('Milwaukee', 'Brewers'),
     ('Seattle', 'Mariners')
... ]) ```

```MLB_team = dict(
     Colorado='Rockies',
     Boston='Red Sox',
     Minnesota='Twins',
     Milwaukee='Brewers',
     Seattle='Mariners'
 )```

In [71]:
MLB_team = dict(
     Colorado='Rockies',
     Boston='Red Sox',
     Minnesota='Twins',
     Milwaukee='Brewers',
     Seattle='Mariners'
 )

In [72]:
type(MLB_team)

dict

In [73]:
MLB_team

{'Colorado': 'Rockies',
 'Boston': 'Red Sox',
 'Minnesota': 'Twins',
 'Milwaukee': 'Brewers',
 'Seattle': 'Mariners'}

The entries in the dictionary display in the order they were defined. But that is irrelevant when it comes to retrieving them. Dictionary elements are not accessed by numerical index:



In [75]:
## will occur the error
MLB_team[1]

KeyError: 1

### Accessing Dictionary Values


In [76]:
MLB_team['Minnesota']

'Twins'

In [77]:
MLB_team['Colorado']

'Rockies'

In [78]:
MLB_team['Kansas City'] = 'Royals'

In [79]:
MLB_team['Seattle'] = 'Seahawks'
MLB_team

{'Colorado': 'Rockies',
 'Boston': 'Red Sox',
 'Minnesota': 'Twins',
 'Milwaukee': 'Brewers',
 'Seattle': 'Seahawks',
 'Kansas City': 'Royals'}

### Dictionary Keys vs. List Indices


You may have noticed that the interpreter raises the same exception, **KeyError**, when a dictionary is accessed with either an undefined key or by a numeric index:

In [80]:
MLB_team['Toronto']

KeyError: 'Toronto'

In [81]:
MLB_team[1]

KeyError: 1

### Building a Dictionary Incrementally


In [85]:
person = {}
person['fname'] = 'Joe'
person['lname'] = 'Fonebone'
person['age'] = 51
person['spouse'] = 'Edna'
person['children'] = ['Ralph', 'Betty', 'Joey']
person['pets'] = {'dog': 'Fido', 'cat': 'Sox'}

person

{'fname': 'Joe',
 'lname': 'Fonebone',
 'age': 51,
 'spouse': 'Edna',
 'children': ['Ralph', 'Betty', 'Joey'],
 'pets': {'dog': 'Fido', 'cat': 'Sox'}}

In [86]:
person['children'][-1]

'Joey'

In [87]:
person['pets']['cat']

'Sox'

In [88]:
foo = {42: 'aaa', 2.78: 'bbb', True: 'ccc'}

In [89]:
foo[42]

'aaa'

### Restrictions on Dictionary Keys


In [90]:
foo = {42: 'aaa', 2.78: 'bbb', True: 'ccc'}
foo

{42: 'aaa', 2.78: 'bbb', True: 'ccc'}

In [91]:
d = {int: 1, float: 2, bool: 3}
d

{int: 1, float: 2, bool: 3}

In [92]:
d[float]

2

In [93]:
MLB_team = {
     'Colorado' : 'Rockies',
     'Boston'   : 'Red Sox',
     'Minnesota': 'Twins',
     'Milwaukee': 'Brewers',
     'Seattle'  : 'Mariners'
 }

MLB_team['Minnesota'] = 'Timberwolves'
MLB_team

{'Colorado': 'Rockies',
 'Boston': 'Red Sox',
 'Minnesota': 'Timberwolves',
 'Milwaukee': 'Brewers',
 'Seattle': 'Mariners'}

Similarly, if you specify a key a second time during the initial creation of a dictionary, **the second occurrence will override the first:**

In [1]:
MLB_team = {
     'Colorado' : 'Rockies',
     'Boston'   : 'Red Sox',
     'Minnesota': 'Timberwolves',
     'Milwaukee': 'Brewers',
     'Seattle'  : 'Mariners',
     'Minnesota': 'Twins'
 }
MLB_team

{'Colorado': 'Rockies',
 'Boston': 'Red Sox',
 'Minnesota': 'Twins',
 'Milwaukee': 'Brewers',
 'Seattle': 'Mariners'}

Secondly, a dictionary key must be of a type that is immutable. You have already seen examples where several of the immutable types you are familiar with—integer, float, string, and Boolean—have served as dictionary keys.

**A tuple can also be a dictionary key, because tuples are immutable:**

In [2]:
d = {(1, 1): 'a', (1, 2): 'b', (2, 1): 'c', (2, 2): 'd'}
d[(1,1)]

'a'

In [3]:
d[(2,1)]

'c'

However, **neither a list nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable:**


In [5]:
## will occur the error
d = {[1, 1]: 'a', [1, 2]: 'b', [2, 1]: 'c', [2, 2]: 'd'}

TypeError: unhashable type: 'list'

Technical Note: Why does the error message say “unhashable”?

Technically, it is not quite correct to say an object must be immutable to be used as a dictionary key. More precisely, an object must be hashable, which means it can be passed to a hash function. A hash function takes data of arbitrary size and maps it to a relatively simpler fixed-size value called a hash value (or simply hash), which is used for table lookup and comparison.

Python’s built-in hash() function returns the hash value for an object which is hashable, and raises an exception for an object which isn’t:

In [7]:
hash('foo')

-4910035530008762825

In [8]:
hash([1, 2, 3])

TypeError: unhashable type: 'list'

### Restrictions on Dictionary Values

By contrast, there are no restrictions on dictionary values. 

A dictionary value can be any type of object Python supports, including mutable types like lists and dictionaries, and user-defined objects, 

In [9]:
d = {0: 'a', 1: 'a', 2: 'a', 3: 'a'}
d[0] == d[1] == d[2]

True

### Operators and Built-in Functions

In [10]:
MLB_team = {
     'Colorado' : 'Rockies',
     'Boston'   : 'Red Sox',
     'Minnesota': 'Twins',
     'Milwaukee': 'Brewers',
     'Seattle'  : 'Mariners'
 }

'Milwaukee' in MLB_team


True

In [11]:
'Toronto' in MLB_team

False

In [12]:
'Toronto' not in MLB_team


True

In [13]:
MLB_team['Toronto']

KeyError: 'Toronto'

In [14]:
'Toronto' in MLB_team and MLB_team['Toronto']

False

The **len()** function returns the number of key-value pairs in a dictionary:



In [15]:
MLB_team = {
     'Colorado' : 'Rockies',
     'Boston'   : 'Red Sox',
     'Minnesota': 'Twins',
     'Milwaukee': 'Brewers',
     'Seattle'  : 'Mariners'
 }

print(len(MLB_team))

5


### Built-in Dictionary Methods


```d.clear()``` : Clears a dictionary.

In [16]:
d = {'a': 10, 'b': 20, 'c': 30}
d

{'a': 10, 'b': 20, 'c': 30}

In [17]:
d.clear()
d

{}

```d.get(<key>[, <default>])``` : Returns the value for a key if it exists in the dictionary.

d.get(<key>) searches dictionary d for <key> and returns the associated value if it is found. If <key> is not found, it returns None:

In [18]:
d = {'a': 10, 'b': 20, 'c': 30}
print(d.get('b'))

20


If <key> is not found and the optional <default> argument is specified, that value is returned instead of None:

In [19]:
print(d.get('z', -1))

-1


```d.items()``` : Returns a list of key-value pairs in a dictionary.

d.items() returns a list of tuples containing the key-value pairs in d. The first item in each tuple is the key, and the second item is the key’s value:



In [20]:
d = {'a': 10, 'b': 20, 'c': 30}

d

{'a': 10, 'b': 20, 'c': 30}

In [21]:
list(d.items())

[('a', 10), ('b', 20), ('c', 30)]

In [23]:
print(list(d.items())[1][0])
print(list(d.items())[1][1])

b
20


```d.keys()``` : Returns a list of keys in a dictionary.

d.keys() returns a list of all keys in d:

In [24]:
d = {'a': 10, 'b': 20, 'c': 30}
d

{'a': 10, 'b': 20, 'c': 30}

In [25]:
list(d.keys())

['a', 'b', 'c']

```d.values()``` : Returns a list of values in a dictionary.

d.values() returns a list of all values in d:



In [27]:
d = {'a': 10, 'b': 20, 'c': 30}
d

{'a': 10, 'b': 20, 'c': 30}

In [28]:
list(d.values())

[10, 20, 30]

Any duplicate values in d will be returned as many times as they occur:


In [30]:
d = {'a': 10, 'b': 10, 'c': 10}
list(d.values())

[10, 10, 10]

```d.pop(<key>[, <default>])```: Removes a key from a dictionary, if it is present, and returns its value.

If <key> is present in d, d.pop(<key>) removes <key> and returns its associated value:



In [32]:
d = {'a': 10, 'b': 20, 'c': 30}
d.pop('b')

20

In [33]:
d

{'a': 10, 'c': 30}

d.pop(<key>) raises a KeyError exception if <key> is not in d:

In [34]:
d = {'a': 10, 'b': 20, 'c': 30}
d.pop('z')

KeyError: 'z'

If <key> is not in d, and the optional <default> argument is specified, then that value is returned, and no exception is raised:



In [35]:
d = {'a': 10, 'b': 20, 'c': 30}
d.pop('z', -1)

-1

In [36]:
d

{'a': 10, 'b': 20, 'c': 30}

```d.popitem()``` : Removes a key-value pair from a dictionary.

d.popitem() removes the last key-value pair added from d and returns it as a tuple:



In [37]:
d = {'a': 10, 'b': 20, 'c': 30}
d.popitem()
d

{'a': 10, 'b': 20}

In [38]:
d.popitem()

('b', 20)

In [39]:
d

{'a': 10}

If d is empty, d.popitem() raises a KeyError exception:


In [40]:
## will occur the error
d = {}
d.popitem()

KeyError: 'popitem(): dictionary is empty'

```d.update(<obj>)``` : Merges a dictionary with another dictionary or with an iterable of key-value pairs.

- If <obj> is a dictionary, d.update(<obj>) merges the entries from <obj> into d. For each key in <obj>:
- If the key is not present in d, the key-value pair from <obj> is added to d.
- If the key is already present in d, the corresponding value in d for that key is updated to the value from <obj>.

In [41]:
d1 = {'a': 10, 'b': 20, 'c': 30}
d2 = {'b': 200, 'd': 400}
d1.update(d2)
d1

{'a': 10, 'b': 200, 'c': 30, 'd': 400}

In this example, key 'b' already exists in d1, so its value is updated to 200, the value for that key from d2. However, there is no key 'd' in d1, so that key-value pair is added from d2.

```<obj>``` may also be a sequence of key-value pairs, similar to when the dict() function is used to define a dictionary. For example, ```<obj>``` can be specified as a list of tuples:

In [43]:
d1 = {'a': 10, 'b': 20, 'c': 30}
d1.update([('b', 200), ('d', 400)])

In [44]:
d1

{'a': 10, 'b': 200, 'c': 30, 'd': 400}

In [45]:
d1 = {'a': 10, 'b': 20, 'c': 30}

In [46]:
d1.update(b=200, d=400)

In [47]:
d1

{'a': 10, 'b': 200, 'c': 30, 'd': 400}

### Go to Exercises

## Sets

### Defining a Set
Python’s built-in set type has the following characteristics:

- Sets are unordered.
- Set elements are unique. Duplicate elements are not allowed.
- A set itself may be modified, but the elements contained in the set must be of an immutable type.

In [49]:
x = set(['foo', 'bar', 'baz', 'foo', 'qux'])
x

{'bar', 'baz', 'foo', 'qux'}

In [50]:
x = set(('foo', 'bar', 'baz', 'foo', 'qux'))
x

{'bar', 'baz', 'foo', 'qux'}

Strings are also iterable, so a string can be passed to set() as well.

In [51]:
s = 'quux'
list(s)

['q', 'u', 'u', 'x']

In [52]:
set(s)

{'q', 'u', 'x'}

To recap:

- The argument to set() is an iterable. It generates a list of elements to be placed into the set.
- The objects in curly braces are placed into the set intact, even if they are iterable.

In [53]:
{'foo'}

{'foo'}

In [54]:
set('foo')

{'f', 'o'}

A set can be empty. However, recall that Python interprets empty curly braces ({}) as an empty dictionary, so the only way to define an empty set is with the set() function:

In [55]:
x = set()
type(x)

set

In [56]:
x

set()

In [57]:
x = {}
type(x)

dict

In [59]:
x = set()
bool(x)

False

In [62]:
s1 = {2, 4, 6, 8, 10}
s2 = {'Smith', 'McArthur', 'Wilson', 'Johansson'}

The elements in a set can be objects of different types:

In [63]:
x = {42, 'foo', 3.14159, None}
x

{3.14159, 42, None, 'foo'}

set elements must be immutable. For example, a tuple may be included in a set:

In [64]:
x = {42, 'foo', (1, 2, 3), 3.14159}
x

{(1, 2, 3), 3.14159, 42, 'foo'}

But lists and dictionaries are mutable, so they can’t be set elements:

In [65]:
a = [1, 2, 3]
{a}

TypeError: unhashable type: 'list'

In [66]:
d = {'a': 1, 'b': 2}
{d}

TypeError: unhashable type: 'dict'

### Set Size and Membership

The len() function returns the number of elements in a set, and the in and not in operators can be used to test for membership:


In [67]:
x = {'foo', 'bar', 'baz'}
len(x)

3

In [68]:
'bar' in x

True

In [69]:
'qux' in x

False

### Operating on a Set


#### Operators vs. Methods


In [71]:
## union
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

x1 | x2

{'bar', 'baz', 'foo', 'quux', 'qux'}

Set **union** can also be obtained with the **.union()** method. The method is invoked on one of the sets, and the other is passed as an argument:



In [72]:
x1.union(x2)

{'bar', 'baz', 'foo', 'quux', 'qux'}

In [73]:
x1 | ('baz', 'qux', 'quux')

TypeError: unsupported operand type(s) for |: 'set' and 'tuple'

In [74]:
x1.union(('baz', 'qux', 'quux'))

{'bar', 'baz', 'foo', 'quux', 'qux'}

#### Available Operators and Methods


##### Union

<img src="resources/26.webp" width=500></img>

x1.union(x2) and x1 | x2 both return the set of all elements in either x1 or x2:

In [76]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}
x1.union(x2)

{'bar', 'baz', 'foo', 'quux', 'qux'}

In [77]:
x1|x2

{'bar', 'baz', 'foo', 'quux', 'qux'}

More than two sets may be specified with either the operator or the method:



In [78]:
a = {1, 2, 3, 4}
b = {2, 3, 4, 5}
c = {3, 4, 5, 6}
d = {4, 5, 6, 7}

a.union(b, c, d)

{1, 2, 3, 4, 5, 6, 7}

In [79]:
a | b | c | d

{1, 2, 3, 4, 5, 6, 7}

##### intersection

<img src="resources/27.webp" width=300></img>

x1.intersection(x2) and x1 & x2 return the set of elements common to both x1 and x2:

In [80]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

x1.intersection(x2)

{'baz'}

In [81]:
x1 & x2

{'baz'}

In [82]:
a = {1, 2, 3, 4}
b = {2, 3, 4, 5}
c = {3, 4, 5, 6}
d = {4, 5, 6, 7}

a.intersection(b, c, d)

{4}

##### difference

<img src="resources/28.webp" width=300></img>

x1.difference(x2) and x1 - x2 return the set of all elements that are in x1 but not in x2:



In [83]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

x1.difference(x2)

{'bar', 'foo'}

In [84]:
x1 - x2

{'bar', 'foo'}

In [86]:
a = {1, 2, 3, 30, 300}
b = {10, 20, 30, 40}
c = {100, 200, 300, 400}

a.difference(b, c)


{1, 2, 3}

In [87]:
a - b - c

{1, 2, 3}

When multiple sets are specified, the operation is performed from left to right. In the example above, a - b is computed first, resulting in {1, 2, 3, 300}. Then c is subtracted from that set, leaving {1, 2, 3}:

<img src="resources/29.webp"></img>

##### symmetric_difference

<img src="resources/30.webp" width=300></img>

In [88]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

x1.symmetric_difference(x2)

{'bar', 'foo', 'quux', 'qux'}

In [89]:
x1 ^ x2

{'bar', 'foo', 'quux', 'qux'}

The ^ operator also allows more than two sets:

In [90]:
a = {1, 2, 3, 4, 5}
b = {10, 2, 3, 4, 50}
c = {1, 50, 100}

a ^ b ^ c

{5, 10, 100}

Curiously, although the ```^``` operator allows multiple sets, the ```.symmetric_difference()``` method doesn’t:

In [91]:
a = {1, 2, 3, 4, 5}
b = {10, 2, 3, 4, 50}
c = {1, 50, 100}

a.symmetric_difference(b, c)

TypeError: symmetric_difference() takes exactly one argument (2 given)

##### isdisjoint


Determines whether or not two sets have any elements in common.





In [92]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

x1.isdisjoint(x2)

False

In [93]:
x2 - {'baz'}

{'quux', 'qux'}

In [94]:
x1.isdisjoint(x2 - {'baz'})

True

If x1.isdisjoint(x2) is True, then x1 & x2 is the empty set:

In [95]:
x1 = {1, 3, 5}
x2 = {2, 4, 6}

x1.isdisjoint(x2)

True

In [96]:
x1 & x2

set()

**Note:** There is no operator that corresponds to the .isdisjoint() method.

##### issubset


Determine whether one set is a subset of the other.



In [97]:
x1 = {'foo', 'bar', 'baz'}
x1.issubset({'foo', 'bar', 'baz', 'qux', 'quux'})

True

In [98]:
x2 = {'baz', 'qux', 'quux'}

x1 <= x2

False

A set is considered to be a subset of itself:

In [99]:
x = {1, 2, 3, 4, 5}
x.issubset(x)

True

##### x1 < x2

Determines whether one set is a proper subset of the other.


A proper subset is the same as a subset, except that the sets can’t be identical. A set x1 is considered a proper subset of another set x2 if every element of x1 is in x2, and x1 and x2 are not equal.





In [100]:
x1 = {'foo', 'bar'}
x2 = {'foo', 'bar', 'baz'}


x1 < x2

True

In [101]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'bar', 'baz'}
x1 < x2

False

##### issuperset

Determine whether one set is a superset of the other.



In [102]:
x1 = {'foo', 'bar', 'baz'}
x1.issuperset({'foo', 'bar'})

True

In [103]:
x2 = {'baz', 'qux', 'quux'}

x1 >= x2

False

##### x1 > x2

Determines whether one set is a proper superset of the other.




In [104]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'bar'}
x1 > x2

True

<font color="red">A set is not a proper superset of itself:</font>

In [105]:
x = {1, 2, 3, 4, 5}
x > x

False

#### Modifying a Set

#### Update

In [107]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'baz', 'qux'}

x1 |= x2

x1

{'bar', 'baz', 'foo', 'qux'}

##### intersection_update 

Modify a set by intersection.



In [109]:
x1 = {'foo', 'bar', 'baz'}

x2 = {'foo', 'baz', 'qux'}

x1 &= x2

x1

{'baz', 'foo'}

In [110]:
x1.intersection_update(['baz', 'qux'])
x1

{'baz'}

##### difference_update

Modify a set by difference.



In [111]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'baz', 'qux'}
x1 -= x2
x1

{'bar'}

In [112]:
x1.difference_update(['foo', 'bar', 'qux'])
x1

set()

##### symmetric_difference_update

Modify a set by symmetric difference.



In [114]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'baz', 'qux'}
x1 ^= x2
x1

{'bar', 'qux'}

In [115]:
x1.symmetric_difference_update(['qux', 'corge'])
x1

{'bar', 'corge'}

#### Other Methods For Modifying Sets


##### Add

Adds an element to a set.



In [116]:
x = {'foo', 'bar', 'baz'}
x.add('qux')
x

{'bar', 'baz', 'foo', 'qux'}

##### Remove

Removes an element from a set.

In [118]:
x = {'foo', 'bar', 'baz'}
x.remove('baz')
x

{'bar', 'foo'}

In [119]:
x.remove('qux')

KeyError: 'qux'

##### discard

Removes an element from a set.



In [120]:
x = {'foo', 'bar', 'baz'}
x.discard('baz')
x

{'bar', 'foo'}

In [122]:
x.discard('qux')
x

{'bar', 'foo'}

##### Pop


Removes a random element from a set.



In [123]:
x = {'foo', 'bar', 'baz'}
x.pop()

'bar'

In [124]:
x

{'baz', 'foo'}

In [125]:
x.pop()

'baz'

In [126]:
x

{'foo'}

In [127]:
x.pop()
x

set()

In [128]:
x.pop()
x

KeyError: 'pop from an empty set'

#### Clears a set.

Clears a set.




In [129]:
x = {'foo', 'bar', 'baz'}
x

{'bar', 'baz', 'foo'}

In [130]:
x.clear()

In [131]:
x

set()

#### Frozen Sets


Python provides another built-in type called **a frozenset, which is in all respects exactly like a set, except that a frozenset is immutable.** 

You can perform non-modifying operations on a frozenset:


In [132]:
x = frozenset(['foo', 'bar', 'baz'])

x

frozenset({'bar', 'baz', 'foo'})

In [133]:
x & {'baz', 'qux', 'quux'}

frozenset({'baz'})

In [134]:
x = frozenset(['foo', 'bar', 'baz'])
x.add('qux')

AttributeError: 'frozenset' object has no attribute 'add'

In [135]:
x.pop()

AttributeError: 'frozenset' object has no attribute 'pop'

In [138]:
x.clear()

AttributeError: 'frozenset' object has no attribute 'clear'

In [139]:
x

frozenset({'bar', 'baz', 'foo'})

Like normal sets, frozenset can also perform different operations like copy, difference, intersection, symmetric_difference, and union.



In [140]:
# Frozensets
# initialize A and B
A = frozenset([1, 2, 3, 4])
B = frozenset([3, 4, 5, 6])

# copying a frozenset
C = A.copy()  # Output: frozenset({1, 2, 3, 4})
print(C)

# union
print(A.union(B))  # Output: frozenset({1, 2, 3, 4, 5, 6})

# intersection
print(A.intersection(B))  # Output: frozenset({3, 4})

# difference
print(A.difference(B))  # Output: frozenset({1, 2})

# symmetric_difference
print(A.symmetric_difference(B))  # Output: frozenset({1, 2, 5, 6})

frozenset({1, 2, 3, 4})
frozenset({1, 2, 3, 4, 5, 6})
frozenset({3, 4})
frozenset({1, 2})
frozenset({1, 2, 5, 6})


## Go to Exercises

## File Handling

Outline 

- Retrieve file properties
- Create directories
- Match patterns in filenames
- Traverse directory trees
- Make temporary files and directories
- Delete files and directories
- Copy, move, or rename files and directories
- Create and extract ZIP and TAR archives
- Open multiple files using the fileinput module

For File Reader
```
with open('readme.txt') as f:
    lines = f.readlines()
```

To read a text file in Python, you follow these steps:

- First, open a text file for reading by using the **open()** function.
- Second, read text from the text file using the file **read(), readline(), or readlines()** method of the file object.
- Third, close the file using the file **close()** method.


1) <font color="red">open() </font>function

```
open(path_to_file, mode)
```
The **path_to_file** parameter specifies the path to the text file. For example, if the file is readme.txt stored in the sample folder as the program, you need to specify the path to the file as c:/sample/readme.txt

The **mode** is an optional parameter. It’s a string that specifies the mode in which you want to open the file.

<img src="resources/file_mode.png" width=800>



2) <font color="red">Reading text methods</font>

The file object provides you with three methods for reading text from a text file:

- **read()** – read all text from a file into a string. This method is useful if you have a small file and you want to manipulate the whole text of that file.
- **readline()** – read the text file line by line and return all the lines as strings.
- **readlines()** – read all the lines of the text file and return them as a list of strings3

3) <font color= "red"> close() </font> method

f.close()


In [14]:
with open('resources/file_reader.txt') as f:
    contents = f.read()
    print(contents)

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [13]:
lines = []
with open('resources/file_reader.txt') as f:
    lines = f.readlines()

count = 0
for line in lines:
    count += 1
    print(f'line {count}: {line}')  

line 1: Beautiful is better than ugly.

line 2: Explicit is better than implicit.

line 3: Simple is better than complex.

line 4: Complex is better than complicated.

line 5: Flat is better than nested.

line 6: Sparse is better than dense.

line 7: Readability counts.

line 8: Special cases aren't special enough to break the rules.

line 9: Although practicality beats purity.

line 10: Errors should never pass silently.

line 11: Unless explicitly silenced.

line 12: In the face of ambiguity, refuse the temptation to guess.

line 13: There should be one-- and preferably only one --obvious way to do it.

line 14: Although that way may not be obvious at first unless you're Dutch.

line 15: Now is better than never.

line 16: Although never is often better than *right* now.

line 17: If the implementation is hard to explain, it's a bad idea.

line 18: If the implementation is easy to explain, it may be a good idea.

line 19: Namespaces are one honking great idea -- let's do more of thos

### A more concise way to read a text file line by line


In [12]:
with open('resources/file_reader.txt') as f:
    for line in f:
        print(line)

Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one-- and preferably only one --obvious way to do it.

Although that way may not be obvious at first unless you're Dutch.

Now is better than never.

Although never is often better than *right* now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!


### Read UTF-8 text files


In [11]:
with open('resources/quotes.txt', encoding='utf8') as f:
    for line in f:
        print(line.strip())

人生で何度も何度も失敗を繰り返してきました。だからこそ、私は成功を収めることができたのです。
どれだけ高く登れたかで人を評価しません。尻餅をついたあと、どれだけ変わったかで評価をするのです。
成功を収める人とは人が投げてきたレンガでしっかりした基盤を築くことができる人のことである。


### Getting a Directory Listing

#### In Older Version

In [16]:
import os
entries = os.listdir('resources/')
for entry in entries:
    print(entry)

.ipynb_checkpoints
24.png
25.png
26.webp
27.webp
28.webp
29.webp
30.webp
break_continue.webp
fibonacci_number.png
file_mode.PNG
file_reader.txt
image1.PNG
quotes.txt
temperature.png


#### in Modern Python Versions


In [22]:
import os

with os.scandir('resources/') as entries:
    for entry in entries:
        print(entry.name)

.ipynb_checkpoints
24.png
25.png
26.webp
27.webp
28.webp
29.webp
30.webp
break_continue.webp
fibonacci_number.png
file_mode.PNG
file_reader.txt
image1.PNG
quotes.txt
temperature.png


In [23]:
from pathlib import Path

entries = Path('resources/')
for entry in entries.iterdir():
    print(entry.name)

.ipynb_checkpoints
24.png
25.png
26.webp
27.webp
28.webp
29.webp
30.webp
break_continue.webp
fibonacci_number.png
file_mode.PNG
file_reader.txt
image1.PNG
quotes.txt
temperature.png


<img src="resources/file_directory.PNG">

### Listing All Files in a Directory


To filter out directories and only list files from a directory listing produced by os.listdir(), use os.path:

In [25]:
import os

# List all files in a directory using os.listdir
basepath = 'resources/'
for entry in os.listdir(basepath):
    if os.path.isfile(os.path.join(basepath, entry)):
        print(entry)

24.png
25.png
26.webp
27.webp
28.webp
29.webp
30.webp
break_continue.webp
fibonacci_number.png
file_directory.PNG
file_mode.PNG
file_reader.txt
image1.PNG
quotes.txt
temperature.png


An easier way to list files in a directory is to use os.scandir() or pathlib.Path():



In [26]:
# List all files in a directory using scandir()
basepath = 'resources/'
with os.scandir(basepath) as entries:
    for entry in entries:
        if entry.is_file():
            print(entry.name)

24.png
25.png
26.webp
27.webp
28.webp
29.webp
30.webp
break_continue.webp
fibonacci_number.png
file_directory.PNG
file_mode.PNG
file_reader.txt
image1.PNG
quotes.txt
temperature.png


Using os.scandir() has the advantage of looking cleaner and being easier to understand than using os.listdir(), even though it is one line of code longer. Calling entry.is_file() on each item in the ScandirIterator returns True if the object is a file.

using pathlib.Path():

In [27]:
from pathlib import Path

basepath = Path('resources/')
files_in_basepath = basepath.iterdir()
for item in files_in_basepath:
    if item.is_file():
        print(item.name)

24.png
25.png
26.webp
27.webp
28.webp
29.webp
30.webp
break_continue.webp
fibonacci_number.png
file_directory.PNG
file_mode.PNG
file_reader.txt
image1.PNG
quotes.txt
temperature.png


### Listing Subdirectories

use os.listdir() and os.path()

In [28]:
import os

# List all subdirectories using os.listdir
basepath = 'resources/'
for entry in os.listdir(basepath):
    if os.path.isdir(os.path.join(basepath, entry)):
        print(entry)

.ipynb_checkpoints


use os.scandir()

In [30]:
import os

# List all subdirectories using scandir()
basepath = 'resources/'
with os.scandir(basepath) as entries:
    for entry in entries:
        if entry.is_dir():
            print(entry.name)

.ipynb_checkpoints


use pathlib.Path()

In [31]:
from pathlib import Path

# List all subdirectory using pathlib
basepath = Path('resources/')
for entry in basepath.iterdir():
    if entry.is_dir():
        print(entry.name)

.ipynb_checkpoints


### Getting File Attributes


In [32]:
import os
with os.scandir('resources/') as dir_contents:
    for entry in dir_contents:
        info = entry.stat()
        print(info.st_mtime)

1657015361.2817025
1655891008.0
1655891117.0
1655953086.0
1655953290.0
1655953436.0
1655953543.0
1655953624.0
1656998637.29793
1657002855.411434
1657019650.1380072
1657015987.7004168
1657019320.5655973
1656988770.8618238
1657016958.9656863
1657002395.9705684


In [33]:
from datetime import datetime
from os import scandir

def convert_date(timestamp):
    d = datetime.utcfromtimestamp(timestamp)
    formated_date = d.strftime('%d %b %Y')
    return formated_date

def get_files():
    dir_entries = scandir('resources/')
    for entry in dir_entries:
        if entry.is_file():
            info = entry.stat()
            print(f'{entry.name}\t Last Modified: {convert_date(info.st_mtime)}')

In [34]:
get_files()

24.png	 Last Modified: 22 Jun 2022
25.png	 Last Modified: 22 Jun 2022
26.webp	 Last Modified: 23 Jun 2022
27.webp	 Last Modified: 23 Jun 2022
28.webp	 Last Modified: 23 Jun 2022
29.webp	 Last Modified: 23 Jun 2022
30.webp	 Last Modified: 23 Jun 2022
break_continue.webp	 Last Modified: 05 Jul 2022
fibonacci_number.png	 Last Modified: 05 Jul 2022
file_directory.PNG	 Last Modified: 05 Jul 2022
file_mode.PNG	 Last Modified: 05 Jul 2022
file_reader.txt	 Last Modified: 05 Jul 2022
image1.PNG	 Last Modified: 05 Jul 2022
quotes.txt	 Last Modified: 05 Jul 2022
temperature.png	 Last Modified: 05 Jul 2022


The arguments passed to .strftime() are the following:

%d: the day of the month

%b: the month, in abbreviated form

%Y: the year

### Making Directories

<img src="resources/make_directories.png" width=600>


#### Creating a Single Directory

In [1]:
# using os.mkdir():

import os

os.mkdir('example_directory/')

FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'example_directory/'

**If the path already exists, mkdir() raises a FileExistsError:**

In [2]:
from pathlib import Path

p = Path('example_directory')
try:
    p.mkdir()
except FileExistsError as exc:
    print(exc)

[WinError 183] Cannot create a file when that file already exists: 'example_directory'


In [3]:
from pathlib import Path

p = Path('example_directory')
p.mkdir(exist_ok=True)

#### Creating Multiple Directories

In [4]:
import os
os.makedirs('2018/10/05')

In [5]:
import os

os.makedirs('2018/10/05', mode=0o770)

FileExistsError: [WinError 183] Cannot create a file when that file already exists: '2018/10/05'

### Making Temporary Files and Directories


**tempfile** can be used to open and store data temporarily in a file or directory while your program is running. tempfile handles the deletion of the temporary files when your program is done with them.

In [8]:
from tempfile import TemporaryFile

# Create a temporary file and write some data to it
fp = TemporaryFile('w+t')
fp.write('Hello universe!')

# Go back to the beginning and read data from file
fp.seek(0)
data = fp.read()

# Close the file, after which it will be removed
fp.close()

In [9]:
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
    print('Created temporary directory ', tmpdir)
    os.path.exists(tmpdir)

Created temporary directory  C:\Users\Dell\AppData\Local\Temp\tmplej3cmfm


In [13]:
os.path.exists(tmpdir)

False

After the context manager goes out of context, the temporary directory is deleted and a call to os.path.exists(tmpdir) returns False, which means that the directory was succesfully deleted.

### Deleting Files and Directories


#### Deleting Files in Python

To delete a single file, use pathlib.Path.unlink(), os.remove(). or os.unlink().

os.remove() and os.unlink() are semantically identical. To delete a file using os.remove(), do the following:

In [None]:
import os

data_file = 'C:\\Users\\vuyisile\\Desktop\\Test\\data.txt'
os.remove(data_file)

Deleting a file using os.unlink() is similar to how you do it using os.remove():


In [None]:
import os

data_file = 'C:\\Users\\vuyisile\\Desktop\\Test\\data.txt'
os.unlink(data_file)

Calling .unlink() or .remove() on a file deletes the file from the filesystem. 

These two functions will throw an OSError if the path passed to them points to a directory instead of a file. 

To avoid this, you can either check that what you’re trying to delete is actually a file and only delete it if it is, or you can use exception handling to handle the OSError:

In [None]:
import os

data_file = 'home/data.txt'

# If the file exists, delete it
if os.path.isfile(data_file):
    os.remove(data_file)
else:
    print(f'Error: {data_file} not a valid filename')

In [None]:
import os

data_file = 'home/data.txt'

# Use exception handling
try:
    os.remove(data_file)
except OSError as e:
    print(f'Error: {data_file} : {e.strerror}')

#### Deleting Entire Directory Trees

To delete non-empty directories and entire directory trees, Python offers shutil.rmtree():


In [None]:
import shutil

trash_dir = 'my_documents/bad_dir'

try:
    shutil.rmtree(trash_dir)
except OSError as e:
    print(f'Error: {trash_dir} : {e.strerror}')

<img src="resources/delete_directories.PNG">

### Copying, Moving, and Renaming Files and Directories

#### Copying Files in Python


In [None]:
import shutil

src = 'path/to/file.txt'
dst = 'path/to/dest_dir'
shutil.copy(src, dst)

shutil.copy() is comparable to the cp command in UNIX based systems. 

shutil.copy(src, dst) will copy the file src to the location specified in dst. 

If dst is a file, the contents of that file are replaced with the contents of src. 

If dst is a directory, then src will be copied into that directory. 

**shutil.copy() only copies the file’s contents and the file’s permissions. Other metadata like the file’s creation and modification times are not preserved.**

In [None]:
import shutil

src = 'path/to/file.txt'
dst = 'path/to/dest_dir'
shutil.copy2(src, dst)

#### Copying Directories


In [None]:
import shutil
shutil.copytree('data_1', 'data1_backup')

In this example, .copytree() copies the contents of data_1 to a new location data1_backup and returns the destination directory. 

The destination directory must not already exist. It will be created as well as missing parent directories. shutil.copytree() is a good way to back up your files.



#### Moving Files and Directories

src is the file or directory to be moved and dst is the destination:

In [None]:
import shutil
shutil.move('dir_1/', 'backup/')

**shutil.move('dir_1/', 'backup/') moves dir_1/ into backup/ if backup/ exists. If backup/ does not exist, dir_1/ will be renamed to backup.**



#### Renaming Files and Directories


In [15]:
os.rename('first.zip', 'first_01.zip')

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'first.zip' -> 'first_01.zip'

In [None]:
from pathlib import Path
data_file = Path('data_01.txt')
data_file.rename('data.txt')