# Agenda

1. Dictionary
2. Set
3. Combinations of data structures
4. Files
5. Functions

# Dictionary

`dict` is also known as:

- Hash table
- Hash
- Hashmap
- Map
- Key-value store
- Name-value store
- Associative array

We can define the "key" and the value. Some rules:

- Key can be any *immutable* value.
- Keys are unique within any given dict
- Values can be anything at all

## Syntax

- We use `{}` to define a dict
- We separate a key and value with `:`
- We separate pairs with `,`

In [1]:
d = {'a':10, 'b':20, 'c':30}
type(d)

dict

In [2]:
len(d)

3

In [3]:
# we use `[]` to retrieve values via the key

d['a']

10

In [4]:
d['x']

KeyError: 'x'

In [5]:
# to find if a key is in the dict, we use "in"

'a' in d

True

In [6]:
10 in d

False

In [7]:
'A' in d

False

In [8]:
' a' in d

False

In [9]:
d.keys()

dict_keys(['a', 'b', 'c'])

# Exercise: Restaurant

1. Define a dict, `menu`, with strings (menu items) as keys and ints (prices) as values.
2. Define `total` to be 0.
3. Ask the user repeatedly to order something
    - If they enter an empty string, stop asking
    - If they enter something in the menu, print the price + the updated total.
    - If they enter something *not* on the menu, then scold them
4. Give the total

Example:

    Order: sandwich
    sandwich is 20, total is 20
    Order: tea
    tea is 10, total is 30
    Order: elephant
    We're out of elephant today!
    Order: [ENTER]
    Total is 30

In [10]:
menu = {'sandwich': 20, 'tea':10, 'cake':12, 'apple':4}
total = 0

while True:
    order = input('Order: ').strip()

    if not order:
        break

    if order in menu:
        price = menu[order]
        total += price
        print(f'{order} costs {price}, total is now {total}')
    else:
        print(f'Sorry, we are out of {order} today!')

print(f'{total=}')

Order:  sandwich


sandwich costs 20, total is now 20


Order:  tea


tea costs 10, total is now 30


Order:  asdfafadfa


Sorry, we are out of asdfafadfa today!


Order:  


total=30


# Dictionaries are mutable!

1. We can change the values
2. We can add new pairs
3. We can remove pairs

In [11]:
d

{'a': 10, 'b': 20, 'c': 30}

In [12]:
d['b'] = 999
d

{'a': 10, 'b': 999, 'c': 30}

In [13]:
# add a new pair
d['x'] = 876

In [14]:
d

{'a': 10, 'b': 999, 'c': 30, 'x': 876}

In [15]:
mylist = [10, 20, 30, 40]

d[mylist] = 123

TypeError: cannot use 'list' as a dict key (unhashable type: 'list')

In [16]:
# remove a pair from the dict with dict.pop
# removes the pair, and returns the value

d.pop('x')

876

In [17]:
d

{'a': 10, 'b': 999, 'c': 30}

In [18]:
d.pop('x')

KeyError: 'x'

# 3 paradigms for dicts

1. Read-only dict defined at start of program
2. Define keys and default values at the start, modify values but don't add/remove keys
3. Start with an empty dict, add keys/values as needed

In [19]:
counts = {'vowels':0,
         'digits':0,
         'others':0}

text = input('Enter text: ').strip()

for one_character in text:
    if one_character in 'aeiou':
        counts['vowels'] += 1
    elif one_character.isdigit():
        counts['digits'] += 1
    else:
        counts['others'] += 1

print(counts)        

Enter text:  this is a test!!! 123


{'vowels': 4, 'digits': 3, 'others': 14}


In [20]:
# iterating on a dict

for one_item in counts:
    print(one_item)

vowels
digits
others


In [21]:
for one_key in counts:
    print(f'{one_key}: {counts[one_key]}')

vowels: 4
digits: 3
others: 14


In [22]:
# dict.items -- returns a (key, value) tuple with each iteration

for one_item in counts.items():
    print(one_item)

('vowels', 4)
('digits', 3)
('others', 14)


In [23]:
# tuple unpacking!

for key, value in counts.items():
    print(f'{key}: {value}')

vowels: 4
digits: 3
others: 14


In [24]:
counts.values()

dict_values([4, 3, 14])

# Exercise: Odds and evens

1. Define a dict with two keys, `odds` and `evens`. The values will be empty lists, `[]`.
2. Ask the user to enter a string. The string should contain numbers separated by spaces.
3. If the user gives an empty string, exit.
4. Go through each "word" in the string that the user gave:
    - If we cannot turn it into a number, scold the user
    - If we can turn it into a number, we do so -- and add to the list of either odds or evens
5. Go back to step 2, and ask the user again for a string of numbers.
6. Print all odd numbers and all even numbers.

Example:

    Enter numbers: 10 15 hello 12
    10 is even
    15 is odd
    hello is not a number; ignoring
    12 is even
    Enter numbers: 25 72 goodbye
    25 is odd
    72 is even
    goodbye is not a number; ignoring
    Enter numbers: [ENTER]
    odds: 15 25
    evens: 10 12 72
    

In [26]:
counts = {'odds': [],
         'evens': []}

while True:
    text = input('Enter numbers: ').strip()

    if not text:    # got an empty string? exit the loop
        break
    
    for one_word in text.split():    # run over each word that we got

        if not one_word.isdigit():   # check that we got a number; if not, go to the next
            print(f'{one_word} is not numeric; ignoring')
            continue
    
        n = int(one_word)            # convert to an int
    
        if n % 2:   # odd
            print(f'{n} is odd')
            counts['odds'].append(n)
        else:
            print(f'{n} is even')
            counts['evens'].append(n)

for key, value in counts.items():
    print(f'{key}: {value}')

Enter numbers:  10 15 20 25 hello 31 32


10 is even
15 is odd
20 is even
25 is odd
hello is not numeric; ignoring
31 is odd
32 is even
odds: [15, 25, 31]
evens: [10, 20, 32]


In [28]:
# Paradigm 3: Start with nothing

counts = {}

text = input('Enter text: ').strip()

for one_character in text:
    if one_character in counts:
        counts[one_character] += 1    # subsequent times
    else:
        counts[one_character] = 1     # first time we get one_character

print(counts)    

Enter text:  abca


{'a': 2, 'b': 1, 'c': 1}


In [32]:
d = {'a':10, 'b':20, 'c':30}

while True:
    k = input('Enter a key: ').strip()

    if not k:
        break

#     value = d.get(k)   # get None if the key doesn't exist!
    value = d.get(k, 'No such key')
    print(f'd[{k}] = {value}')

Enter a key:  x


d[x] = No such key


Enter a key:  a


d[a] = 10


Enter a key:  b


d[b] = 20


Enter a key:  


In [33]:
# Paradigm 3: Start with nothing

counts = {}

text = input('Enter text: ').strip()

for one_character in text:
    counts[one_character] = counts.get(one_character, 0) + 1

print(counts)    

Enter text:  abca


{'a': 2, 'b': 1, 'c': 1}


# Exercise: Rainfall

1. Define an empty dict, `rainfall`. The keys will be city names (string), and the values will be lists of integers (mm rain in each city).
2. Ask the user repeatedly to enter the name of a city
    - If they give an empty string, exit the loop
3. Ask the user how much rain fell
4. Add the key-value pair if needed, or add to the list as needed.
5. Print the dict, all cities and values, including mean (average) rain for each city.

Example:

    city: a
    rain: 5
    city: b
    rain: 4
    city: a
    rain: 3
    city: [ENTER]
    a: 8
    b: 4

In [36]:
rainfall = {}

while True:
    city_name = input('City: ').strip()

    if not city_name:
        break

    mm_rain = input('Rain: ').strip()
    mm_rain = int(mm_rain)

    if city_name not in rainfall:
        rainfall[city_name] = []
    
    rainfall[city_name].append(mm_rain) 

for city_name, mm_rain in rainfall.items():
    print(f'{city_name}: {mm_rain}, mean {sum(mm_rain)/len(mm_rain)}')

City:  a
Rain:  10
City:  b
Rain:  8
City:  a
Rain:  6
City:  b
Rain:  4
City:  


a: [10, 6], mean 8.0
b: [8, 4], mean 6.0


# Dict (hash table)

Originally, Python would create a dict with 8 spaces.

If I store `d['a'] = 10`:

- Python would calculate `hash('a')`
- Python would then use `% 8` for the location
- When we used 2/3 of the spaces in the dict's memory, it would double in size

In [37]:
hash('a')

4878313245049174264

In [38]:
'a' in d   # 

True

# Dict (hash table)

Now, Python uses *two* structures behind the scenes:

- 3-column table with index (starting with 0), key, and value
- a C array of ints, the indexes into the table

If I store `d['a'] = 10`:

- Python adds a new row to the table with `a` and 10
- Python would calculate `hash('a')`
- Python would then use `% 8` for the location in the *array*. There, Python stores the index in the table

If I request `d['a']` 

In [39]:
d = {'a':10, 'b':20, 'c':30}

In [40]:
d.keys()

dict_keys(['a', 'b', 'c'])

In [41]:
d.pop('b')
d['b'] = 999

d.keys()

dict_keys(['a', 'c', 'b'])

In [42]:
x = 10

In [43]:
globals()

{'__name__': '__main__',
 '__doc__': 'Automatically created module for IPython interactive environment',
 '__package__': None,
 '__loader__': None,
 '__spec__': None,
 '__builtin__': <module 'builtins' (built-in)>,
 '__builtins__': <module 'builtins' (built-in)>,
 '_ih': ['',
  "d = {'a':10, 'b':20, 'c':30}\ntype(d)",
  'len(d)',
  "# we use `[]` to retrieve values via the key\n\nd['a']",
  "d['x']",
  '# to find if a key is in the dict, we use "in"\n\n\'a\' in d',
  '10 in d',
  "'A' in d",
  "' a' in d",
  'd.keys()',
  "menu = {'sandwich': 20, 'tea':10, 'cake':12, 'apple':4}\ntotal = 0\n\nwhile True:\n    order = input('Order: ').strip()\n\n    if not order:\n        break\n\n    if order in menu:\n        price = menu[order]\n        total += price\n        print(f'{order} costs {price}, total is now {total}')\n    else:\n        print(f'Sorry, we are out of {order} today!')\n\nprint(f'{total=}')",
  'd',
  "d['b'] = 999\nd",
  "# add a new pair\nd['x'] = 876",
  'd',
  'mylist = [

In [45]:
globals()['x']

10

In [46]:
d = {'a':10, 'b':20, 'c':30}

'a' in d

True

In [47]:
features = {'admin': True, 'defined':True}

'admin' in features

True

# set -- just the keys from the dict

- elements must be immutable
- items are unique 

In [48]:
s = set([10, 20, 30, 40, 50])
s

{10, 20, 30, 40, 50}

In [49]:
# empty set
s = {}

type(s)

dict

In [50]:
# must use

s = set()
type(s)

set

In [51]:
s = {10, 20, 30}
s.add(40)
s

{10, 20, 30, 40}

In [52]:
s.add(40)
s.add(40)
s.add(40)
s

{10, 20, 30, 40}

In [53]:
40 in s

True

In [54]:
s.remove(40)
s

{10, 20, 30}

In [55]:
s[0]

TypeError: 'set' object is not subscriptable

In [56]:
{10, 20, 30} == {30, 20, 10}

True

In [57]:
# set.update adds a number of items

s.update([20, 30, 40, 50, 60])
s

{10, 20, 30, 40, 50, 60}

In [58]:
s1 = {10, 20, 30, 40}
s2 = {30, 40, 50, 60}
s3 = {10, 30}



In [60]:
s3 < s1   # s3 is a subset of s1!

True

In [61]:
s3 < s3

False

In [62]:
s3 <= s3

True

In [64]:
s1 & s2  # intersection

{30, 40}

In [66]:
s1  | s2  # union

{10, 20, 30, 40, 50, 60}

In [67]:
s1 - s2    # what's in s1 and not in s2?

{10, 20}

In [68]:
s2 - s1   # what's in s2 and not s1?

{50, 60}

In [69]:
s1.symmetric_difference(s2)   # xor

{10, 20, 50, 60}

In [70]:
s1 ^ s2  # also xor!

{10, 20, 50, 60}

# Combinations of data structures

1. List of lists
2. List of tuples
3. List of dicts
4. Dict of lists
5. Dict of dicts (tree)

# Files

We open a file with `open`. The first argument is a string, the filename. The second argument is optional, defaulting to `'r'`, for reading from the file.

We get "file-like objects" back from `open`.

In [71]:
open('/etc/passwd') 

<_io.TextIOWrapper name='/etc/passwd' mode='r' encoding='UTF-8'>

In [72]:
%pwd

'/Users/reuven/Courses/Current/Intel-2025-10October-26'

In [73]:
open('myfile.txt')

FileNotFoundError: [Errno 2] No such file or directory: 'myfile.txt'