# Agenda: Day 3

- `str.split` and `str.join`
- Tuples
    - What are they?
    - Tuple unpacking
- Dictionaries
    - How are they different from strings/lists/tuples?
    - Defining dicts
    - Retrieving from dicts
    - Iterating over dicts
    - Using them in general
- Files
    - Reading from (text) files
    - A little bit about writing to files, as well

# `str.split` and `str.join`

If I have a string, and want to get a list from it, I can use the `str.split` method.

- This is a string method, meaning that we will typically invoke `.split` on a string
- Don't forget to put `()` after the invocation of `str.split`
- If you put an argument in the `()`, then that is what will be used as a delimiter/separator
- If you don't put an argument in the `()`, then any whitespace characters (space, newline, carriage return, and a few others), in any combination and in any number, will be used as delimiters

In [1]:
s = 'abcd:ef:ghi'

s.split(':')  # str.split always returns a list of strings

['abcd', 'ef', 'ghi']

In [2]:
s.split('d')  # a little weird, but it'll work

['abc', ':ef:ghi']

In [4]:
# much of the time that we want to use str.split, we actually want to split on whitespace
# this is especially true when we get input from the user

words = input('Enter some text: ').split()   # any/all combination of whitespace is used to cut

words

Enter some text:  this      is          a            test


['this', 'is', 'a', 'test']

# The opposite of `str.split`: `str.join`

- We invoke `str.join` on a string, and pass it a list of strings
- If you pass `str.join` something that isn't a list of strings, you'll get an error
- Notice that `str.join` is a *string* method! Meaning, we invoke it on the string that we'll want to see between the elements of the argument

In [5]:
mylist = ['abcd', 'ef', 'ghij']

'*'.join(mylist)    # I'm invoking str.join on the '*', but I'm passing mylist

'abcd*ef*ghij'

In [6]:
' '.join(mylist)   # glue is ' ', and the list is mylist... we get back a single string based on mylist's elements, connected by the glue

'abcd ef ghij'

In [7]:
'\n'.join(mylist)

'abcd\nef\nghij'

In [8]:
print('\n'.join(mylist))

abcd
ef
ghij


# Where do we use these?

All over the place. 

Whenever we read from a file, database, network, or even the user, the odds are good that we'll have to break the data apart, and `str.split` is a standard, classic way to do that.

If we have a list of strings, then `str.join` is a great way to get one string back from them. It's far faster and more efficient to create an empty list, and `append` numerous things to it as you walk through the program, and then just run `str.join` on the result, to get a resulting string. This is better than just doing `+=` tons of times on the string.

# `str.join` only works with lists of strings!

If you have a list of non-strings that you want to get a single string from, you cannot use `str.join`, at least not directly.

In [9]:
mylist = [10, 20, 30]

'*'.join(mylist)

TypeError: sequence item 0: expected str instance, int found

# Exercise: Vowels, digits, and others (list edition)

1. Define three empty lists -- `vowels`, `digits`, and `others`.
2. Ask the user to enter some text.
3. Go through that text, one character at a time:
    - If it's a vowel, append to `vowels`
    - If it's a digit, append to `digits`
    - If it's neither, append to `others`
4. When you're done, print each of the lists -- but first, `join` them together, to get a string, in which the characters are separated by commas and spaces.

Example:

      Enter text: hi! 123
      vowels: i
      digits: 1, 2, 3
      others: h, !,  

In [14]:
vowels = []
digits = []
others = []

text = input('Enter some text: ').strip()  # get input from the user, remove leading/trailing spaces, and assign the result to "text"

for one_character in text:
    if one_character in 'aeiou':         # if one_character is a vowel...
        vowels.append(one_character)     # ... append it to the end of "vowels"
    elif one_character.isdigit():        # if one_character is a digit...
        digits.append(one_character)     # ... append it to the end of "digits"
    else:
        others.append(one_character)     # otherwise, just append to the end of "others"

print(', '.join(vowels))  # ',' is our "glue," and "vowels" contains the elements to glue together
print(', '.join(digits))
print(', '.join(others))

Enter some text:  hello out there! 12345


e, o, o, u, e, e
1, 2, 3, 4, 5
h, l, l,  , t,  , t, h, r, !,  


In [19]:
# this means: give me a string based on the list "others", putting
# | between every two elements.

'|'.join(others)

'h|l|l| |t| |t|h|r|!| '

In [21]:
sep = '*'

# join not on sep, but on sep inside of an f-string, where it has spaces between the separator
f' {sep} '.join(others)

'h * l * l *   * t *   * t * h * r * ! *  '

# Functions vs. methods

Both functions and methods are the verbs in Python; they both tell the language to do something.

The difference is:

- Functions are floating around in Python's memory, unconnected to any particular data structure. You can tell that something is a function because there isn't any `.` before its name.
- By contrast, *methods* are functions that are attached to a particular type of data. We indicate this when we write their names as `str.join` and `list.append`. When we invoke them, we always have to use a `.`, and we often invoke the method on a particular instance of a data type, as in `'*'.join(mylist)` or `mylist.append('a')`.

Methods are far more common than functions in Python, because they help to keep our code organized. This way, we know exactly what methods are defined on a given type, and how to invoke them. Functions aren't explicitly and clearly connected to any type.

There are functions in Python, and they tend to be the most common and the simplest verbs. As time goes on, you'll see more and more methods.

In [23]:
# CC

vowels = []
digits = []
others = []

text = input("enter a text: ").strip()

for character in text:
  if character in "aeiou":
    vowels.append(character)
  elif character.isdigit():
    digits.append(character)
  else:
    others.append(character)

print(f"Vowels: {", ".join(vowels)}")

enter a text:  hello out there! 12345


Vowels: e, o, o, u, e, e


# Tuples and unpacking

Python has three "sequence" types:

- Strings, which contain characters and are immutable, meaning that we cannot change them
- Lists, which contain *anything* at all, and are mutable, meaning that we *can* change them
- Tuples, which contain *anything* (like lists) but are immutable (like strings)

It's very tempting to say that tuples are just immutable lists, or "locked lists." But really, in practice, there are two different uses for lists and tuples:

- Lists are meant to be for sequences of data in which every value has the same type
- Tuples are meant for sequences of data in which values have different types

How much will you use tuples? Probably not much, especially in your first year of using Python. But it's important to know what they are, and how they work.

#### To define a tuple

Use `()`, as in

```python
t = (10, 20, 30, 40, 50)
```

#### To retrieve from a tuple

Just use `[]` and an index or a slice (`start:end`), in the square brackets:

```python
t[2]
t[2:4]
```

You can also:
- Search in a tuple with `in`
- Iterate over a tuple with `for`
- Get the number of values in a tuple with `len`

In other words, tuples work just like lists in many ways. *BUT* if you try to assign to a tuple, you'll find that it's impossible.

# The most common way to use tuples

The most common way that newcomers to Python use tuples is in "tuple unpacking." The basic idea is:

- Have an iterable value (i.e., something that knows how to behave in a `for` loop) on the right of assignment
- Have a tuple of variables on the left of the assignment
- Make sure that the number of values on the right and the number of variables on the left matches

In [24]:
mylist = [10, 20, 30]

(x,y,z) = mylist    # tuple of variables on the left, iterable of values on the right, same number in both

In [25]:
# the result is that we've assigned each value to its parallel variable

x

10

In [26]:
y

20

In [27]:
z

30

In [28]:
# you actually don't need () when writing tuples! So... we can just write:

x,y,z = mylist

In [29]:
x

10

In [30]:
y

20

In [31]:
z

30

# Remember `enumerate`?

Last time, we saw that if we want to number the elements of a string (or any sequence) when we invoke a `for` loop, we could do so with `enumerate`:


In [34]:
# this works via tuple unpacking!
# with each iteration, enumerate('abcd') gives us a 2-element tuple, (index, character)
# we break that apart in the "for" loop with two variables -- a kind of internal unpacking

for index, one_item in enumerate('abcd'):
    print(f'{index}: {one_item}')

0: a
1: b
2: c
3: d


In [35]:
t = (10, 'a', [1,2,3])
t

(10, 'a', [1, 2, 3])

In [36]:
t[0]

10

In [37]:
t[1]

'a'

In [38]:
t[2]

[1, 2, 3]

In [39]:
person = ('Reuven', 'Lerner', 54, 46)

first_name, last_name, age, shoe_size = person

In [40]:
first_name

'Reuven'

In [41]:
last_name

'Lerner'

In [42]:
age

54

In [43]:
shoe_size

46

In [44]:
# here, we're creating a new tuple whose values are from mylist
# but we're not storing the tuple itself! Rather, we're storing the values we got from mylist
# it's totally OK for us to have a list, unpack its values into a tuple, and then use those variables

(x,y,z) = mylist 

# Exercise: First and last names

1. Ask the user, repeatedly, to enter their first + last names, separated by a space.
    - If the user enters an empty string, then use `break` to get out of the `while True` loop.
2. Assume that the user did enter just two names, separated by a space.
3. Use tuple unpacking to define `first_name` and `last_name`, based on the user's input
4. Greet the person using their first and last names (but not together).

Example:

    Enter your first + last names: Reuven Lerner
    Hello, Reuven!
    We have not had any Lerners here for a while. Welcome.


In [46]:
while True:
    name = input('Enter your first and last names: ').strip()

    if name == '':
        break

    if name.count(' ') != 1:   # this counts how many spaces there are in name...
        print(f'Enter both first and last names, separated by a space')
        continue

    first_name, last_name = name.split()  # str.split returns a list of strings

    print(f'Hey there, {first_name}!')
    print(f'Do a lot of people have the last name {last_name} where you come from?')

Enter your first and last names:  Reuven Lerner


Hey there, Reuven!
Do a lot of people have the last name Lerner where you come from?


Enter your first and last names:  hello


Enter both first and last names, separated by a space


Enter your first and last names:  hello out there


Enter both first and last names, separated by a space


Enter your first and last names:  a b


Hey there, a!
Do a lot of people have the last name b where you come from?


Enter your first and last names:  


# Next up

1. Dictionaries
    - What are they, and why do we need them?
    - Defining them
    - Retrieving from them
    - Searching in them

# Dictionaries

These are the most important data structure in Python!

- They are easy to use
- They are super fast
- They are extremely flexible

If you have any programming experience, then you might have heard of dicts with other names:

- Hash tables
- Hashes
- Hash maps
- Associative arrays
- Key-value stores
- Name-value stores

The basic idea is: We don't store individual values. Rather, we store pairs of values, a key (kind of like the index) and the value.

Another way to think about dicts is that they are like lists, except that we get to determine the key (i.e., the index), rather than having it defined for us based on the location in the list.

# Some rules for dicts

- Keys can be anything at all, so long as they are immutable. This typically means integers and strings, but in theory, tuples can also be dict keys.
- Vavlues can be anything at all, with no exceptions or restrictions whatsoever.
- Keys must be unique in a dict. There is no repetition of keys.

# Defining and using dicts

We define a dict with `{}`.

- You can have an empty dict, `{}`
- Every key-value pair is written as `key:value`, with a `:` between them
- The pairs are separated by `,`

Some examples:

```python
d = {'a':10, 'b':20, 'c':30}   # keys are 'a', 'b', and 'c', and their values are 10, 20, and 30
d = {'first':'Reuven', 'last':'Lerner'}   # keys are 'first' and 'last', and values are 'Reuven' and 'Lerner'
d = {1:'January', 2:'February', 3:'March'}  # keys are 1, 2, and 3; values are 'January', 'February', and 'March'
```




In [47]:
d = {'a':10, 'b':20, 'c':30}

type(d)  # what kind of data is stored in d?

dict

In [48]:
# how can I retrieve from a dict?
# I put the key in `[]`, just like a string/list/tuple -- but now we can use non-integers

d['a']

10

In [49]:
# what if I ask for a key that doesn't exist?

d['x']

KeyError: 'x'

In [50]:
# I can search in the keys with the "in" operator

if 'x' in d:        # if 'x' is a key in d...
    print(d['x'])   #   ... then print d['x']
else:
    print(f'No such key "x" in d')

No such key "x" in d


# Why do we care about dicts so much?

There are many, many programming problems that can be solved using a dictionary:

- Month names to month numbers
- Month numbers to month names
- ID numbers to database records
- Usernames to ID numbers
- Usernames to passwords

# Using variables with dicts

If you want, you can use a variable instead of a dict key. This is very useful (and common)!

In [51]:
d = {'a':10, 'b':20, 'c':30}

k = 'b'

d[k]  # what will be returned?

20

# Exercise: Restaurant

1. Define `total` to be 0.
2. Define a dict, `menu`, in which the keys are restaurant menu items (strings) and the values are the prices for those items (integers).
3. Ask the user, repeatedly, to order something on the menu. (Don't show them the menu!)
    - If the user enters an empty string, stop asking
    - If the user enters something that *is* on the menu, print the item and its price, and the new total.
    - If the user enters something that is *not* on the menu, scold them and let them try again
4. In the end, print the final total.

Example:

    Order: sandwich
    sandwich is 10, total is 10
    Order: tea
    tea is 5, total is 15
    Order: elephant
    Sorry, we're fresh out of elephant today!
    Order: [ENTER]
    Total is 15

In [52]:
total = 0

menu = { 'sandwich':10,   'tea':5,    'apple':3,     'cake':7   }
menu

{'sandwich': 10, 'tea': 5, 'apple': 3, 'cake': 7}

In [53]:
len(menu)

4

In [54]:
menu = {'sandwich': 10, 'tea': 5, 'apple': 3, 'cake': 7}

while True:   # infinite loop! We must have a "break" for it to end sometime
    order = input('Order: ').strip()

    if order == '':   # empty string?
        break

    elif order in menu:   # is the user's input string a key in the "menu" dict?
        price = menu[order]  # get the value for the "order" key from the "menu" dict, and assign to "price"
        total += price       # add the price to the total
        print(f'{order} is {price}; total is now {total}')

    else:
        print(f'We are out of {order} today!')

print(f'Total = {total}')        

Order:  sandwich


sandwich is 10; total is now 10


Order:  apple


apple is 3; total is now 13


Order:  tea


tea is 5; total is now 18


Order:  tea


tea is 5; total is now 23


Order:  asdfafafafafaf


We are out of asdfafafafafaf today!


Order:  


Total = 23


In [57]:
# IS

total = 0
menu = {'cola':12, 'orange': 10, 'wine': 44}
while True:
    item = input('order from the menu').strip()
    if item == '':
        break
    elif item in menu:
        price = menu[item]
        print(f'{item} costs {price}')
    else:
        print('done')

order from the menu cola


cola costs 12


order from the menu wine


wine costs 44


order from the menu asdfsafafaasd


done


order from the menu 


# Are dictionaries mutable?

Meaning: Can we change them? Yes!

There are several ways to change a mutable data structure:

- Modify an existing value
- Add new elements
- Remove existing elements

We can do all of these with a dict

In [58]:
# how do we modify an existing value?
# we just assign to it!

d = {'a':10, 'b':20, 'c':30}

d['a'] = 20   # this replaces the existing value associated with the key 'a'
d

{'a': 20, 'b': 20, 'c': 30}

In [59]:
# I can also say:

d['a'] += 5    # this changes the value to be 5 more than before
d

{'a': 25, 'b': 20, 'c': 30}

In [60]:
# how can I add new key-value pairs to a dict?
# with a list, I have to use the .append method
# with dicts... it's not that complicated -- I just assign!

d['x'] = 123   # this adds a key-value pair if it's new, or updates the value if it isn't
d

{'a': 25, 'b': 20, 'c': 30, 'x': 123}

In [61]:
# how can I remove existing key-value pairs?
# it's actually kind of rare, in my experience, to do this, but it does work
# use the dict.pop method -- give it a key, and it'll return the value, removing the pair

d.pop('x')   # removes the key-value pair with 'x' as a key
d

{'a': 25, 'b': 20, 'c': 30}

In [62]:
# if we want to change a key...
# remember that keys must be IMMUTABLE
# which means that they cannot ever be changed
# because if they could, then they wouldn't be legal dict keys

d = {'abcd':10, 'efghi':20, 'jklmno':30}

In [63]:
# instead, I could remove an existing pair and add a new key with the old value

d['ABCD'] = d.pop('abcd')
d

{'efghi': 20, 'jklmno': 30, 'ABCD': 10}

In [64]:
# U1

menu = {'hamburger':10, 'fries':5, 'pizza':15, 'salad':10}

menu['soda'] = 3
menu.pop('soda')

3

In [65]:
# SS : Can we join two dicts?

d1 = {'a':10, 'b':20, 'c':30}
d2 = {'c':300, 'd':400, 'e':500}

# we can use the | symbol to get a new dict, the union of the previous two
# if the same key exists in both, the right side gets priority

d1 | d2

{'a': 10, 'b': 20, 'c': 300, 'd': 400, 'e': 500}

# How are dicts implemented?

Dictionaries are super fast at finding or retrieving keys, and also (by extension) retrieving values based on keys.

How do they do it?

When we add a new value to a list, it goes at the end. This means that if want to know if a value is in a list, we have to run a `for` loop over the entire list, from start to end. We might get to the end of the list, and find that it wasn't there. Or it might have been there, just earlier.

So searching for an item in a list takes longer, the longer the list is. 

By contrast, Python stores key-value pairs in a dict based on something known as the "hash function." This function takes a key and returns an integer -- one that looks random, and which has a huge range, but which isn't random. However, it is impossible for a human to predict the outcome of the hash function for any one argument.

When we store a key-value pair, Python runs `hash(key)`, getting a location in memory. The key-value pair is stored in that memory location. When you say `key in d` or `d[key]`, Python runs `hash(key)`, jumps to that memory location, and retrieves the value. It doesn't matter how many key-value pairs are in the dict; it still runs and retrieves super fast. No `for` loops are needed.



# Next up

- Using dicts as accumulators for known keys
- Using dicts as accumulators for unknown keys
- Using dicts in `for` loops

# Dict paradigm #2: Accumulate known keys

In this paradigm for using dicts:

- Define a dictionary with all of the keys that we'll use. We won't ever add/remove keys.
- The values for these keys are all starting points -- 0 or `[]` or the like
- Over the course of the program's run, we'll update the values, but never add/remove keys



# Example: Odds and evens



In [66]:
counts = {'odds':0, 'evens':0}

numbers = [10, 15, 20, 23, 28]

for one_number in numbers:
    if one_number % 2 == 0:   # if the number is even -- because %2 means "give me the remainder after dividing by 2"
        counts['evens'] += 1
    else:
        counts['odds'] += 1

print(counts)        

{'odds': 2, 'evens': 3}


In [67]:
# slightly different, using lists for the values rather than 0

counts = {'odds':[], 'evens':[]}

numbers = [10, 15, 20, 23, 28]

for one_number in numbers:
    if one_number % 2 == 0:   # if the number is even -- because %2 means "give me the remainder after dividing by 2"
        counts['evens'].append(one_number) 
    else:
        counts['odds'].append(one_number)

print(counts)        

{'odds': [15, 23], 'evens': [10, 20, 28]}


# Exercise: Vowels, digits, and others -- dict edition!

1. Define a dict `counts` in which the keys are `vowels`, `digits`, and `others`. The values should all be 0.
2. Ask the user to enter text.
3. Go through that text, one character at a time.
    - If it's a vowel, add 1 to the `vowels` key in `counts`
    - If it's a digit, add 1 to the `digits` key in `counts`
    - In other cases, add 1 to the `others` key in `counts`
4. Print `counts`

In [68]:
counts = {'vowels':0,
          'digits':0,
          'others':0}

text = input('Enter text: ').strip()

for one_character in text:
    if one_character in 'aeiou':     # if the character is a vowel...
        counts['vowels'] += 1        # ... add 1 to the value for 'vowels'
    elif one_character.isdigit():    # if the character is a digit...
        counts['digits'] += 1        # ... add 1 to the value for 'digits'
    else:
        counts['others'] += 1        # otherwise, add 1 to the count of 'others'

print(counts)        

Enter text:  hello!! 123


{'vowels': 2, 'digits': 3, 'others': 6}


# Iterating over dicts

We've seen that we can use a `for` loop on a bunch of different data structures:

- Looping on a string gives us its characters
- Looping on a list (or tuple) gives us its elements

Can we loop on a dict? What will we get?

In [70]:
d = {'a':10, 'b':20, 'c':30}
d

{'a': 10, 'b': 20, 'c': 30}

In [72]:
# iterating over a dict gives you its keys

for one_item in d:
    print(one_item)

a
b
c


In [73]:
# one way to get the values is by retrieving them

for one_key in d:
    print(f'{one_key}: {d[one_key]}')

a: 10
b: 20
c: 30


In [74]:
# My preferred way is to use the dict.items method
# this returns a 2-element tuple (key, value) with each iteration

# here's how this looks:

for key, value in d.items():  # we know that we'll get (key, value) with each iteration
    print(f'{key}: {value}')

a: 10
b: 20
c: 30


# Don't use `dict.keys`!

You might see people talk about the `dict.keys` method, which returns a dict's keys. I see *no reason* to use this method. You should be invoking `in` on your dict to search, and `for` on your dict to iterate. If you use `dict.keys`, then you will get the right answer! But it'll be much, much slower than other options.

In [75]:
counts = {'vowels':0,
          'digits':0,
          'others':0}

text = input('Enter text: ').strip()

for one_character in text:
    if one_character in 'aeiou':     # if the character is a vowel...
        counts['vowels'] += 1        # ... add 1 to the value for 'vowels'
    elif one_character.isdigit():    # if the character is a digit...
        counts['digits'] += 1        # ... add 1 to the value for 'digits'
    else:
        counts['others'] += 1        # otherwise, add 1 to the count of 'others'

for key, value in counts.items():
    print(f'{key}: {value}')

Enter text:  hello!! 123


vowels: 2
digits: 3
others: 6


In [76]:
# if you want to search in the values of a dict, you can
# with the dict.values method, which returns the values. You can use "in" to search on there.

# Dict para