# Agenda

1. Recap + Q&A + exercise
2. Dictionaries
    - What are they?
    - Creating dicts
    - Retrieving from and updating dicts
    - How do they work?
    - The three different paradigms of dict usage
3. Files (text files)
    - Reading from files
    - Looping over files (and why we can/should do that)
    - Writing to files
4. Set up Python + PyCharm on your computer


# Recap from yesterday

- Loops
    - `for` loops over strings -- we get one character at a time
    - `for` loops over numbers -- we use the `range` function
    - `for` loop with indexes -- using the `enumerate` function
    - `while` loops -- like an `if` that repeats running its block until the condition is `False`
    - `break` -- stops the running of a loop immediately
    - `continue` -- stops the running of the current iteration, but goes to the next one in the loop
    - If we use `while True` for a loop, then we can use `break` to get out when we get certain input from the user, or hit a particular condition.
- Lists
    - Mutable
        - We can modify the contents of a list via assignment
        - We can add items to the end of a list with `list.append`
        - We can remove items from the end of a list with `list.pop`
    - Ordered
        - Just as with strings, we can retrieve with a numeric index or get multiple items with a slice
        - We can iterate over the elements of a list, starting with index 0 and going through the end
    - Containers -- a list can contain any number of any other objects of any type -- including a list of lists!
    - Many (most) of the things that work on strings also work on lists, because both are in the "sequence" family in Python
        - Indexes
        - Slices
        - `in` for searching
        - `for` loops
- Turning strings into lists, and vice versa
    - We can get a list of strings based on a string with the `str.split` method
        - If we give a delimiter as an argument, then that is used to break apart the original string
        - If we don't pass any argument to `str.split`, then any whitespace of any length and any combination is used as the delimiter
        - When we use `str.split`, the original string isn't changed. We get back a new list of strings based on it.
    - We can take a list of strings, and produce one new string based on it, using `str.join`
        - We invoke the method on a string, the "glue" that'll go between elements of the list
        - We get back a new string, without affecting/modifying the list on which we ran
- Tuples
    - You can think of tuples as immutable lists, even though the Python world thinks about them as structs/records, containing different types of values
    - Most of the things that work on strings and lists also work on tuples
    - Python uses a lot of tuples behind the scenes, but how much you'll want to use them is up to you.
- Tuple unpacking
    - If you have an iterable on the right side of assignment, and a tuple of variables on the left side of assignment, the values are assigned in parallel to the variables
    - This means that you can retrieve/extract elements of a sequence into variables pretty easily

# Splitting and joining

If I have a string, and I want to treat it as a bunch of fields (in a record) or words (in a sentence), then I can use `str.split` to get back a list of strings based on that string.

In [1]:
# CSV -- comma-separated values

s = 'Reuven,Lerner,reuven@lerner.co.il,46'

s.split(',')    # this returns a new list of strings based on s -- we'll get 4 elements

['Reuven', 'Lerner', 'reuven@lerner.co.il', '46']

In [2]:
fields = s.split(',')  # now the list is assigned to the "fields" variable

fields[0]

'Reuven'

In [4]:
fields[1]

'Lerner'

In [5]:
# I can even, using unpacking, say:

first_name, last_name, email, shoe_size = s.split(',')

In [7]:
# split is always about taking a string and breaking it into pieces, using
# some small string as a delimiter

# You can use any character or string as a delimiter
# here's a line from the Unix /etc/passwd file, containing user info:

s = '_postfix:*:27:27:Postfix Mail Server:/var/spool/postfix:/usr/bin/false'

# the fields are separated with : characters

In [8]:
s.split(':')

['_postfix',
 '*',
 '27',
 '27',
 'Postfix Mail Server',
 '/var/spool/postfix',
 '/usr/bin/false']

In [9]:
# what if I split on something that isn't there?
s.split('~')

['_postfix:*:27:27:Postfix Mail Server:/var/spool/postfix:/usr/bin/false']

In [11]:
# the most common use of split is on user-entered data
# when we want to break a string apart into words, we can split without mentioning the delimiter
# in such a case, any/all whitespace is used

s = 'This    is a bunch of    words for my Python course'

s.split()  # split on nothing == split on whitespace

['This', 'is', 'a', 'bunch', 'of', 'words', 'for', 'my', 'Python', 'course']

In [12]:
# Joining is a bit trickier.  We need two pieces:
# 1. The "glue" string, typically one character, that'll go between the list elements
# 2. A list of strings that'll be joined together

words = s.split()   # now I have a list of strings!
words

['This', 'is', 'a', 'bunch', 'of', 'words', 'for', 'my', 'Python', 'course']

In [13]:
# I want to get a new string back
# based on words
# with spaces between the words 

' '.join(words)

'This is a bunch of words for my Python course'

In [14]:
# what if I want two spaces between each word?

'  '.join(words)

'This  is  a  bunch  of  words  for  my  Python  course'

In [15]:
# what if I want underscores and asterisks between words?

'*_*'.join(words)

'This*_*is*_*a*_*bunch*_*of*_*words*_*for*_*my*_*Python*_*course'

In [16]:
# notice that the glue goes between elements, not at the start and finish
# also, remember -- the original list isn't changed in the slightest

# Exercise: Higher and lower

1. Define two empty lists, `higher` and `lower`.
2. Ask the user to enter an integer, which we'll call `threshold`.
3. Repeatedly ask the user to enter a string with numbers separated by whitespace.
    - If the user enters an empty string, stop asking
4. Go through each "word" in the string, one at a time.
    - If it's not a number, then scold the user and go on to the next word
    - If it's a number and lower than the threshold, append it to `lower`.
    - If it's a number and higher (or equal to) the threshold, append it to `higher`.
5. At the end print the elements of both `higher` and `lower`.

Example:

    Enter a threshold: 10
    Enter numbers: 5 15 20 30 7
    Enter numbers: 2 10 hello
    hello is not a number
    Enter numbers: [ENTER]
    higher: [15, 20, 30, 10]
    lower: [5, 7, 2]

What are the things to keep in mind here?
- `while` loop that goes forever
- Ask the user for input, and check for an empty string -- if it's empty, then `break`
- Split the user's input string into a bunch of words
- Iterate over those words, one at a time
- If the word cannot be turned into an integer, continue onto the next word
- If the word *can* be turned into an integer, then check whether it's higher/lower than the threshold
- Append to the appropriate list
- At the end of everything, print both `higher` and `lower`.

In [22]:
# setup
higher = []
lower = []

# calculations
s = input('Enter threshold: ').strip()
threshold = int(s)   # here, we assume we got a number

while True:
    s = input('Enter numbers: ').strip()

    # if the user gave us an empty string, then break out of this loop
    if s == '':
        break

    # if I'm here, then I know that the string is non-empty
    # break it apart into individual words/numbers, and go through each one to see if it's
    # higher or lower
    for one_word in s.split():

        if one_word.isdigit():
            n = int(one_word)    # get an int from one_word, and assign to n
            if n < threshold:
                lower.append(n)                
            else:
                higher.append(n)

        else:   # not numeric? scold the user!
            print(f'{one_word} is not numeric; ignoring')

# report
print(f'higher = {higher}')
print(f'lower = {lower}')

Enter threshold:  10
Enter numbers:  2 3 4 5 hello 80 90 100


hello is not numeric; ignoring


Enter numbers:  20 30 50 2 3
Enter numbers:  


higher = [80, 90, 100, 20, 30, 50]
lower = [2, 3, 4, 5, 2, 3]


# Solution in the Python Tutor:

https://pythontutor.com/render.html#code=%23%20setup%0Ahigher%20%3D%20%5B%5D%0Alower%20%3D%20%5B%5D%0A%0A%23%20calculations%0As%20%3D%20input%28'Enter%20threshold%3A%20'%29.strip%28%29%0Athreshold%20%3D%20int%28s%29%20%20%20%23%20here,%20we%20assume%20we%20got%20a%20number%0A%0Awhile%20True%3A%0A%20%20%20%20s%20%3D%20input%28'Enter%20numbers%3A%20'%29.strip%28%29%0A%0A%20%20%20%20%23%20if%20the%20user%20gave%20us%20an%20empty%20string,%20then%20break%20out%20of%20this%20loop%0A%20%20%20%20if%20s%20%3D%3D%20''%3A%0A%20%20%20%20%20%20%20%20break%0A%0A%20%20%20%20%23%20if%20I'm%20here,%20then%20I%20know%20that%20the%20string%20is%20non-empty%0A%20%20%20%20%23%20break%20it%20apart%20into%20individual%20words/numbers,%20and%20go%20through%20each%20one%20to%20see%20if%20it's%0A%20%20%20%20%23%20higher%20or%20lower%0A%20%20%20%20for%20one_word%20in%20s.split%28%29%3A%0A%0A%20%20%20%20%20%20%20%20if%20one_word.isdigit%28%29%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20n%20%3D%20int%28one_word%29%20%20%20%20%23%20get%20an%20int%20from%20one_word,%20and%20assign%20to%20n%0A%20%20%20%20%20%20%20%20%20%20%20%20if%20n%20%3C%20threshold%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20lower.append%28n%29%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%0A%20%20%20%20%20%20%20%20%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20higher.append%28n%29%0A%0A%20%20%20%20%20%20%20%20else%3A%20%20%20%23%20not%20numeric%3F%20scold%20the%20user!%0A%20%20%20%20%20%20%20%20%20%20%20%20print%28f'%7Bone_word%7D%20is%20not%20numeric%3B%20ignoring'%29%0A%0A%23%20report%0Aprint%28f'higher%20%3D%20%7Bhigher%7D'%29%0Aprint%28f'lower%20%3D%20%7Blower%7D'%29&cumulative=false&curInstr=41&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%2210%22,%222%203%2020%2030%20hello%2050%22,%22%22%5D&textReferences=false

# Dictionaries (aka "dicts")

Dictionaries are not unique to Python! They exist in other languages, too, often with names like:

- Hash table
- Hash map
- Hash
- Map
- Key-value store
- Name-value store
- Associative array

All of these things describe the same sort of data structure.

The idea of a dictionary is sort of like a list, except that in a list, the index is dictated by the number of elements, and is always an integer. If there are 5 elements in a list, then they have the indexes 0-4. 

In a dict, we can determine what the indexes are, as well as determine what the values are. The keys (which is what we call the indexes) can be any **immutable** type, which basically and normally means integers and strings.

This means that our code can be much clearer with a dict, because we aren't using numeric indexes. Rather, we're able to use something closer to our own language.

# Defining a dict

We define a dictionary with `{}`

- Each key-value pair has a colon between the key and the value
- The pairs are separated by commas
- Every key has a value, every value has a key
- Keys are guaranteed to be unique! If you repeat a key, the last one wins
- Values don't need to be unique

In [23]:
# defining a simple dict

d = {'a':100, 'b':200, 'c':300}

type(d)

dict

In [24]:
# when we talk about dictionaries, we always talk about them in terms of pairs,
# not individual keys or values.

len(d)   # how big is d? 

3

In [25]:
# what if I want to retrieve from a dict?
# I use []

d['a']   # put the key in the square brackets

100

In [26]:
d['x']  # if it doesn't exist...

KeyError: 'x'

In [27]:
# can I use a variable? yes

k = 'b'
d[k]  # should return d['b'], which is 200

200

In [None]:
# Here, we see that we can retrieve a value based on the key
# can we retrieve a key based on the value?

# no. Dicts are one-way streets. You always use the key to do things; the value
# is dragged