# Introduction to Programming with Python
## Day 4 Notebook 
## Booleans, Control Flow, reading Text Files
## Fall 2021 (c) Jeff Parker 

In [2]:
with open('sample.txt', 'r') as my_file:
    for line in my_file:
        print(line)

Sample.txt



Sample text file to read





# Topics

- If statement
- Boolean conditions 
- equals and is
- or and and
- not and in
- Reading and writing text files

# Lessons from Homework

- Functions can simplify your work
- Factor out the parts you do more than once
- Functions are our unit of design 
- Do one thing to one type of thing

## Good Function Names: Verb, VerbNoun or IsAdjective
```python
    printGreeting()
    strip()
    lower()
    isalpha()
```

# Some elements of Code Style

From Python Enhancement Proposal 8 (PEP 8)

https://www.python.org/dev/peps/pep-0008/

### Doc Strings for functions

```python
def block(num_cols: int) -> str:
    "Return a string one block high"
    ...
```
### Two spaces between functions

### Keep lines short: 79 characters max
Trey Hunner discusses this: https://treyhunner.com/2017/07/craft-your-python-like-poetry/

### ... *many more ...*

# Control Flow
## We may want to do this or to do that: 
- Add a number or forget it
- Finish a calculation or keep going
- Draw a line or draw a circle

## To decide what do, we test a Boolean Condition

### Boolean Conditions are named after George Boole

Since they are named after someone, the term should be capitalized: 'Boolean expression', not 'boolean expression'

Alas, the Python type hint is bool, not Boole

## Predict what this will print

In [None]:
def example(name):
    if len(name) > 0:
        return f"Hello, {name}!"

In [None]:
print(example('Sam'))

## Predict what this will print

In [None]:
print(example(''))

## Predict what this will print

In [None]:
def example_two(name):
    if name:
        return f"Hello, {name}!"
    
print(example_two('Sam'))

In [None]:
print(example_two(''))

### '' isn't False, but it is Falsy

https://www.freecodecamp.org/news/truthy-and-falsy-values-in-python/

Things that aren't Falsy are Truthy.  

*No relation to Colbert's Truthiness*

# Boolean Tests

In [None]:
# Boolean 
4 == 4

In [None]:
4 == 'cat'

# Boolean tests: ==

In [None]:
s1 = 'cat'
s2 = 'cat'

s1 == s2

In [None]:
L1 = ['cat']
L2 = ['cat']

L1 == L2

# Boolean tests: is

In [None]:
s1 = 'cat'
s2 = 'cat'

s1 is s2

In [None]:
L1 = ['cat']
L2 = ['cat']

L1 is L2

## The strings are the same
## The lists are different
### Look at their addresses in memory

In [None]:
help(id)

```python
id(object)
    Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

    CPython implementation detail: This is the address of the object in memory.
```

In [None]:
## Addresses of the two strings

print(id(s1))
print(id(s2))

In [None]:
## Addresses of the two lists

print(id(L1))
print(id(L2))

## Why the difference?

In [None]:
print('L1', L1)
print('L2', L2)

In [None]:
## Modify one list
L2.pop()

print('L1', L1)
print('L2', L2)

## When I paint L2 black, L1 doesn't change color

## L2 has changed, but is still in the same location

In [None]:
print(id(L1))
print(id(L2))

# My sons Max und Moritz are identical twins 

They look the same, but they live in different appartments 

The lists are different, but they hold the same values

# Twin cats

In [None]:
s1 = 'cat'
s2 = 'cat'

print(id(s2))
print(id(s2))

In [None]:
s1 is s2

## Why is there only one copy of the string?

Since strings are imutable, we only need one copy of 'cat' 

# Franken-cat

In [None]:
## What if we ask Igor for some spare parts and we assemble a cat?
ch1 = 'c'
ch2 = 'a'
ch3 = 't'

frankenCat = ch1 + ch2 + ch3
print(frankenCat)

## Predict what will happen

In [None]:
s1 == frankenCat

In [None]:
s1 is frankenCat

In [None]:
print(id(s1))
print(id(frankenCat))

# Boolean condition 'in'

### *Is this object in that collection?*

## Strings support 'in'

We use 'in' in two ways: as part of a for loop, and as a Boolean test

These are different meanings for the same word, but are used in different contexts.  Python will not be confused.

In [None]:
import string

print(string.ascii_lowercase)

In [None]:
import string

s = 'Peanut'

for ch in string.ascii_lowercase:
    if ch in s:
        print(f"String {s} includes {ch}")

## Lists support 'in'

In [None]:
lst = ['c', 'a', 't']

for ch in string.ascii_lowercase:
    if ch in lst:
        print(f"List includes {ch}")

# One Boolean condition with two outcomes

### *Is you is or is you ain't my baby?*

https://www.youtube.com/watch?v=VP1Bcx1D3c8&vl=ja

In [None]:
x = -1

if x > 0:
    print(f'{x} is positive')
else:
    print(f'{x} is negative')

## *What kind of bug is that?*

In [None]:
x = 0

if x > 0:
    print(f'{x} is positive')
else:
    print(f'{x} is negative')

# Use 'elif' for more than two outcomes

'elif' takes a Boolean condition, just as 'if' does.  

In [None]:
x = 121

if x > 0: 
    print(f'{x} is positive')
elif x == 0:                       # Note ‘==‘, not '=''
    print(f'{x} is zero')
else: 
    print(f'{x} is negative')

## Python checks the conditions in order

- If first is false, check next
- If a condition is True, run that branch and skip the remaning tests
- *If multiple conditions are True, only the first True branch runs.*
- The final, unguarded else runs if none of the conditions are True

```python
    if x == ...
        print(...)
    elif x == ...
        print(...)
    elif x == ...
        print(...)
    elif x == ...
        print(...)
    elif x == ...
        print(...)
    elif x == ...
         print(...)
    else:
         print(...)
```

# Parallel construction

In [None]:
if hasHammer: 
    print('Hammer in Morning') 
    
if hasBell: 
    print('Ring in Morning')
    
if hasSong: 
    print('All over this land') 

## In that example, we always check each condition
### All or none of the conditions can fire

# Compound examples

In [None]:
time = 9

if (time < 8): 
    print('Please take a seat.')
elif (time > 12):
    print('Please take a seat.')
else: 
    print('Please come in.')

## Combine two clauses with same result with 'or'
### If this is True OR if that is True

In [None]:
if (time < 8) or (time > 12):
    print('Please have a seat.')
else: 
    print('Please come in.')

# What if we must pass two tests?

In [None]:
x = 13

if 0 < x: 
    if x < 10: 
        print('x is a positive digit')

## Both conditions must be True: this AND that.

In [None]:
if (0 < x) and (x < 10): 
    print('x is positive digit')

In [None]:
if (0 < x and < 10): 
    print('x is positive digit')

## Nesting If Statements

In [None]:
x = 3
y = 5

if x == y: 
    print('x and y equal') 
else: 
    if x < y: 
        print('x is less')
    else:
        print('x is greater')

##  I find it clearer to rewrite this one with elif

In [None]:
if x == y: 
    print('x and y equal') 
elif x < y: 
    print('x is less')
else:
    print('x is greater')

## Boolean Inequalities

In [None]:
## Assignment
x = 3
y = 5

## Equality
x == y      # Not x = y

## Inequalities
x != y
x > y
x < y
x >= y
x <= y

print('All Done')

# Application: Leap Year 
## Write Boolean function to tell if a year is a leap year

## Why is this hard?
## Digression: Unix calendar command cal
### *DOS doesn't have an equivalent*

In [None]:
! cal

```python
     June 2021        
Su Mo Tu We Th Fr Sa  
       1  2  3  4  5  
 6  7  8  9 10 11 12  
13 14 15 16 17 18 19  
20 21 22 23 24 25 26  
27 28 29 30        
```

## And one more

In [None]:
## 9th month of 1752

! cal 9 1752

```python   September 1752     
Su Mo Tu We Th Fr Sa  
       1  2 14 15 16  
17 18 19 20 21 22 23  
24 25 26 27 28 29 30  
```

## *Why, that's fantastic!*

## Julien Calendar  
Imposed in 46 AD by Julius Cesear.

Every fourth year is a Leap Year.

By 1500 Easter was moving into Summer: reform was needed.

Pope Gregory changed the rules in 1582.

The new rules are more complex, but keep the moveable feasts in place.

## Gregorian Calendar was not adopted in UK until 1752

In [None]:
! cal 1752

## *Give us back our 11 days!*

https://www.historic-uk.com/HistoryUK/HistoryofBritain/Give-us-our-eleven-days/


"William Willett of Endon. 
    
"Always keen on a joke, he apparently wagered that he could dance non-stop for 12 days and 12 nights. On the evening of September 2nd 1752, he started to jig around the village and continued all through the night. The next morning, September 14th by the new calendar, he stopped dancing and claimed his bets!"

## Which are the leap years?

"These extra days occur in years which are multiples of four (with the exception of centennial years not divisible by 400)" - Wikipedia

### Algorithm
We have a leap year
- On every year that is evenly divisible by 4
- Except every year that is evenly divisible by 100
- Unless the year is also evenly divisible by 400

### *We are ignoring the years before 1752*

# Test for our leap year function

Always a good idea to write the tests before you write the code.

Once you have written the code, you tend to share it's bugs.

You can be much clearer **before** you have a dog in the fight.

In [None]:
# Which should be True and which should be False?

def test_leap_year() -> bool:
    assert not is_leap_year(2021), "2021 Not divisible by 4"
    assert is_leap_year(2020), "2020 is divisible by 4 and not by 100"
    assert not is_leap_year(1900), "1900 is divisible by 100, but not by 400"
    assert is_leap_year(2000), "2000 divisible by 400"
    
    print("Passed 4 tests")

## Our first Boolean Function
### *Make it run, make it right, make it fast* - Kent Beck

In [None]:
# Signature is right, but the logic is wrong
# This says no year is a leap year: right 75% of the time
#
def is_leap_year(year: int) -> bool:
    "Is this a leap year?"
    return False

print(is_leap_year(2021))

## Let's try it

In [None]:
test_leap_year()

## Detour: Testing if a value is divisible by 2

We use '%', which returns the remainder

In [None]:
x = 3

def isOdd(x):
    if x % 2 == 0:
        return False
    else:
        return True
    
print(f"{x} is odd? {isOdd(x)}")

# You often see this as a test for non-zero

3 % 2 is 1, which is 'Truthy"

4 % 2 is 0, which is 'Falsy'

Any non-zero integer is 'Truthy'

In [None]:
def isOdd(x):
    return x % 2

In [None]:
print(f"{x} is odd? {isOdd(x)}")

if isOdd(x):
    print("That's odd")

## That works, and you will see that IRL

You may prefer to return a Boolean

In [None]:
def isOdd(x):
    return (x % 2 != 0)

In [None]:
print(f"{x} is odd? {isOdd(x)}")

if isOdd(x):
    print("That's odd")

## Take our test for divisibility to is_leap_year()

In [None]:
# Grow a solution

def is_leap_year(year: int) -> bool:
    if (year % 4 == 0):
        return True
    else:
        return False

In [None]:
test_leap_year()

## Rather than "if true, return true", just return condition

Replace
```python
    if <Condition is True>
        return True
    else:
        return False
``` 
with
```python
    return Condition
```

In [None]:
# The same function, but more compact
# Still has bug
#
def is_leap_year(year: int) -> bool:
    return (year % 4 == 0)

In [None]:
test_leap_year()

## *Passes test 2 now, but fails on test 3*

# What do we think of this solution?

In [None]:
def is_leap_year(year):
    if year % 4 == 0:
        if year % 100 == 0 and year % 400 != 0:
            return False
        elif year % 100 != 0:
            return True
        elif year % 400 == 0:
            return True
        else:
            return False
    else:
        return False

In [None]:
test_leap_year()

### Two issues:
- Looks complex
- Runs lots of tests

In general, this
```python
    if year % 4 != 0:
        return False

    if year % 100 == 0 and year % 400 != 0:
        return False
    ...
```
is clearer than
```python
    if year % 4 == 0:
        if year % 100 == 0 and year % 400 != 0:
            return False
        ...
    else:
        return False
 ```
 The condition (not a multiple of 4) is too far from the result (return False)
 
 The name for this simplification is "Left Align the Happy Path"

## Can we simplify?

In [None]:
def is_leap_year(year: int) -> bool:
    if year % 4:
        return False
    
    # So we know that year is divisible by 4
    if year % 400 == 0:
        return True
    
    return year % 100 != 0

In [None]:
test_leap_year()

# What about this one?

In [None]:
def is_leap_year(year: int) -> bool:
    if year % 100 == 0:
        return year % 400 == 0
    return year % 4 == 0

In [None]:
test_leap_year()

# What makes a good function?
## a) Short
## b) Fast
## c) Understandable

## Stop and Think

Does it meet all 3 conditons?

```python
def is_leap_year(year: int) -> bool:
    if year % 100 == 0:
        return year % 400 == 0
    return year % 4 == 0
```

Short function, but always makes 2 tests

Can we do better?  *What should we test first?*

In [None]:
# Write your function and test it
def is_leap_year(year: int) -> bool:
    # Insert your definition
    pass

In [None]:
test_leap_year()

# Reading Text Files

Text files have the following structure:

- A sequence of lines

- Each line of a Unix text file is a sequence of characters ending with a ‘\n’

- Each line of a DOS text file is a sequence of characters ending with a ‘\r\n’

However, Python always returns a sequence ending with '\n'

If we need to see ‘\r’, we open it as a file of bytes, 8-bit ASCII characters

In [None]:
! more sample.txt

```python
   Sample.txt

   Sample text file to read

```

In [None]:
## Unix Word Count program

! wc sample.txt

## WordCount Program tells us text file sample.txt has:

```python
    ! wc sample.txt
        4       6      38 sample.txt
```

4 lines    6 words   38 characters

## Reading a text file in Python

In this example, we

- Open the file
- Read a line at a time
- Print the line

In [None]:
with open('sample.txt', 'r') as my_file:
    for line in my_file:
        print(line)

### *The 'with' clause closes the file when you exit the block*

# Crossword Puzzle Example

## Great!  Let's write a program to help with crosswords!

## Downey gives us words.txt with 113K English Words

## I want all 4-letter words that end in 't'

- Open the file
- Read a line at a time
- Print the lines that match the pattern: 4 letter words ending in 't'

In [None]:
## To run this cell, you need words.txt in the parent directory
## I put it there so I can see if from each week's subdirectory

print("Four letter words ending in t")

with open('../words.txt', 'r') as words:
    for line in words:
        if (len(line) == 4) and (line[-1] == 't'):
            print(line)
            
print("Done")

### What did we expect?  
### What did we get?

## Debug the issue with print 
repr() shows representation

In [None]:
help(repr)

In [None]:
print("Find four letter words ending in t\n")

with open('../words.txt', 'r') as words:
    for line in words:
        print(repr(line))                       # <<<<<<< Debug
        if (len(line) == 4) and (line[-1] == 't'):
            print(line)

```python
Find four letter words ending in t

'aa\n'
'aah\n'
'aahed\n'
...
```
## Which test is failing?  Rework test and debug print

In [None]:
print("Find four letter words ending in t\n")

with open('../words.txt', 'r') as words:
    for line in words:
        if (len(line) == 4):
            print(repr(line))      # <<<<<<< Debug
            if (line[-1] == 't'):
                print(line)

```python
Find four letter words ending in t

'aah\n'
'aal\n'
'aas\n'
...
```

## We are including the trailing \n
No line ends with 't'.  They all end with '\n'

We will use the string method strip()

https://docs.python.org/3/library/stdtypes.html

In [None]:
help(strip)

In [None]:
help(str.strip)

In [None]:
print("Find four letter words ending in t\n")
with open('../words.txt') as words:
    for line in words:
        
        line = line.strip()                 # <<<<<<<
        
        if (len(line) == 4) and (line[-1] == 't'):
            print(line)

```python
    Find four letter words ending in t

    abet
    abut
    adit
    airt
    ...
```

# Command Line Parameters - mycopy.py
## *Will not work in Notebook - it uses argv parameters*
## Our first program with an interesting usage line

```python
        python mycopy.py <original> <copy>
````

### *The cell below will not complete the task from notebook*

```python
FileNotFoundError: [Errno 2] No such file or directory: '-f'
```
Save the contents as the file mycopy.py and run that

In [None]:
import sys

print(sys.argv)

In [None]:
# mycopy.py
#
# Copy file
# Usage:
#      % python mycopy.py <original> <copy>
#
# Jeff Parker, September, 2018

import sys


def copy(source: str, dest: str):
    "Copy one file to another"

    # Open the files
    with open(source, 'r') as fin:
        with open(dest, 'w') as fout:

            # Iterate over the input, write to output
            for line in fin:
                fout.write(line)


if (len(sys.argv) == 3):
    copy(sys.argv[1], sys.argv[2])
else:
    print("Usage: mycopy <from> <to>")

In [None]:
! cat mycopy.py

In [None]:
! ls *leap*

```python
    leapyear.py      test_leapYear.py
```

In [None]:
! python mycopy.py

## We didn't include parameters, so we get the final line
```python
    Usage: mycopy <from> <to>
```
## Give it parameters: file *from* and *to*

In [None]:
! python mycopy.py leapyear.py newleap.py

## Did we create a new file newleap.py?

In [None]:
! ls *leap*

```python
    leapyear.py      newleap.py       test_leapYear.py
```

## We have created a file.  Is it the same as leapyear.py?

In [None]:
## Unix command diff looks for differences 

! diff leapyear.py newleap.py

In [None]:
! wc leapyear.py
! wc newleap.py

In [None]:
! diff leapyear.py test_leapYear.py

## The two files are the same
We made a copy of the text file leapyear.py

We can use mycopy.py to copy any textfile

## Stop and Think

A) Will this work correctly on a Windows textfile?

Try it and see.  

B) Write a program that decides if two files are the same.

Print the first pair of lines that differ

Implementing the full diff program is 'diff occult'.  I hope to discuss the algorithm later

# Recursion in action - traverse directory tree

We will learn how to do this in Python soon

In [None]:
# DOS version

! dir /s

In [None]:
# Unix version

! find ..

## My directory holds something like this:
```python
.
./hanoi2.py
./cross.py
./hanoi.py
./isvowel.py
./Koch.py
./reverse.py
./__pycache__
./__pycache__/isvowel.cpython-37.pyc
./fakecopy.py
./traverse.py
./read2.py
./cli.py
./mycopy.py
./Day4.ipynb
./crossword.py
./find.py
./.ipynb_checkpoints
./.ipynb_checkpoints/Day4-checkpoint.ipynb
./double.py
./square.py
./day4.py
./test_vowel.py
./sample.txt
./read.py
./match.py
```

# Summary
Now we have loops, conditions, and functions

We know how to read text files

We can start to write interesting programs