<a href="https://colab.research.google.com/github/Princeton-CDH/python4poets/blob/main/4_Conditionals_Functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 4. Conditionals: The Whens and Ifs of Things

##4.1 `if` Statements

As we've learned before in `2.4.1`, conditions return a Boolean value (`True` or `False`). For example,



> 34 < 57 --> `True`
>
> 60 > 92 --> `False`
> 
> 9 <= 10 --> `True`
>
> 7 != 10 --> `True`


You can use conditions with `if` statements to check *if* a certain condition is met. For example:




In [None]:
x = 5
y = 7

if x < y:
  print(str(x) + " is smaller than " + str(y))

5 is smaller than y


You'll notice that the second line only executes if the statement of the first line is `True`. In the example above, since `5 < 7`, we print the specified `str`.

∇ *What happens if you change* `y` *to* `3`*?*

###4.1.1 Architecture of `if` statements

In sum, then, an `if` statement is -- minimally -- construed the following way: 

1.   a line starting with `if` and the condition that should be met, closed with a colon `:`
2.   a line that executes *if* the condition specified in line 1 is met.


In [None]:
name = "Toni Morrison"

if name == "Toni Morrison":
  print(name + " was awarded the Nobel Prize in Literature in 1993")

Toni Morrison was awarded the Nobel Prize in Literature in 1993


###4.1.2 `if` and `else`
You can use `else` to specify what should be executed if the `if` statement is `False`. 

At the same time, you can also check an `if` statement against dictionary values. In the following example, we are interested whether the user-provided `str` -- specified through the function `input()` -- is *in* the `dict`. Try out a couple of names:

In [None]:
laureates_selection = {
    "Gao Xingjian": 2000, "Sir Vidiadhar Surajprasad Naipaul": 2001, "Imre Kertész": 2002, "John M. Coetzee": 2003, 
    "Elfriede Jelinek": 2004, "Harold Pinter": 2005, "Orhan Pamuk": 2006, "Doris Lessing": 2007, 
    "Jean-Marie Gustave Le Clézio": 2008, "Herta Müller": 2009, "Mario Vargas Llosa": 2010, "Tomas Tranströmer": 2011, 
    "Mo Yan": 2012, "Alice Munro": 2013, "Patrick Modiano": 2013, "Svetlana Alexievich": 2014, 
    "Bob Dylan": 2016, "Kazuo Ishiguro": 2017, "Olga Tokarczuk": 2018, "Peter Handke": 2019,
    "Louise Glück": 2020, "Abdulrazak Gurnah": 2021, "Annie Ernaux": 2022}


print("Enter a name")
user_input = input()
if user_input in laureates_selection.keys():
  print(user_input + " was awarded the Nobel Prize in Literature in " + str(laureates_selection.get(user_input)))
else: 
  print(user_input + " is not among the list of people who were awarded a Nobel Prize in Literature since 2000")

Enter a name
Herta Muller
Herta Muller is not among the list of people who were awarded a Nobel Prize in Literature since 2000


∇  *What happens if you misspell a name? Can you imagine why that happens?*






###4.1.3 `elif`

Should there be more than two possible outcomes (so not just `True` and `False`), you can use `elif`. You use this clause to check if a *second* (or third, or fourth...) statement is `True`. Compare the following:

In [None]:
test_string = "We die. That may be the meaning of life. But we do language. That may be the measure of our lives."


overall_counter = 0
e_counter = 0
rest_counter = 0
punctuation_counter = 0



for char in test_string: 
  overall_counter += 1
  if char == "e":
    e_counter += 1
  elif (char != ".") and (char != "\n"):
    rest_counter += 1
  elif char == ".":
    punctuation_counter += 1


print(e_counter)
print(rest_counter)
print(punctuation_counter)
print(overall_counter)

13
81
4
98


∇  *Can you read what gets counted in the lines of code above?*

You *can* specify a final `else`, but you *do not have* to. 

∇  *What* `char` *would fall under a final* `else` *in the lines of code above?*


##4.2 Functions


### 4.2.1 Introduction to Functions, Second Act
Instead of writing lines of code repeatedly, you can also store full lines as **functions**. A function only executes when called. As you might recall, we've talked about functions already in lesson one, as (metaphorically speaking) **verbs** to our variables. They do the heavy-lifting of transforming things! 

We had introduced functions already at the beginning of this course -- along mathematical lines, for example in the function `y = f(x)`, wherein the input `x` is transformed by the function `f` into the output `y`. 

But throughout the past notebooks, you've encountered other functions repeatedly. For example, at this point, you're comfortable writing the following:

In [None]:
print("Hello, World!")

This is a function, and while you may not know *what specifically* is happening behind the scenes (yet), by now you should know how you can use this function. Specifically, in `print()`, you are providing the parameter or argument of *what* you want to print (in the example above, specifically, `Hello, World!`). 

Now imagine you wanted to print `Hello, World!` repeatedly in your code -- you'd have to type that specific line multiple times. So let's just imagine you wanted to write *another* function just to print the specific string `Hello, World!`, whenever you call the function. You'd first have to *define* the function, using the `def` keyword:

In [None]:
def greet_world():
  print("Hello, World!")

After defining the function, you still have to call that same function to execute it. Let's do that in the following `if` statement:

In [None]:
x = "worldlings"
y = x[:5]

if y == "world":
  greet_world()

Hello, World!


∇  *Can you explain why the lines of code above return* `Hello, World!`*?*

Now, imagine that you greeted the world throughout writing code. At some point you may forget how you actually defined the function you use constantly -- and that's ok. Similarly, using other functions that you don't fully "get" yet is ok (and you'll certainly do it all the time). 

But if you need to get help, well... another function comes to your rescue! Specifically, Python has a built-in `help()` function, so let's try to get help for `print()`:



In [None]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



That might be a little... confusing? But let's try to read that documentation. 

∇  *Can you understand some of the things in the documentation above?* 

###4.2.2 Ascending the Ladder of Abstraction

One thing to keep in mind though is that you will encounter functions of increasing levels of abstraction. Take our example of the act of greeting someone again: while we have, above, defined `greet_world()` as a distinct function, we could also abstract it further -- greeting *someone* or *something* does much of the same thing as greeting the world, only that the input argument or parameter differs.


#### Level 1: Concrete ("hard coded") greetings

So in a sense, by defining our function `greet_world()` we actually hard-coded the parameter of *what* is greeted. Whoops!

In [None]:
greet_world()

Hello, World!


Now, we could again hard-code that you wanted to greet something else than the world. You could, again, define and call a specific function to that end:

In [None]:
def greet_ishmael():
  print("Hello, Ishmael!")

greet_ishmael()

Hello, Ishmael!


#### Level 2: Abstracting the person greeted

It's clear that there's a considerable overall between `greet_world()` and `greet_ishmael()`, not just in the output(`Hello, World!` and `Hello, Ishmael!`), but also in the structure of how we defined these functions. Now... realistically speaking, how often would you `greet_world()` and `greet_ishmael()`, and how often would you just greet *someone or something*? So... why not just write a function that would allow you to do the latter and that in other words allows for a more flexible parameter?

You can do so the following way:

In [None]:
def greet(x):
  print(f'Hello, {x}!')

We're here *again* defining `greet()`, which here takes, however, an argument or parameter `x`. Crucially, in the second line, we define that that variable `x` should take the place of what or who is greeted. So let's try that: 

In [None]:
greet('World')

Hello, World!


In [None]:
greet('Ishmael')

Hello, Ishmael!


Because `x` can stand in for anything, you can also greet others without defining a new function:

In [None]:
greet("'Python for Poets' group")

Hello, 'Python for Poets' group!


Also, we didn't define any limits for `x`, so now you can also do the amazing thing of greeting a `bool`: 

In [None]:
greet(True)

Hello, True!


#### Level 3: Abstracting the greeting used

Now, let's go one step further. What if the greeting itself (`Hello` thus far) is also just another variable in our process of greeting? 

So let's redefine `greet()` accordingly:

In [None]:
def greet(person, greeting = 'Hello'):
    print(f'{greeting}, {person}!')

You'll notice that `person` is a necessary argument of the function, while `greeting` is optional. This means that `greeting = 'Hello'` specifies the default greeting as `Hello`. Try it out:

In [None]:
greet("Ishmael")

Hello, Ishmael!


We can specify the `greeting` in a second argument. When the function is called subsequently, however, the `greeting` reverts back to the default value `Hello`:

In [None]:
greet("all of you", "Good afternoon")
greet("you there")

It's also ok to clarify the arguments as follows:

In [None]:
greet(person="Ishmael")

Hello, Ishmael!


What doesn't work, however, is any attempt to skip the necessary `person` argument:

In [None]:
greet(greeting="Good night")

TypeError: ignored

Remember? We defined `greeting()` as greeting *someone* or *something*. So make sure to include the key argument(s) in your functions:

In [None]:
greet(person="cow jumping over the moon", greeting="Goodnight")

Goodnight, cow jumping over the moon!


Of course, you can write much more elaborate functions, including also other functions. In that regard, compare the following example from the introductory notebook:

In [None]:
def greet_by_the_hour(person, utc_offset=-5):
    # get the hour
    import datetime
    hour = datetime.datetime.now().hour + utc_offset

    # if morning?
    if hour < 12:
        greeting = "Good morning"
    elif hour < 17:
        greeting = "Good afternoon"
    elif hour < 21:
        greeting = "Good night"
    
    return greet(person, greeting)


In [None]:
greet_by_the_hour('Ishmael')

Good morning, Ishmael!


∇  *Can you read the function above?*

###4.2.3 More on Functions in Functions

Embedding functions *into* functions can hence be very powerful, but may require some practice at first. You may also struggle to read a function you're using -- but that's ok; remember, you do not need to fully grasp every bit of code that your using. More important is understanding the process behind the code, and how to get the information you need. 

In the following, we'll walk you through this process, taking again functions we defined in notebook 1:

In [None]:
# Install a "python4poets" package of code (-q means quietly and -U means update)
!pip install -qU git+https://github.com/Princeton-CDH/python4poets

# import everything
from python4poets import *

# define debinarize() and problematize()
def debinarize(text):
    swap={
        'he':'they',
        'she':'they',
        'him':'them',
        'his':'their',
        'her':'their',
        'man':'person',
        'woman':'person',
        'husband':'spouse',
        'wife':'spouse'
    }
    return swap_words(text, swap)
  
def problematize(text):
    swap = {
        'universally':'in some circumstances',
        'truth':'position',
        'must':'might'
    }
    return swap_words(text, swap)

  Preparing metadata (setup.py) ... [?25l[?25hdone


∇ *What are things you notice about the two functions* `debinarize()` *and* `problematize()`*?*

It's also a good idea to test out whether the function fulfills its intended purpose:

In [None]:
# problematize the austen sentence
austen = 'It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.'
problematize(debinarize(austen))

'It is a position in some circumstances acknowledged, that a single person in possession of a good fortune, might be in want of a spouse.'

You should be equipped by now to understand the full syntax of the functions listed above; the only thing you wouldn't know is the embedded function `swap_words()`.



∇ *How could you find more information on that function?*

In [None]:
# maybe a line of code?


Once you have managed to track down `swap_words()`, you'll find it declared as follows:





In [None]:
def swap_words(text, swap):
    orig_words = tokenize(text)
    swap = {
        **dict((v,k) for k,v in swap.items()),
        **dict((k.title(), v.title()) for k,v in swap.items()),
        **dict((v.title(), k.title()) for k,v in swap.items()),
        **swap, 
    }

    new_words = [
        swap.get(orig_word, orig_word)
        for orig_word in orig_words
    ]
    
    return untokenize(new_words)

∇ *Can you understand parts of this?* 

∇ *Let's go back to* `debinarize()` *and* `problematize()` *-- do those functions make a little more sense to you now?* 

You may have to go through the [Python documentation](https://docs.python.org/3/tutorial/controlflow.html#more-on-defining-functions) or [Stackoverflow](https://stackoverflow.com/questions/36901/what-does-double-star-asterisk-and-star-asterisk-do-for-parameters) to understand parts of this function. Again, that's ok -- looking things up is a normal part of learning Python. 

You will *again* encounter functions in there -- actually two: `tokenize()` and `untokenize()`.

∇ *Judging by the names of these functions, what should they do?* 

(Additionally, you may have also noticed the dictionary method `dict.items()` and the string method `str.title()`, but let's not go down those routes right now).

So if you were to again look up `tokenize()` and `untokenize()`, you'd find the following:

In [None]:
def tokenize(text):
    """
    Split a text into tokens (words, morphemes we can separate such as
    "n't", and punctuation).
    """
    return list(_tokenize_gen(text))
    
def _tokenize_gen(text):
    for sent in nltk.sent_tokenize(text):
        for word in nltk.word_tokenize(sent):
            yield word

def untokenize(words):
    """
    Untokenizing a text undoes the tokenizing operation, restoring
    punctuation and spaces to the places that people expect them to be.
    Ideally, `untokenize(tokenize(text))` should be identical to `text`,
    except for line breaks.
    """
    text = ' '.join(words)
    step1 = text.replace("`` ", '"').replace(" ''", '"').replace('. . .', '...')
    step2 = step1.replace(" ( ", " (").replace(" ) ", ") ")
    step3 = re.sub(r' ([.,:;?!%]+)([ \'"`])', r"\1\2", step2)
    step4 = re.sub(r' ([.,:;?!%]+)$', r"\1", step3)
    step5 = step4.replace(" '", "'").replace(" n't", "n't").replace(
        "can not", "cannot")
    step6 = step5.replace(" ` ", " '")
    return step6.strip()

∇ *Can you understand parts of this?* 

Now, you'll notice that `tokenize()` again depends on another function (`_tokenize_gen()`, which, in turn, has its own dependencies, drawing in particular on an external library), so we might want to stop this game now. So let's instead try to use these functions as the comment of `untokenize()` suggests, on a `str` by Ursula Le Guin:

In [None]:
test_str = "I talk about the gods, I am an atheist. But I am an artist too, and therefore a liar. Distrust everything I say. I am telling the truth."
untokenize(tokenize(test_str))

'I talk about the gods, I am an atheist. But I am an artist too, and therefore a liar. Distrust everything I say. I am telling the truth.'

Well, that worked (whatever happened behind the scenes)!
Now, let's just skip the `untokenize()` part to see what actually happened:

In [None]:
tokenize(test_str)

['I',
 'talk',
 'about',
 'the',
 'gods',
 ',',
 'I',
 'am',
 'an',
 'atheist',
 '.',
 'But',
 'I',
 'am',
 'an',
 'artist',
 'too',
 ',',
 'and',
 'therefore',
 'a',
 'liar',
 '.',
 'Distrust',
 'everything',
 'I',
 'say',
 '.',
 'I',
 'am',
 'telling',
 'the',
 'truth',
 '.']