<div style="text-align: right">
    <i>
        Python for Programmers <br>
        Spring 2023 <br>
        Anuar Assamidanov
    </i>
</div>

# Notebook 1: Basic IO, variables, boolean expressions

This notebook introduces basic programming terminology such as *functions*, *arguments*, *variables*, *IO* etc. It shows how to implement basic input-output operations using `print` and `input`. Then it explains Python fundamental *data types*, such as `str`, `int`, `float` and `bool`. Finally, it demonstrates how to create and evaluate Boolean expressions.

## 1. The Jupyter environment

In this class we are going to use Jupyter notebooks for coding. They are minimalistic, easy to read, and they make it easy to write explanations in Markdown. Moreover, they allow us to work with the code in each cells separately. By now, you should be familiar with this environment.

In order to avoid the hassle of installing Jupyter notebooks locally, we will be using [Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb). If you decide to work locally, get in touch with me **and** make sure you have all the required Python dependencies intalled prior to beginning each homework. 

We will use Python 3 exclusively.
If you are **not using Colab**, make sure that your notebook is running the **correct kernel**, by checking that Python 3 is listed in the top right corner.

## 2. Functions and their arguments

The notion of functions and arguments is probably not new to you, especially if you took formal semantics classes. Roughly, _John sleeps_ can be expressed as a function `sleep` getting `John` as an argument and therefore yielding `sleep(John)`.

In Python, and in programming in general, **functions** can be thought of as descriptions of actions. They always return some value. For example,
 * a function that reverses strings returns another string;
 * a function that adds two numbers together returns their sum;
 * a function that calculates number of symbols in a sentence returns that number.
 
However, sometimes it might seem that a function is not returning anything. Spoiler: it returns something, and this something is *nothing*, or `None`.
We will come back to this later in the course.

In order to perform the intended chain of actions, a function needs zero or more **arguments**, or objects that are required in advance by that function. For example:
 * a function that reverses strings needs to have 1 argument: a string to reverse;
 * a function that removes first _n_ words from a sentence needs 2 arguments: the number of words to remove, and the sentence itself;
 * a function that prints "Hello world!" needs 0 arguments: we know exactly what we are printing.


**Question:** what arguments can a function that draws a circle have?



# Print

The simplest function in Python is `print`. It simply displays on the screen its argument(s):

In [1]:
print(3.14) 

3.14


In simple words, the `print` function simply prints on the screen whatever it has in the parentheses. But in the previous cell, we see that the two arguments of `print` are highlighted in different colors. This is because Python's syntax highlighting shows that they belong to _different data types_.

### Comments

If you tried running the cell above (if you didn't, try it now!), you shoulw have noticed that:
- only "Hello word!" gets printed
-it looks like the first line is in plain English, has a different color than standard code, and it doesn't to do anything.

Note that the first line starts with a #. This is the Python symbol to mark the start of a **comment**. 
The # tells Python to ignore that line.
Comments can be used to leave yourself and your fellow programmers short descriptions of the code or reminders of what need to be fixed.
They can also be used to make some code invisible to Python (maybe because you are not sure it is needed, or because you came up with multiple alternative interpretations).

Good commenting is very important in programming, as it makes your code more readable and easier to understand (both for you and others).


## 3. Basic data types: int, float, str, bool

Data types are the fundamental building blocks of programming.

We will work on this concept and work through Python's so-called *primitive* data types throughtout this notebook. But you can get a heads up by running the cell below and watching the (optional) video:

In [1]:
from IPython.display import IFrame
IFrame('https://www.youtube.com/embed/A37-3lflh8I', width=700, height=350)

We will start exploting Python data types by looking at integers, floats, strings, and booleans.

**Integers** (`int`) are numbers without a fractional component. For example, `8`, `0`, `-1` and `-9248` are integers, whereas `3.14` or `-1.333` are not. 


**Floating point numbers** (`float`) are numbers with the fractional component, i.e. `9.8`, `4.3958` or `-8.000001`.  

This is a very important distinction, since integers and floating point numbers are stored differently in the memory of the computer. A function that conveniently shows the type of its argument is `type`. Like this, we can ensure that `8` is an integer, whereas `8.5` is not:

Note, that when `8` is written as `8.0`, it is a `float` and not an integer!

We can perform arithmetic operations with integers and floats.

In [None]:
# addition


In [None]:
# substraction


In [None]:
# multiplication


In [None]:
# division that returns a floating point number ("classic division")


In [None]:
# division that rounds down the result to a nearest integer ("floor division")


In [None]:
# exponentiation


**Practice:** how can you calculate the square root of `1024` using  the Python knowledge that we already have?

On a separate note, notice two things:
 * there is an orange _Out\[number\]_ right next to the outputs of every cell, and
 * we didn't use `print`, and still saw the results of the operations!
 
This happens because when we run a cell, the output of the last operation is being displayed on the screen. So what do you think we will see when we run the next cell? Try to answer **before** running it.


Did you see what you were expecting? This shows us that only the last operation is displayed as the output of a cell.
In general, if we want to make sure that something is externally shown -- eg. we want to dispay the output of every operation -- we should use the `print` function.

Back to data types! **Strings** (`str`) are sequences of characters: `"apple"`, `'Hello world!'` or `"My phone number is 123."`. **Strings are always surrounded by quotes!** These quotes can be either single or double, just use them consistently.

**Question** What happens if you type a string with mismatched quotes? Try it out in the next cell.

Note that, when a number is surrounded by quotes, it is not an integer or a float, but it is a string!

In the previous code cell, we see the following line: `print(type(5))`. It simply means that the output of `type(5)` is passed to the `print` function as an argument, i.e. `type(5)` tells that the type of `5` is an integer, and the `print` function catches that output and displays it on the screen.

For strings, the `+` operator defines concatenation.

In [None]:
# "15" and "1" are strings, not integers!


**Practice:** what will happen if we add the string "15" to the integer 1? And can we use other integer operators with strings?

In [5]:
print("15" * 2)

TypeError: can only concatenate str (not "int") to str

A frequent task is to convert a variable from one type to another, or to perform _typecasting_. If we want to change the type of a string to an integer, for example, to be able to perform arithmetic with a number that was represented as a string, we can use the `int` or `float` functions.

A value can be converted from another type to a string by using the `str` function.

Finally, **booleans** (`bool`) are `True` and `False`, or simply `1` and `0`. ([Why are they called Boolens though?](https://www.webopedia.com/TERM/B/Boolean_logic.html#:~:text=Named%20after%20the%20nineteenth%2Dcentury,of%20either%201%20or%200.))

Booleans are the "answers" to such questions like the following ones.
 * Does this phrase contain the word "linguistics"?
 * Is the sum of those two number bigger than 17?
 * Have we already seen this sentence before?
 
We will see very soon how extremely useful booleans are.

## 4. Variables

Up to this point, we have worked with operations unrelated to each other. But what if we want to use the output of a recent operation as the input/argument for a new one? What we need is a system to store and retrieve data from memory.

The way to store some value in the memory of the computer is to define a _variable_ that refers to that value. In some sense, a variable is a name of the value, shared between us and the computer. As soon as we _declare_ that variable, we can use it to refer to its value.

For example, we can define a variable `name` and then use it if we want to greet someone:

The value of the variable can be of any data type.

Note that `type()` now gives you the type of the value stored in your variable. If there are several lines where the same variable name is defined, only the last definition matters.

**Laws of variable names**
 * Variable names are not strings: they are not surrounded by quotes!
 * They cannot start with a digit.
 * They cannot contain spaces or special symbols such as $, !, ~, etc. (The underscore is fine though!)

**Warning:** never (unless you are doing it on purpose!) define a variable using the term that already means something for python (`print`, `int`, `type`, etc.) It is possible, but it will break _a lot_ of things. Can you think of why that might be a problem? What would happen if you define a variable named `print`?

Now that we have introduced variables, we can easily store the result of an operation.

If the variable value needs to be updated with respect to its old value, we can use the following operators:
 * `var += some_value` (same as `var = var + some_value`);
 * `var -= some_value` (same as `var = var - some_value`);
 * `var *= some_value` (same as `var = var * some_value`);
 * `var /= some_value` (same as `var = var / some_value`).

### Note: Cells and order of execution

Now that we are working with variables, it will be common for you to write code in some cell, that depends on the results computed in a previous cell.
Note that cells in a notebook are **independent**. So you have to make sure to execute them in the correct order for dependences between them to work.
Consider the to cells below. Try to execute the second cell without executing the fist. What happens? Why?

Now execute the first cell, and then run the second cell again. What happens? Why?

## 5. Basic IO

We already know that the way to display values on the screen is to `print` them. However, many tasks (e.g. chatbots!) rely on the input from a user. In Python, `input` takes care of it!

The `input` function allows users to enter some information, and makes that information available to the program as **a string**.
Thus, it is important to save the results of the input into some variable.

If `input` is called without any arguments (i.e. as `input()`, don't forget the parentheses!), it simply waits for the user to type in some information.

However, if `input` is called with an argument, this argument is displayed next to the input window. Compare it with having a `print()` right before the empty `input()`.

## 6. Boolean expressions

Booleans expressions are expressions that can be evaluated to `True` or `False` (hence, they return a boolean value). There are multiple logical operators that help us to form them.

The operator `==` checks for the equality of its left and the right sides.

The opposite operator to `==` is `!=`, it checks for non-equality:

Operators `>`, `>=`, `<` and `<=` are defined as well.

The operator `in` checks if the left-hand side object is contained within the right-hand side one.

The operator `not` reverses the truth value to the opposite one.

Apart from the above listed operators, there are _complex operators_ `and` and `or`. Boolean expressions can be combined using these operators.
* **`and`** returns true if it combines two expressions, and both of them evaluate to True;
* **`or`** returns true if at least one of the expressions it combines evaluates to True.

_Beware of the scope_: `(A and B) or C` is not the same thing as `A and (B or C)`!

**Practice**: Try to guess the value each cell will return *before* running it!

Still confused? Boolean operators are used a lot in search engines (like those administering library databases!)You might want to check out this video to learn about the intuition behind Boolean operators.

In [2]:
IFrame('https://www.youtube.com/embed/sdx9dACkvyI', width=700, height=350)

A good way to familiarize yourself with Boolean operators is to think of them in terms of Truth tables. This requires doing a bit of math, but if you are interested in learning more, check out this video:

In [3]:
IFrame('https://www.youtube.com/embed/jbete3iXbdM', width=700, height=350)

## 7. More magic with strings (aka String methods)

If you keep working in computational linguistics, you are going to end up hearing about **Bag-of-words** models. These are models of meaning, which assume that the meaning of the text can be represented by all the words found in the text and their frequency. Intuitively, if a text is about pets, we expect words such as "cat" and "dog" to be more frequent in it, and if a text is about politics, words such as "president", "market",  and "GDP" will occure more often.

However, there are words that are frequent in all types of texts: "and", "of", "the", "a(n)", "there", and so on. These words are called **stop words**, and since they are not informative for modeling the meaning of the text, they are frequently removed from it.

Similarly, for many linguistic tasks capitalization does not matter. For example, when the task is to get rid of stop words, we want to get rid of them independently of capitalization ("THE", "the", "The", etc.) However, for Python, "the" and "The" are completely different words.

There is a way to map all the versions of "the" with different capitalizations to "the": `str.lower("ThE")`.

Similarily, there are functions `upper` and `title` that convert a string to uppercase or capitalize it.

In the examples below, we are printing parentheses inside of the other parentheses by alternating their types (i.e. if double quotes are marking the string, the single quotes are used inside, or vice versa).

Another way to do it is to use a special _escape symbol_ `/` before the quotation mark that we want to have as a part of the string.

Another important special symbols are the `\n` (new line) and `\t` (tabulation).

There are also functions that allow to check if the string is uppercase, lowercase, or title, and these functions are:
 * `str.isupper("Hello")` checks if a string is uppercase;
 * `str.islower("Hello")` checks if a string is lowercase;
 * `str.istitle("Hello")` checks if a string is a title.

Another very useful function is `len`. Can you guess what it does based on the outputs of the following cells?

## Tips for learning Python Syntax

At this point, you might feel overwhelmed by the amount of information we cover. My tip here is to distinguish between things that you need to fully know and understand, and things you need to memorize.

For instance, you need to understand what data types are, and how to assign a value to a variable. You also need to know that there are string methods to convert lowercase letters to uppercase letters.
BUT, do you need to remember that the exact command for that is str.islower? Not at first!

My advice it to create your own "dictionary" for this course (as if you were learning a new language!), and list in it all the new commands that we encounter. 
You can also crowdsource it as a resource for the class, maybe coordinating the work via Slack!

# Practice Problems


**Problem 1.** You are given the following paragraph.

In [49]:
text = "A glance around her studio reveals some of the complexity. The place is packed chockablock " \
       "with clusters of objects grouped by type: alarm clocks (maybe two dozen), antique books, model " \
       "clipper ships, African masks, birdcages, globes, painted wood watermelon slices, the Mexican " \
       "healing charms known as milagros and so-called mammy dolls piled on a chair."

Write a code that will ask the user to enter a word. Then check if this word is contained in the text given above.

In [1]:
user_input = input("test")

**Problem 2.** Write a condition that checks if the word "banana" is present in the user input. Make sure that it works independently of the capitalization!

**Problem 3.** Ask the user for the year in which they were born, and print the age of the user. (Assume that the user's birthday is always January 1 so that the calculation is simple.)

_Hint:_ the `int` function might be useful here.

**Problem 4.** With string concatenation you can also play a round of *Mad Libs*.
If you aren't familiar with the game, here's how it works: you have a predetermined text where certain words have been replaced by their part of speech, for example *verb*, *noun*, *adjective*.
For each gap, you ask a friend to say a word of that part of speech.
You then put those words in the gap and read out the text aloud.
Ideally, hilarity ensues.

Here's an example adapted from the very first Mad Libs book:

~~~
"[exclamation]! He said [adverb] as he jumped into his convertible
[noun] and drove off with his [adjective] friend."

"Ron! He said better as he jumped into his convertible
Tesla and drove off with his irritating friend."
~~~

Write a program that allows the user to play a single round of Mad Libs with the computer (e.g. like the round above).

**Problem 5.** Ask the user to define what is the minimal number of characters for a word to be considered long. Ask for another input, in this case, a word. Afterwards, write a boolean expression that checks if the word provided by a user is long or not.