## Introduction
Welcome to the workshop **Python Fundamentals**, hosted by ITS - Research Services at the University of Iowa.

Instructor: Giang Rudderham

This workshop focuses on the **basic** concepts and features of Python that will serve as building blocks for data analysis tasks.

Today is **day 1** of the workshop. This notebook will cover day 1 and part of day 2 of the workshop.

In this notebook we will cover:
* Using Jupyter Notebook - the basics
* Numbers and arithmetic
* Built-in functions
* Variable assignment
* Strings
* Booleans
* Comparison statements
* Download notebooks
* Technical notes

## Jupyter Notebook

This is a Jupyter Notebook. Jupyter Notebooks contain **text cells** and **code cells**.

To run a code cell, click in the gray cell and press **Shift + Enter** (Windows) or **Shift + Return** (Mac).

For example, run the code cell below. *Don't worry about understanding it for now.*

In [1]:
# ARemark: Comment are written using POUND key
# I have a dog named Bonnie
# Bonnie hasn't seen any squirrels yet at the start of the walk
squirrels_seen = 0
print(squirrels_seen)

# 20 minutes into the walk, she has seen 4 people, 1 dog, 1 bunny, and 3 squirrels
squirrels_seen = squirrels_seen + 3

if squirrels_seen > 0:
    print("But I want to run after the squirrels!")
    
bonnie_thoughts = "squirrels! " * squirrels_seen
print(bonnie_thoughts)

0
But I want to run after the squirrels!
squirrels! squirrels! squirrels! 


***
**Text cells**, like this one, are written in a language called Markdown. Double click on a text cell to edit it. You can run a text cell similarly to running a code cell. Running a text cell will return nicely formatted text.

### Your turn

Practice editing a text cell. Double click on the text cell below to edit it. When you are done, run the cell by pressing **Shift + Enter** (Windows) or **Shift + Return** (Mac).

**How do you like Jupyter Notebook so far?**

*Your answer here:* Pretty nice, is this the one people commonly use?

***
As you work through a notebook, make sure to save it from time to time. Press **Ctrl + s** (Windows) or **Command + s** (Mac).

There are many more things we can do with a notebook. For a beginner-friendly tutorial, see https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330.

***
Two types of data that you will often run into are numbers and text (strings). We will look at each of them below.

## Numbers and arithmetic

Python can act as a simple calculator. Run the code cells below to see the outputs.

To run a code cell, click in the gray cell and press **Shift + Enter** (Windows) or **Shift + Return** (Mac).

In [2]:
2 + 2

4

In [3]:
20 - 10

10

Here are a few other operations that Python can also do.

| Operator | Name |	Description |
| --- | --- | ---|
| a + b | Addition | Sum of a and b |
| a - b | Subtraction | Difference of a and b
| a * b | Multiplication | Product of a and b
| a / b | True division | Quotient of a and b
| a // b | Floor division | Quotient of a and b, removing fractional parts
| a % b | Modulus | Integer remainder after division of a by b
| a ** b | Exponentiation | a raised to the power of b
| -a | Negation | The negative of a

In that table, notice that there are 2 types of divisions. "True division" is what most of us are used to.

In [None]:
31 / 5

"Floor division" gives us the result rounded down to the next integer.

In [None]:
31 // 5

### Order of operations

In school you probably learned the order of arithmetic operations: PEMDAS, which stands for **P**arentheses, **E**xponents,  **M**ultiplication/**D**ivision, **A**ddition/**S**ubtraction.

The rules are similar in Python.

In [None]:
# 5 * 6 gets computed first
50 - 5 * 6

Using `()` changes the order of operations and gives us a different result.

In [None]:
# (50 - 5) gets computed first, due to the parentheses
(50 - 5) * 6

In the examples above, lines that start with `#` are **comments**. Comments are not executed but are important to document code.

### Types of numbers

Numbers such as `6` and `6.2` are recognized by Python differently. We can use the built-in function `type()` to see how Python would describe `6` and `6.2`.

In [None]:
type(6)

In [None]:
type(6.2)

`int` is short for integer, which is a whole number. A `float` is a number with a decimal place.

### Built-in functions

**Built-in functions** are functions that come with Python, ready for us to use without additional installation. As you can see in the examples above, we call a function first with its name, `type`, put parentheses after its name, and then put inputs inside the parentheses.

Inputs to a function, such as `6` and `6.2` in the examples above, are called **arguments**.

For a list of Python built-in functions, see https://docs.python.org/3/library/functions.html

`int()` and `float()` are built-in functions that convert numbers to the corresponding types.

In [None]:
int(6.2)

In [None]:
float(6)

`min()` and `max()` return the minimum and maximum, respectively.

In [None]:
min(2, 3, 4)

In [None]:
max(-2, -3, -4)

### Your turn

1. In Iowa, we started social distancing due to COVID-19 in March 2020. Suppose it has been **18 months** since the start of social distancing.

Use Python to calculate the number of **years** it has been since the start of social distancing. Use true division.

In [10]:
# Your answer here
18/12

1.5

2. What kind of numbers do we get when we use true division? Test this by typing `type(18 / 12)` in the code cell below and running the cell.

In [7]:
# Your answer here
type(18/12)

float

3. What kind of numbers do we get when we use *floor* division? Test this with `type(18 // 12)`.

In [8]:
# Your answer here
type(18//12)

int

4. `abs()` is another useful built-in function. Look up the documentation for it at https://docs.python.org/3/library/functions.html#abs.

Use `abs()` to return the absolute value of `-5`.

In [9]:
# Your answer here
abs(-5)

5

## Strings

![image.png](attachment:image.png)

Strings can be enclosed in either single quotes or double quotes.

In [11]:
"hi Bonnie!"

'hi Bonnie!'

In [None]:
'I want to chase squirrels!'

Use `\` to escape quotes.

In [12]:
'Sorry, you can\'t.'

"Sorry, you can't."

The built-in function `print()` omits the enclosing quotes and gives a more readable output.

In [13]:
print('Sorry, you can\'t.')

Sorry, you can't.


Use `+` to **concatenate** strings. Concatenating 2 strings means linking them end-to-end.

In the example below, we start with the string "chas" and create different forms of the verb "chase".

In [18]:
stem = "chas"
print(stem + "ed")
print(stem + "ing")
print(stem + "es")
print(stem + "es3")
print(3)

chased
chasing
chases
chases3
3


In the example above, we created a variable called `stem` and assigned to it the value "chas". The `=` is called the **assignment operator**.

We can reassign a variable and change its value or even its type.

In [17]:
stem = "chas"
print(stem + 3)

TypeError: can only concatenate str (not "int") to str

Note that We cannot do the above, ie we cant just print stem+3, the probelm here is you cant add a string and anumber?

In [None]:
# initially assign stem the string "chas"
stem = "chas"
type(stem)

In [None]:
# now change stem to a different type
stem = 2
type(stem)

We can also modify `stem` and overwrite its previous value. In this example, Python calculates the expression on the right-hand side of `=`, which is `stem + 3`, and then assigns the result to the variable on the left-hand side.

In [None]:
stem = 2
# overwrites stem with a new value
stem = stem + 3
stem

***
Back to strings, we can use `*` to repeat strings.

In [None]:
thoughts = "go! "
print(thoughts * 3)

There is plenty to learn about strings. We will come back to strings in a later lesson.

### Your turn

1. Use `\` to print the string
`She said, "I can't wait to visit Cairo!"`.

*Hint: recall that `\'` will print as `'`*.

In [None]:
# Your answer here


2. The **newline character** `\n` will cause Python to start a new line in a string. For example,

`print("Roses are red\nViolets are blue")`

will return:

`Roses are red
Violets are blue`.

Use `\n` to print the string:

`Is it time for a walk yet?
No, just a minute.
How about now?
Not yet.
How about now??
Okay, okay.
`

In [None]:
# Your answer here


**Aside**: We can use tripple quotes to enter a new line without using `\n`.

In [None]:
print('''Is it time for a walk yet?
No, just a minute.
How about now?
Not yet.
How about now??
Okay, okay.''')

3. Bonnie is currently 2.5 years old. 
* Create a variable called `bonnie_age` and assign to it the value `2.5`.
* What's Bonnie's age in 2 years? Add `2` to `bonnie_age` and reassign it back to `bonnie_age`, overwriting the original value of `bonnie_age`.

In [None]:
# Your answer here

# Check your answer
bonnie_age

## Booleans

Another type of variable that is very helpful for cleaning and analyzing data is `bool` or booleans. A boolean can either have value `True` or `False`.

In [None]:
x = True
type(x)

In [None]:
# False is also a boolean
x = False
type(x)

### Comparison operations

We typically get a boolean value by running comparisons. Below are some comparison operators.

| Operation | Description |	Operation| Description |
| --- | --- | --- | --- |
| a == b | a equal to b | a != b | a not equal to b |
| a < b | a less than b | a > b | a greater than b |
| a <= b | a less than or equal to b | a >= b | a greater than or equal to b |

Here are some examples.

In [None]:
2 < 3

In [4]:
-2 > -1

False

In [None]:
# Python is able to compare floats and integers and returns result as expected.
3.0 == 3

In [None]:
# be careful if you are trying to compare strings and numbers!
"3" == 3

What's the difference between `=` and `==`? We use `=` to assign value to a variable. We use `==` to compare 2 values.

In [None]:
# assign 2 to a and 3 to b
a = 2
b = 3

# compare a and b
a == b

### Combining comparisons: `and`, `or`, `not`

We can combine multiple comparisons by using `and`, `or`, and `not`.

In [None]:
(9 > 1) and (7 > 2)

In the example above, both conditions in the parentheses are true, so combining them with `and` returns `True`.

There is much more to learn about logic, but here are some simple summary tables for `and`, `or`, and `not`.

**Table for `and`:**

| P | Q | P and Q | 
| --- | --- | --- | 
| True | True | True | 
| True | False | False | 
| False | True | False | 
| False | False | False | 

In words, `P and Q` is true only when both `P` and `Q` are true.

Our example using `and` above illustrates the first row of the table above.

Here's another example, this time illustrating the *third* row of the table. Notice the comparison operator `==`, which was used to check an exact match of a string.

In [None]:
# define Bonnie's age and breed
bonnie_age = 2.5
bonnie_breed = "Great Pyrenees"

# check if Bonnie is a Great Pyr puppy
(bonnie_age < 1) and (bonnie_breed == "Great Pyrenees")

***

**Table for `or`:**

| P | Q | P or Q | 
| --- | --- | --- | 
| True | True | True | 
| True | False | True | 
| False | True | True | 
| False | False | False | 

In words, `P or Q` is true when either `P` or `Q` is true.

This example with `or` illustrates the second row of the `or` table above.

In [None]:
# using "or", only 1 condition needs to be true in order for the whole statement to be true.
(2 < 3) or (4 < 1)

Here's the example previously used with `and`, this time using `or`. It illustrates the *third* row of the `or` table. Notice the comparison operator `==`, which was used to check an exact match of a string. 

In [None]:
# define Bonnie's age and breed
bonnie_age = 2.5
bonnie_breed = "Great Pyrenees"

# check if Bonnie is a Great Pyr OR a puppy
(bonnie_age < 1) or (bonnie_breed == "Great Pyrenees")

***

**Table for `not`:**

| P | not P | 
| --- | --- | 
| True | False | 
| False | True | 

In words, if `P` is true, then `not P` is false. 

This example using `not` illustrates the first row of the `not` table.

In [None]:
# x < 10 is True so not(x < 10) is False
x = 5
not(x < 10)

Here's the example previously used with `and`, this time using `not`. It illustrates the *second* row of the `not` table. Notice the comparison operator `==`, which was used to check an exact match of a string. 

In [None]:
# define Bonnie's age and breed
bonnie_age = 2.5
bonnie_breed = "Great Pyrenees"

# check if Bonnie is a puppy
not (bonnie_age < 1) 

***

Spend some time on your own to make sure you understand the tables above. It's helpful to be able to write conditional statements when you write code.

### Your turn

One common task in data cleaning is to filter the dataset to keep only certain rows that meet certain criteria. These exercises are simple examples of that task.

1. Suppose you have a patient whose age is 25 years old. Create a variable called `patient_age` and assign to it the value `25`. Then write a comparison statement to check if the patient is in the age group "less than 50 years old".

In [None]:
# Your answer here


2. In many datasets, it's common to code sex assigned at birth as either 0 (male) or 1 (female).** 

Suppose you have a *male* patient whose age is 25. 
* Create a variable called `patient_age` and assign to it the value `25`. 
* Create a variable called `patient_sex` and assign to it the value `0` (for male).
* Then write a comparison statement to check if the patient is a **female** who is less than 50 years old.

*Note that in many datasets, sex assigned at birth is a different variable than gender.*

In [None]:
# Your answer here


3. Now suppose you want to look at any patient who is either female or less than 50 years old. The patient doesn't have to meet both criteria, just either one.

Continue with our example of a *male* patient whose age is 25. 
* Create a variable called `patient_age` and assign to it the value `25`. 
* Create a variable called `patient_sex` and assign to it the value `0` (for male).
* Then write a comparison statement to check if the patient is either a *female* **or** is less than 50 years old.

In [None]:
# Your answer here


4. Now suppose you want to look at any patient who is **not** a female that is less than 50 years old. 

Continue with our example of a *male* patient whose age is 25. 
* Create a variable called `patient_age` and assign to it the value `25`. 
* Create a variable called `patient_sex` and assign to it the value `0` (for male).
* Then write a comparison statement to check if the patient is **not** a female that is less than 50 years old. 

In [None]:
# Your answer here


***
Just for fun, here is the example that was shown at the beginning of this notebook. Now read through it. How much of the code do you feel you can understand now?

What concept haven't we talked about yet?
*Hint: it's the `if` statement, which we will talk about in the next lesson.*

In [None]:
# I have a dog named Bonnie
# Bonnie hasn't seen any squirrels yet at the start of the walk
squirrels_seen = 0
print(squirrels_seen)

# 20 minutes into the walk, she has seen 4 people, 1 dog, 1 bunny, and 3 squirrels
squirrels_seen = squirrels_seen + 3

if squirrels_seen > 0:
    print("But I want to run after the squirrels!")
    
bonnie_thoughts = "squirrels! " * squirrels_seen
print(bonnie_thoughts)

## Technical notes

This workshop is taught using Python and Jupyter Notebook on the Interactive Data Analytics Service (IDAS) from ITS - Research Services at the University of Iowa.

Access to this workshop instance (computing environment) will end at **5 p.m. on Monday, October 31, 2022**.

**Please make sure to download all notebooks and files from this workshop instance to your personal computer**. Data are not backed up on this workshop instance. See instructions below to download a notebook from IDAS to your computer.

Current University of Iowa members can request a **free** IDAS account for research use. For more information, see https://wiki.uiowa.edu/display/hpcdocs/Requesting+An+IDAS+Account.

## Download notebooks to your computer

* Save any notebook that is running. Press **Ctrl + s** (Windows) or **Command + s** (Mac).
* In the menu bar in Jupyter Lab, click **File**, then click **Download** in the drop-down menu.