## Introduction
Welcome to the workshop **Python Fundamentals**, hosted by ITS - Research Services at the University of Iowa.

Instructor: Giang Rudderham

This workshop focuses on the **basic** concepts and features of Python that will serve as building blocks for data analysis tasks.

This notebook will cover **day 2 and part of day 3** of the workshop.

In this notebook we will cover:
* Conditional statements
* Defining functions
* More about strings
* String methods
* Lists
* Download notebooks
* Technical notes

TESTING


## Conditional statements

**Booleans** are very useful when used together with **conditional statements**: `if`, `elif`, and `else`.

For example, suppose you are working with survey data that include a variable about the respondents' sex assigned at birth. *(Note that sex assigned at birth is often a different variable than gender.)*

It's common for sex to be coded as numeric values following a scheme such as:

    Male -> coded as 0
    Female -> coded as 1
    Respondents refused to disclose sex during the survey -> coded as 9


In the example below, we create a variable called `sex_str` to correspond with the numeric variable `sex`. 

In [None]:
sex = 0

if sex == 0:
    sex_str = "Male"
    
# see what sex_str is, given that sex = 0 
print(sex_str)

Some observations about an `if` statement:
* Notice the `:` after the `if` statement. This is required syntax.
* The body of the code inside the `if` statement is indented. It's standard to use 4 spaces. Jupyter Notebook indents automatically for you.

Use `elif` and `else` to check all scenarios:

In [None]:
sex = 0

if sex == 0:
    sex_str = "Male"
elif sex == 1:
    sex_str = "Female"
elif sex == 9:
    sex_str = "Refused"
else:
    sex_str = "Missing"

# see what sex_str is, given that sex = 0 
print(sex_str)

In the block of conditional statements in the example above, each `if`, `elif` and `else` statement was checked, starting from the the top `if` statement. If the condition is true, the code inside the statement gets executed. Otherwise move down to the next statement.

At the end, all remaining cases get put in the `else` category.

`elif` is short for "else if". You can include as many `elif` statements as you want. It's optional to include `elif` and `else` in a block of conditional statements. Only the first statement, `if`, is required.

Similarly to the `if` statement:
* Notice the `:` after each `elif` and `else` statement. This is required syntax.
* The code inside each `elif` and `else` statement is indented. It's standard to use 4 spaces. Jupyter Notebook indents automatically for you.

## Defining functions

In the last lesson, we learned several built-in functions such as `type()`, `int()`, `float()`, `min()`, `max()`, and `abs()`.

We can also write our own functions in Python. 

Conditional statements become handy when we want to write our own functions. For example, continue with the example from the above section. Suppose you are working with survey data where sex is coded as numeric values following a scheme such as:

    Male -> coded as 0
    Female -> coded as 1
    Respondents refused to disclose sex during the survey -> coded as 9
    
*(Note that sex assigned at birth is often a different variable than gender.)*

In the example below, we create a function called `sex_to_str`, which returns the string that corresponds with a numeric value of sex. 

In [None]:
def sex_to_str(x):
    if x == 0:
        return "Male"
    elif x == 1:
        return "Female"
    elif x == 9:
        return "Refused"
    else:
        return "Missing"
 
# Test the function with a few examples
print(sex_to_str(0))
print(sex_to_str(1))
print(sex_to_str(9))
print(sex_to_str(-12))

The function takes an argument, `x`, which is a numeric code for sex. It returns a string for the numeric code.

Note the following about the syntax:
* To define a function, we start with the keyword `def`, give the function name, and then specify arguments (inputs to the functions) inside the parentheses.
* The `def`statement ends with a `:`.
* The body of code that follows the `:` is indented.
* If we want the function to return a value, we use the `return` keyword.

When we tested the function, we nested the `sex_to_str()` function inside the `print()` function. Python evaluates from the inside out, so `sex_to_str(0)` gets evaluated first, which returns `"Male"`. This value then gets passed to the `print()` function and gets printed out.

There are a lot more to learn about writing your own functions. For more information, see this section in the official Python tutorial: https://docs.python.org/3/tutorial/controlflow.html#defining-functions

## More about strings

In the last lesson, we learned to create a string in Python, use `\` to escape quotes in strings, use `+` to concatenate strings, and use `*` to repeat strings.

Today we will learn a few more things we can do with strings.

**Count the number of characters in a string**

The built-in function `len()` returns the length of an object. It is very useful and used frequently.

In [None]:
word = "Python"
len(word)

**Indexing** 

We can extract an individual character from strings.

In [None]:
word = "Python"
word[0]

In [None]:
word = "Python"
print(word[0])
print(word[1])
print(word[5])

Note that the first character has index 0.

Index -1 means the last character. -2 means second-to-last character, and so on.

In [None]:
print(word[-1])
print(word[-2])
print(word[-6])

**Slicing**

While indexing is used to obtain individual characters, slicing allows you to obtain substrings.

In [None]:
# get letters from position 0 (included) to 2 (excluded)
word = "Python"
word[0:2]

In [None]:
# get letters from position 2 (included) to 6 (excluded)
word = "Python"
word[2:6]

We can omit either the starting point or the ending point. Omitting the starting point means starting at index 0. Omitting the ending point means going to the end of the string.

In [None]:
# from the beginning of the string to position 2 (excluded)
print(word[:2])

# from position 2 (included) to to the end of the string
print(word[2:])

We can also use negative index.

In [None]:
# from the third-to-last character to the end of the string
print(word[-3:])

**Strings in Python are immutable**. We can't modify them. For example, **the code below will return an error**.

In [None]:
word = "Python"
word[5] = "j"

### String methods

Strings in Python have built-in **methods** (functions) that return different transformations of the string. Here are a few common and useful string methods.

* Convert a string to all upper case.

In [None]:
mystr = "Hackberry trees are native to Iowa."
mystr.upper()

Note the following about syntax:
1. we first call the variable `mystr`, then a `.`, then the `upper()` method follows .
2. we don't need to give any argument (input) to the `upper()` method. Each string method might require 0 or more arguments. **For a list of string methods and the required syntax, see https://docs.python.org/3/library/stdtypes.html#string-methods**

* Convert a string to all lower case.

In [None]:
mystr = "HACKBERRY TREES ARE NATIVE TO IOWA."
mystr.lower()

* Search the string for a specified value and returns the position of where it was found. If there are multiple matches, it only returns the position of the first one. If no match, it returns "-1".

In [None]:
mystr = "Hackberry trees are native to Iowa."
mystr.find("berry")

In [None]:
mystr = "Hackberry trees are native to Iowa."
mystr.find("blueberry")

**Note**: String methods return new values but do *not* change the original string. You have to overwrite (assign the new value back to the original variable) if you want to make changes to your string.

In [None]:
mystr = "Hackberry trees are native to Iowa."
print(mystr.upper())

# mystr hasn't been changed
print(mystr)

In [None]:
mystr = "Hackberry trees are native to Iowa."

# overwrite if want to change mystr
mystr = mystr.upper()
print(mystr)

### Your turn

1. The string method `index()` is similar to the method `find()`, but it returns an error if a substring is not found. Run the code below.

In [None]:
mystr = "Hackberry trees are native to Iowa."
mystr.index("berry")

Now use the `index()` method to find the substring `blueberry` in `mystr`. **The code should return an error.**

In [None]:
# Your answer here


2. The method `startswith()` returns `True` if the string starts with a specified substring. Otherwise it returns `False`.
Run the following code.

In [None]:
mystr = "Hackberry trees are native to Iowa."
mystr.startswith("berry")

Modify the code with the `startswith()` method so that it returns `True`.

In [None]:
# Your answer here


3. A similar method is `endswith()`. The method `endswith()` returns `True` if the string ends with a specified substring. Otherwise it returns `False`.

* Create a variable called `mystr` and assign to it the string `Harry - yer a wizard.`
* Write code with `endswith()` to check if `mystr` ends in `wizard` (no period).

In [None]:
# Your answer here


Modify the code with `endswith()` so that it returns `True`.

In [None]:
# Your answer here


4. The string method `split()` splits a string into a **list**. (We will learn about lists in the next section.) Run the code below.

In [None]:
mystr = "Harry - yer a wizard."
mystr.split()

As you can see, whitespace is the separator by default. We can also specify a different separator. Run the code below.

In [None]:
mystr = "Harry - yer a wizard."
mystr.split("-")

Now the returned list has only 2 strings, instead of 5 as before.

Your turn:
* Create a variable called `my_date_str` and assign to it the string `2021-09-16`.
* Use the `split()` method to split `my_date_str` at `-`.

In [None]:
# Your answer here


We can join the elements in a list into a string with the `join()` method. For example:

In [None]:
"/".join(["2021", "09", "16"])

## Lists

Lists are used to group together several values. The items in the list can be of the same or different types.

In [None]:
odds = [1, 3, 5, 7]
odds

In [None]:
characters = ["Elizabeth", "Emma", "Elinor", "Marianne", "Anne", "Catherine", "Fanny"]
characters

In [None]:
mixed = [1, "Elizabeth", 5, "Fanny"]
mixed

**Find the length of a list**

In [None]:
odds = [1, 3, 5, 7]
len(odds)

Like strings, lists can be **indexed** and **sliced**. What we learned about indexing and slicing strings can be applied to lists as well.

In [None]:
# indexing returns an item
print(odds[0])

# slicing returns a list
print(odds[0:2])

**Concatenate list**

In [None]:
odds = odds + [9, 11, 13]
odds

Lists are **mutable**, which means we can change their content.

In [None]:
odds = [1, 3, 5, 7]
odds[2] = 6
odds

In [None]:
odds = [1, 3, 5, 7]
odds[2:4] = [15, 17]
odds

In [None]:
characters = ["Elizabeth", "Emma", "Elinor", "Marianne", "Anne", "Catherine", "Fanny"]

# replace items in position 0 (included) to 2 (excluded)
characters[0:2] = ["Darcy", "Knightley"]
print(characters)

In [None]:
characters = ["Elizabeth", "Emma", "Elinor", "Marianne", "Anne", "Catherine", "Fanny"]

# remove items in position 3 (included) to 5 (excluded)
characters[3:5] = []
print(characters)

In [None]:
characters = ["Elizabeth", "Emma", "Elinor", "Marianne", "Anne", "Catherine", "Fanny"]

# clear the list by replacing all the elements with an empty list
characters[:] = []
print(characters)

**Nested lists**

We can create a list that contains other lists.

In [None]:
odds = [1, 3, 5, 7]
characters = ["Elizabeth", "Emma", "Fanny"]
mixed = [odds, characters]
print(mixed)

In [None]:
# get the second item in the list mixed
print(mixed[1])

In [None]:
# get the second item in the second item in the list mixed
print(mixed[1][1])

### List methods

Like strings, lists also have **methods**.

Add a value to the end of the list

In [None]:
odds = [1, 3, 5, 7]
odds.append(15)
odds

We will learn more in the next lesson!

## Technical notes

This workshop is taught using Python and Jupyter Notebook on the Interactive Data Analytics Service (IDAS) from ITS - Research Services at the University of Iowa.

Access to this workshop instance (computing environment) will end at **5 p.m. on Monday, October 31, 2022**.

**Please make sure to download all notebooks and files from this workshop instance to your personal computer**. Data are not backed up on this workshop instance. See instructions below to download a notebook from IDAS to your computer.

Current University of Iowa members can request a **free** IDAS account for research use. For more information, see https://wiki.uiowa.edu/display/hpcdocs/Requesting+An+IDAS+Account.

## Download notebooks to your computer

* Save any notebook that is running. Press **Ctrl + s** (Windows) or **Command + s** (Mac).
* In the menu bar in Jupyter Lab, click **File**, then click **Download** in the drop-down menu.