# Introduction to Python
MiCM Workshop - February 20, 2024

Benjamin Z. Rudski, PhD Candidate, Quantitative Life Sciences, McGill University

Dear `Reader | Workshop Attendee`,  
Welcome! In this interactive Jupyter Notebook, I will introduce you to the [Python](https://www.python.org) programming language. In this journey, we'll go from understanding the basics of computers to creating variables, using functions and working with packages.

This notebook is the **solution version**, which contains the answers. There is also **student version** in the [`code`](../code/) folder. The **student version** contains several blanks where I will write code during the live workshop and where you can fill out exercises. I recommend trying the exercises yourself before looking at the solutions, There is often more than one way to answer a programming question, so you should focus more on understanding the code that you are writing, instead of just copying my answers. You may come up with an answer better than the one I've provided!

Here's the outline of this workshop:

1.	Module 1 – Introduction to Programming (30 minutes)
    1.	Basic Concepts and Definitions
        1.	What is a Computer?
        2.	What is a Program?
        3.	What are Programming Languages?
    2.	Welcome to Python
        1.	What is Python?
        2.	How to Install Python
        3.	Tools for Using Python
2.	Module 2 – Python Basics (1 hour, 15 minutes)
    1.	Foundations of Python - A Brief Overview of Types and Variables
        1.	Primitive Data Types (int, float, bool, string)
        2.	Variables
        3.	Collection Data Types (tuples, lists, dictionaries)
        4.	Introduction to Functions (Function as a Machine)
    2.	Numbers and Comparisons
        1.	Mathematical Operations
        2.	Integers and Floating-Point Numbers
        3.	Booleans
    3.	Intro to Control Flow and Loops (if, while and for)
        1.	Control Flow: the if Statement
        2.	while Loops
        3.	Iteration with for Loops
    4.	Exercise: Numbers and Loops for Unit Conversion
3.	Module 3 – Strings and Collections: An Object Primer (1 hour)
    1.	Introducing the String!
        1.	String Slicing
        2.	String Methods (concatenation and string formatting, converting strings to numbers, find and replace)
        3.	String Exercise: DNA Processing
    2.	Introduction to Tuples, Lists and Dictionaries
        1.	Tuples and Tuple Unpacking
        2.	Lists and List Methods (adding, removing, slicing)
        3.	Dictionaries (Key-Value storage, accessing, adding, removing)
    3.	Exercise: Working with Strings and Collections for DNA and Protein Processing
4.	Module 4 – Modules and Packages (40 minutes)
    1.	Using Modules
        1.	What is a Module?
        2.	Importing a Module
        3.	Importing Specific Functions
    2.	Package Management
        1.	What is a Package?
        2.	Installing Packages using conda 
        3.	Installing Packages using pip
        4.	Other Installation Tips
        5.	Using Packages and Reading Documentation
    3.	Exercise: importing a module from the standard library and using its functions.
5.	Module 5 – Where to go from here (10 minutes)
    1.	What to learn next? How?
    2.	How to get help and how not to get help?
        1.	Your code editor
        2.	Documentation
        3.	Books
        4.	Tutorials
        5.	Stack Overflow (and pitfalls)
        6.	ChatGPT (and pitfalls)
    3.	Other cool programming topics (to mention, not to cover)
        1.	Writing packages
        2.	Object-Oriented Programming
        3.	Developing Graphical User Interfaces
        4.	Hosting projects on GitHub

When this workshop is over, you should be able to write simple Python scripts. More importantly, I am hoping to give you *the tools* so that you can learn new Python skills and read package documentation to find what you need. **In my opinion, the most important part of programming is knowing how to get help when you need it.**

# Module 1 - Introduction to Programming

Welcome! In this first module, we won't be writing any code, but we'll see a bit about what computers are and how we can tell them to perform different tasks. Here's the outline for this module:

1.	Basic Concepts and Definitions
    1.	What is a Computer?
    2.	What is a Program?
    3.	What are Programming Languages?
2.	Welcome to Python
    1.	What is Python?
    2.	How to Install Python
    3.	Tools for Using Python

## Basic Concepts and Definitions

This section deals with the big questions in life! What's a computer? What are we doing here? How are we doing it? What is the meaning of life??? Ok, maybe not that one... But still, here, we'll see the basic motivation for what we're doing.

### What is a Computer? What is Programming?
In this section, we'll briefly see what a computer is and how programming helps us do what we want to do with it.

This is the inside of a computer:

![Computer inside](../assets/Dell_G5_5000_motherboard.jpg)

**Image credit:** [Dell G5 5000 motherboard.jpg](https://commons.wikimedia.org/wiki/File:Dell_G5_5000_motherboard.jpg), by [Project Kei](https://commons.wikimedia.org/wiki/User:Keita.Honda), licensed under the Creative Commons [Attribution-Share Alike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/deed.en) license.

For our intents and purposes, a computer is a machine that has **two parts**:
1. RAM (memory)
2. CPU (processor)

These parts each do very specific tasks:
* The **memory** stores information that we want to process.
* The **processor** performs operations on data, using inputs to produce an output.

In reality, computers are much more complicated that these two parts, but these are the most important for us. 

### What is a Program?

We have this hardware... but what can we do with it? We must provide it with a set of **instructions** to tell the hardware what to do. These instructions are a *program*. The job of *programming* involves writing instructions that tell the computer what *operations* the CPU should perform, and which *data* it should operate on.

A program is a **text file**. That's it. Well, actually, not exactly, but that's part of it. We'll discuss more about this later.

But... how are these instructions actually written? Introducing...

### What are Programming Languages?

**Programming languages** provide the rules for writing programs.

Who are programming languages designed to help: you (the programmer) or the computer?

Let's pause to think about it...

The answer: **you**

A program contains code that **you** write using a programming language. The computer only sees this as text. That's it. There's absolutely nothing special about the file. The computer doesn't run this text file. The computer has to turn this file into something that it can understand.

Let's make a biology analogy. Let's think about DNA, RNA and Proteins. Let's ignore non-coding RNA and ribozymes and let's just focus on the classical paradigm of the central dogma. DNA encodes instructions to build proteins. But, the DNA itself doesn't do the same function as the protein. It only tells the cell how to make the protein. The DNA must be transcribed to RNA and then **translated** using the ribosome to form a functional protein. 

Well, computers are similar! The program is a text file containing instructions. **You** write code using a programming language, like Python, C++, Java, Kotlin, Swift, etc. All the computer sees is a text file. The computer must process it before it can run it.  The computer then needs to turn these instruction that are *human-readable* into instructions that are *machine-readable*. This is done either using an **interpreter** that runs code line-by-line, for languages like Python, or a **compiler** that process also the code, for languages like C++. 

## Welcome to Python

And now, for our major focus: Introducing Python!

### What is Python?
For more on the history, see: https://en.wikipedia.org/wiki/History_of_Python

Python was introduced by Guido Van Rossum in 1991. It has a number of features:
* Free and open source
* Interpreted language
* Object-oriented language

#### Free and Open Source
Being a free and open source language, anyone can download and use Python, but also more! Anyone can distribute Python and even modify Python, or contribute to its development. Python is developed by the **community** that uses it.

#### Interpreted
Python is an *interpreted* language, **not** a *compiled* language. This means that the entire program file is not translated into machine code before it is run. Instead, only parts of the program are translated when they need to be. This means that when you make a change to one part of a program, you don't need to rebuild your entire project. You often just need to restart your interpreter. This also means that you can open Python in your terminal, type in a single line, and it will work!

#### Object-Oriented
In Python, we use **objects** to represent things. Yeah... I know that doesn't really help. So, objects are a way of representing something, and they group together **data** about that thing (in the form of *attributes*) and **manipulations** that you can do with that thing (in the form of *methods*). Lots of things in Python are represented using these objects from strings of text to lists and you can even make your own (we won't cover that today).

### How to install Python

You've probably done this in the setup for the workshop. There are a number of ways to download Python. If you're on macOS or Linux, lucky you! You might already have Python installed. To see if you do, just open up a terminal window (on macOS, it should be under `Applications > Utilities > Terminal` and on Linux you may be able to just press `Ctrl+Alt+T` to open a terminal window). Once you're there, type `which python`. If you see something that isn't an error message, you already have python. You may also have to try `which python3`.

Now, if you don't have Python installed, you can install it from Python's website at https://www.python.org or get it from the Windows Store or your local software repository on Linux. But, there's another way that is quite helpful: using a Python distribution such as **Anaconda** or **miniconda**. 

Why use Anaconda? Well, it comes with **many** pre-installed packages which are very helpful in science, such as NumPy, SciPy, Matplotlib, Pandas, and more! It's a bit of a big download and the install takes a bit of time, but it is definitely worth it. To get Anaconda, just go to https://www.anaconda.com/ and hit the big green `Download` button. There is a graphical installer for Windows and macOS and a text-based installer for Windows, macOS and Linux. If you don't want to perform a 600 MB download, you can opt for miniconda instead. Go https://docs.conda.io/en/latest/miniconda.html and click on one of the download links. 

Unlike Anaconda, miniconda doesn't come with the packages pre-installed, but it provides you with the `conda` tool to help you install them. We'll discuss Packages more later.

Finally, if you don't want to install Python, then good news! If you have a Google Account, then you can use Python on the web. I'll discuss this more very soon.

### Tools for Programming in Python

There are many tools out there for programming in Python:
* **Jupyter Notebooks:** Let's start with the tool that we're using now! This tool lets us combine code, explanations and figures. This is really good if you want to share your code with extra details. With a Google Account, you can use Jupyter notebooks remotely via **Google Colab**.
* **`python` shell:** This is the most basic way of running a script in the command line or using an interpreter to run one line at a time.
* **`ipython` shell:** Similar to the regular Python shell, but with better auto-complete and syntax highlighting.
* **Microsoft Visual Studio Code:** code editor that can also be used for debugging and running Jupyter notebooks. Python extension necessary.
* **Spyder:** Fully-fledged integrated development environment (IDE). Write and debug code, view figures.
* **PyCharm:** Fully-fledged IDE developed by JetBrains. This tool is great for working on larger Python projects. Community edition is open-source.

![vscode](../assets/vscode.png)  
A familiar Jupyter notebook opened in Microsoft Visual Studio Code.

![pycharm](../assets/pycharm.png)  
Working with a Python file in PyCharm Community Edition.

![spyder](../assets/spyder.png)
A sample Spyder window. This may look familiar if you're used to working with MATLAB or RStudio.

## Module Summary

We've made it to the end of this first module! Here are the main points that we covered:

* **Computers** perform a variety of tasks that involve **performing operations** on **data**.
* **Programs** give the computer a set of instructions to follow to perform this data processing.
* We write programs using a **programming language** that defines the syntax for writing these instructions.
* **Python** is a programming language that is **open source, interpreted and object-oriented**.
* We can use a variety of **tools** to program in Python.


That wasn't so bad! Well, now the coding begins.

# Module 2 - Python Basics

In this section, we'll see the basic, foundational concepts of programming in Python. We'll start with the basics of mathematical operations and we'll see variables for storing data. Then, we'll also start seeing how to get things done in Python. Along the way, I'll also point out possible places where users of different programming languages need to pay special attention.

**Topics:**

1.	Foundations of Python - A Brief Overview of Types and Variables
    1.	Primitive Data Types (int, float, bool, string)
    2.	Variables
    3.	Collection Data Types (tuples, lists, dictionaries)
    4.	Introduction to Functions (Function as a Machine)
2.	Numbers and Comparisons
    1.	Mathematical Operations
    2.	Integers and Floating-Point Numbers
    3.	Booleans
3.	Intro to Control Flow and Loops (if, while and for)
    1.	Control Flow: the if Statement
    2.	while Loops
    3.	Iteration with for Loops
4.	Exercise: Numbers and Loops for Unit Conversion

## Foundations of Python

This section explores the most basic ways that we store small amounts of data. We'll see what types of data we can store and how we can combine pieces of data.

But first! It's conventional that the first program we write in a new language is the "Hello, World!" program. This is a simple program that writes the text "Hello, World!" to the screen. In Python, it's quite easy to do:

In [99]:
# Your exciting first line of Python code here!
# In this line of code, we'll display, or print, the text string "Hello, World!"

print("Hello, World!")

Hello, World!


This very simple program introduces a few important ideas. You'll notice that the first line doesn't really look like code. Actually, it's not! The first line is a **comment**. We (and the computer) know this because the line starts with the symbol `#`. Python ignores that symbol and everything that comes after it, letting you write notes about what your code is doing. It's very important to put comments in your code, especially if you're going to need to come back to it after a few weeks or if you're going to share it with other people.

On the second line, we have two things:
* the `print` *function*
* the *string* "Hello, World!"

The `print` function displays output to the screen or to the console. While it's not necessary in this Jupyter notebook (which automatically outputs the result of the last line of code), it's very helpful if you're ever writing code in a different program, like *PyCharm* or *Spyder*. We'll discuss functions in more detail later, but the idea is that functions take inputs, known as **arguments**, do operations on them and optionally return some sort of modified result. Here, the `print` function takes in the **string** of text "Hello, World!", writes it onto the screen and doesn't return any new data.

The **argument** that we pass to this function is the text **string** `"Hello, World!"`. We'll discuss strings in more detail later. The important thing is that a string is a group of characters surrounded by quotation marks (either single quotes `'Hello'` or double quotes `"Hello"`).

Don't worry if this doesn't make sense! We'll explain each part of it as we go along. By the end of this workshop, you'll understand what this line does!

Now that we've passed this important milestone, let's dive into basic Python!

### Primitive Data Types

Working with the computer is all about processing information. Yay! That's great! Except... what does this information look like? In Python, information can be stored in many different **types**. These types represent numbers, text, and more! Let's start by discussing the **building blocks of everything**, known as the **primitive types**.

In Python, there are four different **primitive types** (see [here](https://realpython.com/python-data-types/)):

1. Integers - `int`
2. Decimal numbers - `float`
3. True or false Boolean values - `bool`
4. Text strings - `str`

Let's take a look at each of these in a bit more detail. We can get the type of a value using the `type` function.

#### Integers

No big surprise, integers are whole numbers without a decimal. We write them... well, as a whole number without a decimal. For example, to represent the number 4 in Python, we'd write:


In [100]:
# Your code here to represent the number 4
4

4

You'll notice some placeholder text that I've written in the box above. Remember, this is a **comment**. Python ignores everything on a line that comes after the `#` symbol. This sort of text is designed to help anyone reading your code understand what it's doing.

There's an extra trick if you're working with really big numbers. To avoid confusion, you can add underscores to make the number easier to read:

In [101]:
# Your code here for a really big number
5_192_435_235

5192435235

We'll see what we can do with these numbers soon.

#### Decimal Numbers

In programming, a decimal number is known as a **floating-point number** or a **`float`**. These numbers are, unsurprisingly, written as numbers with decimals:

In [102]:
# Your code here to write 4 as a float
4.0

4.0

Another way of writing floating-point numbers is to use **scientific notation**. Let's see an example:

In [103]:
# Your code here to write 5 * 10^3 in scientific notation
5e3

5000.0

Note that even if we don't have a decimal here, our value is a floating point number. We can check this using the `type` function:

In [104]:
# Your code here to check the type of what we wrote before.
type(5e3)

float

We'll see more operations on these values very soon.

**Note:** For anyone who has used C, Java or Swift (or many other languages), Python does ***not*** have a separate `double` type. Python also doesn't have type modifiers like `long` or `unsigned`. If you wind up working with NumPy, then you may have to think about different types of integers and floating point numbers (but, we won't talk about that today).

#### Boolean Values

A **Boolean** represents one of two states: True or False. In Python, these values are represented using the names `True` and `False`.

In [105]:
# Your code here to show a Boolean value
True

True

These may seem very simple, but we'll soon see that Booleans are **extremely important**.

#### Strings

In programming, we refer to text as a **string**. When writing a string, we **must** put it inside quotation marks. In Python, these can be single quotation marks (''), or double quotation marks (""). The important thing is to use the same quotation mark to open and close your string. Let's see some examples:

In [106]:
# Your code here for working with strings.
"Hello, Python!"

'Hello, Python!'

We'll see that there's a lot more that we can do with these strings. Let's say you want to have a long string that spans multiple lines and you want to keep the line breaks. Well, you can use a *triple-quoted* string to do just this.

In [107]:
# Your code here for example of triple-quoted string.
"""
This is my long
triple-quoted
string over
many, many,
many, many,
many, many,
lines!
"""

'\nThis is my long\ntriple-quoted\nstring over\nmany, many,\nmany, many,\nmany, many,\nlines!\n'

We have a whole section on strings coming up! So, stay tuned!

**Note:** It is ***super important*** to remember to put quotation marks around your string. Otherwise, Python will get mad at you.

### Variables

So, we've seen basic data types, but they're not really useful if we can't store the values. To store data, we use **variables**. A **variable** gives a *name* to a piece of data stored in memory so that you can easily access it later. The information stored in a variable can change (or **vary**).

**Note:** Python has no constants. Only variables. If you come from a language that has constants, I'm sorry.

#### Variable Names
There are rules for naming a variable:
* Variable names are **case-sensitive**.
* A variable name must contain only letters, numbers and underscores.
* A variable name cannot start with a number.
* A variable name cannot be the same as a reserved word in Python (see [here](https://docs.python.org/3/reference/lexical_analysis.html#keywords) for list).

A variable name may consist of multiple words combined. There are a few different conventions for putting words together. Two common ones are known as `snake_case` and `camelCase`:
* In `snake_case`, all letters are lowercase and words are separated by underscores.
* In `camelCase`, different words are combined with no spaces, and the first letter of a new word is put as a capital.

Different people use different conventions. Your code editor may suggest one over the other (for example, PyCharm prefers `snake_case`). The choice depends on your project setup and any existing code you may be adding to.

**Notes:** 

* Although you can combine words together, try to keep variable names reasonably concise.
* Although Python has no constants, `ALL_CAPS_NAMES` are sometimes used to denote variables that shouldn't change.
* Variable names *can* start with underscores, but this often has a special meaning.

Now, to test your skills, find the incorrect variable names and what the problems are:
* `my_variable12.3`
* `-myVariableName2`
* `@myVariable`
* `my-variable&`
* `my+variable`
* `23variable`
* `my_variable2`
* `myVariable`
* `myV#ariable`
* `my_variaBle_32`
* `import`
* `my_import`

#### Variable Assignment

The way that we assign a variable is easy. We just use the `=` sign. That's it. We can also change the value of a variable by just assigning a new value using the equal sign (and so, the value **varies**).

Now, let's do a few examples of variable assignment. Here, we'll make use of the `print` function to track the value of the variables.

1. **Assignment**  
 Let's create a variable called `my_variable` with the value `42`.

In [108]:
# Your code here for variable assignment
my_variable = 42

my_variable

42

2. **Reassignment**  
 We can easily reassign a value using the equal sign again. Let's re-assign our variable `my_variable` to have the value `16`.

In [109]:
# Your code here for variable reassignment
my_variable = 16

my_variable

16

3. **Changing type**  
 There's no requirement for the new value to be of the same type as the original. Let's assign the string `"Hello"` to our variable `my_variable`:

In [110]:
# Your code here to assign a string
my_variable = "Hello"

my_variable

'Hello'

### Collection Types

So, we've seen how to store individual numbers and bits of text. Well, let's say we want to store a lot of values. If we have a small number, we can just create a bunch of variables:

In [111]:
# Your code here to create three variables with numbers
number1 = 3
number2 = 5
number3 = 16

Ok... so, this is a bit hard to manage for even small numbers of variables. Instead of working with multiple variables, we use **collection types**. A **collection** stores *multiple* values. We'll see three main types of collections:

1. Tuples - store a small fixed number of values.
2. Lists - store multiple values; can add or remove elements.
3. Dictionaries - store multiple values based on keys; can add or remove elements.

We'll go into more detail about each later, but first let's see some examples. Each collection type has specific notation:

1. Tuples - Values separated by commas between parentheses `(a, b)`
2. Lists - Values separated by commas between square brackets `[a, b]`
3. Dictionaries - Keys-Value pair defined by colons, separated by commas between brace brackets `{a:b, c:d}`

Confused? Let's see some examples.

First, a tuple example:

In [112]:
# Your code here to create a tuple
my_tuple = (3, 5, 16)

my_tuple

(3, 5, 16)

Now, let's see a list:

In [113]:
# Your code here to create a list
my_list = [3, 5, 16]

my_list

[3, 5, 16]

And now a dictionary:

In [114]:
# Your code here to create a dictionary
my_dictionary = {"Milk": 3, "Apples": 5, "Eggs": 16}

my_dictionary

{'Milk': 3, 'Apples': 5, 'Eggs': 16}

We'll see all of these in **much** more detail later. For now, what's important is that you know they exist. We'll see how to use them later.

### Introduction to Functions

So far, we've seen how to store data... but our code hasn't actually done anything. We'll see soon how to start writing code that makes decisions and does calculations. But first, let's talk about **functions**.

We can think of a **function** as a sort of machine. It takes **inputs**, does some sort of operation on them, and produces **outputs**. People often represent functions as a little black box.

![black box](../assets/Function.png)

When we use a function, this is known as **calling** the function. When we *call a function*, we tell it what data to use to perform the operations and where to store the result (typically in a variable). The syntax to call a function and store its result in a variable `x` is as follows:

```python
    x = function_name(arguments_here)
```

We've actually already seen a function! We used the `print` function a while ago. This function takes a string that we want to display, shows it on the screen and doesn't return any output. Let's call this function:

In [115]:
# Your code here to call the `print` function
print("Hello!!!!")

Hello!!!!


Python has other built-in functions available. There's a list available at [this link](https://docs.python.org/3/library/functions.html). For example, we can use the `abs` function to take the absolute value of a number:

In [116]:
# Your code here to call the absolute value function on some input
my_number = -5

# Call the absolute value function and store the result in a new variable
my_abs_number = abs(my_number)

# Display that new variable
my_abs_number

5

Functions can also take multiple inputs. For example, we can round numbers using the `round` function, described [here](https://docs.python.org/3/library/functions.html#round):

In [2]:
# Your code here to call the round function on a decimal number 2.95 to 1 decimal place.
my_number = 2.95

round(my_number, ndigits=1)

3.0

Here, `ndigits` is an optional **keyword argument**. Functions often have many of these, which have default values. To specify what value a keyword argument should take, you simply write it like you would a variable assignment. In this case, since the keyword argument is called `ndigits` in the documentation, and we want to set it to `1`, in the function call, we must write `ndigits=1`.

We'll talk a bit more about functions later. The important thing to remember about functions is that they are **defined packets of behaviour** that **encapsulate** certain operations. We call them on inputs and they produce outputs without us having to worry about the internal details.

## Numbers and Comparisons

We've seen how we can store data in variables. But just storing the data is boring! In this section, we'll start talking about things that we can *do* with the data. Specifically, we'll see operations that we can do on numbers and Booleans.

### Mathematical Operations

Python gives users the ability to perform simple mathematical operations on numbers. The following operations that you know very well can be easily done:
* **Addition** is performed using the `+` operator
* **Subtraction** is performed using the `-` operator
* **Multiplication** is performed using the `*` operator
* **Division** is performed using the `/` operator (does not round)

Python offers a few other operations as well:
* **Exponents** can be taken using the `**` operator (**NOT** `^`)
* **Modulus** (remainder) can be taken using the `%` operator (**Warning:** for anyone who uses MATLAB, this is **not** a comment!)
* **Integer division** (dividing and rounding down) can be performed using the `//` operator (**Warning:** for anyone who knows Java or C or any number of other languages, this is **not** a comment in Python!)

To perform a basic mathematical operation, all you need to do is type in the numbers, along with the operator, in the same way that you'd write the expression on paper. For example, to add 5 and 4, we would write the following:

In [117]:
# Put your code here
5 + 4

9

We can also chain operations together. Remember that the rules of **BEMDAS** apply. Let's do an example to show this. 

Write code that computes and prints the following results: $4+5\times 3$ and $(4+5) \times 3$. 

**Hint:** Remember the `print` function from above and use it to show the result of two different calculations in the same Jupyter notebook cell.

In [118]:
# Put your code here
print(4 + 5 * 3)
print((4 + 5) * 3)

19
27


These examples contained integers, known in Python as `int`s. We can also do calculations that involve decimal numbers, known as **floating point numbers** or simply, `float`s. We can also mix the two different types of numbers.

Now, it's your turn! Write code to perform the following calculations:
* $3\times 4 - 6 \div 2$
* $(3.23 + 5.2) \times 4.3^2$
* $\textup{floor}(\frac{5}{2}) \times (6 \mod 4)$

In [119]:
# Put your code here

# Example 1
ans1 = 3 * 4 - 6 / 2 
print("The first answer is", ans1)

# Example 2
ans2 = (3.23 + 5.2) * (4.3 ** 2)
print("The second answer is", ans2)

# Example 3
ans3 = 5 // 2 * (6 % 4)
print("The third answer is", ans3)

The first answer is 9.0
The second answer is 155.87069999999997
The third answer is 4


These rules don't only apply when working with numbers. We can also plug in variables that hold numeric values.

For example, let's set `a=5`, `b=4`, `c=2`. Let's compute the following:

* $a\times b \div c$
* $(a + b) \mod c$

In [120]:
# Your code here
a = 5
b = 4
c = 2

# Example 1
ans1 = a * b / c
print("First answer:", ans1)

# Example 2
ans2 = (a + b) % c
print("Second answer:", ans2)

First answer: 10.0
Second answer: 1


We just saw that we can use variables in our calculations. We can also easily assign the results of calculations to variables. Let's some variable assignment. 

Let's create a variable called `my_variable` with the value `35`. Let's then multiply it by `2` and store this result in the same `my_variable` variable.

In [121]:
# Your code here
my_variable = 35

# Multiply by 2 and store in same variable
my_variable = my_variable * 2

# Show the result
my_variable

70

This assignment looked a bit bulky! For some of these operations, we have a shortcut so that we don't have to rewrite the variable name twice. For each operation, we can use a new assignment operator:
* We replace assignment and `+` with `+=`
* We replace assignment and `-` with `-=`
* We replace assignment and `*` with `*=`
* We replace assignment and `/` with `/=`
* We replace assignment and `**` with `**=`
* We replace assignment and `%` with `%=`
* We replace assignment and `//` with `//=`

So, we can rewrite the last example we did:

In [122]:
# Your code here
my_variable = 35

# Use the assignment operator
my_variable *= 2

# Show the result
my_variable

70

For more information on `int`s and `float`s and the numeric types in Python, see [this page](https://docs.python.org/3/library/stdtypes.html#typesnumeric) from the official Python documentation.

### Booleans

A **boolean** represents a value that is either `True` or `False`. In this section, we'll see how to generate them, and then we'll see fun things we can do with them!

#### Comparisons

Think back to when you were starting to learn math... What was one of the first things they taught you? For me, it was **comparisons** and **inequalities**. We had two numbers, and we had to put the correct sign, `>,<,=` in between (some of you were maybe also told to think of a crocodile opening its mouth to the bigger number...).

Well, this is an important idea in programming too! We can use the following operations to generate boolean values. Let's say that `a` and `b` are both numbers (either `int`s or `float`s):
* `a > b` -- **greater than**, evaluates to `True` if `a` is bigger than `b`, otherwise evaluates to `False`
* `a >= b`-- **greater than or equal to**
* `a < b` -- **less than**
* `a <= b` -- **less than or equal to**
* `a == b` -- **equal** -- ***NOTE:*** there are ***TWO*** equal signs!!!!!
* `a != b` -- **not equal**

Again, I want to emphasize that for the equals comparison, you must must must put two equal signs `==`! Otherwise, Python will think you're trying to assign a variable and it will get mad at you and give you an error!

Also, for `>=` and `<=`, the order of the two signs matters! Do **NOT** write `=>` or `=<`! If you forget, remember that the order is the same as we read it. **Less that or equal to** is first *less than*, so `<` and then *equal to*, so `=`, so the order is `<=`.

Now, let's see some examples:

In [123]:
# Your code here
a = 92
b = 43

# Complete these lines: # Your code here
print("a is greater than b:", a > b)
print("a is less than b:", a < b)
print("a is equal to b:", a == b)
print("a is not equal to b:", a != b)
print("a is greater than or equal to b:", a >= b)
print("a is less than or equal to b:", a <= b)

a is greater than b: True
a is less than b: False
a is equal to b: False
a is not equal to b: True
a is greater than or equal to b: True
a is less than or equal to b: False


Feel free to change the values of `a` and `b` and see how the output changes!

These operations don't only work on numbers! We can use `==` and `!=` on just about any other data. Let's see some examples on strings:

In [124]:
# Your code here for string comparisons
user_password = "Password"
actual_password = "passWorD"

user_password == actual_password

False

In [125]:
# Your code here for string comparisons
user_password = "Password"
actual_password = "Password"

user_password == actual_password

True

These types of comparisons are very important. We'll see why in a bit... But first, let's see some other cool things we can do with Booleans.

#### Boolean Operations

We've seen how to generate booleans using numbers and strings. We can also perform operations on booleans to get... more booleans! These three operations are **logical operations**:
* `and`
* `or`
* `not`

#### The `and` operation
The `and` operation takes **two** boolean values `a` and `b`. If **both** `a` and `b` are `True`, then `a and b` is also `True`. Otherwise, `a and b` is `False`. People coming from other programming languages may know `and` as `&&` or `&`. We can represent this operation using a **truth table**:

| `a` | `b` | `a and b` |
| --- | --- | --- |
| `False` | `False` | `False` |
| `False` | `True` | `False` |
| `True` | `False` | `False` |
| `True` | `True` | `True` |

Let's also see some examples:


In [126]:
# Your code here for a simple example

True and False

False

In practice, you'll often work with Booleans that you've generated using comparisons. Now, let's a more complicated example. Let's set `a=4`, `b=5` and `c=6` and evaluate `(a < b) and (c > b)`:

In [127]:
# Your code here for a more complicated example
a = 4
b = 5
c = 6

# Perform the operation
(a < b) and (c > b)

True

Let's think about that last example: we have `a=4`, `b=5`, `c=6`. We're looking at the logical expression
```python
a < b and c > b
```

So, we start by breaking it up into the two parts:
* `a < b`
* `c > b`

Now, we look at each part separately:
* `a < b`: well, we have `a=4` and `b=5`, so we have `4 < 5`, which is `True`
* `c > b`: we have `c=6` and `b=5`, so we have `6 > 5`, which is `True`

Now, we can put these two back together: for `a < b and c > b` both the left and the right are `True`, which makes the whole expression `True`!

#### The `or` operation

The `or` operation also takes **two** boolean values `a` and `b`, but it evaluates to `True` if **at least one** of `a` or `b` is `True`. If both values are `False`, then `a or b` is `False`. Otherwise, `a or b` is `True`. In other programming languages, the `or` operation is represented as `a || b` or `a | b`.

To help visualise, here's the truth table:

| `a` | `b` | `a or b` |
| --- | --- | --- |
| `False` | `False` | `False` |
| `False` | `True` | `True` |
| `True` | `False` | `True` |
| `True` | `True` | `True` |

Now, let's do some examples

In [128]:
# Your code here for a simple example
True or False

True

Once again, you'll often work directly with numbers and comparisons. Let's do an example where we have `a = 5`, `b = 6`, `c = 7` and let's evaluate `a > b or c > b`:

In [129]:
# Your code here for a numeric example:
a = 5
b = 6
c = 7

# Perform the comparisons and logical operation
a > b or c > b

True

Let's go through that last example. We have `a=5`, `b=6`, `c=7`. Let's again break up our expression into two parts:
* `a > b`
* `c > b`

Let's look at each one:
* `a > b` --> `5 > 6` --> `False`
* `c > b` --> `7 > 6` --> `True`

Since at least one of the two boolean values is `True`, then `a > b or c > b` is `True`.

We don't have to use comparisons of variables of all the same type. Let's set `n1 = 5`, `n2 = 6` and `password = "Hello"`. Let's now check to see if the product of the two numbers is less than 28 **or** the password is equal to `"World"`:

In [130]:
# Your code here for a more complicated example:

# Define our variables
n1 = 5
n2 = 6
password = "Hello"

# Perform our comparisons
(n1 * n2 < 30) or password == "World"

False

Let's now change the password to `"World"` and try again:

In [131]:
# Your code here...

# Define our variables
n1 = 5
n2 = 6
password = "World"

# Perform our comparisons
(n1 * n2 < 30) or password == "World"

True

Hopefully, you're starting to see that we can use these booleans to make decisions. We'll come back to this idea **really soon**.

#### The `not` operation

The `not` operation only takes in **one** boolean value `a` and flips its value. If `a` is `True`, then `not a` is `False` and if `a` is `False`, then `not a` is `True`. In other languages, it may be represented by `!a` or `~a`.

Here's the truth table:

| `a` | `not a` |
| --- | --- |
|`False` | `True`|
|`True` | `False` |

The easy way to understand it is that it's opposite day! When you add the `not` operator, everything that is usually `True` becomes `False` and everything that is usually `False` becomes `True`. And here are a couple of examples:

In [132]:
# Your code here for a simple example
not True

False

As usual, we're not often going to work directly with Booleans. Let's do a numeric example. Let's set `a=6` and `b=8`. Let's evaluate `not a > b`:

In [133]:
# Your code here for a numeric example

# Define our variables
a = 6
b = 8

# Perform the logical operation
not a > b

True

For the last example, let's look a bit more closely. We have `a=6` and `b=8`.

The value of `a > b` is `6 > 8`, which is `False`. But the `not` operation flips this from `False` to `True`.

**Note:** When you want to invert equality, *DO NOT* do `not a == b`. We have an operation that does this in one step, called `!=`. So, you should do `a != b` instead. It's cleaner and simpler.

Now that we have a basic understanding of booleans, let's see one of their most practical uses...

## Intro to Control Flow and Loops

So far, our code has just run line-by-line. Everything we've written has run. But, we have ways of making decisions and repeating certain lines. In this section, we'll see how to do this using:

* Control Flow
* `while` Loops
* `for` Loops 

### Control Flow: the `if` Statement

Let's say you're heading down to campus. You take the metro and get off at Peel. You get out at the corner of Peel and de Maisonneuve and look around. In your head, you're thinking, `if` Peel is open, I'll walk up there, otherwise (`else`), I'll go to Metcalfe. **Congratulations!!!** You've just done control flow!

Control flow is about **making decisions** using boolean values. The important keyword here is `if`. Our basic control flow has the structure:
```python
    if boolean_value:
        do_something

    some_other_code_here...
```

Here are a few things to note:
* there is a **colon** (:) after the boolean value.
* the line `do_something` only runs if the `boolean_value` evaluates to `True`. 
* the line `do_something` is **indented**. In other languages, you might be used to curly brackets. Python **DOES NOT** use these. In Python, different blocks of code are indented. Also, note that in Python, we don't need to write `end` when we're done! It's enough to stop indenting.
* the line `some_other_code_here` runs *regardless* of whether the `boolean_value` is `True`. We can tell because it's **not** indented.

Let's see an example to help illustrate.

In [134]:
# Your code here

peel_is_closed = True

print("I'm out of the metro...")

if peel_is_closed:
    print("I'm heading over to Metcalfe... again.")

print("I'm heading up to McIntyre")


I'm out of the metro...
I'm heading over to Metcalfe... again.
I'm heading up to McIntyre


Try changing the variable `peel_is_closed` to `False` and see what happens...

You may have noticed that the lines under the `if` statement are indented. That tells Python that they will only run when the `if` condition is met. The lines underneath that aren't indented tell Python that they run no matter what.

You may have also noticed that I didn't write:
```python
if peel_is_closed == True
```

This isn't necessary, since we already have a boolean. Putting in the extra comparison makes our code less clean. Also, just try reading the code like it's a sentence. It even sounds like a conversation:
"If Peel is closed, [print] I'm taking Metcalfe".

Now, we can also replace the boolean with one of the comparisons we have above...

In [135]:
# Freezing point example: Your code here

# Set the current temperature
current_temperature = -5

# Give an introductory message
print("We're taking the temperature...")

# Check if the temperature is below zero
if current_temperature < 0:
    print("We're below freezing!")

# Give a concluding message
print("Done taking the temperature")

We're taking the temperature...
We're below freezing!
Done taking the temperature


In this example, we put an expression that evaluates to a boolean after the `if`. Try setting the value of `current_temperature` to be above zero and see what happens.

In the last example, it would've been nice if we could print a different message if we were above freezing... or if we're in some different temperature range. Well, we can do this with `elif` clauses and a final `else` clause! We can extend the structure do be:

```python
    if some_boolean:
        do_something
    elif some_other_boolean:
        do_something_else
    elif yet_another_boolean:
        do_another_something_else
    ...
    else:
        all_else_has_failed_so_lets_do_this
```

So, if the `some_boolean` is `True`, then the line `do_something` runs. If it isn't `True`, then we test to see if `some_other_boolean` is `True`. If it is, then we run `do_something_else`. Otherwise, we keep going down the list of conditions until one of them is `True`. If all conditions are `False`, then the code under `else` runs.

**Notes:**
* There is no limit to the number of `elif` clauses you can have. You can have zero, one, or as many as you want.
* There is no requirement to add an `else` clause. You can lots of `elif` clauses without a final `else`.
* You can only have at most **one** `else` clause.

Now, for practice, let's write code that again takes a temperature, and this time tells you specifically if you are:
* below freezing
* at freezing
* above freezing

In [136]:
# Freezing point example: Your code here

current_temperature = 100

# Give an introductory message
print("We're taking the temperature...")

# Check if the temperature is below zero
if current_temperature < 0:
    print("We're below freezing!")
elif current_temperature == 0:
    print("We're at freezing!")
else:
    print("We're above freezing!")

# Give a concluding message
print("Done taking the temperature")

We're taking the temperature...
We're above freezing!
Done taking the temperature


We can also use variables that are strings. For example, let's say we want our code to be friendly to users in the US. We want to have a variable `units` that can be `C`, `F` or `K` to determine whether our temperature is in Celsius, Fahrenheit or Kelvin and decide on freezing based on that:

In [137]:
# Your code here for example with units

# Store the current temperature and the units
current_temperature = 10
units = "F"

# Give an introductory message
print("We're taking the temperature...")

# Get the value for the freezing point
if units == "C":
    freezing_point = 0
elif units == "F":
    freezing_point = 32
elif units == "K":
    freezing_point = 273.15
else:
    print("Invalid units! Assuming Celsius!")
    freezing_point = 0

# Check if the temperature is below zero
if current_temperature < freezing_point:
    print("We're below freezing!")
elif current_temperature == freezing_point:
    print("We're at freezing!")
else:
    print("We're above freezing!")

# Give a concluding message
print("Done taking the temperature")

We're taking the temperature...
We're below freezing!
Done taking the temperature


### `while` loops

So, control flow is great for choosing which lines of code to run, but what if we want to run a line more than once? To do this, we can use **loops**. There are two main kinds of loops in Python:
* `while` loops
* `for` loops

They are similar, but `for` loops run for a predetermined number of times and `while` loops run for an arbitrary number of iterations. We'll start with `while` loops.

Syntax:

```python
    while some_boolean:
        do_some_code
    
    code_after_loop...
```

Now, you'll pretty much **NEVER** want to put a raw boolean value in the `while`. You'll instead want to use some sort of operation that returns a boolean. This operation usually involves a variable that you update in the loop. Again, notice the indent!

Sticking with our temperature theme... Let's write an example where the temperature starts at -15 and increases by 2° at every iteration until it hits 10°. At each iteration, we print a message saying the current temperature and whether we are below, at or above freezing:

In [138]:
current_temperature = -15

# Your code here
while current_temperature < 10:
    # Print our message
    if current_temperature > 0:
        message = "is above zero."
    elif current_temperature == 0:
        message = "is at zero."
    else:
        message = "is below zero."
    print("Current temperature", current_temperature, message)
    
    # Increase the temperature
    current_temperature += 2

Current temperature -15 is below zero.
Current temperature -13 is below zero.
Current temperature -11 is below zero.
Current temperature -9 is below zero.
Current temperature -7 is below zero.
Current temperature -5 is below zero.
Current temperature -3 is below zero.
Current temperature -1 is below zero.
Current temperature 1 is above zero.
Current temperature 3 is above zero.
Current temperature 5 is above zero.
Current temperature 7 is above zero.
Current temperature 9 is above zero.


Try changing the increment or the starting value to see the differences in the output.

### Iteration with `for` loops

`for` loops are a bit simpler, since they involve running for a pre-determined number of times. To use a `for` loop, we need something to iterate over. One basic iterable uses the `range` function.

The `range` function takes up to three arguments:
```python
    range(a,b,c)
```

This function gives you all the numbers going from `a` up to but excluding `b`, skipping by `c`. If you leave out the last argument, it will give you every number from `a` to `b`. If you only give one argument, it will give you every number from 0 to that number (excluding it).

Here is the `for` loop syntax:
```python

    for var_name in iterable:
        some_code
    
    code_when_finished

```

At each step, we get a new value stored in `var_name`.

Let's see an example where we're calculating the squares of all numbers between 1 and 10 (excluding 10):

In [139]:
# Your code here

for i in range(1, 10):
    i_squared = i * i
    print(i, "squared is", i_squared)


1 squared is 1
2 squared is 4
3 squared is 9
4 squared is 16
5 squared is 25
6 squared is 36
7 squared is 49
8 squared is 64
9 squared is 81


Another common iterable is a list. Let's see an example of iterating over a list of integers:

In [140]:
# Your code here for iterating over a list

my_list = [1, 1, 2, 3, 5, 8, 13, 21, 34]

for number in my_list:
    doubled_number = number * 2
    print(number, "doubled is", doubled_number)

1 doubled is 2
1 doubled is 2
2 doubled is 4
3 doubled is 6
5 doubled is 10
8 doubled is 16
13 doubled is 26
21 doubled is 42
34 doubled is 68


Sometimes, you may want to interrupt a loop early, or skip one iteration. For this, we have the keywords `break` and `continue`.

We use `break` if we want to stop going through a loop. For example, let's say we are using a `for` loop to calculate squares, but we don't want to go above 50:

In [141]:
# Your code here
for i in range(10):
    i_squared = i * i
    print(i, "squared is", i_squared)
    
    # Check if above 50
    if i_squared > 50:
        print("We're above 50! Stopping!")
        break

0 squared is 0
1 squared is 1
2 squared is 4
3 squared is 9
4 squared is 16
5 squared is 25
6 squared is 36
7 squared is 49
8 squared is 64
We're above 50! Stopping!


For an example with `continue`, let's say we have a list and we only want to compute squares of all even numbers:

In [142]:
# Your code here

# Here's our list
my_list = [5, 23, 4, 1, 2, 2, 6, 7, 8, 5, 3, 4, 8]

# Iterate over the list
for n in my_list:
    # Check if n is odd
    if n % 2 == 1:
        print("Skipping odd number...")
        continue
    n_squared = n * n
    print(n, "squared is", n_squared)


Skipping odd number...
Skipping odd number...
4 squared is 16
Skipping odd number...
2 squared is 4
2 squared is 4
6 squared is 36
Skipping odd number...
8 squared is 64
Skipping odd number...
Skipping odd number...
4 squared is 16
8 squared is 64


These two keywords can also be used in `while` loops.

## Exercise: Temperature Conversions

We have reached the end of this module!!!

Here's a mini-project to work on based on what we saw this module:

In the United States, the temperature is commonly reported in Fahrenheit. But, here in Canada (and in much of the rest of the world), the temperature is recorded in Celsius. The conversion to Fahrenheit from Celsius is given by:
$$
    \textup{F} = \frac{9}{5}\textup{C} + 32
$$

To convert from Fahrenheit back to Celsius, we use the equation:
$$
    \textup{C} = \frac{5}{9}(\textup{F} - 32)
$$

(P.S. if you ever forget, easy way to remember: the relationship is linear -- the lines intersect at -40 -- and we know that water freezes at 32°F and 0°C and boils at 212°F and 100°C; with any two of these three points, you can definitely find the line).

First, let's write code to convert between the two units. Users will include an input temperature and input unit (either as `"F"` or `"C"`) and the code should convert to the other unit.

In [143]:
# Your code here

input_temperature = 60
input_units = "F"

if input_units == "C":
    output_units = "F"
    converted_temperature = 9 / 5 * input_temperature + 32
else:
    output_units = "C"
    converted_temperature = 5 / 9 * (input_temperature - 32)

print("Unit conversion complete!", input_temperature, input_units, "equals", converted_temperature, output_units)

Unit conversion complete! 60 F equals 15.555555555555557 C


Now, for another example, let's find the temperature in Fahrenheit for all Celsius temperatures from $-40^\circ \textup{C}$ to $+35^\circ \textup{C}$ (inclusively), incrementing by $5^\circ$.

**BONUS:** Write this code twice: once using a `for` loop and once using a `while` loop.

In [144]:
# Put your code here...

# For loop solution
print("===== FOR LOOP RESULTS ======")

for c in range(-40, 36, 5): # Notice that we have to go above 35, since 35 is excluded
    f = 9 / 5 * c + 32
    print(c, "C equals", f, "F.")

# While loop solution
print("\n\n===== FOR LOOP RESULTS ======")

# Note that here we need to set the initial temperature
c = -40

while c <= 35:
    f = 9 / 5 * c + 32
    print(c, "C equals", f, "F.")

    # We need to explicitly increment
    c += 5

-40 C equals -40.0 F.
-35 C equals -31.0 F.
-30 C equals -22.0 F.
-25 C equals -13.0 F.
-20 C equals -4.0 F.
-15 C equals 5.0 F.
-10 C equals 14.0 F.
-5 C equals 23.0 F.
0 C equals 32.0 F.
5 C equals 41.0 F.
10 C equals 50.0 F.
15 C equals 59.0 F.
20 C equals 68.0 F.
25 C equals 77.0 F.
30 C equals 86.0 F.
35 C equals 95.0 F.


-40 C equals -40.0 F.
-35 C equals -31.0 F.
-30 C equals -22.0 F.
-25 C equals -13.0 F.
-20 C equals -4.0 F.
-15 C equals 5.0 F.
-10 C equals 14.0 F.
-5 C equals 23.0 F.
0 C equals 32.0 F.
5 C equals 41.0 F.
10 C equals 50.0 F.
15 C equals 59.0 F.
20 C equals 68.0 F.
25 C equals 77.0 F.
30 C equals 86.0 F.
35 C equals 95.0 F.


### BONUS: Replacing `for` Loops with `while` Loops

Any time that you use a `for` loop, you can actually use a `while` loop instead. It's just not always as nice and clean:

In [145]:
# Done using a `for` loop
for i in range(10):
    print("The value of i is now", i)
    # print("The operation 2 * i gives us:", 2 * i)


# Done using a `while` loop.
i = 0

while i < 10:
    print("The value of i is now", i)
    i += 1

The value of i is now 0
The value of i is now 1
The value of i is now 2
The value of i is now 3
The value of i is now 4
The value of i is now 5
The value of i is now 6
The value of i is now 7
The value of i is now 8
The value of i is now 9
The value of i is now 0
The value of i is now 1
The value of i is now 2
The value of i is now 3
The value of i is now 4
The value of i is now 5
The value of i is now 6
The value of i is now 7
The value of i is now 8
The value of i is now 9


## Module Summary

Congratulations! You've made it through the basics! In this module we've seen:

* How to *store* different *types* of data in **variables**.
* How to perform *basic mathematical operations* on **integers** and **floating-point numbers**.
* How to perform **boolean operations** and apply these to **control flow** through **`if` statements**.
* How to repeat tasks using **`for` and `while` loops**.

# Module 3 - Strings and Collections: An Object Primer

In this module, we'll take things up to a new level. We've seen how to write code that does stuff with basic data, such as numbers and booleans. We've also played a bit with strings. Now, let's go into a bit more depth on strings and collections.

## Introducing the String!

A **string** is a sequence of text characters, surrounded by quotation marks. We saw an example above when we wrote the "Hello, World!" program. We can use either single quotes or double quotes:

In [146]:
# Your code here
print("This is a string")
print('This is also a string')

This is a string
This is also a string


We can also use triple-quotes to have a longer string that has line breaks in it.

In [147]:
# Your code here

print("""
This is a much longer string.

It spans multiple lines.

Look at all this text.
""")


This is a much longer string.

It spans multiple lines.

Look at all this text.



These types of strings will be useful later on. It's very very very important that you remember the quotation marks! Otherwise, Python will think you're talking about variables:

In [148]:
# Your code here to produce an error
# This line produces an error
# print(Not a valid string!)

In [149]:
# Your code here to produce successful output

print("Valid string!")

Valid string!


There's lots of stuff that we can do with these strings. Let's discuss a few operations on strings.

Printing strings is great, but we want to actually process them. We can use the `len` function to get the number of characters in a string:

In [150]:
# Your code here
my_string = "I like Python!"
string_length = len(my_string)
print("The length of my string is", string_length, "characters")

The length of my string is 14 characters


**NOTE**: Those of you who have learned Java have undoubtedly seen that you can't compare strings with the `==` operator. Well, good news! In Python, you **CAN**.

In [151]:
string1 = "Hello"
string2 = str("Hello")

print("Checking string equality:", string1 == string2)

Checking string equality: True


Even though we made a new `str` object, the equality still holds!

### String Slicing

We can also access individual characters or substrings using the **bracket operator** `[]`. But first, we need to talk about **indexing**. In a Python string, every character has a numbered position. It's **extremely** important to remember that in Python, the first position is indexed with the number **0**.

Again, I'll repeat that...

***The first character in a Python string has index 0.***

So, you can also figure out that the last character in a string with *n* characters has index *n-1*, **not** *n*.

This diagram should help clarify it:

![string indexing](../assets/StringIndexingPositive.png)

Note that blank spaces are counted! To get the character at an index, stored in variable `i`, we'd write the following:

```python
character_of_interest = my_string[i]
```

To get a substring starting at index `i` and going to the character at index `j` (**excluding** that character), we write:
```python
my_substring = my_string[i:j]
```

If we omit `i`, then we get everything from the beginning up to (but **excluding**) `j`. If we omit `j`, then we get the substring starting at index `i`.

We can even skip every `k` characters by adding a third number:
```python
my_substring = my_string[i:j:k]
```

Now, let's see some examples of string indexing and taking substrings. In Python, this process is commonly referred to as *slicing*.

In [152]:
my_string = "my string text"

# Your code here

# Let's look at single characters
print("The first character in the string is:", my_string[0])
print("The last character in the string is:", my_string[len(my_string) - 1])

# Now, let's look at substrings
print("The substring from index 3 to index 12 is:", my_string[3:12])

# Now, let's skip a few characters
print("The substring from 5 to the end, skipping every 2 is:", my_string[5::2])

The first character in the string is: m
The last character in the string is: t
The substring from index 3 to index 12 is: string te
The substring from 5 to the end, skipping every 2 is: rn et


Python also has a great feature where we can use **negative** indices! The last character has an index of -1 and the values go back to -n, where n is the length of the string. Here's an updated diagram:

![Negative indices](../assets/StringIndexingNegative.png)

Now, it's your turn! Let's do some string indexing with negative indices. **Note:** We *can* combine positive and negative indices.

In [153]:
# Reproduce the above strings using negative indexing where convenient
my_string = "my string text"

# Your code here

# Let's look at single characters
print("The first character in the string is:", my_string[0])
print("The last character in the string is:", my_string[-1])

# Now, let's look at substrings
print("The substring from index 3 to index 12 is:",  my_string[3:12])
print("The substring from the beginning to index 6 is:", my_string[:6])
print("The substring from index 7 to the end is:", my_string[7:])

The first character in the string is: m
The last character in the string is: t
The substring from index 3 to index 12 is: string te
The substring from the beginning to index 6 is: my str
The substring from index 7 to the end is: ng text


One last note on string slicing and indexing: Strings are **immutable**, meaning that you can't change any of the individual characters or substrings. You can create a new string using existing strings, but you **cannot** change the content of a string.

In [154]:
# This code produces an error:
# my_string[3] = 'b'

### String Operations and Methods

#### Concatenation and Formatting
A common operation on strings is **concatenation**, or combining strings. We can combine strings with the `+` sign:

In [155]:
string_1 = "Hello,"
string_2 = "World!"

# Your code here
concatenated_string = string_1 + string_2

print("Concatenated string is:", concatenated_string)

Concatenated string is: Hello,World!


This example shows something very important! Concatenation does **NOT** add in any spaces. It just takes the two strings and combines them together. If you want there to be spaces, you need to make sure to add them in!

Also, concatenation only works on **strings**! Let's look at this example:

In [156]:
string_1 = "The meaning of life, the universe and everything is "
meaning_of_life = 42

# This gives an error!
# print(string_1 + meaning_of_life)

This is very important to remember if you know JavaScript! Running this gives us an error! We can't concatenate an integer and a string. If we want to add the two together, we **must convert the `int` to a string** using `str`:

In [157]:
string_1 = "The meaning of life, the universe and everything is "
meaning_of_life = 42

# Your code here
complete_sentence = string_1 + str(meaning_of_life)

print(complete_sentence)

The meaning of life, the universe and everything is 42


But, there's a shortcut using **string formatting**, or **f-strings**, which let you put a variable directly into a string:
```python
my_formatted_string = f"The meaning of life, the universe and everything is... {meaning_of_life}"
```

In [158]:
# Your code here for string formatting
my_formatted_string = f"The meaning of life, the universe and everything is {meaning_of_life}."

print(my_formatted_string)

The meaning of life, the universe and everything is 42.


Notice that there is an **f** before the opening quotation mark and that the variable goes in curly braces. This tool makes life **much** easier! There are also cool ways of formatting numbers with extra zeros and spaces... but we won't see them today.

#### Converting Strings to Numbers

Let's say, you've gotten some data from a file or the internet and it contains a number. You want to do some sort of mathematical operation on it... and you rush to Python and you do this:

```python
    my_number_from_file = "32.3"

    my_answer = 3 * my_number_from_file

    print("The answer to my computation is:", my_answer)
```

What do you think will print?

In [159]:
my_number_from_file = "32.3"

# Your code here to multiply by 3
my_answer = my_number_from_file * 3

print("The answer to my computation is:", my_answer)

The answer to my computation is: 32.332.332.3


The answer may surprise you. Depending on which operation you're doing, you'll either get:
* a complete nonsense answer
* an error

There's an important step that we need to do before we can do any mathematical operations: we must convert the strings to numeric types. This is very easy:
* To convert a string to a `float`, just call the `float()` function with the string as the argument.
* To convert a string to an `int`, just call the `int()` function with the string as the argument.

For example:

In [160]:
my_string_float = "32.3"
my_string_int = "41"

# Fill in the blanks to perform the type conversions
# Your code here

my_int = int(my_string_int)
my_float = float(my_string_float)

print("The product of 32.3 and 41 is:", my_float * my_int)

The product of 32.3 and 41 is: 1324.3


**Fun fact**: The `int` function can also be used on numbers that are not base-10!

### Finding a Substring - Intro to Methods and Objects
And now, for a string exercise! Remember that I said you can't change the contents of a string. Well, let's now create a new string that has a single character that is different. And, since this is an MiCM workshop, let's use DNA as an example.

In [161]:
dna_sequence = "AAGGACCTTAGAAGGGGACCATTATTAAATTCCCGCA"

There are more things that we can do with strings. In Python, strings are a type of **object**. An **object** is a grouping of variables, known as **attributes**, and functions, known as **methods** that all relate to one thing. String objects have various methods that we can use, or **call**, to do different things with the text contents. To call a method, we use the syntax

```python
    variable_name.method_name(arguments)
```

***This syntax will look quite familiar to anyone coming from Java or a C-based language. It may be a bit confusing for people coming from R or MATLAB. Remember, in Python, the dot `.` is NOT part of the variable name. It is an operator that lets us access functions and variables that belong to certain objects.***

Remember from earlier that **functions** may take inputs, or **arguments**, perform calculations, and then **return** outputs. Let's see a few examples of methods that we can use on strings.

For example, one method we can use on strings is `find`. Let's look at the documentation to see what this method does: https://docs.python.org/3/library/stdtypes.html#str.find

The `find` method looks for a specified substring within a whole string, or part of a string, and returns the index where it is located.

In [162]:
dna_sequence = "AAGGACCTTAGAAGGGGACCATTATTAAATTCCCGCA"

# Put in your code to find the index of the first T nucleotide
index_of_first_t = dna_sequence.find("T")

print("The first thymine nucleotide is located at index", index_of_first_t)
print(dna_sequence[index_of_first_t])

The first thymine nucleotide is located at index 7
T


### Replacing Characters

Well, let's say we want to replace this `T` nucleotide with a `G` nucleotide. We can use another useful method: `replace`. As the name suggests, this method replaces specified characters or substrings with the provided new ones. It's documentation is [here](https://docs.python.org/3/library/stdtypes.html#str.replace).

The syntax is:
```python
    new_string = my_string.replace("old", "new", optional_count)
```

Let's go back to our DNA sequence and replace only the first `T` with `G`:

In [163]:
# Your code here
mutated_dna_sequence = dna_sequence.replace("T", "G", 1)

print("Our modified sequence is:", mutated_dna_sequence)

Our modified sequence is: AAGGACCGTAGAAGGGGACCATTATTAAATTCCCGCA


There are many more methods we can call for strings. To learn more, see the `str` reference on the Python documentation website (https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str).

## String Iteration and the `for` Loop

Remember, earlier we saw the `for` loop. Well, we can do fun things with the `for` loop in strings! We can iterate over each character in the string.

Here's the syntax:

```python
    for c in my_string:
        do_something
```

Here, `c` is a single character in the string. Let's see an example:

In [164]:
my_dna_sequence = "ACGGACAGGAGCGAGATTTGACAGCATTA"

number_of_purines = 0
number_of_pyrimidines = 0

# Your code here
for nucleotide in my_dna_sequence:
    if nucleotide == "C" or nucleotide == "T":
        number_of_pyrimidines += 1
    elif nucleotide == "A" or nucleotide == "G":
        number_of_purines += 1
    else:
        print("Invalid nucleotide! Skipping!")

print(f"In our sequence there are {number_of_purines} purines and {number_of_pyrimidines} pyrimidines.")

In our sequence there are 19 purines and 10 pyrimidines.


There's actually an easy way to clean up our boolean conditions. Instead of using string equality, we can check if the nucleotide is contained in a string using the `in` keyword:

In [165]:
my_dna_sequence = "ACGGACAGGAGCGAGATTTGACAGCATTA"

number_of_purines = 0
number_of_pyrimidines = 0

for nucleotide in my_dna_sequence:
    # Your code here to simplify
    if nucleotide in "CT":
        number_of_pyrimidines += 1
    elif nucleotide in "AG":
        number_of_purines += 1
    else:
        print("Invalid nucleotide! Skipping!")

print(f"In our sequence there are {number_of_purines} purines and {number_of_pyrimidines} pyrimidines.")

In our sequence there are 19 purines and 10 pyrimidines.


### Exercise: DNA transcription and mRNA processing

Now that we're done discussing variables, numeric types and strings, let's do a few exercises!

1. Ahhh, the joys of transcription! Remember that DNA and RNA share *most* of their nucleotides, but they differ in one of the pyrimidines. DNA has thymine while RNA has uracil. I'm giving you the **non-template** strand DNA. Recall that the non-template strand is identical to the produced mRNA, with the exception that the thymine is replaced by uracil. 

Replace all the thymine nucleotides with uracil to get the result of transcription.

(a) The non-template strand is `AGCAGATGCATTAGCCATTAGTTTGCACCAGTATATGCAGAGTTTAGGAGACCATAATTAACGAGAGCCGATAGCTAGA`.

In [166]:
dna_sequence = "AGCAGATGCATTAGCCATTAGTTTGCACCAGTATATGCAGAGTTTAGGAGACCATAATTAACGAGAGCCGATAGCTAGA"

# Put your code here
rna_sequence = dna_sequence.replace("T", "U")

rna_sequence

'AGCAGAUGCAUUAGCCAUUAGUUUGCACCAGUAUAUGCAGAGUUUAGGAGACCAUAAUUAACGAGAGCCGAUAGCUAGA'

(b) (Time Permitting) Now, let's say the **template** strand is `AGCAGATGCATTAGCCATTAGTTTGCACCAGTATATGCAGAGTTTAGGAGACCATAATTAACGAGAGCCGATAGCTAGA`. Now, you must find the complementary nucleotides to transcribe to mRNA.

**Remember:** we first need to find the complementary strand and reverse the direction!

In [167]:
dna_sequence = "AGCAGATGCATTAGCCATTAGTTTGCACCAGTATATGCAGAGTTTAGGAGACCATAATTAACGAGAGCCGATAGCTAGA"

# Put your code here

# Solution 1
reversed_strand = dna_sequence[::-1]

print(reversed_strand)
print("-"*len(reversed_strand))

rna_sequence = ""

for nucleotide in reversed_strand:
    if nucleotide == "A":
        rna_sequence += "U"
    elif nucleotide == "T":
        rna_sequence += "A"
    elif nucleotide == "C":
        rna_sequence += "G"
    elif nucleotide == "G":
        rna_sequence += "C"

# Solution 2
rna_sequence = ""

for i in range(len(dna_sequence) - 1, -1, -1):
    nucleotide = dna_sequence[i]
    if nucleotide == "A":
        rna_sequence += "U"
    elif nucleotide == "T":
        rna_sequence += "A"
    elif nucleotide == "C":
        rna_sequence += "G"
    elif nucleotide == "G":
        rna_sequence += "C"

print(rna_sequence)

# Solution 3
pairings = {"A": "U", "T": "A", "C": "G", "G": "C"}

rna_sequence = ""

for i in range(len(dna_sequence) - 1, -1, -1):
    nucleotide = dna_sequence[i]
    rna_sequence += pairings[nucleotide]

print(rna_sequence)

# Solution 4
pairings = {"A": "U", "T": "A", "C": "G", "G": "C"}

rna_sequence = "".join([pairings[dna_sequence[i]] for i in range(len(dna_sequence) - 1, -1, -1)])

print(rna_sequence)

# Solution 5
pairings = {"A": "U", "T": "A", "C": "G", "G": "C"}

rna_sequence = "".join([pairings[nt] for nt in reversed(dna_sequence)])

print(rna_sequence)


AGATCGATAGCCGAGAGCAATTAATACCAGAGGATTTGAGACGTATATGACCACGTTTGATTACCGATTACGTAGACGA
-------------------------------------------------------------------------------
UCUAGCUAUCGGCUCUCGUUAAUUAUGGUCUCCUAAACUCUGCAUAUACUGGUGCAAACUAAUGGCUAAUGCAUCUGCU
UCUAGCUAUCGGCUCUCGUUAAUUAUGGUCUCCUAAACUCUGCAUAUACUGGUGCAAACUAAUGGCUAAUGCAUCUGCU
UCUAGCUAUCGGCUCUCGUUAAUUAUGGUCUCCUAAACUCUGCAUAUACUGGUGCAAACUAAUGGCUAAUGCAUCUGCU
UCUAGCUAUCGGCUCUCGUUAAUUAUGGUCUCCUAAACUCUGCAUAUACUGGUGCAAACUAAUGGCUAAUGCAUCUGCU


2. In eukaryotes, mRNA must be processed before it is translated by the ribosome. This processing involves three steps:
* Capping
* Splicing
* Polyadenylation

We'll skip the capping, but let's now do some splicing! We won't deal with actual splice sites. Instead, let's say that there are introns at the following indices (start and end both in intron):
* Start at the ninth **nucleotide** and ending at the 17th nucleotide
* Start at 20 nucleotides from the end of the sequence and going until 12 from the end

Splice out these introns and stick the exons together.

In [168]:
# Your code here for splicing
spliced_sequence = rna_sequence[8:17] + rna_sequence[19:-12]

spliced_sequence

'UCGGCUCUCUAAUUAUGGUCUCCUAAACUCUGCAUAUACUGGUGCAAACUAAUGGCU'

Now, for polyadenylation, add a sequence of 15 `A` nucleotides. Hint: you can use the `*` operator to repeat a string!

In [169]:
# Your code here for polyadenylation
polyadenylated_sequence = spliced_sequence + 15 * "A"

polyadenylated_sequence

'UCGGCUCUCUAAUUAUGGUCUCCUAAACUCUGCAUAUACUGGUGCAAACUAAUGGCUAAAAAAAAAAAAAAA'

3. What is the maximum number of codons we could fit in a sequence with the same length as the original? How many nucleotides would be left over?

In [170]:
# Put your calculations here.
maximum_number_of_codons = len(dna_sequence) // 3
print(f"We can fit a maximum of {maximum_number_of_codons} codons in the original.")

remainder = len(polyadenylated_sequence) % 3
print(f"We would have {remainder} nucleotides left.")

We can fit a maximum of 26 codons in the original.
We would have 0 nucleotides left.


## Collection Types - Introduction to Tuples, Lists and Dictionaries

We've seen that we can store data in basic types, like strings, `int`s and `float`s. But, let's say we want to store many of these at a time. For example, let say we have 100 DNA sequences that we want to store and process? Well, for this we have **collection types**. In this section, we'll see three important collection types:
* Tuples
* Lists
* Dictionaries

For more information on tuples and lists, see [this page](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range) of the Python documentation. For more info about dictionaries, see [here](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict).

### Tuples

A tuple is a way of packaging a fixed number of values together. The number of values can't be changed, and neither can the values themselves. Tuples are **immutable**, like strings. Remember, though, we can always assign a new tuple to the same variable. Tuples are represented using multiple values separated by commas within round brackets (parentheses) -- `()`.

In [171]:
# Your code here
tuple1 = (10, 23)

tuple1

(10, 23)

In [172]:
# Your code here
tuple2 = ("Hello", "World", "!")

tuple2

('Hello', 'World', '!')

In [173]:
# Your code here
tuple3 = ("Error", 404)

tuple3

('Error', 404)

#### Accessing Elements
There are two different ways to access individual elements in a tuple:
* Slicing
* Unpacking

When working with tuples, **slicing** works the *exact same way* that it did with strings, described above.

Sorry to be pedantic and repetitive, but remember that **_INDEXING STARTS AT ZERO_**.

Fill in the following example to confirm that.

We have the tuple `(4, 5, "Hello", "World!", 12, True, 4.5)`.

1. Use slicing to isolate the sub-tuple containing "Hello" and "World!".
2. Use slicing to get the last two elements.

In [174]:
my_tuple = (4, 5, "Hello", "World!", 12, True, 4.5)

In [175]:
# Put your code here for question 1.
tuple_1 = my_tuple[2:4]
print("Question 1:", tuple_1)

Question 1: ('Hello', 'World!')


In [176]:
# Put your code here for question 2.
tuple_2 = my_tuple[-2:]
print("Question 2:", tuple_2)

Question 2: (True, 4.5)


#### Tuple Unpacking
**Unpacking** is a different process. Let's say we have a tuple with 2 elements in it. We can assign each one of these elements to a variable, like this:

In [177]:
my_point = (-3, 5)

# Your code here to assign x and y
x, y = my_point

print("The value of my point is:", my_point)
print("The value of x is:", x)
print("The value of y is:", y)

The value of my point is: (-3, 5)
The value of x is: -3
The value of y is: 5


**NOTE:** You **MUST** have the same number of variables and the number of elements in the tuple. Otherwise, unpacking won't work and you'll get an error from Python.

Finally, like with strings, we can concatenate tuples using the `+` operation.

In [178]:
# Your code here to concatenate tuples
my_combined_tuples = my_tuple + my_point

my_combined_tuples

(4, 5, 'Hello', 'World!', 12, True, 4.5, -3, 5)

### Lists and List Methods

List are more exciting than tuples. Lists are **mutable**! So, we can add entries to a list, remove entries from a list, and change the entries in a list. Lists are represented as comma-separated values in square brackets -- `[]`. Unlike tuples, we can't unpack lists. Lists also *usually* contain elements of the same or similar type (although they don't have to).

In [179]:
# Your code here
# Here's an example of a list

my_list = [1, 2, 4, 8, 16, 32]

my_list

[1, 2, 4, 8, 16, 32]

In [180]:
# Your code here for another example list
my_list2 = ["The", "quick", "brown", "fox"]

my_list2

['The', 'quick', 'brown', 'fox']

In [181]:
# Your code here for yet another example list
my_list3 = ["hello", 4, True, "WorlD"]

my_list3

['hello', 4, True, 'WorlD']

Now, I've told you all these great things that we can do with lists... but how do we do them?

#### Length of a List
Well, let's start with the simplest thing... taking the **length** of a list. We do this in the exact same way that we took the length of a string! We use the `len` function.

In [182]:
# Your code here
my_squares = [1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]
print("My squares list has length:", len(my_squares))

My squares list has length: 11


#### List Slicing

We can obtain individual items and sublists through *slicing*, exactly the same way that we did with strings and tuples.

Here's an exercise to test your skills with this...

I'm giving you this list: `[1, 1, 2, 3, 5, 8, 13, 21, 34]`

Using slicing, find:
* the last element
* the values `3, 5, 8`
* the values `1, 2, 5, 13`

In [183]:
my_list = [1, 1, 2, 3, 5, 8, 13, 21, 34]

# Your code here
print("The last element in the list is:", my_list[-1])
print("The sublist is:", my_list[3:6])
print("The sublist is:", my_list[0:-1:2])

The last element in the list is: 34
The sublist is: [3, 5, 8]
The sublist is: [1, 2, 5, 13]


But, there's more that we can do with the slicing! We can now update values using the `=` sign! We can do this for both individual elements and for sublists!

Let's take this example: `[1, 2, 4, 9, 16, 32, 64, 129, 257]`

Any idea what this sequence is? There are three mistakes that we need to correct!

So... Where are the mistakes? How do we correct them?

In [184]:
# Here is our error-filled list:
powers_of_two = [1, 2, 4, 9, 16, 32, 64, 129, 257]

# Your code here to correct
powers_of_two[3] = 8
powers_of_two[-2:] = [128, 256]


print("The corrected list is:", powers_of_two)

The corrected list is: [1, 2, 4, 8, 16, 32, 64, 128, 256]


#### Adding Elements

Now for the fun part! Let's insert new items! Remember that the list is **mutable**, so when we add new items, we are actually *changing* the list. We are **not** creating a new list. To change the list, we use **methods** from the list object.

Let's start with adding a new item at the **end** of the list. This process is known as *appending* to a list. So, naturally, the method to do this is called `append`:

In [185]:
# Example using our powers of two
# Your code here to continue the list
powers_of_two.append(512)

print("Powers of two is now:", powers_of_two)

Powers of two is now: [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]


We can also insert at any index `i` using the method called... `insert`! This method takes **two** arguments: the index `i` *before which* we want to insert the new element and the new element that we want to insert. 

***NOTE:*** You must respect this order of arguments.

Here's an example:

In [186]:
days_of_the_week = ["Sunday", "Tuesday", "Wednesday", "Thursday", "Saturday"]

# Your code here to add Monday in the correct spot
days_of_the_week.insert(1, "Monday")

# Your code here to add Friday in the right spot (hint: negative indexing)
days_of_the_week.insert(-1, "Friday")


print(f"The {len(days_of_the_week)} days of the week are: {days_of_the_week}")

The 7 days of the week are: ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']


How can we learn more about these methods? We can check out the [documentation](https://docs.python.org/3/library/stdtypes.html#list). We can also see other methods, like `index`, which we can use to find the position of an element.

#### Removing Elements

Sometimes, we want to delete elements from a list. There are a few ways to do this:
- using the `del` keyword
- using an assignment
- using the `pop` method
- using the `clear` method

Here are the details:
* The `del` keyword can be used to get rid of single elements or a range. `del` is **not** a function, so we **don't** use brackets. 
* To remove a range, we can alternatively just use slicing and assign an empty list to the desired range (see [here](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types)).
* We can use the `pop` method without an argument to remove the last item from a list, or with an index as argument to remove the item at index `i`. The `pop` method returns the removed element, so it can be stored in a variable.
* We can use the `clear` method to remove **all** items from a list.

In [187]:
test_list = [2, 3, 5, 7, 9, 11, 13, 17, 19, 23]

# Your code here

# Get index of 9
my_index = test_list.index(9)

# Remove the number which doesn't belong using `del`...
del test_list[my_index]

print("Test list is now:", test_list)

Test list is now: [2, 3, 5, 7, 11, 13, 17, 19, 23]


In [188]:
test_list = [2, 3, 5, 7, 9, 11, 13, 17, 19, 23]

# Your code here to remove the number which doesn't belong using `pop`...
my_index = test_list.index(9)
removed_element = test_list.pop(my_index)


print("Test list is now:", test_list, "since we removed item:", removed_element)

Test list is now: [2, 3, 5, 7, 11, 13, 17, 19, 23] since we removed item: 9


In [189]:
test_list_2 = [1, 2, 2, 1, 2, 3, 1, 2, 2, 1, 1, 2, 2]

# Your code here to remove the numbers that disrupt the pattern using `del`.
del test_list_2[4:6]


print("Test list 2 is now:", test_list_2)

Test list 2 is now: [1, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2]


In [190]:
test_list_2 = [1, 2, 2, 1, 2, 3, 1, 2, 2, 1, 1, 2, 2]

# Your code here to remove the numbers that disrupt the pattern using assignment.
test_list_2[4:6] = []

print("Test list 2 is now:", test_list_2)

Test list 2 is now: [1, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2]


In [191]:
# Your code here to remove all elements using `clear`
test_list_2.clear()

print("The test list 2 is now:", test_list_2)

The test list 2 is now: []


#### List Concatenation

One last operation: lists can be concatenated using the `+` operator. Remember that **both** the left and the right must be lists! You can't add a number to a list by concatenation! You must first embed it in a list.

In [192]:
# Your code here to define list_a and list_b and concatenate the two lists
list_a = [1, 4, 6]
list_b = [2, 4, 6]

joined_list = list_a + list_b

print("The joined list is:", joined_list)

The joined list is: [1, 4, 6, 2, 4, 6]


In [193]:
# Your code here to add 3 to the end of list_a to create list_c
list_c = list_a + [3]

print("Modified list a is:", list_c)

Modified list a is: [1, 4, 6, 3]


#### List Iteration

Remember how we went through each character in a string? Well, we can do the exact same thing with a list!

```python
    for item in my_list:
        do_something...
```

Here's an example:

In [194]:
my_list = [2, 4, 6, 5, 8, 7, 1, 3, 5, 7, 8, 9, 10, 22, 11, 95]

# Your code here to extract the even and odd numbers from the sequence

even_numbers = []
odd_numbers = []

for n in my_list:
    if n % 2 == 0:
        even_numbers.append(n)
    else:
        odd_numbers.append(n)


print(f"Our list has {len(even_numbers)} even numbers and {len(odd_numbers)} odd numbers.")

Our list has 7 even numbers and 9 odd numbers.


Now, let's say we want to get the index of the element... Well, we can use the `enumerate` function. This returns a tuple containing the index and the item from the list.

**Note:** In the `for` loop, we can **immediately unpack** the tuple!

In [195]:
my_list = [2, 4, 6, 5, 8, 7, 1, 3, 5, 7, 8, 9, 10, 22, 11, 95]

number_of_even = 0
number_of_odd = 0

last_even_index = -1
last_odd_index = -1

# Your code here to extract the number of odd and even and get the final indices of each
for i, n in enumerate(my_list):
    if n % 2 == 0:
        number_of_even += 1
        last_even_index = i
    else:
        number_of_odd += 1
        last_odd_index = i


print("Our list has", number_of_even, "even numbers and", number_of_odd, "odd numbers.")
print("The last even number was at index", last_even_index, "and the last odd number was at index", last_odd_index)


Our list has 7 even numbers and 9 odd numbers.
The last even number was at index 13 and the last odd number was at index 15


#### List Exercise

Now, time to practice lists! Let's take a string of RNA and turn it into a list of codons. At the end, print the number of codons.

In [196]:
my_rna = "AGCAGCAUGACCGAGUCAGUCAGCUUGCGGCUACGUACUGGCCAUUAGCAGUACAGU"

# Your code here

In [197]:
my_rna = "AGCAGCAUGACCGAGUCAGUCAGCUUGCGGCUACGUACUGGCCAUUAGCAGUACAGU"

# Your code here

# Here are a few hints ...

# 1. Create an empty codon list
my_codons = []

# 2. Find the start codon
start_codon_index = my_rna.find("AUG")

# 3. Iterate over the string
for i in range(start_codon_index, len(my_rna) - 2, 3):
    # 4. Get the codon...
    new_codon = my_rna[i: i + 3]

    # 5. Add codon to list
    my_codons.append(new_codon)
    

print("We found", len(my_codons), "codons")
print(my_codons)

We found 17 codons
['AUG', 'ACC', 'GAG', 'UCA', 'GUC', 'AGC', 'UUG', 'CGG', 'CUA', 'CGU', 'ACU', 'GGC', 'CAU', 'UAG', 'CAG', 'UAC', 'AGU']


### Dictionaries

So... How many of you can remember using a paper dictionary? What's the idea behind them?

#### Key-Value Storage

Well, we're not going to be defining words... but think about the **structure** of a dictionary. You look up a word and you get an associated piece of information, a definition. Let's call the word a **key** and the associated information a **value**. A **dictionary** is a collection that stores **Key-Value** pairs.

Now, for the syntax... Well, tuples involved round brackets, and lists involved square brackets... so it's only natural that the syntax for dictionaries uses curly brackets, or brace brackets `{}`. But, there's another twist here. 

We need both keys and values! The **values** can be any type, but the **keys** must be **immutable**. So, the keys can be numbers, tuples or strings (or booleans, I guess, but that may not be useful), but they **cannot** be lists. In addition, keys **cannot** be duplicated, but values can. If you try to duplicate a key, only one of the values is kept.

In [198]:
# Your code here: dictionary example for image_counts

image_counts = {"microCT": 12, "FIB-SEM": 5, "confocal": 36, "STORM": 6, "cryoTEM": 2}

image_counts

{'microCT': 12, 'FIB-SEM': 5, 'confocal': 36, 'STORM': 6, 'cryoTEM': 2}

Note: the keys and the values don't have to be in order.

Now, there are lots of operations that we can do on dictionaries!

#### Accessing and Modifying Dictionary Entries

Recall that in strings, tuples and lists we used the square brackets `[]` for indexing. We're still going to use them here, but instead of using a *numeric* index, we put a key in the brackets instead. We can then perform our usual operations of retrieving and replacing values.

In [199]:
# Your code here to access the number of microCT scans and store it in micro_ct_scans
micro_ct_scans = image_counts["microCT"]

print(f"We have {micro_ct_scans} microCT scans in our database!")

# Your code here to modify the number of confocal images
image_counts["confocal"] = 39

print("Imaging database now has the following datasets:", image_counts)


We have 12 microCT scans in our database!
Imaging database now has the following datasets: {'microCT': 12, 'FIB-SEM': 5, 'confocal': 39, 'STORM': 6, 'cryoTEM': 2}


#### Adding Keys

Adding new elements to a dictionary is easy! We just need the new key and the new value, and then we write:
```python
    my_dictionary[new_key] = new_value
```

For example:

In [200]:
# Your code here to add TEM to our imaging database
image_counts["TEM"] = 10

print(f"Our imaging database now has the following datasets available: {image_counts}")

Our imaging database now has the following datasets available: {'microCT': 12, 'FIB-SEM': 5, 'confocal': 39, 'STORM': 6, 'cryoTEM': 2, 'TEM': 10}


#### Removing Entries

To remove an entry, we can again use the `del` keyword, or we can use `pop`. Like with lists, `pop` gives us the value that we removed in a variable.

In [201]:
# Your code here to remove the STORM datasets and store them in a variable storm_datasets
storm_datasets = image_counts.pop("STORM")

print("The number of STORM datasets was:", storm_datasets)

print("Our dictionary is now:", image_counts)

The number of STORM datasets was: 6
Our dictionary is now: {'microCT': 12, 'FIB-SEM': 5, 'confocal': 39, 'cryoTEM': 2, 'TEM': 10}


#### Other Operations

Much of the expected behaviour of dictionaries is similar to lists. There are a few methods that are exclusively used by dictionaries:
* The `keys` method returns the keys in the dictionary.
* The `values` method returns the values in the dictionary.
* The `items` method returns tuples containing `(key, value)` pairs.
* The `update` method can be used for combining dictionaries (Concatenation doesn't work!). **This method updates the current dictionary and does not produce a new one!**

In [202]:
# Your code here
my_keys = image_counts.keys()
my_values = image_counts.values()
my_items = image_counts.items()

print("The keys are:", my_keys)
print("The values are:", my_values)
print("The items are:", my_items)

The keys are: dict_keys(['microCT', 'FIB-SEM', 'confocal', 'cryoTEM', 'TEM'])
The values are: dict_values([12, 5, 39, 2, 10])
The items are: dict_items([('microCT', 12), ('FIB-SEM', 5), ('confocal', 39), ('cryoTEM', 2), ('TEM', 10)])


In [203]:
new_datasets = {
    "synchrotron": 3,
    "STEM": 4,
}

# Your code here to update the dictionary
image_counts.update(new_datasets)

print("Imaging catalogue now has data:", image_counts)

Imaging catalogue now has data: {'microCT': 12, 'FIB-SEM': 5, 'confocal': 39, 'cryoTEM': 2, 'TEM': 10, 'synchrotron': 3, 'STEM': 4}


#### Dictionary Iteration

To do things with all data stored in the dictionary, we don't usually iterate over indices. Instead, we can iterate over the keys, or the values, or the `items` which contain both. To iterate over the keys, we can just do the following:

```python
    for k in my_dictionary:
        do_something
```

As an example, let's find the average of our imaging catalogue counts from above:

In [204]:
image_counts = {
    "microCT": 12,
    "FIB-SEM": 5,
    "confocal": 36,
    "STORM": 6,
    "cryoTEM": 2
}

# Your code here to compute the average number of datasets for the modalities and store it in average_count
number_of_datasets = 0

for modality in image_counts:
    n = image_counts[modality]
    number_of_datasets += n

average_count = number_of_datasets / len(image_counts)

print("The average number of image datasets is", average_count)

The average number of image datasets is 12.2


## Exercise: Strings and Collections for DNA and Protein Processing: Translation

So... We made a list of codons before. Now, let's take it a step farther. In this exercise, we will write code to translate the mRNA to proteins. I'll provide you with a codon table... but backwards! You need to start by creating the table that goes from codon to amino acid. Codon table from here: https://en.wikipedia.org/wiki/DNA_and_RNA_codon_tables.

**Recall:** Your list of codons from the DNA sequence earlier should still be in the variable `my_codons`.

In [205]:
amino_acid_to_codon_table = {
    "F": ["UUU", "UUC"],
    "L": ["UUA", "UUG", "CUU", "CUC", "CUA", "CUG"],
    "I": ["AUU", "AUC", "AUA"],
    "M": ["AUG"],
    "V": ["GUU", "GUC", "GUA", "GUG"],
    "S": ["UCU", "UCC", "UCA", "UCG", "AGU", "AGC"],
    "P": ["CCU", "CCC", "CCA", "CCG"],
    "T": ["ACU", "ACC", "ACA", "ACG"],
    "A": ["GCU", "GCC", "GCA", "GCG"],
    "Y": ["UAU", "UAC"],
    "STOP": ["UAA", "UAG", "UGA"],
    "H": ["CAU", "CAC"],
    "Q": ["CAA", "CAG"],
    "N": ["AAU", "AAC"],
    "K": ["AAA", "AAG"],
    "D": ["GAU", "GAC"],
    "E": ["GAA", "GAG"],
    "C": ["UGU", "UGC"],
    "W": ["UGG"],
    "R": ["CGU", "CGC", "CGA", "CGG", "AGA", "AGG"],
    "G": ["GGU", "GGC", "GGA", "GGG"]
}
# Your code here

# Start by creating a new dictionary where the codons are the keys
forward_codon_table = {} # This creates an empty dictionary

# Use iteration to create the opposite table: Codons to Amino Acids
for amino_acid in amino_acid_to_codon_table:
    for codon in amino_acid_to_codon_table[amino_acid]:
        forward_codon_table[codon] = amino_acid

# Perform the translation on the provided codon list:
my_codons = ['AUG', 'ACC', 'GAG', 'UCA', 'GUC', 'AGC', 'UUG', 'CGG',
          'CUA', 'CGU', 'ACU', 'GGC', 'CAU', 'UAG', 'CAG', 'UAC', 'AGU']

my_protein = ""

for codon in my_codons:
    # Get the corresponding amino acid for the codon
    new_amino_acid = forward_codon_table[codon]
    
    # Check if it is the stop codon
    if new_amino_acid == "STOP":
        print("STOP CODON!")
        break # End the loop

    # Add the new amino acid to the protein
    my_protein += new_amino_acid
    print(f"Added new amino acid {new_amino_acid} for codon {codon}!")

print("Our protein has amino acid sequence:", my_protein)

Added new amino acid M for codon AUG!
Added new amino acid T for codon ACC!
Added new amino acid E for codon GAG!
Added new amino acid S for codon UCA!
Added new amino acid V for codon GUC!
Added new amino acid S for codon AGC!
Added new amino acid L for codon UUG!
Added new amino acid R for codon CGG!
Added new amino acid L for codon CUA!
Added new amino acid R for codon CGU!
Added new amino acid T for codon ACU!
Added new amino acid G for codon GGC!
Added new amino acid H for codon CAU!
STOP CODON!
Our protein has amino acid sequence: MTESVSLRLRTGH


## Module Summary

Yay! We've made it through another module! Here, we've explored the basics of strings and collection types. Here are the main points that we saw:

* A **string** represents *text* in Python. We can use **slicing** to access its elements. We can also perform operations, like **concatenation and string formatting**, and use **methods** to get extra info about a string or create modified versions of it.
* A **tuple** represents a *small number of objects grouped together*. To access elements, we can either use slicing, or we can **unpack** its contents into the corresponding number of variables. Tuples can't be modified.
* A **list** represents a *variable-length collection* of objects. We can add or remove objects from the list using **list methods**, such as `append`, `insert` and `pop`. We can also iterate over all elements of a list using a `for` loop.
* A **dictionary** represents *key-value storage*. Instead of having a numeric index, we access **values** using a **key**. We can add or remove elements using keys and we can modify the dictionary using **dictionary methods**, such as `pop` and `update`. We can also use the `keys`, `values` and `items` methods to get different pieces of information.
* All of these are **objects**, which means that they store information and have functions, or **methods** associated with them.
* We can iterate over all these types of objects to process individual elements.

For more information about any of these objects, check out the official Python documentation. There's a lot of detail about each type:
* Strings: https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str
* Tuples: https://docs.python.org/3/library/stdtypes.html#tuple
* Lists: https://docs.python.org/3/library/stdtypes.html#list
* Dictionaries: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict

Finally, there's another collection type that I didn't discuss, called a *set*. If you want to learn about it, check out this page: https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset.

# Module 4 - Modules and Packages

We've been seen the basics of how to store data, make decisions and run code over and over again. In practice, more complicated code will be wrapped up in **functions** that other people have written. The good news is that in Python it's **very easy** to use code from other people. In this module, we'll talk about how Python code is arranged and how you can **import** code and use it as if you had written it yourself. Here's the outline for this module:

1.	Using Modules
    1.	What is a Module?
    2.	Importing a Module
    3.	Importing Specific Functions
2.	Package Management
    1.	What is a Package?
    2.	Installing Packages using conda 
    3.	Installing Packages using pip
    4.	Other Installation Tips
    5.	Using Packages and Reading Documentation
3.	Exercise: importing a module from the standard library and using its functions.

## Using Modules

Python code is organised in *modules*. But wait???? What's a module? I'm glad you asked...


### What is a Module?

Simple answer: a **module** is a file. That's it. Any time you create a new Python file and assign it a name that ends with `.py`, you've created a module. If you share this file with someone else, they can use your code in their own files without having to copy-paste it. We'll see the details in a bit.

So, what does this module look like? Usually, it contains a bunch of different code:
* **Functions**: bits of repeatable behaviour to simplify tasks.
* **Classes**: code that defines new types of objects.
* **Constants**: variables that have important pre-determined values, like $\pi$.

All of these are also typically accompanied by **documentation**, which explains how they work, what you can do with them, and how you can use them. This documentation is just a series of triple-quoted strings in the module file.

This will become clearer in a bit. First, it's important to know that Python comes with **a lot** of built-in modules. You can see a list [here](https://docs.python.org/3/py-modindex.html).

We *can* also run Python code to see what modules we have available. **BUT!!! This code may take some time to run, especially if you installed Python using Anaconda! So, think twice before running this line!**

In [206]:
# help("modules")

This list doesn't only include built-in modules, but also those that you've installed from other packages. We'll talk about this later.

### Importing a Module

To use code from a module, we have to **import** it. Importing the module tells Python that we want to access its contents and use them in our code.

To import a module so that we can use it in our code, here's the syntax:
```python
    import module_name
```

Let's do an example. Let's say you're working with a long sequence of DNA that's hundres of base pairs long and you want to break it up over several lines. Well, Python has a [`textwrap` module](https://docs.python.org/3/library/textwrap.html#module-textwrap) that can help! First, let's import the module

In [207]:
# Your code here to import the textwrap module

import textwrap

Great! We've imported the module! That's our first step done. The next step is to **read how to use the module**. We have two ways to do this:
1. Go to the website to read the documentation.
2. Use the `help` function in Python.

If we use the `help` function, we can read the help right from Python without having to search the internet! The downside is that the `help` function is entirely text-based, so there are no pictures and it's harder to navigate. **Usually, I look at the online help.**

In [208]:
help(textwrap)

Help on module textwrap:

NAME
    textwrap - Text wrapping and filling.

MODULE REFERENCE
    https://docs.python.org/3.8/library/textwrap
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

CLASSES
    builtins.object
        TextWrapper
    
    class TextWrapper(builtins.object)
     |  TextWrapper(width=70, initial_indent='', subsequent_indent='', expand_tabs=True, replace_whitespace=True, fix_sentence_endings=False, break_long_words=True, drop_whitespace=True, break_on_hyphens=True, tabsize=8, *, max_lines=None, placeholder=' [...]')
     |  
     |  Object for wrapping/filling text.  The public interface consists of
     |  the wrap() and fill() methods; the other methods are just there for
     |  subclasses 

So, we've now seen that we can use the `wrap` function in the `textwrap` module to break text up into lines of a given length. To actually use this function ourselves, we use the **dot notation**, like when we were calling methods on strings and lists.

In [209]:
my_long_dna = "AGGACAGTTGTACGATGCATCGTGCTACGATCGATGCTAGCGACGTACGTAGCATGCTAGCTAGCTGACGAGCGCGCGCGATCAGCATGCGCCGGACGTCAGTCAGTGTCAGTCATGCAGTACTGCAGTGTACGTCAGTACGTACTGCAGTCGTCATGTCGATGCATGCCATGTGACGTATGACTGCATGACGTACTG"

# Your code here for an example of using `textwrap.wrap`
wrapped_lines = textwrap.wrap(my_long_dna, 80)

wrapped_lines

['AGGACAGTTGTACGATGCATCGTGCTACGATCGATGCTAGCGACGTACGTAGCATGCTAGCTAGCTGACGAGCGCGCGCG',
 'ATCAGCATGCGCCGGACGTCAGTCAGTGTCAGTCATGCAGTACTGCAGTGTACGTCAGTACGTACTGCAGTCGTCATGTC',
 'GATGCATGCCATGTGACGTATGACTGCATGACGTACTG']

The two important lines to remember in this example are:
* `import textwrap` --> we imported the `textwrap` module
* `wrapped_lines = textwrap.wrap(my_long_dna, 80)` --> we use the `wrap` function from the `textwrap` module

Note that we have to write `module_name.function_name`. This is because we are importing the ***whole module***.

Let's do another example. Let's look at some math! Good news, Python has a `math` module. Let's import it and compute some sines and cosines of 180°. First, let's look at the [documentation](https://docs.python.org/3/library/math.html#module-math) and then write some code:

In [210]:
# Your code here... Compute sines and cosines using `math`

import math

my_angle = 180
sin180 = math.sin(my_angle)
cos180 = math.cos(my_angle)

print(f"sin(180)={sin180} and cos(180)={cos180}")

sin(180)=-0.8011526357338306 and cos(180)=-0.5984600690578581


Wait! Hang on! That's not right! We remember from high school math that $sin(180^\circ)=0$ and $cos(180^\circ)=-1$. What's going on??? Well, the answer is in the documentation. We can call the `help` function on specific functions!

In [211]:
help(math.cos)

Help on built-in function cos in module math:

cos(x, /)
    Return the cosine of x (measured in radians).



Aha! The angle has to be in radians! So, we need to convert the angle to radians first! We can either do this manually by doing $\textup{radians} = \pi/180 \times \textup{degrees}$ or we can use another function from math! Let's see both:

In [212]:
# Your code here for solution 1
import math

my_angle_in_degrees = 180
my_angle_in_radians = math.pi / 180 * my_angle_in_degrees
sin180 = math.sin(my_angle_in_radians)
cos180 = math.cos(my_angle_in_radians)

print(f"sin(180)={sin180} and cos(180)={cos180}")

sin(180)=1.2246467991473532e-16 and cos(180)=-1.0


In [213]:
# Your code here for solution 2
import math

my_angle_in_degrees = 180
my_angle_in_radians = math.radians(my_angle_in_degrees)
sin180 = math.sin(my_angle_in_radians)
cos180 = math.cos(my_angle_in_radians)

print(f"sin(180)={sin180} and cos(180)={cos180}")

sin(180)=1.2246467991473532e-16 and cos(180)=-1.0


In the first solution, we see an example of using a **constant** (well, actually a variable) from a module.

*Note:* You may be thinking... Hang on! The value of `sin(180°)` didn't come out to zero. Well, it's something very small due to problems representing decimal numbers on a computer. So, for our intents and purposes, we can say $1\times 10^{-16} \approx 0$.

You may be thinking in that last example that we've had to write `math` a lot! We had to write `math.sin` and `math.cos` and `math.pi` and `math.radians`. Can't there be an easier way??? Turns out, there is!

### Importing Specific Functions

Sometimes, we don't want to import an entire module. We may want to just import a specific function. For this, the syntax is:
```python
    from module_name import function_name
```

Then, when we call the function, we **don't** need to write the module name. We only need to write the function name. We ca also import **constants** in this way.

We aren't restricted to importing only one function or constant. We can import a bunch:
```python
    from module_name import function1, function2, constant
```

Let's apply this example to our previous sine and cosine example:

In [214]:
# Your code here to import the specific functions for our sine and cosine example
from math import sin, cos, pi

my_angle_in_degrees = 180
my_angle_in_radians = pi / 180 * my_angle_in_degrees
sin180 = sin(my_angle_in_radians)
cos180 = cos(my_angle_in_radians)

print(f"sin(180)={sin180} and cos(180)={cos180}")

sin(180)=1.2246467991473532e-16 and cos(180)=-1.0


Notice that we were able to call the `sin` and `cos` functions directly and use `pi` as if it were a variable that we had defined.

So, you may be wondering what's the best approach to use. Well, it's really a **case-by-case** decision:
* Does the module have a long name? If so, you may want to just import the functions you'll use.
* Will you forget where the function came from? If so, leave it as a module import so that you remember where the function came from and you don't try to find where you've defined it.
* How much of the module are you using? If you have to import 20 different functions specificially, don't waste the room with the import statement.

## Package Management

All this has been good, but we've only been looking at code that comes with Python. A lot of stuff comes with Python, but not everything. Now, we'll see how to go beyond what Python gives us and explore the big world of **packages**.

### What is a Package?

A **package** is a collection of modules that interact and have been grouped together to be easily distributed to other people. Packages usually have a very specific focus. Here are some very common ones that you will almost definitely encounter in your career:

* **NumPy**: Offers mathematical tools for processing large numeric arrays in many dimensions.
* **SciPy**: Offers scientific tools for signal processing, interpolation, high-dimensional image processing and much, much more.
* **Pandas**: Offers data processing tools for working with tables.
* **Matplotlib**: Offers tools for generating many different types of plots in 2D and 3D.
* **scikit-image**: Offers tools for image processing.
* **scikit-learn**: Offers statistics and machine learning tools.
* **TensorFlow** and **PyTorch**: Offer deep learning and AI tools.

These packages can be quite big! If you installed Anaconda, then great! You have most of them installed automatically. If you didn't install Anaconda, no problem! It's really easy to install packages. There are two main tools that you'll use:
* `conda` -- available if you've installed Anaconda or miniconda
* `pip` -- generally available, even if not using Anaconda

Let's see how to use each of them!

### Installing Packages using `conda`

The packages available to install in `conda` come from the **Anaconda** repository: https://anaconda.org/. We can search online to find a package that we want to install. For example, if you want to install matplotlib, you can search for **matplotlib** on the Anaconda repository. Then, you have **two ways** to install it:

1. On the command line
2. In a graphical user interface

#### Installing Packages on the Command Line

To install using the **command line**, open up the **Terminal** on macOS or Linux, or the **Anaconda Prompt** on Windows. It's very important to **not** use a Python shell for this. Again, we do this in a terminal, **NOT IN A PYTHON SHELL**.

In general, to install a package with `conda`, at the **command prompt** you would write:
```bash
$ conda install package_name
```

Press enter, wait for it to prompt you, type `y` and hit enter again to install! If you don't want to be prompted, then you can just add `-y` to the command so that it automatically answers "yes" to the prompt for installation.

Sometimes, the package that you want isn't available in the main channel, so you may have to specify an additional option (see [here](https://docs.conda.io/projects/conda/en/latest/commands/install.html) for more details). For example, packages may come from a channel called `conda-forge`, so you would have to specify:
```bash
$ conda install -c conda-forge package_name
```

You can also add additional channels, such as [**bioconda**](https://bioconda.github.io/).

#### Installing Packages Graphically

To install packages in the graphical user interface, you need to open **Anaconda Navigator** and select the **Environments** tab.

![Anaconda Navigator showing the environments](../assets/anaconda_navigator.png)

Anaconda Navigator comes pre-installed when you install Anaconda. On Windows and macOS, it's easy to find. On Linux, you may have to open the command line and run:

```bash
$ anaconda-navigator
```

Once you have the **Environments** tab open, you can search for a package and install it.

#### Using Environments
Anaconda lets you set up **Environments** that let you keep different versions of packages. Sometimes, updates change package features or break compatibility with other packages. For scientific work, you may want to have very specific versions of packages. Good news! Anaconda lets you export a list of all of these package versions to a file. Other users can then easily recreate your setup. The process is discussed briefly in the `conda` documentation [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).

At the command prompt (with you `conda` environment active), type the following (after the dollar sign):

```bash
    $ conda env export -f requirements.txt
```

To create a new `conda` environment called `new_env` based on the file, type this:
```bash
    $ conda env create --file requirements.txt --name new_env
```

To activate this new environment, you would type:
```bash
    $ conda activate new_env
```

Environments help you keep multiple versions of packages, and even Python itself, separate.

### Installing Packages using `pip`

What if you don't have Anaconda or Miniconda? Don't worry! Every installation of Python comes with `pip`, official tool for installing packages. `pip` lets you download packages from the official Python Packaging Index (PyPI), found at https://pypi.org/.

To install a package, first you can search for it on PyPI. For example, if we want to install Open3D, which is a package for working with 3D point clouds and models, we can search on PyPI. When you click on the result, it even gives you the code to be able to install the package!


To install packages using `pip`, again you must open the command line. At the prompt, you write:
```bash
    $ pip install package_name
```

When it's done installing, you can use the package!

**Note:** `pip` should come with just about any installation of Python. If you didn't install Anaconda, things may get a bit messy. There are two major versions of Python in use: 2.7 and 3.*. On some operating systems, typing in `python` or `pip` on the command line use Python 2, while you must use `python3` or `pip3` to use the more updated and supported version of Python. When you install Anaconda, you no longer need to deal with this issue.

#### Requirements and Environments

Similar to how we exported a `conda` environment, we can export a list of package versions with `pip`. The package information is stored in a **requirements** file, commonly called `requirements.txt`. To create this file, at the command line, you use `pip freeze`:

```bash
    $ pip freeze > requirements.txt
```

Then, on another computer, if you want to install all these packages, you just write:
```bash
    $ pip install -r requirements.txt
```

You can also create multiple environments using `pip` and `virtualenv`. If you have both `conda` and `pip` installed, the `conda` [documentation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) recommends trying to install packages with `conda` first. Creating these virtual environments is outside the scope of this workshop, but there's plenty of online documentation that can help.

### Other Installation Tips

Most packages give you information in the **documentation** about how to install them. In practice, you rarely have to search Anaconda or PyPI. Usually, you just need to search for the package, and it will explain how to set it up.

For example, NumPy provides the following [page](https://numpy.org/install/).

Matplotlib provides [this page](https://matplotlib.org/stable/users/getting_started/index.html#installation-quick-start). 

For anyone interested in user interface development, PyQt provides [this page](https://www.riverbankcomputing.com/software/pyqt/download).

**Usually**, the installation instructions are simple, telling you to `pip install` the package. There are a few cases, though, where things are more complicated.

An example is [CuPy](https://cupy.dev/), which allows performing NumPy and SciPy operations on the GPU. This package **does not** work on all systems. It requires an NVIDIA GPU and CUDA, which is not available on macOS. In these cases, it's very important to read the [installation instructions](https://docs.cupy.dev/en/stable/install.html).

### Using Packages and Reading Documentation

We've seen what packages are and how to install them, but now how do we use them?

To use a package, we have to import it, just like we import a module. Often, since we use a lot of functions from a package, we typically want to give the package a shorter name when we import it. Here's the syntax for doing this:
```python
    import package_name as short_name
```

You'll see this commonly for the NumPy package:

In [215]:
# Your code here to import numpy
import numpy as np

Now that we have done this, we don't write `numpy` before all functions. Instead, we write `np`:

In [216]:
my_arr = np.arange(8)

For more details on [NumPy](https://numpy.org/), check out its [website](https://numpy.org/) and [documentation](https://numpy.org/doc/stable/).

Packages can be very big! So, instead of dumping all their code in one module, developers often create additional modules and subpackages. Let's look at [SciPy](https://docs.scipy.org/doc/scipy/reference/index.html#scipy-api) as an example.

To import subpackages, we use the **dot notation**. For example, let's import the `interpolate` subpackage from SciPy.

In [217]:
# Your code here to import interpolate from scipy
import scipy.interpolate

We can also rename an imported subpackage. A common example of this is the `matplotlib` plotting package. We commonly use the `pyplot` sub-package. It would be **really, really** long to keep writing `matplotlib.pyplot` everywhere. Instead, when we import it, we commonly see this line:

In [218]:
# Your code here for importing matplotlib
import matplotlib.pyplot as plt

Here, we've given the subpackage a much shorter name.

So, how do we learn all the awesome things we can do with a package? The answer is **DOCUMENTATION**. Many of these big packages are **extremely well-maintained**. So, they have teams of people who dedicate tons of time to writing documentation to help **you**. All these packages have many different types of information available online:

* API references: provide the detailed information, or *docstrings*, about each function, class, method and constant in the package. Example: [NumPy](https://numpy.org/doc/stable/reference/index.html).
* User guides: introductory material and tutorials telling you how to accomplish common tasks and how to follow common conventions for the package. Example: [SciPy](https://docs.scipy.org/doc/scipy/tutorial/index.html).
* Examples: worked out, sometimes step-by-step, examples of how to use the package, with the code available. Example: [matplotlib](https://matplotlib.org/stable/gallery/index).

All of these resources are here for **you**, so make sure that you use them! When in doubt, **consult the documentation!**

## Exercise: Working with Modules

We've seen how to work with modules and packages. Now, let's do an exercise. We saw that using the `math` module we can perform basic mathematical operations. Well, it turns out there's also a built-in `statistics` module that we can use to perform basic stats. The documentation for this module is found [here](https://docs.python.org/3/library/statistics.html#module-statistics). In this problem, I've given you code that generates a list of DNA sequences. Count the number of `C` nucleuotides in each and compute the mean and sample standard deviation of these values.

In [219]:
import random

nucleotides = ["A", "T", "C", "G"]

# Don't worry about this part :-)
random_sequences = [
    "".join(
        random.choices(nucleotides, weights=[
            2 * (i ** 2) - 3 * i, 2 * (i ** 2), i**2 - i, i ** 2 - 4 * i
        ], k=80)
    ) for i in range(1, 101)
]

print("Random sequences are:")
print("\n".join(random_sequences))

# Your code here
c_counts = [seq.count("C") for seq in random_sequences]

import statistics

print(f"The mean C count is {statistics.mean(c_counts)} and the standard deviation is {statistics.stdev(c_counts)}.")

Random sequences are:
AAAATTATTAATATTAAATTTAATTAATATATTTTTATAAAAAATATTTTATAATATTTTATTAAAAATTTTTAAATTTT
ATATTATATTATAATTAATTATATTTATATAAATTATTAAAATTTATTTTATAAATTTATTATTTTTTTTTTATTTTATT
CATTTATATCCATTTTATTCTTTTTTTTACTCCCTAAACTTCTTATTTTCATCTTAATTTTTTTTTACTATCTATTAATT
ATTATATACTTTCTACTTATATTTACTTTCCAAAACTACTCAAAATCTTACTTAACTATATTTTTTCTTATTATATTTTA
CAATCCCCTTTTATTAACAATATATGAATTTTGATTAAATCTCATTCAATATTGCAATATCACTCCTCTATCATTTTAAT
TTTAGTAAAATTTTAACAGATAATAATTCATATAGGTCTAATTACACTTTAACAATATAATGTATTTATTAGGCAATACT
ACATACAAAGTCTAACATTAAGTCGGTTATCCTATATAGAATATGCAAGTTCTTTATAATCACTACCTTAATGTTTGAAG
AACTCGTTTCATAAGTGCGTCTAAATTTACAGTTATTTTTTTCTTGCTTTTGTTTTCGAGTGAGTATTCAACTACAATTT
CCTCACTAAAACTTATTCACATCAATAGCTATTTTTGATCGCTGATCGTACCGTCCAATACAACGAAATACTTACATCCT
AGAAAATGCAATACACTTAAAATGCTAAGTGAGATTTTAGTCTTAAAAGAAGTTTCCTAAATTTTAATAATTAATGCTAT
ATTATTTCATGAAATGCACTAAATTACGATTAATGATTCATCAAGGCAAAGCTTGAATCCCCCAACATTCATGTTAATTG
TGGCCGATCATACATGTACTCACTGAATAACTCACTGGGTACCACCCAACACACACCATATAGAAAGACTCATGTTCTGG
AAATGA

## Module Summary

Congratulations! Another module done! Here are the main points we saw in this module on modules:
* Python code is organised into **modules** that we can easily **import** into our own code to use.
* We can import **an entire module** or we can import **specific functions and constants** to accomplish certain tasks.
* Not all modules we need come installed with Python. We can install **packages** using `conda` or `pip` to get even more functionality.
* We can easily **import** packages into our code to use their added functionality.
* Many of these packages have **lots of documentation** that provides **reference, tutorials and examples** on how to use these packages.

Now you can both write your own code and use code from existing modules and packages!

# Module 5 - Where to Go From Here

We're just about at the end of our workshop! Over the course of these few hours, we've seen the basics of variables and numbers, Booleans and strings, as well as more complicated collection types. We've also seen how to use built-in modules and install packages to gain extra functionality.

So... what comes next?

## What to Learn Next? How?

What great questions? Well, there are still a bunch of topics that I didn't cover today. We talked a bit about **functions**. Well, you can actually **write your own**. Functions are very helpful when you need to repeat certain tasks and you want to have building blocks.

You can also write your own **classes** and define **new types of objects**. Through **object-oriented programming**, you can easily represent the world around you in code.

How to learn all of these? There are plenty of resources out there. Keep your eyes open for other workshops! And check online for tutorials and videos. I'll talk a bit more about these soon.

## How to Get Help and How NOT to Get Help?

When writing code, there are a bunch of resources that can help you!

### Your Code Editor

Yes! That's write! The software you're using to write code can give you lots of help. It can suggest completions and tell when there are errors and even help you reformat your files and restructure your code. So, please, please, please, **DO NOT** write your code in a simple text editor that has not additional features. And ***PLEASE*** don't use a word processing software. Use software that is made for coding!

### Documentation

I mentioned this one earlier. Big projects have big documentation. Take a look at their guides for getting started. For example, [Pandas](https://pandas.pydata.org/) has a [10 minutes to pandas](https://pandas.pydata.org/docs/user_guide/10min.html) tutorial. Use these resources! If you want to learn how to use a function, **look it up** and read the paragraph about it. It will tell you how to use the arguments, any quirks to expect, and in some cases it will give you references about the papers behind the function. This is especially true in image processing and other fields that rely heavily on algorithms. So, the documentation will tell you not only how to use the code, but also **where it comes from**. And make sure to check out the Official Python docs at https://docs.python.org/3/.

### Books

Books, books, books! There are tons! And tons of books out there! For example, there are a couple that are free online:
* *Think Python 2e* by Allen B. Downey (FREE book): https://greenteapress.com/wp/think-python-2e/
* *Data Structures and Information Retrieval in Python* also by Allen B. Downey (FREE book): https://greenteapress.com/wp/data-structures-and-information-retrieval-in-python/

Through the databases at the McGill Library, we also have access to lots of books **for free**. Check out the library's online catalogue to see more.

### Tutorials

Tutorials are also great! And very much abundant! From more formal ones on sites like [freeCodeCamp](https://www.freecodecamp.org/) and [W3Schools](https://www.w3schools.com/python/default.asp) to less formal ones on [DEV](https://dev.to/), you can get lots of insight from these. There are also lots posted on Medium that you can check out. In addition to text-based tutorials, there are also videos on YouTube. And don't forget the official tutorials in the documentation! Tutorials are a very valuable resource that can help you see how to put pieces of code together in real-world examples.

### Stack Overflow (and Pitfalls)

If you have a Python question, chances are that someone, somewhere has asked it on [Stack Overflow](https://stackoverflow.com/). Stack Overflow is a **great** resource for finding answers to real questions about programming. **But** make sure that you're using it properly. Try the other resources before going to Stack Overflow. The answer may turn out to be on the documentation page for the function you're looking for. If there's a link to the docs in a Stack Overflow answer, **use it**. Check out in more detail. Make sure that you understand the code that you're about to add to your project and don't just copy-paste it. Coding is a thinking game. Make sure that you have thought about all the code that you're putting in and that you understand why it's there. And use your judgement and intuition when borrowing that code. If it looks sketchy, it could very well be sketchy and there may be a better way.

### ChatGPT (and Pitfalls)

Everything I said above about Stack Overflow. And more. Answers on Stack Overflow are written by humans who have written the code, tested it, and run the results. Be careful when using ChatGPT for code (if you're allowed to at all). Make extra sure that it makes sense, and test it. Don't just trust it because AI wrote it for you. You need to make extra sure that it actually makes sense and runs properly, because you don't have that same guarantee that a human has used this exact code in their own experience. Use your coding judgement and intuition.

Again, ALWAYS remember to **read the documentation**. Often, if you're stuck, the answer is **right there**. If it's not, then it's probably on Stack Overflow. It's often a good idea to check the documentation **first** to see if there's an official explanation or an official example. And don't just copy a Stack Overflow answer or sample code. Think about what the code is doing. Does it make sense? Is there a better way? Try to look line by line to understand what is going on (play around in the IPython interpreter or in a Jupyter notebook!).

## Other Cool Programming Topics

So, I talked a bit about functions and classes, but there's much more that you can look into to help build your programming skills and write code that others will want to use.

### Writing Packages

We've seen how to install and use packages. But, you can also **write your own packages**. There are many great resources online about writing packages. The one that I most recommend is [this free online book](https://py-pkgs.org/): *Python Packages* by Tomas Beuszen and Tiffany Timbers. It's an easy read and helps you learn not only how to organise your code, but how to publish it, too. The authors also walk through how to render your own nice-looking documentation and host that online.

### Object-Oriented Programming

Writing code with loops and control flow is fun, but it's even better when we can combine everything into functions and classes and work in an **object-oriented** manner. This paradigm helps you organise your code differently, constructing building blocks that can work together to build elaborate programs.

### Developing Graphical User Interfaces

Jupyter notebooks and command line scripts are powerful, but they aren't accessible for people who don't know how to code. Solution: build a graphical user interface! Using PyQt, the process is quite straightforward. Check out [this online tutorial series](https://www.pythonguis.com/) by Martin Fitzpatrick to learn about developing GUIs in Python.

### Hosting Projects on GitHub

What fun is a project if other people can't use it? By hosting your project on GitHub, you let others easily contribute to your project and build on it. Learning Git and GitHub are essential! And so are a few other skills along the way, like writing documents in Markdown. MiCM often has Git and GitHub workshops, so check out their workshop schedule!

## The End

We've reached the end of our workshop! For those of you who have previous programming experience, congratulations on adding another language to your repertoire. For those of you who are new, welcome to the world of programming! Just remember, programming is like art: you start with an empty text file and soon enough, you have hundreds (or thousands) of lines of code!

Don't hesitate to reach out if you have any further questions. Happy coding!

In [220]:
from time import sleep


print("Good luck with your programming future!", end=" ")

i = 1
s = "/-\|"

print(s[0], end="")

while i < 10:
    print("\b" + s[i % len(s)], end="")
    i += 1
    sleep(0.5)

print("\b🎉")

Good luck with your programming future! 🎉
