# Intermediate Python

MiCM Workshop - November 1, 2024

Benjamin Z. Rudski, PhD Candidate, Quantitative Life Sciences, McGill University

Dear `Reader | Workshop Attendee`,

Welcome! In this interactive Jupyter notebook, we will explore intermediate-level skills in the Python programming language. We'll cover material from functions and data classes to processing and visualising data using popular libraries. This workshop assumes that you have a basic knowledge of Python. If you don't feel free to check out some beginner resources. In a shameless self-promotion plug, you may find my [Intro to Python workshop]() helpful. We'll also review some of the core concepts before diving into new material, but I encourage you to make sure you're comfortable with the basics before tackling the new material.

This workshop contains various exercises to help you practice the material. This notebook is the **student version**, which contains several blanks where I will write code during the workshop and where you can fill out exercises. There is a **solution version** in the [`solutions`](../solutions/) folder. I recommend trying the exercises yourself before looking at the solutions, There is often more than one way to answer a programming question, so you should focus more on understanding the code that you are writing, instead of just copying my answers. You may come up with an answer better than the one I've provided!

# Table of Contents

1.	1. Module 1 – Getting Up to Speed (10 minutes)
    1. Quick Review
        1. What is Python?
        2. Key Ideas and Syntax
2. Module 2 – Introduction to Functions (45 minutes)
    1. Function Overview
        1. What is a function?
    2. Writing Custom Functions
        1. Basic function definitions
        2. Passing inputs: Defining parameters
        3. Producing outputs: Return values
    3. Documenting Functions
        1. Defining function docstrings
        2. How to get help from your IDE: Type annotations (optional)
    4. Exercise: Writing functions for biological sequences.
3. Module 3 – Modules and Packages (45 minutes)
    1. Using Modules
        1. What is a Module?
        2. Importing a Module
        3. Importing Specific Functions
    2. Package and Environment Management
        1. What is a Package?
        2. Installing Packages using conda
        3. Installing Packages using pip
        4. Using Packages and Reading Documentation
        5. A Brief Intro to Environments
    3. Exercise: Using `textwrap` to nicely print DNA sequences.
4. Module 4 – Where to go from here (10 minutes)
    1. What to learn next? How?
    2. How to get help and how not to get help
        1. Your code editor
        2. Documentation
        3. Books
        4. Tutorials
        5. Stack Overflow (and pitfalls)
        6. ChatGPT (and pitfalls)
    3. Glimpse of other cool programming topics

# Learning Objectives

By the end of this workshop, you'll have the skills necessary to:

1.	Define and call new functions.
2.	Import and use code from built-in Python modules.
3.	Install new packages to access even more tools.

Ready? Let's dive into the material!

# Module 1 - Getting up to Speed

In this module, we make sure that everyone is on the same playing field in terms of basic Python knowledge. This section is meant to **briefly** cover the important concepts. It does **not** replace an introductory course.

## Quick Review

In this quick review, we explore two main questions:

1. What is Python?
2. What are its key ideas and important syntax elements?

### What is Python?

Python is a **free and open-source**, **object-oriented** and **interpreted** programming language. Here's a bit more detail:

* **Free and open-source:** Anyone can download, install, use, copy, modify and redistribute Python. Everyone can see how Python is developed and even contribute to the project.
* **Object-oriented:** everything in Python is represented as an **object**, or a grouping of data, called **attributes** with behaviour, called **methods**.
* **Interpreted:** There's no need to compile Python code. Python runs your code one line at a time, so it's easy to make changes to your code.

For more information about Python, make sure to check out the official Python website at https://www.python.org.

### Key Ideas and Syntax

Here, we do a lightning review of important concepts, with some illustrations.

#### Variables
**Variables** contain pieces of data. A variable has a name, which may only contain letters, numbers and underscores. A variable name may **not** start with a number.

Python has no constants.

Variables are simply assigned using the `=` sign:

In [None]:
my_variable = 5

#### Data Types

Python has the following commonly-used data types:

##### Primitive Data Types

* `int` - Integer, representing a whole number. For example: `5`.
* `float` - Floating point number, representing a decimal number. For example: `4.0`.
* `bool` - Boolean value, `True` or `False`.
* `str` - Text string. For example: `"Hello, World!"`.

##### Collection Types
* `tuple` - Tuple, containing a small fixed number of values, not necessarily of the same type. Enclosed in round brackets. For example: `(404, "Not Found")`.
* `list` - List, containing a variable number of elemnts, typically of the same type. Enclosed in square brackets. For example: `["T", "R", "M", "A", "Q"]`.
* `dict` - Dictionary, containing key-value pairs. Enclosed in brace brackets. For example: `{"purines": ["A", "G"], "pyrimidines": ["C", "U", "T"]}`.

To find out the type of a variable, we can use the `type()` function.

#### Control Flow and Loops

Control flow and loops let us change the running of the code, allowing some lines to run, but not others, and allowing lines to be repeated multiple times.

##### Control Flow - The `if` Statement

```python

if some_boolean:
    run_some_code()
    ...
elif some_other_boolean:
    run_other_code()
    ...
elif yet_another_boolean:
    run_other_other_code()
    ...
else:
    run_code_if_no_match()
    ...

run_code_regardless()
```

The `if` statement is the only required. You can add as many `elif` clauses as you want (within reason), and at most one `else` clause. The most important thing to remember is to **indent**.

##### `while` Loops

```python

while some_boolean:
    run_some_code()
    ...
    update_boolean()

run_other_code()

```

Again, notice the indent. The block is done when the indent changes.

**Warning:** (almost) never put a pure boolean value in as `some_boolean`. Instead, put an expression that relies on a value that gets updated in the loop.

##### `for` Loops

```python

for variable in some_iterable:
    run_some_code()

run_other_code()
```

The key idea here is that the value of `variable` changes each time the loop runs.


Now that we've seen the key concepts, let's jump into our material!

# Module 2 - Introduction to Functions

In this module, we'll explore functions. By now, you've almost certainly used existing functions, like `abs` or `round` or, of course, `print`. Here, we'll see not only how to *use* functions, but also how to *define* them.

Here's the outline for this module:

1. Function Overview
    1. What is a function?
2. Writing Custom Functions
    1. Basic function definitions
    2. Passing inputs: Defining parameters
    3. Producing outputs: Return values
3. Documenting Functions
    1. Defining function docstrings
    2. How to get help from your IDE: Type annotations
4. Exercise


## Function Overview

We hear this word a lot... *function, function, function...*, but what does it mean? Let's find out!

### What is a Function?

We can think of functions as **machines** that take in **inputs**, run code (do calculations, magic or a bit of both), and then produce an **output** that can be used.

The inputs are known as *parameters* or *arguments* and the outputs are known as *return values*.

Here's a diagram to illustrate this.

![Function as a machine](../assets/function/Function.png)

Like anything in Python, a function has a **name**. To run the function, we must **call it** by writing its name, and then including the arguments in brackets.

**Remember! Even if the function has no arguments, you must put the brackets!**

If the function **returns** a value, we can store it in a variable using the typical `=` assignment.

Python has a number of [built-in](https://docs.python.org/3/library/functions.html) functions. Let's explore one of them. Let's call the [`round`](https://docs.python.org/3/library/functions.html#round) function:

In [None]:
# Your code here


We can learn more about any function using the built-in `help` function:

In [None]:
# Your code here


This help documentation, known as a **docstring** tells us important information about the function. It describes the parameters and return values, as well as any quirks that the function may have.

In addition to using the `help` function, we can also read the docstring online, at the official Python documentation: https://docs.python.org/3/library/functions.html#round.

## Writing Custom Functions

Now that we've review what functions are and how to *use* them, let's dive into **defining** our own.

### Why write our own functions?

It's all good and fun to write all the steps you want to do line-by-line. But, let's say you want to run the same set of steps multiple times, potentially on different inputs. You could just copy-paste the code... but what happens if you have to change it? You'll have to change all the copies!

Instead of copying the code, we can write new **functions**.

### A Bit of Syntax

In Python, functions are defined using the `def` keyword. The syntax is:

```python

def function_name(argument1, argument2, argument3, ..., argumentN):
    """
    documentation here
    """

    your_code_here...

    return some_value

```
Here are the important elements to notice when **defining** a function:

* The function definition begins with the `def` keyword. This is similar to the `function` keyword in Javascript, or the `func` keyword in Swift.
* The **function name** follows the same rules as variable names. There are different naming conventions for names that consist of multiple words (`snake_case` vs `camelCase`). By common convention, the function name starts with a **lowercase** letter.
* After the function name, you can include a list of parameters in parentheses. **If your function takes no arguments, you must still put the brackets.** Each argument in the list must have a valid variable name. We'll discuss these in more detail later.
* After closing the argument list bracket, we put a **colon** (`:`).
* After the first line, we must **indent**. This tells Python where the function body begins and ends.
* We can start the body with a **docstring**, which describes the function. We'll discuss these more later.
* Then, you write your code as normal. In this function body, treat the arguments **like normal variables**.
* To **output** a result that can be used later, use the keyword `return`, followed by the result. We'll discuss this more later.
* After finishing to define the function, simply stop indenting. There's no need to close any brackets or type `end`.

To demonstrate, let's write a function with no arguments that simply prints a string onto the screen:

In [None]:
# Your code here


Wait! What happened? Or well, what didn't happen? We didn't see any string... What's going on?

Well, we only **defined** the function. To actually run the function we must *call* it. To call a function, simply write the name of the function, followed by the desired arguments in brackets. **If the function takes no arguments, you must still type the empty brackets.**

Let's call our function we just defined:

In [None]:
# Your code here


### Function Parameters

This function worked, but we didn't really put the *fun* in *function*.

We said that a function takes input and produces output... This does neither!!! So, let's create a function with some parameters! Let's look at the specific syntax:

```python

def my_function(arg1, arg2, arg3, ..., argN):
    my_code...

```

We separate each parameter using **commas (,)**. We can then refer to these as variables in the function body. In this case, in the function body's code, you can refer to `arg1` just as you would any other variable.

As an example, let's write a function that takes a DNA sequence as input and prints the transcribed RNA. To make it more interesting, let's add an extra parameter that indicates whether we are considering the sequence to be on the template strand or not. For simplicity, let's ignore the directionality of DNA.

Remember, if the DNA is on the template strand, we must perform base-pairing!

In [None]:
# Your code here


And now, let's call this function using a specific sequence.

In [None]:
my_sequence = "AATTAGCGAGCCGAATATATAGCCGCGATTCAGACAGTTCCAGCGCA"

# Your code here


This works well! Except, what if most of the time, we're going to call the function on the template strand? It would be nice if we didn't have to specify this argument every time we call the function and if we could give it a default value.

#### Keyword Arguments

Good news! We can set default values for function arguments. These are known as *keyword* arguments. Values without a default value are known as *positional* arguments. To specify the default value, simply assign the value with `=`:

```python

def my_function(my_positional_arg, my_kw_arg=default_value):
    ...

```

Let's extend our transcription example to set a default value for the `is_template_strand` parameter:

In [None]:
# Your code here to modify the function


So, now we can call the function without having to specify a value for the second parameter:

In [None]:
# Your code here to call the function


There are a few **important rules** to remember about positional and keyword arguments:

1. Positional arguments **always** come first, both when defining and when calling functions.
2. When calling a function, you **must** include **all** positional arguments, but you can omit keyword arguments (since they have default values).
3. Keyword arguments can be passed in **any order**, but positional arguments must be kept in the same order.

Here are some more technical notes:
1. **Any** argument can be written in keyword argument form when calling a function, but if you write a positional argument in keyword form, **all** subsequent arguments must be written in keyword form.
2. **Any** argument can be written in positional form when calling a function, but **all** preceding arguments must be written in this positional form as well.

### Function Return Values

So, we've seen how to pass information into functions, but now, how do we get information out? The answer is **return values**. These return values let us capture the result of a function, which we can then use like a normal variable in code. To return a value, we simply type `return` followed by the value we want to return.

Here's the syntax:
```python

def my_function(...):
    ...

    my_result = ...

    ...

    return my_result

```

Let's now switch our previous transcription function to *return* the mRNA instead of simply printing it:

In [None]:
# Your code here to modify the function to return a result


So, this is how to return the value. Now, let's see how to capture and use it. To capture the value, we simply assign it to a variable, like normal, using the equal sign `=`.

In [None]:
# Your code here


**Note:** If your code has multiple branches, you can put multiple return statements in your code. **But**, once your code reaches the `return` line, the function **stops** and returns to the code that called it. Any code that you've written after the `return` statement **will not run**.

Let's just repeat that again: **Code underneath a `return` statement WILL NOT RUN.**

If you're using a good code editor, it will give you a warning about this "dead code".

We can also return *multiple* values using tuples, lists or dictionaries. For example, let's say we want to count the number of each type of nucleotide in a sequence of DNA:

In [None]:
# Your code here


Now, let's run this code on our example sequence:

In [None]:
# Your code here


This is great! But let's say you get this function from someone else to import and use in your own code. You don't want to have to find this function and read all the code just to use it... But, how do we know what parameters this function takes and what values it returns...

## Documenting Functions

The answer to this question is **documentation**. Remember how we looked at the `help` for the `round` function earlier? We can do the same thing for our custom functions!

### Defining Function Docstrings

When defining a function, we can provide a *docstring*, which describes the important information about a function in a **human-readable** form. The docstring is just a string that a person can read to learn more about a function. If you're using a code editor or IDE, like VS code or PyCharm, this string appears when you hover your mouse over a function. The information contained in this docstring can include:

* A brief description of the function.
* A longer description of the function. If you're implementing an existing approach, it could be good to include a citation here. You can also include equations here.
* A description of the function parameters, including their types.
* A description of the function return values, as well as their types. This is especially useful if you are returning multiple values and need to include their order.

Let's clarify our previous example by adding a docstring:


In [None]:
# Your code here to add a docstring to our function


Now that we have a docstring, we can actually read it using the `help` function!

In [None]:
# Your code here to look at the help for count_nucleotides.


While there are not many rules for how to write docstrings, there are some guidelines laid out in the Python documentation in [PEP 257](https://peps.python.org/pep-0257/). There are also a number of common conventions used. One is the **numpydoc** style, which is used by the developers of the NumPy project. This style is described online [here](https://numpydoc.readthedocs.io/) and is integrated into some code editors.

### How to Get Help from your IDE: Type Annotations (Optional)

So, this is great for making it easy for other people to read... But, the docstring is just a string. The code editor doesn't understand it and can't give us suggestions based on it. But, we can do get this extra help using **type hints**.

**Type hints** are a relatively recent addition to Python. They allow us to *explicitly* indicate the types of function parameters, return values and any other variable. That way, the code editor can tell us if we've passed the wrong type of value somewhere, or even give us suggestions as we type.

#### A quick refresh on types

Everything in Python has a type. When you have a string of text, such as `"Hello, world!"`, it is a Python *object* of type `str` (string). If you have an integer number, like `4`, it is an object of type `int`. If you have a list, it is an object of type `list`. Hopefully, you get the idea by now.

In many cases the type can be inferred. For example, if you write
```python

x = 5

```

Then you know that `x` has type `int`. You can check the type of a variable `x` by running:
```python
type(x)
```

We'll see later how to create new types.

Let's see a few more examples of types:

In [None]:
# Your code here


#### Type Hints

Although in the earlier cases we saw the type is implied, we can also explicitly set the type of different variables. We do this by adding a colon and the name of the type after the variable name. For example:
```python

x: int = 6
y: str = "world"

```

This extra bit that we add is known as a *type hint*. It gives a hint to the reader and the code editor about the specific type of the variable.

**Note:** Type variables are simply an annotation. They *don't* actually change the type of the object. If you want to convert from one type to another, you need a different function.

#### Type Hints and Functions

There are certain operations that we can only perform on certain types. For example, you can subtract two `int`s or two `float`s, but you can't subtract two `str`s. In your function, you perform operations on the arguments that are passed in. These operations often make assumptions about the **type** of the arguments. With type hints, you express clearly for the computer, as well as your code editor, what exactly those assumptions are.

To add type hints to the **parameters**, we just repeat the above syntax with the colon and the type name after the parameter names.

We can also use type hits to indicate the **return type**. After defining the parameters, but **before the colon**, we can put an arrow `->` followed by the return type.


Let's add type hints to our earlier transcription code:

In [None]:
# Your code here to add type hints


You may not be entirely convinced yet... But, if you're using a code editor like **Microsoft Visual Studio Code** or **PyCharm**, you'll get error highlights if you try to call the function with the wrong type of argument.

For collection types, like `list` and `tuple`, we can also modify the type hint to give information about the *contents*. We put the type contained in the collection in **square brackets** after the collection type name. For example, you can denote a list of strings as:

```python
my_list: list[str] = ["Hello", "World"]

my_float_tuple: tuple[float, float] = (3.2, -5.7)
```

Another cool thing is that in some editors, if you hover your mouse over the function name, when you see the function header, you see the types! So this will help you get a better idea of what types you need.

## Module Summary

In this module, we've explored **functions**. Specifically, we've seen:

* What functions are and how to **call** them.
* How to **define new functions**, which take in **parameters** and **return** results.
* How to **document** functions using **docstrings** and **type hints (optional)** to make them easier to understand and reuse.

Now, let's take a look at our exercise!

## Exercise: Writing Functions for Biological Sequences

Proteins are composed of sequences of amino acids, arranged in polypeptide sequences. There are 20 common amino acids, which have different properties. We'll focus on polarity and charge. Amino acids are grouped into four categories:
1. Non-polar
2. Polar
3. Acidic
4. Basic

Let's write a function called `compute_amino_acid_properties` that takes a peptide sequence and returns the number of amino acids falling into each category. I've given you a dictionary with the amino acids and their properties as a starting point (obtained from [Wikipedia](https://en.wikipedia.org/wiki/DNA_and_RNA_codon_tables)).

In [None]:
AMINO_ACID_PROPERTIES = {
    "NON_POLAR": ["F", "L", "I", "M", "V", "P", "A", "W", "G"],
    "POLAR": ["S", "T", "Y", "Q", "N", "C"],
    "ACIDIC": ["D", "E"],
    "BASIC": ["H", "K", "R"]
}

# Your code here


In [None]:
# Here's an artificial amino acid sequence to test with:
test_peptide = ("EDEQLPAMFYDHSRMGQDCTIQYRAFFKFKCDEVVICPRMCRFDM"
                "GYLSCNWPDQWQFWPPNPHTDSTWVSLDYPLRWDCCRKPHTFEPY"
                "TMHASWCTERDPDIWACIKDSWMSPFEPQGSWGSTELVKEDPGFF"
                "SVFALRPCVWAAPTT")

test_peptide_properties = compute_amino_acid_properties(test_peptide)

print(test_peptide_properties)

# Module 3 - Modules and Packages

We've now seen that we can write functions to package up repeatable tasks and call them on different inputs. We also saw that there are a number of built-in Python functions.

But, what about doing more complicated tasks? Let's say you need to generate random numbers, or compute statistics, or calculate sines and cosines, or work with matrices and generate nice plots? How are you going to do that?

In practice, more complicated code will be wrapped up in **functions** that other people have written. The good news is that in Python it's **very easy** to use code from other people. In this module, we'll talk about how Python code is arranged and how you can **import** code and use it as if you had written it yourself. Here's the outline for this module:

1. Using Modules
    1. What is a Module?
    2. Importing a Module
    3. Importing Specific Functions
2. Package and Environment Management
    1. What is a Package?
    2. Installing Packages using conda
    3. Installing Packages using pip
    4. Using Packages and Reading Documentation
    5. A Brief Intro to Environments
3. Exercise: Using `textwrap` to nicely print DNA sequences.

## Using Modules

Python code is organised in *modules*. But wait???? What's a module? I'm glad you asked...


### What is a Module?

Simple answer: a **module** is a file.

That's it.

Any time you create a new Python file and assign it a name that ends with `.py`, you've created a module. If you share this file with someone else, they can use your code in their own files without having to copy-paste it. We'll see the details in a bit.

So, what does this module look like? Usually, it contains a bunch of different code:
* **Functions**: bits of repeatable behaviour to simplify tasks.
* **Classes**: code that defines new types of objects.
* **Constants**: variables that have important pre-determined values, like $\pi$.

All of these are also typically accompanied by **documentation**, which explains how they work, what you can do with them, and how you can use them. This documentation is just a series of docstrings from the module file.

This will become clearer in a bit. First, it's important to know that Python comes with **a lot** of built-in modules. You can see a list [here](https://docs.python.org/3/py-modindex.html).

We *can* also run Python code to see what modules we have available. **BUT!!! This code may take some time to run, especially if you installed Python using Anaconda! So, think twice before running this line!**

In [None]:
# help("modules")

This list doesn't only include built-in modules, but also those that you've installed from other packages. We'll talk about this later.

### Importing a Module

To use code from a module, we have to **import** it. Importing the module tells Python that we want to access its contents and use them in our code.

To import a module so that we can use it in our code, here's the syntax:
```python
    import module_name
```

If you're importing code you've written yourself, then the `module_name` is just the name of your file, without the `.py` extension. Module names follow the same rules as variable names, so if you want to be able to import your code in a different file, you must give it a name that is valid.

Let's do an example. Let's import the `math` module, which provides basic functions for performing more complicated mathematical operations:

In [None]:
# Your code here to import the math module

import math

Great! We've imported the module! That's our first step done. The next step is to **read how to use the module**. We have two ways to do this:
1. Go to the website to read the documentation.
2. Use the `help` function in Python.

If we use the `help` function, we can read the help right from Python without having to search the internet! The downside is that the `help` function is entirely text-based, so there are no pictures and it's harder to navigate. **Usually, I look at the online help.**

In [None]:
help(math)

Now, let's look at the [online documentation](https://docs.python.org/3/library/math.html#module-math).

We've seen that there are a lot of functions available to do things like compute square roots and trigonometric results. Let's try to use some of the functions! Let's try to compute the sine and cosine of 180°. We expect to find the following:

$$
\begin{align*}
    \sin 180^\circ &= 0\\
    \cos 180^\circ &= -1
\end{align*}
$$

Let's try to compute these values in code now:

In [None]:
# Your code here... Compute sines and cosines using `math`


Wait! Hang on! That's not right! What's going on??? Well, the answer is in the documentation. We can call the `help` function on specific functions!

In [None]:
# Your code here to get help on math.cos


Aha! The angle has to be in radians! So, we need to convert the angle to radians first! We can do this manually by doing $\textup{radians} = \pi/180 \times \textup{degrees}$ or... we can use **another function** from `math`!

In [None]:
# Your code here


In the first solution, we see an example of using a **constant** (well, actually a variable) from a module.

*Note:* You may be thinking... Hang on! The value of `sin(180°)` didn't come out to zero. Well, it's something very small due to problems representing decimal numbers on a computer. So, for our intents and purposes, we can say $1\times 10^{-16} \approx 0$.

You may be thinking in that last example that we've had to write `math` a lot! We had to write `math.sin` and `math.cos` and `math.radians`. Can't there be an easier way??? Turns out, there is!

### Importing Specific Functions

Sometimes, we don't want to import an entire module. We may want to just import a specific function. For this, the syntax is:
```python
    from module_name import function_name
```

Then, when we call the function, we **don't** need to write the module name. We only need to write the function name. We can also import **constants** in this way.

We aren't restricted to importing only one function or constant. We can import a bunch:
```python
    from module_name import function1, function2, constant
```

Let's apply this example to our previous sine and cosine example:

In [None]:
# Your code here to import the specific functions for our sine and cosine example


Notice that we were able to call the `sin` and `cos` functions directly and use `pi` as if it were a variable that we had defined.

So, you may be wondering what's the best approach to use. Well, it's really a **case-by-case** decision:
* Does the module have a long name? If so, you may want to just import the functions you'll use.
* Will you forget where the function came from? If so, leave it as a module import so that you remember where the function came from and you don't try to find where you've defined it.
* How much of the module are you using? If you have to import 20 different functions specificially, don't waste the room with the import statement.

## Package and Environment Management

All this has been good, but we've only been looking at code that comes *with* Python. When doing scientific computing, we often need to use a lot of code that doesn't come included.

We need to go beyond what Python gives us and explore the big world of **packages**.

### What is a Package?

A **package** is a collection of modules that usually interact and have been grouped together to be easily **distributed** to other people. Packages usually have a very specific focus. 

Here are some very common packages that you will almost definitely encounter in your career:

* **NumPy**: Offers mathematical tools for processing large numeric arrays in many dimensions.
* **SciPy**: Offers scientific tools for signal processing, interpolation, high-dimensional image processing and much, much more.
* **Pandas**: Offers data processing tools for working with tables.
* **Matplotlib**: Offers tools for generating many different types of plots in 2D and 3D.
* **scikit-image**: Offers tools for image processing.
* **scikit-learn**: Offers statistics and machine learning tools.
* **TensorFlow** and **PyTorch**: Offer deep learning and AI tools.

And there are **many** more out there that you'll likely use at some point.

If you installed Anaconda, then great! You have most of the packages you'll ever need (and a bunch you'll never need) installed automatically. If you didn't install Anaconda, no problem! It's really easy to install packages. There are two main tools that you'll use:
* `conda` -- available if you've installed Anaconda or miniconda.
* `pip` -- always available, regardless of how you installed Python.

Let's see how to use each of them!

### Installing Packages using `conda`

The packages available to install in `conda` come from various channels available via the **Anaconda** repository: https://anaconda.org/. We can search online to find a package that we want to install. For example, if you want to install matplotlib, you can search for **matplotlib** on the Anaconda repository. Then, you can easily install it using the **command line terminal** (not a Python console).

#### Installing Packages on the Command Line

To install using the **command line**, open up the **Terminal** on macOS or Linux, or the **Anaconda Prompt** on Windows. It's very important to **not** use a Python shell for this. Again, we do this in a terminal, **NOT IN A PYTHON SHELL**.

If everything is set up properly, you should see `(base)` before the prompt. This indicates that you are in the base `conda` environment.

In general, to install a package with `conda`, at the **command prompt** you would write:
```bash
$ conda install package_name
```

Press enter, wait for it to prompt you, type `y` and hit enter again to install! If you don't want to be prompted, then you can just add `-y` to the command so that it automatically answers "yes" to the prompt for installation.

Sometimes, the package that you want isn't available in the main channel, or only an older version is available. If the package you want isn't available in the main `anaconda` channel, you can specify the `conda-forge` channel instead using the `-c` option (see [here](https://docs.conda.io/projects/conda/en/latest/commands/install.html) for more details).

```bash
$ conda install -c conda-forge package_name
```

You can also add additional channels, such as [**bioconda**](https://bioconda.github.io/).

Let's do an example. Let's try to install `numpy`, `scipy` and `matplotlib` from the `conda-forge` channel. Remember, we need to use a **terminal**. Thankfully, we can open terminals in Jupyter Lab.

**Tip:** We can install multiple packages at the same time by including all their names.

We can do other package management operations in `conda`. These are described in the `conda` [documentation](https://docs.conda.io/projects/conda/en/stable/commands/index.html). The main ones are:
* `conda remove` - uninstall a package.
* `conda update` - update a package.

There are also various options for each command.

### Installing Packages using `pip`

What if you don't have Anaconda or Miniconda? Don't worry! Every installation of Python comes with `pip`, official tool for installing packages. `pip` lets you download packages from the official Python Packaging Index (PyPI), found at https://pypi.org/.

To install a package, first you can search for it on PyPI. For example, if we want to install Open3D, which is a package for working with 3D point clouds and models, we can search on PyPI. When you click on the result, it even gives you the code to be able to install the package!


To install packages using `pip`, again you must open the command line. At the prompt, you write:
```bash
    $ pip install package_name
```

When it's done installing, you can use the package!

Similar to `conda`, `pip` has a variety of other operations it can perform, which are all described in its [online documentation](https://pip.pypa.io/en/stable/cli/). The most important one for now is `pip uninstall package_name` which removes an installed package.

**Note:** `pip` should come with just about any installation of Python. If you didn't install Anaconda, things may get a bit messy. There are two major versions of Python in use: 2.7 and 3.*. On some operating systems, typing in `python` or `pip` on the command line use Python 2, while you must use `python3` or `pip3` to use the more updated and supported version of Python. When you install Anaconda or Miniconda, you no longer need to deal with this issue. This issues is becoming less common, as Python 2.7 has been deprecated.

### Jupyter Notebook Trick - Magic Commands

If you're using Jupyter notebooks, there are built-in [magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html) that let you install packages within Python code cells. (These are also discussed [here](https://discourse.jupyter.org/t/why-users-can-install-modules-from-pip-but-not-from-conda/10722/4?u=fomightez) and [here](https://discourse.jupyter.org/t/python-in-terminal-finds-module-jupyter-notebook-does-not/2262/9) and mentioned briefly in a comment [here](https://stackoverflow.com/questions/38694081/executing-terminal-commands-in-jupyter-notebook).)

To install a package using `pip` or `conda`, you write the same line you would at the terminal, but put the `%` sign at the beginning.

For example, to install NumPy in the environment associated with the Jupyter notebook using `conda`, write:

```python
%conda install numpy
```

To install Matplotlib using `pip`, write:

```python
%pip install matplotlib
```

There are other magic commands that can be used only in **Jupyter notebooks** and in the **IPython shell**. You can read more about them [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html).

### Other Installation Tips

Most packages give you information in the **documentation** about how to install them. In practice, you rarely have to search Anaconda or PyPI. Usually, you just need to search for the package, and it will explain how to set it up.

For example, NumPy provides the following [page](https://numpy.org/install/).

Matplotlib provides [this page](https://matplotlib.org/stable/users/getting_started/index.html#installation-quick-start). 

For anyone interested in user interface development, PyQt provides [this page](https://www.riverbankcomputing.com/software/pyqt/download).

**Usually**, the installation instructions are simple, telling you to `pip install` the package. There are a few cases, though, where things are more complicated.

An example is [CuPy](https://cupy.dev/), which allows performing NumPy and SciPy operations on the GPU. This package **does not** work on all systems. It requires an NVIDIA GPU and CUDA, which is not available on macOS. In these cases, it's very important to read the [installation instructions](https://docs.cupy.dev/en/stable/install.html).

If you have both `conda` and `pip` installed, the `conda` [documentation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) recommends trying to install packages with `conda` first. You can easily search on https://anaconda.org to see if the package is available. Installing packages with `conda` makes it easier to manage multiple *environments* (which we'll discuss soon).

### Using Packages and Reading Documentation

We've seen what packages are and how to install them, but now how do we use them?

#### Importing Packages

To use a package, we have to import it, just like we import a module. Often, since we use a lot of functions from a package, we typically want to give the package a shorter name when we import it. Here's the syntax for doing this:
```python
    import package_name as short_name
```

You'll see this commonly for the NumPy package:

In [None]:
# Your code here to import numpy


Now that we have done this, we don't write `numpy` before all functions. Instead, we write `np`:

In [None]:
# Your code here to generate a simple array


For more details on [NumPy](https://numpy.org/), check out its [website](https://numpy.org/) and [documentation](https://numpy.org/doc/stable/).

Packages can be very big! So, instead of dumping all their code in one module, developers often create additional modules and subpackages. Let's look at [SciPy](https://docs.scipy.org/doc/scipy/reference/index.html#scipy-api) as an example.

To import subpackages, we use the **dot notation**. For example, let's import the `interpolate` subpackage from SciPy.

In [None]:
# Your code here to import interpolate from scipy


We can also rename an imported subpackage. An extremely common example of this is the `matplotlib` plotting package. We commonly use the `pyplot` sub-package. It would be **really, really** long to keep writing `matplotlib.pyplot` everywhere. Instead, when we import it, we commonly see this line:

In [None]:
# Your code here for importing matplotlib


Here, we've given the subpackage a much shorter name.

**Note:** This renaming works for all imports, not just from packages and subpackages. You can even rename functions that you import (although this can become confusing).

#### Reading Documentations

So, how do we learn all the awesome things we can do with a package? The answer is **DOCUMENTATION**. Many of these big packages are **extremely well-maintained**. So, they have teams of people who dedicate tons of time to writing documentation to help **you**. All these packages have many different types of information available online:

* API references: provide the detailed information, or *docstrings*, about each function, class, method and constant in the package. Example: [NumPy](https://numpy.org/doc/stable/reference/index.html).
* User guides: introductory material and tutorials telling you how to accomplish common tasks and how to follow common conventions for the package. Example: [SciPy](https://docs.scipy.org/doc/scipy/tutorial/index.html).
* Examples: worked out, sometimes step-by-step, examples of how to use the package, with the code available. Example: [matplotlib](https://matplotlib.org/stable/gallery/index).

All of these resources are here for **you**, so make sure that you use them! When in doubt, **consult the documentation!**

When starting with a new package, I recommend looking for two things:
1. a **getting started** guide,
2. **examples**.

The guide will help you learn important conventions quickly, while examples will show you the breadth of what the package can do. There are usually **tons** of functions and classes in a package, so it's good to see if any of the examples are similar to what you are looking to do.

### A Brief Introduction to Environments

I've tossed the word **environment** out a couple of times. But, what is an environment?

In Python programming, an **environment** contains your **Python interpreter** and all the **associated packages**.

#### Why Use Environments?

If you start working on complicated projects, you'll notice a few things:

* Things sometimes break over time. New versions of packages add new features, but sometimes remove or change other features. Sometimes, your task may require a specific version of a package. For example, your task may require `numpy` version `1.26` and not `2.0`.
* The above, but for your dependencies. Sometimes, a package that you use depends on a specific version of another package.
* Some projects you're working on may require some versions of a package, while others require a different version.
* If you're sharing your code with a colleague, they'll need to install specific packages to run your code. It's easier to give them as short a list of dependencies as possible, and you need to communicate the correct package versions to them.
* New versions of Python itself add new features and introduce new syntax. It's important to let people know exactly what version of Python you're working with so that your code will run properly.

![Sample environments for different projects.](../assets/environment/Environments.png)

Image made with PowerPoint.

Python environments help you to simplify these tasks by allowing you to:

* Keep different versions of the same package isolated.
* Have multiple versions of Python installed.
* Export your installed packages and dependencies to a file.


There are **many** different ways to create new Python environments. We'll discuss two:

* `conda` environments
* `virtualenv` environments

#### Managing Environments with `conda`

Using `conda`, it's easy to set up environments.

By default, when you install Anaconda or Miniconda, you are provided with the `base` environment. To create a new environment, open up the **terminal** (or the **Anaconda Prompt** on Windows) and type:

```bash
conda create -n environment_name
```

You can optionally specify packages to install immediately when creating the environment. For example, you can create a new environment called `py39` with Python 3.9 and NumPy 1.26 installed from `conda-forge` by writing:

```bash
conda create -n py39 -c conda-forge python=3.9 numpy=1.26
```

After creating your new environment, you have to **activate** it. This is easy! You just need to type

```bash
conda activate environment_name
```

In our example, you would write

```bash
conda activate py39
```

Now, you'll see that the environment name beside your command prompt changes. To deactivate the environment and return to the base, you would write

```bash
conda deactivate
```

**Note:** If you deactivate the base, you've completely closed Anaconda and you can't use `conda` until you restart it.

One of the advantages of Conda environments is that they are very easy to export as a file listing all your installed packages. You do this using the `conda env export` subcommand:

```bash
conda env export -f my_env.yml
```

Now you have a file that can tell `conda` exactly how to recreate your Python environment.

To create a new environment based on this file, you can simply run this code:

```bash
conda env create -f my_env.yml -n new_env
```

Now, you'll have a new environment called `new_env` that has the **same packages** that are listed in `my_env.yml`.

To activate this new environment, you would type:
```bash
conda activate new_env
```

For more details on managing Conda environments, check out the `conda` documentation [here](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). There are also detailed help pages about all of the `conda` commands and subcommands.

#### Creating Virtual Environments `virtualenv` and `pip`

You can also create environments using `virtualenv` and `pip`. These environments are known as **virtual environments**.

First, we need to install `virtualenv`. If you've installed Anaconda or Miniconda, you can install `virtualenv` using `conda`:

```bash
conda install virtualenv
```

Otherwise, you can install `virtualenv` using `pip`:

```bash
pip install virtualenv
```

Unlike working with Conda environments, when creating environments with `virtualenv`, you **must** specify where you want the environment files to be stored. To create an environment in the subdirectory `env` of the current folder, run the following:

```bash
virtualenv env
```

This will create the environment and will give you a message saying it has been created. Like with Conda environments, you must activate this virtual environment before using it. The activate is **different** on Windows from macOS and Linux.

On macOS and Linux, you must type:

```bash
source env/bin/activate
```

On Windows, you must type:

```powershell
.\env\Scripts\activate
```

In both cases, after you activate your environment, you should see the name of the environment before the command prompt. To deactivate the environment, simply type:
```bash
deactivate
```

With the environment active, you can install packages using `pip`. These packages will only be installed in the **active** environment. **DON'T INSTALL PACKAGES USING `conda` WHEN WORKING IN A VIRTUAL ENVIRONMENT!**

**Note:** If you had to use `pip3` before, you won't have to in a virtual environment.

Similar to how we exported a Conda environment, we can export a list of package versions with `pip`. The package information is stored in a **requirements** file, commonly called `requirements.txt`. To create this file, at the command line, you use `pip freeze`:

```bash
pip freeze > requirements.txt
```

This will create a file containing the names of all your installed packages and the exact version.

After creating another environment, you can easily install all these packages using `pip` by writing:

```bash
pip install -r requirements.txt
```

For more information about creating virtual environments using `virtualenv`, make sure to check out the [documentation](https://virtualenv.pypa.io/en/stable/).

#### Closing Environment Remarks

There are several differences between Conda environments and virtual environments that are beyond the scope of this workshop. There are also several other tools for managing environments and dependencies. These include **Poetry** (https://python-poetry.org/) and **Pixi** (https://pixi.sh/latest/). I **strongly** recommend that you check these out if you are interested in developing and distributing Python packages for other people to use.

And lastly, **please, please, please** don't mix `conda` and `pip` unless you **absolutely have to**. If you are working with a Conda environment and the package documentation doesn't indicate a way to install it using `conda`, check the [online list](https://conda-forge.org/packages/) at `conda-forge` to see if someone has uploaded a version to `conda-forge`.

## Exercise: Working with Modules

We've seen how to work with modules and packages. Now, let's do a more biological exercise  that involves working with modules.

DNA sequences can be very long. After all, we have billions of nucleotides of DNA in each cell.

Sequences that are are very long don't look nice on the screen. To make our sequences easier to read (and to make them easier to ultimately export to FASTA files), we want to wrap the sequences and break them into several smaller lines of 80 nucleotides.

We could do this manually... but, as it turns out, Python includes a [`textwrap` module](https://docs.python.org/3/library/textwrap.html#module-textwrap) that can help! Read the module documentation and write code to break a DNA sequence into smaller chunks. I've given you a DNA sequence to test this code on.

After breaking the DNA sequence up, print each line so that we get a nice wrapped sequence.

In [None]:
my_long_dna = "AGGACAGTTGTACGATGCATCGTGCTACGATCGATGCTAGCGACGTACGTAGCATGCTAGCTAGCTGACGAGCGCGCGCGATCAGCATGCGCCGGACGTCAGTCAGTGTCAGTCATGCAGTACTGCAGTGTACGTCAGTACGTACTGCAGTCGTCATGTCGATGCATGCCATGTGACGTATGACTGCATGACGTACTG"
# Your code here to run the wrapping example


Ok, now that we can do this, let's add a twist. Python also includes a [`random` module](https://docs.python.org/3/library/random.html). Write a function that uses `random` to generate a long random sequence of nucleotides with different weights for each nucleotide.

Your function should have the following signature:

```python
generate_random_dna(length, weight_a, weight_t, weight_c, weight_g)
```
and return a string. If you have time, add a docstring to describe your function.

**Hint:** To convert a list of strings into a single string, you can use the following code:

```python
my_string = "".join(my_list)
```
where `""` is the empty string and `my_list` is your list of strings.

Generate some random DNA sequences and then use your code from above to wrap them to 80 characters.

In [None]:
# Your code here to generate random DNA sequences


## Module Summary

Congratulations! Another module done! Here are the main points we saw in this module on modules:
* Python code is organised into **modules** that we can easily **import** into our own code to use.
* We can import **an entire module** or we can import **specific functions and constants** to accomplish certain tasks.
* Python comes with **many pre-installed modules** for performing common tasks, like mathematical operations and generating random numbers.
* Not all modules we need come installed with Python. We can install **packages** using `conda` or `pip` to get even more functionality.
* We can easily **import** packages into our code to use their added functionality.
* Many of these packages have **lots of documentation** that provides **reference, tutorials and examples** on how to use these packages.
* We can set up **environments** to keep projects separate and install **different versions** of packages and of the Python interpreter.

Now you can both write your own code and use code from existing modules and packages!

# Module 4 - Where to Go From Here

Congratulations! You've now reached the end of this **Intermediate Python** workshop. In this workshop, we've seen the following big ideas:

* How to wrap repeated tasks into **functions** to allow easier reuse.
* How to import **modules** to use code written by other people.
* How to install **packages** to gain even more functionality.

By now, you've seen a fair amount of Python, and you're well on your way to writing successful code.

## What to Learn Next... and How?

We've seen a lot. But, the learning is never done! There are many more topics that you can explore, both in terms of Python syntax skills and external packages. With the tools that you've now learned, it should be fairly easy for you to learn them. Here are some topics that are definitely worth looking into:

### Using NumPy for Array Operations

While we can use basic lists for storing multiple values, things get quite complicated if you want to represent higher dimensional data and perform operations on large sets of data. Good news! The [NumPy](https://numpy.org) package offers a new type of object: the `ndarray`, the `N`-dimensional array. Using these arrays, you can easily store and process lots of numbers easily. And NumPy also contains the mathematical operations you need for processing. NumPy is **extremely commonly used**, so if you're doing something that involves lots of data, chances are that you'll be using NumPy in some form. Make sure to check out the [online documentation](https://numpy.org/doc/stable/) for help getting started.

### Generating Plots

If you have lots of data, you'll certainly want to use plots to help with visualisations. There are several commonly-used plotting packages in Python. Make sure to checkout [Matplotlib](https://matplotlib.org/). The documentation for this package contains **tons of examples** showing just what you can do in terms of plotting. [Seaborn](https://seaborn.pydata.org/) is another package that builds on Matplotlib. If you're familiar with R, then [Plotly](https://plotly.com/python/) can also be used in Python. If you're doing visualisation in 3D, there are also packages for that!

### Using SciPy for Scientific Computing and Pandas for Data Processing

[SciPy](https://scipy.org/) and [Pandas](https://pandas.pydata.org/) are the next logical steps after NumPy. SciPy provides some more advanced scientific operations, like signal processing, spatial operations and mathematical optimisation. The docs are structured quite similarly to NumPy, and NumPy arrays are at the basis of just about all of SciPy, so it shouldn't be too hard to jump right in. Meanwhile, Pandas provides more advanced data manipulation for spreadsheet-like objects. This is quite helpful if your data are mostly 2D and you have more than just numeric data. Check out these packages' websites for tips to get started.

### Image Processing with Scikit-Image

Much life science work relies on processing images. [Scikit-image](https://scikit-image.org/) provides many functions for processing images and getting insight.

### Basic Machine Learning with Scikit-Learn

Want to get started with advanced statistics and machine learning in Python? Check out [Scikit-learn](https://scikit-learn.org/stable/). This package provides tools for clustering, dimensionality reduction and much more!

### Object-Oriented Programming

We've talked about how everything in Python is an **object**. You can create new types of objects using **classes**. A **class** is a **template** for creating new objects. Basic object-oriented programming should be covered in most introductory books on Python, as well as in online tutorials. You can also check out the official Python documentation for some [info](https://docs.python.org/3/tutorial/classes.html) on defining classes.

## How to get help... and how not to get help

In the software development process, you'll inevitably run into bugs.

In fact, if your code always runs perfectly the first time, then something may be wrong.

There will be (many) times when your code won't work.

It happens to everyone.

All the time.

Not just you.

And not just your friend.

Really, everyone!

So, how can you get help when you need it? Here are some important resources that may (or may not) be of use (adapted from my previous iteration of the **Intermediate Python** workshop found [here](https://github.com/bzrudski/micm_intermediate_python_summer_2024)):

### Your Code Editor

Think about it... when you're writing code, you're using a piece of software that is designed **specifically for one purpose**: to help you code.

Yes! That's right!

Your IDE isn't just a text editor; it can suggest code completions, tell when there are errors, help you keep track of variables, find variable definitions and even help you reformat your files and restructure your code.

So, please, please, please, **DO NOT** write your code in a simple text editor that has no additional features. There are **many** IDEs out there that have Python support, including:

* [PyCharm](https://www.jetbrains.com/pycharm/)
* [Microsoft Visual Studio Code](https://code.visualstudio.com/)
* [Spyder](https://www.spyder-ide.org/)
* [Zed](https://zed.dev/)

And these are all either completely free or have a free version with most of the functionality. And ***PLEASE*** don't use word processing software to write code. Use software that is made for coding!

### Documentation

Big projects have big, well-maintained documentation. Take a look at their guides for getting started. For example, [Pandas](https://pandas.pydata.org/) has a [10 minutes to pandas](https://pandas.pydata.org/docs/user_guide/10min.html) tutorial. Use these resources! If you want to learn how to use a function, **look it up** and read the paragraph about it. The docs will tell you how to use the arguments, as well as any quirks to expect. In some cases, the authors have even included references to the papers behind the function. This is especially true in image processing and other fields that rely heavily on algorithms. So, the documentation will tell you not only how to use the code, but also **where it comes from**. And make sure to check out the **Official Python docs** at https://docs.python.org/3/.

### Books

Books, books, books!

There are tons!

And tons!

AND TONS of books out there!


For example, there are a couple of general books that are free online:
* *Think Python 2e* by Allen B. Downey (FREE book): https://greenteapress.com/wp/think-python-2e/
* *Data Structures and Information Retrieval in Python* also by Allen B. Downey (FREE book): https://greenteapress.com/wp/data-structures-and-information-retrieval-in-python/
* *Introduction to Python Programming* by Udayan Das et al., published by OpenStax: https://openstax.org/details/books/introduction-python-programming
* *The Hitchhiker's Guide to Python* by Kenneth Reitz and Tanya Schlusser: https://docs.python-guide.org/

There are also books online about more specialised topics, such as:

* Package development: *Python Packages* by Tomas Beuzen and Tiffany Timbers -- https://py-pkgs.org/
* Data science:
    * *Python for Data Analysis, 3E* by Wes McKinney -- https://wesmckinney.com/book/
    * *Python Data Science Handbook* by Jake VanderPlas -- https://jakevdp.github.io/PythonDataScienceHandbook/

Another book that covers software development for research more generally, including more emphasis on the tools used is:

* *Research Software Engineering with Python* by Damien Irving, et al.: https://third-bit.com/py-rse/index.html

Through the databases at the McGill Library, we also have access to lots of books **for free**. Check out the library's online catalogue to see more.

### Tutorials

Tutorials are also great! And very much abundant! From more formal ones on sites like [freeCodeCamp](https://www.freecodecamp.org/) and [W3Schools](https://www.w3schools.com/python/default.asp) to less formal ones on [DEV](https://dev.to/), you can get lots of insight from these. There are also lots posted on Medium that you can check out. In addition to text-based tutorials, there are also videos on YouTube. And don't forget the official tutorials in the documentation! Tutorials are a very valuable resource that can help you see how to put pieces of code together in real-world examples.

### Stack Overflow (and Pitfalls)

If you're starting to learn Python, and you have a question, chances are that someone, somewhere has also had this question.

And they've probably asked it online.

And so they've almost certainly asked it on [Stack Overflow](https://stackoverflow.com/).

Stack Overflow is a **great** resource for finding answers to real questions about programming. People encounter real problems, they post about them, and they get answers.

**But** make sure that you're using this tool **properly**.

Try the other resources **before** going to Stack Overflow. The answer may turn out to be on the documentation page for the function you're looking for. Stack Overflow answers often include links to the documentation! If there's such a link, **use it**. Check out in more detail. 

Make sure that you **understand** the code that you're about to add to your project and **don't just copy-paste** it. Re-type it yourself. Coding is a thinking game. Make sure that you have thought about all the code that you're putting in and that you understand why it's there. 

Use your judgement and intuition when borrowing that code. If it looks sketchy, it could very well be sketchy and there may be a better way.

In other words: Documentation **first**, documentation **last**, documentation **always**.

### ChatGPT (and Pitfalls)

Everything I said above about Stack Overflow. And more.

Answers on Stack Overflow are written by **humans** who have written the code, tested it, and run it themselves. So, even if it doesn't work for you, you know that it worked for someone at some point in time.

**Be careful** when using ChatGPT for code (if you're allowed to at all). Read the code **carefully**, make extra sure that it makes sense, and test it. Don't just stare at it for two seconds, blink, and think "Looks right". Actually try to dissect the code. Don't just trust it because AI wrote it for you. Otherwise you might wind up putting [glue on your pizza](https://www.theverge.com/2024/5/23/24162896/google-ai-overview-hallucinations-glue-in-pizza).

You need to make extra sure that it actually makes sense and runs properly, because you don't have that same guarantee that a human has used this exact code in their own experience. Use your coding judgement and intuition.

### Concluding Help Remarks...

Again, ALWAYS remember to **read the documentation**. Often, if you're stuck, the answer is **right there**. If it's not, then it's probably on Stack Overflow. It's often a good idea to check the documentation **first** to see if there's an official explanation or an official example. And don't just copy a Stack Overflow answer or sample code. Think about what the code is doing. Does it make sense? Is there a better way? Try to look line by line to understand what is going on (play around in the IPython interpreter or in a Jupyter notebook!).

## Other Cool Programming Topics

Aside from the packages that I discussed above, there are other cool topics that you should definitely take a look at! These will help you write code that runs better, is easier to update and is easier to share.

### Writing Packages

We've seen how to install and use packages. But, you can also **write your own packages**. There are many great resources online about writing packages. The one that I most recommend is [this free online book](https://py-pkgs.org/): *Python Packages* by Tomas Beuszen and Tiffany Timbers. It's an easy read and helps you learn not only how to organise your code, but how to publish it, too. The authors also walk through how to render your own nice-looking documentation and host that online. This book doesn't only talk about code, but really about an entire ecosystem of tools that are extremely helpful when developing something new.

### Object-Oriented and Functional Programming

Python is an **object-oriented language**. So, everything is an object! Using classes, you can create new types of objects. Python offers different ways of creating classes that are worth checking out, including data classes and enumerations. Python also offers **functional programming** tools, allowing you to do more with functions.

### Developing Graphical User Interfaces

Jupyter notebooks and command line scripts are powerful, but they aren't accessible for people who don't know how to code. Solution: build a graphical user interface! Using PyQt, the process is quite straightforward. Check out [this online tutorial series](https://www.pythonguis.com/) by Martin Fitzpatrick to learn about developing GUIs in Python. It has been a great help to me in my own research.

**Warning:** GUI development takes **a lot** of time. Make sure that your processing code is stable before you embark on wrapping it up into a GUI.

### Hosting Projects on GitHub

What fun is a project if other people can't use it? By hosting your project on GitHub, you let others easily contribute to your project and build on it. Learning Git and GitHub are essential! And so are a few other skills along the way, like writing documents in Markdown. MiCM often has Git and GitHub workshops, so check out their workshop schedule!

## Conclusion
We've reached the end of this workshop. You now have more skills that will help you:

1. develop the software tools you'll need in your research, and
2. learn how to use additional packages not covered.

If you have any questions, please reach out!

```python
print("Goodbye!")
```