# Introduction to Python - DDI training session - 22 Feb 2021

Charlotte Desvages

## Welcome!

Today I'll give you an overview of what Python looks like, some examples of what it can do, and point you towards great resources to continue learning.

We will have interactive code examples throughout the session, which you'll be able to run yourselves.

We'll use a cloud service called Binder -- no need to install anything!

## Zoom communication

At the bottom of the **participants list**:

<img src="graphics/zoom.png" alt="Zoom non-verbal feedback buttons." width="400"/>

You can also use the **chat** to ask questions.

### How do I code along?

Here: [link]

When you go to this URL, you should see the content of these slides. This is a **Jupyter notebook** -- a Python programming environment that runs in your browser. Wait until it loads completely (~1min), then:
- scroll down until you see the flags 🚩🚩🚩. Then, click on the **Python code cell** just below. You should see a green frame appearing around it.
- Click the <kbd>> Run</kbd> button in the toolbar at the top (or press <kbd>Ctrl</kbd> + <kbd>Enter</kbd> on your keyboard). This will **run** the code inside the cell, and you will see the result below.

# 🚩🚩🚩 Example 1

In [None]:
print('Success! :)')

In [None]:
print('Success! :)')

When you've run the code, `Success! :)` should appear **below** the code cell. If that's the case, come back on Zoom and click the green "yes" button; if not, click the red "no" button. **Don't close your browser tab or you'll lose your progress!**

You can follow the presentation on Zoom. When there are code examples you can run and change yourself, they will be flagged with 🚩🚩🚩 if you want to jump back into your notebook.

## What is Python?

"Python" refers to both the **programming language** and the **interpreter**. The interpreter is the tool which instructs your computer to execute the code you write.

### Why should I learn Python?

- It's **easier to learn** than most other languages.
- It's **free** and open-source, and available on all major platforms.
- There are a **lot** of well-maintained third-party libraries available for a wide variety of applications. We will see a couple of these today!
- It has a very **large and welcoming community**, which is growing every day.

## Let's go!

First, let's get Python to show us something, using the `print()` command.

## Variables and objects

In Python, all data is an **object** of a certain **type**. For example, we've seen number objects (`456`, `2 + 3`...), and string objects which represent text (`'Success! :)'`).

We can store objects in memory to reuse them later, using **variables**.

A **variable** is a **label** to some place in your computer's memory, where an object is stored.

We often say that, in Python, a variable *points to* or *refers to* an object.

In [None]:
a = 0.5

<img alt="Assigning 0.5 to a" src="graphics/var1.png" width=600px>

# 🚩🚩🚩 Example 2

What will be the output of each of these cells?

In [None]:
a = 0.5
b = a + 2
print(b)

In [None]:
c = a * b
print(c)

In [None]:
x = 1
y = x
x = 2
print(y)

In [None]:
a = a + 1
print(a)

In [None]:
s = 'ABC'
t = 3 * s
print(t)

## Data types

Python can tell us what *type* an object is, using the function `type()`.

## A useful type to make decisions: `bool`


In [None]:
a = True
b = False



In [None]:
x = 10
y = 1



# 🚩🚩🚩 Example 3

Let's play **Guess the number**:

In [None]:
# Import the "random" library
import random

# Generate a random number between 1 and 10
target = random.randint(1, 10)

# Ask user to type a number
guess = int(input('Guess the number! '))

# Keep guessing while we haven't guessed correctly yet
while target != guess:
    guess = int(input('Wrong, guess again? '))

print('Victory!')

## Containers: lists and dictionaries

Python also has objects which can *contain* several other objects inside them -- they're called **containers**.

In [None]:
# Create a list called a
a = [1, 2, 3, 10, 6]

<img alt="The list a in memory" src="graphics/lists.png" width=800px>

In [None]:
# Display and change individual elements


We can put anything in a list -- even other lists!

In [None]:
a_very_diverse_list = ['my', 1, 4.5, ['you', 'they', True], 432, -2.3, 33, [False, 1]]


**Dictionaries** are similar to lists, but indexes elements with a **label** instead of a number.

In [None]:
# Create a dictionary called scores
scores = {'Alice': 80, 'Bob': 64, 'Charlie': 72}


<img alt="The dictionary scores in memory" src="graphics/dict.png" width=800px>

## Python and data: the `pandas` library

A **library** is a bit like an extra toolbox with specialist tools.

`pandas` (Python Data Analysis Library) is particularly useful to deal with data.

In [None]:
import pandas as pd

We'll use the [astronauts database](https://github.com/rfordatascience/tidytuesday/tree/master/data/2020/2020-07-14).

In [None]:
astronauts = pd.read_csv('astronauts.csv')
print(type(astronauts))

# 🚩🚩🚩 Example 4

Run the cell below to import pandas and read the CSV file.

In [None]:
import pandas as pd
astronauts = pd.read_csv('astronauts.csv')

Then, try these commands (one at a time):

```python
print(astronauts.columns)
astronauts.head()
astronauts.tail()
astronauts.head(10)
```

## Summary statistics

In [None]:
astronauts

## Data visualisation with `seaborn`

The `seaborn` documentation has a great [examples gallery](). Let's visualise some of the astronauts data.

# 🚩🚩🚩 Example 5

In [None]:
import seaborn as sns

# Plot histograms using catplot
grid = sns.catplot(data=astronauts,           # the dataframe
                   x='year_of_mission',       # the x-axis (the bins)
                   col='military_civilian',   # separate plots for each value in the column 'military_civilian'
                   hue='sex',                 # different colours (hues) for male and female astronauts
                   kind='count',              # the type of plot (countplot, or a histogram)
                   legend=True,               # display the legend for the different colours
                   col_wrap=1,                # start new row of subplots after just 1 plot
                   height=3,                  # height of each plot
                   aspect=4)                  # aspect ratio of each plot (width/height)

# Rotate the tick labels so we can read them all
grid.set_xticklabels(rotation=45,
                     verticalalignment='top',
                     horizontalalignment='right')

Let's look at flights to Mir and to the International Space Station.

In [None]:
# "|" here means "or", but works on whole dataframes
rows = (astronauts['in_orbit'] == 'Mir') | (astronauts['in_orbit'] == 'ISS')
columns = ['year_of_mission', 'ascend_shuttle', 'in_orbit', 'hours_mission']
station_trips = astronauts.loc[rows, columns]

# Group labels together
station_trips.loc[station_trips['ascend_shuttle'].str.contains('soyuz', case=False), 'ascend_shuttle'] = 'Soyuz'
station_trips.loc[station_trips['ascend_shuttle'].str.contains('sts', case=False), 'ascend_shuttle'] = 'Space Shuttle'

ax = sns.relplot(data=station_trips,
                 x='year_of_mission',
                 y='hours_mission',
                 hue='ascend_shuttle',
                 style='in_orbit',
                 height=6,
                 aspect=2)

ax.set_xticklabels(rotation=45,
                   verticalalignment='top',
                   horizontalalignment='right')

# 🚩🚩🚩 Example 6

- Change the code above to display the sex of the astronauts in 2 different colours, instead of the ascend shuttle.
- Change `'hours_mission'` to `'age'` above to plot the age of astronauts going to Mir or the ISS over the years.
- What is the average duration of a mission to Mir? to the ISS?
- How long was the longest mission, in **days**? Who was the astronaut?
- Which countries have sent civilians to space?
- How many astronauts were from the USSR?

## What's next?

Install Python to run code on your computer:
- If you're interested in doing data science or scientific computing, install [Anaconda](https://www.anaconda.com/products/individual). This will install Python, Jupyter (what we've used today), Spyder (an IDE), together with lots of useful libraries, like pandas and seaborn we've seen today.
- If you just want Python (e.g. for scripting), you can also [install it directly](https://www.python.org/downloads/).

## What's next?

Learn more about Python:
- The official [Python documentation](https://docs.python.org/3/) includes a comprehensive [tutorial](https://docs.python.org/3/tutorial/index.html) for beginners.
- Two excellent free online books by Jake VanderPlas:
    - [A Whirlwind Tour of Python](https://jakevdp.github.io/WhirlwindTourOfPython/)
    - [The Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/)
- [Software Carpentry](https://software-carpentry.org/) is a non-profit which run regular workshops to teach Python (and other things!), for different levels of experience. All their teaching materials are open-source and [freely available online](https://software-carpentry.org/lessons/). The [Edinburgh branch](https://edcarp.github.io/) is also quite active and holds regular workshops (online at the moment!).

## What's next?

Learn more about Python for data science and scientific computing:
- The pandas documentation has excellent [Getting Started tutorials](https://pandas.pydata.org/docs/getting_started/intro_tutorials/) and [user guides](https://pandas.pydata.org/docs/user_guide/index.html). In particular, I'd recommend the tutorial ["10 minutes to Pandas"](https://pandas.pydata.org/docs/user_guide/10min.html).
- The seaborn documentation also has great [tutorials](https://seaborn.pydata.org/tutorial.html) and a [showcase gallery](https://seaborn.pydata.org/examples/index.html).
- For less data-oriented scientific computing, libraries like [NumPy](https://numpy.org/learn/) and [SciPy](https://www.scipy.org/getting-started.html) (also with a great [tutorial page](https://docs.scipy.org/doc/scipy/reference/tutorial/index.html)) are widely used, together with [matplotlib](https://matplotlib.org/1.5.3/users/beginner.html) for plotting.

# Thank you!