# Lab: Introduction to Python {#sec-intro}

Welcome to the first Python lab. Our aim is to introduce a little bit of Python and help you become comfortable with using Jupyter notebooks. No previous knowledge is assumed, and we encourage you to keep making steady progress working through the notebooks, and please ask questions!


## Technological stack {#sec-technological-stack}

This module uses a combination of different technologies (technological stack) besides Python. You will need to be familiar with all their components to follow the labs as well as the assignments. You can find a brief description of the main components below, and you can refer to @sec-setup for a more technical explanation of the setup.

::: callout-important
Please make sure you have Anaconda Python installed and running on your computer. Instructions for doing this can be found on @sec-setup.
:::

### Python

[Python](https://www.python.org/) is a very popular general purpose programming language. Data scientists use it to clean up, analyse and visualise data. A major strength of Python is that the core Python functionality can be extended using libraries. In future labs, you will learn about popular data science libraries such as pandas and numpy.

It is useful to think of programming languages as a structured way to tell the computer what to do. We cover some of the basic features of Python in this lab.

### Anaconda

[Anaconda](https://anaconda.com)[^anaconda] is a _distribution platform_ that manages and installs many tools used in data science, such as _programming languages_ (python), _libraries_, _IDEs_[^ides]... as well as a _GUI_ (Anaconda navigator) and a _CLI_ (Anaconda interpreter). Anaconda does some other useful things, such as creating isolated _virtual environments_, which we will be using for this module. 

[^ides]: Some specific  Integrated Development Environments (IDEs) for Python included in Anaconda are [VS Code](https://code.visualstudio.com/), [Jupyterlab](https://jupyterlab.readthedocs.io/en/stable/getting_started/overview.html) and [Spyder](https://www.spyder-ide.org/). In this module, there's no preferred IDE (actually, different members of the staff use different IDEs) and you can use the one you are more familiar with.

::: callout-tip

### Virtual environments

_Virtual environments_ are a way to install all the dependencies (and their right version) required for a certain project by isolating python and libraries' specific versions. Every person who recreates the virtual environment will be using the same packages and versions, reducing errors and increasing reproducibility.
While they are considered an advanced practice, and are therefore out of scope of this course, you may want to learn about Python's environments here: <https://realpython.com/python-virtual-environments-a-primer/>

:::

[^anaconda]: If you want to know more about Anaconda, these tutorials can be a good start: <https://www.tangolearn.com/what-is-anaconda/>, <https://www.upgrad.com/blog/python-anaconda-tutorial/>


### Jupter Notebooks

[Jupyter notebooks](https://jupyter.org/), such as this one, allow you to combine text and code into documents you can edit in the browser. The power of these notebooks is in documenting or describing what you are doing with the code alongside that code. For example, you could detail why you chose a particular clustering algorithm above the clustering code itself. In other words, it adds narrative and helps clarify your workflow.

![This same notebook, as displayed within Jupyterlab](img/jupyter-notebook.png)

## Getting started

If you send Python a number then it will print that number for you.

In [None]:
45

You will see both the input and output displayed. The input will have a label next to it like 'In [1]' where the number tells you how much code has already been sent to the Python interpreter (the programming interpreting the Python code and returning the result). A line such as 'In [100]' tells you that 99 code cells have been passed to the Python interpreter in the current session.

Python can carry out simple arithmetic.

In [None]:
44 + 87

In [None]:
188 / 12

In [None]:
46 - 128

As seen above, each time the code in a cell is run, the result from the Python interpreter is displayed.

## Data Types {#sec-data-types}

As we saw in [this unit in the Skills Programme](https://pages.github.warwick.ac.uk/CIM-Methods/coding_skills/content/computational_skills/computational_thinking.html#sec-data-types), programming languages use types to help them understand what a piece of data might represent. Knowing how data types work is important because they define what can be done and cannot be done with them. Each programming language allows different data types which can be extended by other libraries such as Pandas, but these are some of the most frequent ones (and the ones we will be facing more frequently).

### int, floats, strings

Integers are whole numbers, like the ones we used above. We can check an object's data type using `type()`:

In [None]:
type(33)

You can also have floats (numbers with decimal points)

In [None]:
33.4

In [None]:
type(33.4)

and a series of characters (strings).

In [None]:
'I have a plan, sir.'

In [None]:
type('I have a plan, sir.')

Data types are great and operators such as `*` do different things depending on the data type. For instance,

In [None]:
33 * 3

That seems quite sensible. What about if we had a string? Run teh below line. What is the `*` doing?

In [None]:
'I have a plan, sir' * 3

There are also operators which only work with particular data types.

In [None]:
#| error: true
'I have a plan, sir.' / 2

This error message is very informative indeed. It tells us the line which caused the problem and that we have an error. Specifically, our error is a `TypeError`. 

::: callout-note
### Understanding the error
In this case, it says that the line `'I have a cunning plan' / 2` consists of  `string / int`.  We are trying to divide a `string` and `int`. The `/` operand is not able to divide a string by an int.
:::

### lists and dictionaries {#sec-list-dictionaries}

You can collect multiple values in a list. This is how they look like:

In [None]:
[35, 'brown', 'yes']

And we can check theyr type:

In [None]:
type([35, 'brown', 'yes'])

Or add keys to the values as a dictionary.

In [None]:
{'age':35, 'hair colour': 'brown', 'Glasses': 'yes'}

In [None]:
type({'age':35, 'hair colour': 'brown', 'Glasses': 'yes'})

## Variables

Variables are bins for storing things in. These things can be data types. For example, the below creates a variable called my_height and stores the in 140 there.

In [None]:
my_height = 140

The Python interpreter is now storing the int 140 in a bin called my_height. If you pass the variable name to the interpreter then it will behave just like if you typed in 140 instead.

In [None]:
my_height

Variables are neat when it comes to lists.

In [None]:
my_heights = [231, 234, 43]
my_heights

In [None]:
my_heights[1]

Wait, what happened above? What do you think the [1] does?

You can index multiple values from a list.

In [None]:
my_heights[0:2]

## Bringing it all together

What does the below do?

In [None]:
radius = 40
pi = 3.1415
circle_area = pi * (radius * radius)

length = 12
square_area = length * length

my_areas = [circle_area, square_area]

In [None]:
my_areas

As an aside, you can include comments which are not evaluated by the Python interpreter.

In [None]:
# this is a comment. Python will ignore it.
# another comment. 
n = 33

## Congratulations

You've reached the end of the first notebook. We've looked at basic data types and variables. These are key components of all programming languages and a key part of working with data.

In the next notebook we will examine using libraries, loading in data, loops and functions.