# Python Basics

## Setting Up The Python Environment

1. Go to [github.com/astral-sh/uv](https://github.com/astral-sh/uv) and follow the setup instructions for the `uv` python package manager.
2. Install `jupyter-lab` with useful dependencies (copy and paste these instructions!).
   ```bash
   uv tool install --with="numba,numpy,scipy,matplotlib,MDAnalysis,ruff,jupyter-ruff,jupyterlab-lsp,python-lsp-server" jupyterlab
   ```
3. Download this notebook to your computer and open it.
   ```bash
   jupyter-lab lab1.ipynb
   ```
4. Optionally, enable format on save and format on run: Settings ->  Settings Editor -> Jupyter Ruff -> Check all boxes.

## Python as a Simple Calculator

Python supports basic arithmetic through the mathematical **operators** for addition, `+`, subtraction, `-`, multiplication, `*`, division, `/`, and exponentiation `**`

The combinations of operands and operators shown above are called **expressions**. 
Expressions can be combined and chained.
Python is consitent with common precedence rules.

Python supports many **data types**.
We have encountered two: **integers** ('ints') and **floating point numbers** ('floats').
The former represent integers, the latter represent real numbers.
When mixing numerical data types in an expression, the result will never loose precision.
Meaning integers can be automatically converted to floats, but not vice versa.

Division always evaluates to a floating point number.
If this is not desired, the integer division operator `//` can be used.

We can force convert ints to floats and vice versa using the `int` and `float` functions. Note that `int` does not round. It truncates toward zero.

## More Complex Math

More complex math requires the loading of **libraries**. We will use **numpy**.
The following statement import the numpy library under a short-hand name `np`.
Unlike in this exercise, it is good practice to load modules at the top of a notebook.

Numpy provides basic mathematical constants, like $\pi$ or $e$.

However, to compute the exponential, always use `np.exp`.

Scientific constants are a part of **scipy**.

## Exercise

Convert $1\ \mathsf{u\ nm^2\ ps^{-2}}$ to $\mathsf{kJ\ mol^{-1}}$ and to $\mathsf{kcal\ mol^{-1}}$.

## Variables

We can assign numbers and the result of an expression to **variables** as follows.

The preceding lines are called **statements**. They do not evaluate to a value and nothing is printed. To get the value of `a` and `b`, we can just type them at the end of a cell or we can use the `print` function.

It is possible and common to have multiple statements per cell.

## Strings

**Strings** are another **data type**. They store text.

## Functions

We have already encountered some **built-in** functions, such as `print`, `int`, or `float`, as well as **library functions**, such as `np.exp`.
We can also define our own functions.

Functions can accept multiple **arguments**.

We can also refer to the arguments of a function by name.

We may also set default values.

## Exercise

Write a function called `v_harmonic` that computes the harmonic potential,

$$
V_\mathsf{harmonic} = \frac{1}{2}k(r-r_0)^2\,
$$

at a position `r`, given some force constant `k` and minimum position `r0`.
Verify that your results are correct by comparing with the output below.

## Rounding

Occasionally, we need to round floating point numbers to ints in a defined way. This is possible with the numpy functions `round`, `ceil`, and `floor`.

`round` rounds to the nearest integer.
`ceil` rounds to the nearest larger integer.
`floor` rounds to the nearest smaller integer.

# Exercise

Write a function called `wrap`, which accepts two arguments: `x` and `length`.
Set a default value of `1.0` to `length`.
The function should return a number between `0` and `length`, by adding or subtracting multiples of `length` to/from `x`.
Verify that your results are correct by comparing with the output below.

## Sequences

We often work with many numbers. Which we can organize in various data structures.
**Sequences** are data structures that have an order.
A simple sequence in Python is called a **tuple**.

To access elements, we use square brackets. Note: Indexing in Python is zero-based.
Tuples are immutable sequences. They cannot be changed.

Another sequence in Python is called a **list**. Lists are mutable sequences. They can be changed.

## Arrays

**Arrays** are immutable sequences containing only one data type.
They support **fast numerical operations**.
Arrays will be our goto data type for MD.
They are a part of **numpy**.

We can create a sequence of numbers using `np.arange`.

Other useful functions for creating arrays are `np.zeros`, `np.ones`, `np.full`, as well as `np.zeros_like`, `np.ones_like`, and `np.full_like`.

Very often we have to create evenly-spaced arrays.

# Arrays support arithmetic operations.

## Plotting

There are various plotting libraries for python. Here, we present **matplotlib**, a simple plotting library with an interface called pyplot insipired by matlab.
Pyplot supports various styles. We use the ggplot theme.

## Exercise

Plot a sine and a cosine function in one plot from 0 to 2 pi.
Add a legend.

## Comparing Values

We can compare values by using the **logical comparison operators** for equality, `==`, inequality, `!=`, as well as the ones for greater-than, `>`, less-than, `<`, greater-or-equal, `>=`, less-or-equal, `<-`.

Logical operations can be chained.

# Boolean operators

We can also use the **boolean operators** `and`, `or`, and `not`, as follows.

# Looping and Branching

**Loops** are used to repeat a statement multiple times.
In Python, we always try to loop over sequences.
**Indentation** is important.

If we want to loop over numbers, we use the `range` function.

Sometimes we need to evaluate a statement **conditionally**. This is called branching.
A common pattern that we see in MD is shown below.
It involves the **modulo** operator, which returns the remainder of a division.

# Exercise

Write a function that determines if some number `n` is a prime number.
Here's a simple, though inefficient algorithm.
Return `False` if `n <= 1`.
Loop from `2` to `n-1` and check if `n` can be divided by this number. If so, return `False`.
After the loop has finished, return `True`.
Check your implementation for the numbers `1` to `11`.

# Random Numbers

Numpy gives access to random number generators through the function `np.random.rand` (uniform distribution) and `np.random.randn()` (normal distribution).

Let's visualize this by ploting histograms.

# Statistics

Once we have some random data, we can compute the sample mean and the sample standard deviation as follows.

We can also do a regression.

## Multidimensional Arrays

Arrays can have multiple dimensions. A two-dimensional array, for example, is a matrix.
We will often work with 2D arrays to save the 3N coordinates and 3N velocities of a system.

We can use the `size` and `shape` attributes of an array to query information on the number of elements and the shape.

The array creation routines `np.zeros`, `np.ones`, `np.full`, and the corresponding "`_like`" functions support multiple dimensions.

The same is true for the random number generators, but here we don't need to provide a tuple.

We can also reshape 1D arrays or stack multiple ones.

Elements of multidimensional arrays can be accessed in various ways.

## Exercise

Create a 2D array called `velocities` with shape `(10000, 3)`. It should correspond to the velocities of 10000 particles in three dimensions.
The array should contain normally distributed random numbers with zero mean and a standard deviation of $\sqrt{kT/m}$.
Choose $T=300\ \mathsf{K}$ and $m=10\ \mathsf{u}$.
Make sure that the velocities have units of $\mathsf{nm/ps}$.
Verify that the mean is close to zero and the standard deviation is close to 0.5 using `np.mean` and `np.std`.
Plot histograms of the velocities in each dimension

## Array Operations

Most numpy operations are elementary. They work on each element.

Some operations work on the array as a whole.

Many of these functions accept a keyword `axis`, to restrict the function evaluation to a particular dimension.

## Exercise

In this exercise, we will work with our array `velocities` again.
It should be shape `(10000, 3)` array.
Compute the magnitude of the velocity vector of each particle, that is, reduce velocities to a 1D array with 1000 elements.
Call this array `vabs`.
Plot a histogram.
What is the name of this distribution?
Compute the mean absolute velocity.