# Intro to Python: Essential Essentials

Now that we've added the syllabus to our class GitHub repo and created a conda environment, it's time to get moving with our first python notebook. The overall goal is to learn all of the basics of python, with an emphasis on code transparency; efficiency comes later.

We begin by installing some of the libraries we'll use in this section.

## Basic data types

the essential data types in python are integers, floats, strings, and booleans. Let's play with them

## Conditionals and Loops

Conditionals are the bread and butter of programming. They allow us to make decisions based on the state of our program. The basic syntax is

Loops are also very important. They allow us to repeat a task many times.

## Collections

Essential types of collections are lists, tuples, dictionaries, and sets. We'll go through each of them.

You're a physicist and probably think sequences are cool.

This one is a little complicated because the variable refers to itself. But for ones that don't we have some clever tricks

Sets aren't subscriptable because they are unordered. But they are very useful for checking membership because they have O(1) lookup time.

Lists, on the other hand, are ordered and subscriptable. But they have O(n) lookup time.

# Plotting

We need to learn how to plot stuff.

Other types of plots in matplotlib and seaborn are histograms, scatter plots, and 3D plots.

Let's make a histogram of the normal distribution.

# Scientific Computing Libraries

Some essential scientific computing libraries in python are 
- [NumPy](https://numpy.org/): Numerical Python
- [SciPy](https://www.scipy.org/): Scientific Python
- [Matplotlib](https://matplotlib.org/): Plotting library
- [Seaborn](https://seaborn.pydata.org/): Statistical data visualization
- [Pandas](https://pandas.pydata.org/): Data analysis library
- [SymPy](https://www.sympy.org/en/index.html): Symbolic Python

You saw some of these utilized in the previous lecture, but in this lecture we'll introduce them a little more thoroughly.

# NumPy

Numpy is the fundamental package for scientific computing in Python. It provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

## Numpy Essentials

To get used to some of the operators, we'll create some random arrays. Functions that do this in numpy include `np.random.rand` and `np.random.randn`. The first creates an array of random numbers between 0 and 1, while the second creates an array of random numbers from a standard normal distribution.

If you want to create an array of random integers, you can use `np.random.randint`. We'll use this for now to keep our math simpler.

Let's do some basic operations

Note: it wasn't obvious, a priori, what to expect for division and multiplication, since these are vectors. However, we see that it does element-wise multiplication and division.

This isn't something we normally do in standard linear algebra, so let's so some of that with some 2x2 matrices.

We can always see the shape of an array with the `shape` attribute.

Let's multiply

## Numpy Linear Algebra

Ok, so we see the NumPy can do some linear algebra. What about some other operations?

Some of the most essential operations in `np.linalg` are:
- `np.linalg.inv`: Inverse of a matrix
- `np.linalg.det`: Determinant of a matrix
- `np.linalg.eig`: Eigenvalues and eigenvectors of a matrix
- `np.linalg.solve`: Solve a linear system of equations

We saw the first, let's see the rest

In this syntax `eigvecs[:,i]` means "all rows, column i". This is a common syntax in python, and is called "slicing". We'll see it again later.

In the context of this eigenvalue problem, if we're not careful we can do the wrong thing, because eigvecs is a matrix and we need to be careful whether the rows or columns are the eigenvectors. Let's see what happens if we do the wrong thing.

We see these clearly don't match.

So: just be careful. Simple tests can always help you sort things out

## Numpy Statistics

Some of the essential statistical tools in numpy are in `np.random` and `np.linalg`. We'll see some of these in action. Some of the most essential functions are:
- `np.random.rand`: Uniform random numbers in [0,1]
- `np.random.randn`: Standard normal random numbers
- `np.random.randint`: Random integers
- `np.random.choice`: Random choice from an array
- `np.random.shuffle`: Randomly shuffle an array
- `np.random.seed`: Set the random seed
- `np.mean`: Mean of an array
- `np.std`: Standard deviation of an array
- `np.var`: Variance of an array, which is the square of the standard deviation
- `np.cov`: Covariance matrix of an array, which is a matrix of all pairwise covariances. The diagional part of the covariance matrix is the variance of each variable.

Let's play with mean

That's the mean of the whole thing! What if we want the mean of each row?

And the mean of each column?

These shapes are the same, so be careful.

A lot of the other functions take an axis argument in a similar way.

Let's check out choice