# Project 0
## Installation of Jupyter, Tools for Use, and Gasses

### Intro
Throughout this course there will be a few projects which use Jupyter Notebook and Python. Moreover, you are encouraged (but *not* required) to use these same tools to solve certain homework problems. For those unfamiliar, Jupyter notebook is a very helpful tool originally built to integrate text and code together in one document. The following will be a introduction to jupyter and python assuming little to no foreknowledge of these tools. If you have experience using them feel free to skip ahead to the section labeled **Installation**.

### Jupyter & Python
Jupyter itself is an application which you access through a web browser, jupyter *is not* a programming language. Instead, jupyter connects to what are called kernels. These are seperate applications which evaluate code from various languages. Most often (and always in this class) we will use jupyter with a python kernel. This means that we will write code in jupyter, and then when we ask jupyter to evaluate a code cell that cell is sent off to the python kernel for evaluation. This then begs the question, what is python.

A computer science heavy definition of Python is a high-level multi-paradigm dynamically-typed interprited language. A more useful definition for us is that python is the current lingua franca of Astronomy. That is to say that for approximalty the last 10 years almost all professional astronomy has been done, in some way in Python. In order to operate in the world of astronomy today **knowing how to use Python is a key skill**. That is not to say that *all* astronomy software is written in python. Fortran and C still hold important places when it comes to high performance simulations while IDL and IRAF scripting still exist in a limited sense. 

If we ask ourselves why Python is so dominate in the field we have to contend with the fact that its dominace is relativley recent. Up and until 10 years ago C, Fortran, IDL, and IRAF were far more important in day to day astornomy (and not just for data-reduction pipelines or intense numerical simulations). Why then did these established languages get displaced by python? It is not because python brings a performance increase. In fact of all the languages given python is by far the least performant. Instead, python brings an ease of use increase. Python code is, generally, far easier to write and to read than equivilent C or Fortran code would be. This turns out to be a tradeoff which many, many ,many astronomers have decided is acceptable.

Jupyter developed somewhat in step with python's adoption by the community (not just in astronomy but in the wider scientific commiunity, though astronomers played a large role in Jupyter development). Jupyter developed as a way to integrate text (such as what you are currently reading) into the same document as code. Jupyter uses a **cell model** where each cell can be either a **markdown cell** which contains formated text or a **code cell** which contains some code to be evaluated. This is useful when you want to tell a story where code is a large part. What I find is the other important use of Jupyter is as a **visual and dynamic code editor**. That is to say that in jupyter visualizations are rendered in-line instead of opening in a seperate window. This does mean that you do not have the ability to pan, scroll, or zoom on them (as they are litterally rendered to a png and then that png is embedded into the webpage); however, I find having those visualiations right next to the code actually lets me iterate and explore faster.

To use Jupyter you interact with a file called a notebook (*.ipynb files). These can be opened with a number of programs such as the original Jupyter Notebook, the more modern Jupyter Lab (which is what I reccomend you use for this class), or even more traditional text editors such as VS code which contain plugins to emulate the cell like behavior of jupyter. Below I provide instructions for installing and using **Jupyter Lab**.

### Installation
The installation instructions will change somewhat from system to system. In general they will probably be a bit simpler on a mac or linux system and a bit more involved on a windows system. I will try my best to provided installation assistance to windows users; however, I am simply not familar with those systems. If after a decent effort you are unable to install jupyter lab on windows either come to my office hours or email me and I will work on it. If that does not work I will likeley direct you to John Griffen, the Department of Physics and Astronomy's IT proffessional who will be more qualified to help. 

I recomend installing the python distribution and package manager **<a href="https://docs.anaconda.com/anaconda/install/">anaconda</a>** which will package everything you need in order to use python and jupyter in this class. The link provided should detail how to install anaconda on various systems. **DURING INSTALLATION ANSWER YES TO ANY PROMPT ASKING IF YOU WOULD LIKE TO ADD ANACOND TO YOUR SHELL OR IF YOU WANT TO INITIALIZE CONDA**. Anaconda will install a version of python on your system and then tell your system to prefer that version over any previously installed version of python. If you don't want to use anaconda that is *okay* you can either install jupyter on your own or come to me and we can work on installing it without anaconda. That being said, the following instructions will assume that anaconda has been installed.

First, check to make sure the version of python you are using is from anaconda (all of these slightly indented code blocks are meant to be run in a terminal emulator, on mac by default this is called term, linux distros all have some varient of this, and I belive similar things can be done from CMD on windows).

```bash
which python
```
this should print out something like `/User/username/anaconda3/bin/python`. If it does not then close your terminal emulator and the reopen it (or resource your profile file). Next we are going to install jupyter and ipython (which is the python kernel jupyter will use) using the python package manager (pip). 

```bash
pip install ipython
pip install jupyterlab
```

At this point, assuming you did not get any error messages, you should be set to go. In order to use jupyter

```bash
jupyterlab
```

This should automatically open a browser and redirect it to a local address. Jupyter is actually running a local webserver which your computer is then connecting to. Within jupyter lab you can browse to different locations on your computer using the file browser in the side bar, create new notebooks by clicking the blue + button at the top of the sidebar, or make new folders with the + folder icon in the same place. When you click the blue plus you will be asked to choose a kernel. You likley only have one avalible, so select that. Now, finally, you are in a jupyter notebook. **At this point I recomend opening the \*.ipynb version of this file (as opposed to the pdf version) so you can follow along in jupyter.**

### Getting Started
There are a few python modules (external code written by others) which we are going to use in this project. These can all also be installed with the python package manager. We will need

1. numpy (general math and array tools)
2. matplotlib (graphing and visualization)
3. scipy (more advanced numerical models)
4. astropy (astronomy specific tools)
5. astroquery (access to astronomy datasets)
6. pandas (more complex data tables)

Some of these are likeley already installed (as they come with anaconda by default); however, just to make sure run `pip install ...` for each of them (i.e. `pip install numpy`)

Once those are all installed you can use them in the notebook which you already created.

In [None]:
# There are a few differnet ways to import modules into python

import numpy as np # names the numpy module np for faster typing, this is super common to do
import matplotlib.pyplot as plt # imports only the pyplot submodule from matplotlib and names it plt
import pandas as pd
from astropy import units as u
from astropy import constants as const

## Using units

Astropy is a package designed by and for astronomers. We'll use two modules from astropy, which we'll import directly. Here, we'll be using the astropy "units" module to keep track of units for us. This is extremely convenient because it allows us to convert between units, and keep track of our units, automatically. From personal experience, there can be huge problems if you get this wrong. As a warning, you do need to make sure you have the units you want before using the value somewhere that you aren't keeping track of units. 

After this project I will **not** require the use of astropy units; however, I do encorage you to use them as they really do help prevent some simple mistakes (note that in certain circumstances they can complicate things and it might make sense to not use them, in those cases you just need to be extra careful about tracing your units manually)

In [None]:
myvelocity = 1e-5*const.c  # use the speed of light stored in astropy.constants
print(myvelocity)
print(myvelocity.to('cm/s'))   # change the units to cm/s

mydistance = myvelocity*(5*u.min)
print(mydistance)# multiply by 5 min
# hmmm, units of "min/s" is kind of silly
# either of these simplifies the units for you
print(mydistance.decompose(), "or", mydistance.to('m')) 

## Q1: an array of random numbers

Create an array of random numbers for velocities and distances, using one of the options in np.random (e.g. a Gaussian, using "normal" or a Rayleigh distribution). I've included an example creating an array of random integers between 0 and 10, appending units on the end. Print out your array.

In [None]:
myvelocities = np.random.randint(low=0,high=10,size=20)*u.m/u.s
mydistances = np.random.randint(low=0,high=10,size=20)*u.m
print(myvelocities)

In [None]:
# plot the data
plt.scatter(myvelocities, mydistances)

# extract the units so you can use them in your axes labels
xunit_string = mydistances.unit.to_string('latex_inline')
yunit_string = myvelocities.decompose().unit.to_string('latex_inline')

# make the plot
plt.ylabel('Velocity ({})'.format(yunit_string))
plt.xlabel('Distance ({})'.format(xunit_string))

## Q2: An Ideal Gas

Earth's atmospheric pressure at sea level is $1.01 \times 10^5~Pa$ (where $1~Pa = 1~N/m^2 = 10^{-5}~bar$). At room temperature ($293 K$) and this pressure, how much volume would be filled by a little cloud of $10^6~ N_2$ molecules?

In [None]:
# For the sake of getting use to Python and jupyter notebooks,
# please use this code cell to perform your calculation.
# (Be sure to indicate the units associated with your answer).

N = int(1e6)                  # total number of molecules
pressure = 1.01e5*u.Pa    # in units of Pa
temperature = 293*u.K     # in units of K
k_B = const.k_B           # Boltzmann's constant

# ...

### Q3: Speeds in Gases 

Let's play around a little with these $10^6~N_2$ atoms, and the speeds at which they're moving. The Maxwell-Boltzmann distribution gives the probability for a particle in an ideal gas to have the $x$-component of its *velocity* fall between $v_x$ and $v_x + dv_x$ is written as 

$$ f(v_x)dv_x = \left(\frac{m}{2\pi k_B T}\right)^{1/2}\exp \left(\frac{-mv_x^2}{2 k_B T} \right)dv_x $$

where $m$ is the mass of the particle, $T$ is the temperature of the gas, and $k_B = 1.38 \times 10^{-23}~J~K^{-1}$ is Boltzmann's constant. If we define the quantity $\sigma = \sqrt{k_B T/m}$, then the above expression can be rewritten as 

$$ f(v_x)dv_x = \frac{1}{\sqrt{2\pi} \sigma}\exp \left(\frac{-v_x^2}{2 \sigma^2} \right)dv_x $$

which is precisely the equation for a Gaussian or "normal" probability distribution, centered at $v_x = 0$ and with a width of $\sigma$. 

Let's use Python to calculate $\sigma$ for simulate the $x$-component of the velocities for $10^6$ imaginary $N_2$ molecules, and then we'll create a `numpy` array that contains $10^6$ random numbers drawn from this probability distribution. 

In [None]:
# define our values
temperature = 293*u.K         # the temperature, in units of K
k_B = const.k_B               # Boltzmann's constant, in units of J/K
mass = 28*u.Dalton            # the mass of an N2 molecule, in amu (or Daltons)

# calculate sigma from these (check on paper that the units make sense!)
sigma = np.sqrt( (k_B*temperature/mass).decompose() )   # in units of m/s

# draw a 1e5 random numbers from a Gaussian ("normal") distribution, 
# centered at vx=0 and with a width of sigma
# note these numpy functions can't use numbers with units
vx = np.random.normal(0,sigma.value,N)*sigma.unit #strip sigma of its units, but make sure vx keeps the right units

Now, let's see how frequently particular values of $v_x$ occur, by plotting a histogram. We will use `plt.hist` (https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html) to make the histogram. Then, plot your histogram and add axis labels (e.g. https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.xlabel.html). I made the choice to use the `density=True` option in `plt.hist` means this distribution will be plotted as a probability density (which is a function that integrates to 1).

In [None]:
# Let's define a domain to work over 
vx_axis = np.linspace(-1500, 1500, 1000)*u.m/u.s

# Using the formula given above to get the velocity distribution
probability = (1/(np.sqrt(2*np.pi)*sigma))*np.exp(-vx_axis**2/2/sigma**2)


# It is good to seperate plotting code from calculation code.

# instantiate the figure (overarching container) and axis (actual dataspace between x and y axis) objects
fig, ax = plt.subplots(1, 1, figsize=(10, 7))
ax.hist(vx.value, bins=50, density=True)
ax.set_xlabel(f'$x$-component of Velocity ({yunit_string})')
ax.set_ylabel('Probability Density')
ax.set_title('Distribution of Velocities in the $x$ Direction')
ax.plot(vx_axis, probability, linewidth=5, color='darkorange', alpha=0.5)

# we can plot the expected average value for vx
ax.axvline(0, color='gray', lw=4, label=r'$\bar{v}_{expected}$')

# we can plot the actual average value of vx, measured from our simulation
vx_average = np.mean(vx.value)
ax.axvline(vx_average, linestyle='--', color='black', label=r'$\bar{v}_{actual}$')

# add a legend, including the labels that were assigned to each line
ax.legend(frameon=False, loc='best');

Hey, cool! The histogram of random numbers that we created with Python follows the shape of the distribution we draw them from. Maybe that's not so surprising, but it's a useful test! Now, let's expand this to three dimensions of motion, and calculate a few more quantities.

+ Create two more arrays for $v_y$ and $v_z$ in the same way you did for $v_x$, assuming the motion in the three directions is totally independent.

+ Plot a histogram of $s = \sqrt{v^2}$, where $s$ indicates the speed of the particles (the magnitude of their 3D velocity vectors). 

+ Indicate with vertical lines on your plot the gravitational escape speed of the planet Earth (11.2 km/s) and the approximate escape speed of the comet 67P/Churyumov–Gerasimenko (1 m/s). 

+ Briefly discuss the implications of the particle speeds and escape speeds for the long-term persistence of an atmosphere on these two bodies.

**Your Breif Discussion Here** (Double click to edit):

**Instructions for Submission:** You can submit this file to canvas as a .ipynb file. Before doing so make sure that you restart the kernel (from the menu bar Kernel>Restart Kernel) then run all cells in order (Run>Run All Cells). You are also welcome to submit as a PDF (File>Save And Export Notebook As>PDF). Note that PDF export may not work if you do not have pdflatex installed on your computer. 