# Part I: Whomst've is an Python?

## Functions
Just as in math irl, a function in Python is something that gives you an output based on an input. You've probably most commonly seen them of the form:

$f(x) = y$

an efficient way of telling you that $y$ is the output of the function $f$, given the input $x$.

While some languages (and versions of Python) use different conventions, Python 3 is convenient in that its functions take the same form. To $\textbf{call}$ a function, or to tell Python to do it, you simply write the function name followed by your input, or the $\textbf{argument}$ of the function.

Python is very powerful in that you can define your own functions, but it also has its own built in functions that don't require you to define anything. One of these is $\texttt{print}$, probably the most useful function ever.

Run the cell below and see what the $\texttt{print}$ function does:

In [None]:
print('Hello World!')

In this case, the argument of the $\texttt{print}$ function was the phrase $\texttt{'Hello World!'}$, and the output was the argument, printed so you can read it.

Another useful function is $\texttt{len}$:

In [None]:
len('Hello World!')

As you might have guessed, this function returns the length of the argument, or how many elements are in it. In this case, we've learned that the phrase 'Hello World! has 12 characters. This can also be applied to $\textbf{lists}$ of numbers:

In [None]:
len([0,1,2,3,4,5,6,7,8,9,10])

Lists are exactly what you think they are. They are within square brackets, and their elements are separated by commas.

You can also use the $\texttt{list}$ function to create lists:

In [None]:
print(list((0,1,2,3,4,5,6,7,8,9,10)))

When dealing with numbers, $\texttt{min}$ and $\texttt{max}$ are also useful functions:

In [None]:
print(min([0,1,2,3,4,5,6,7,8,9,10]))
print(max([0,1,2,3,4,5,6,7,8,9,10]))

Notice this time I told python to print the outputs of these two functions. This means two things: one, if you have other objects in your cell, Python will not automatically print the function's output. Two, YOU CAN CALL FUNCTIONS OF FUNCIONS. Neat, huh?

Just like IRL functions, functions in Python can also have multiple arguments, separated by commas in the parentheses:

In [None]:
print( min([0,1,2,3,4,5,6,7,8,9,10]) , max([0,1,2,3,4,5,6,7,8,9,10]))

Yet another built in function I use all the time is $\texttt{range}$. Given two arguments, it returns a range of integers with a step size of 1 beginning at the first argument and ending at the number before the second. That is, the range is not inclusive of the second argument. This will come back later when we work more with lists again.

So for example, to return the same numbers as the list we're working with, I need only type:

In [None]:
list(range(0,11))

In fact, if I know I will be starting from 0, I only need to give the top number:

In [None]:
list(range(11))

By default, Python and the other morally superior coding languages begin at 0 rather than 1 when making lists of numbers or indexing (something we will get into soon).

Notice how below, the argument is 11, but the min and max of the range are the same before because it stars at 0 and ends at 10.

In [None]:
print(min(range(11)),max(range(11)))

And if I want a step size other than one, $\texttt{range}$ can take that as a third argument:

In [None]:
list(range(0,11,2))

Now, its really getting tedious to keep typing out these things. Luckily, in Python, you can also define $\textbf{variables}$. These serve as place holders for your data and make your life much easier. To set a variable, just type the variable name, =, and its value:

In [None]:
s = 'Hello World!'
l = range(11)

### Benchmark
In the cell below, use Python's built in functions to print both the length of the phrase 'Hello World!' and the list of numbers by using the variables above.

Those are just some of the functions already in Python without us having to define anything! But of course, we'll want to define our own functions to do all sorts of crazy math and science!

To define a new function, we write $\texttt{def}$ then the name of the function and the input variables, followed by a colon:

In [None]:
def function(x,y):
    print(x,y)

Notice how after the colon, I've started a new line and indented. In Python, indentation and spaces matter a lot and take the place of curly brackets and other things in different languages.

After defining a function, I can simply call it just like I would with the built in functions. Notice that because I placed a print statement within the function, when I call the function my two inputs will be printed and I won't have to exlicitly say $\texttt{print(function(x,y))}$.

In [None]:
function(s,l)

If we want a function to return an output, we have to place $\texttt{return}$ and then define what we want the output to be:

In [None]:
def function2(x,y):
    return x + y

function2(2,2)

### Benchmark
In the cell below, define a function that takes two arguments - both integers. It should print a list beginning at the greater of the two numbers, then every integer between that and the lesser with a step size of 5.

## Data - types and basic operations

Let's define some more variables to work with:

In [None]:
a = 42
b = 10.0
c = '24'

Cool. Now lets go over all the crazy things we can do with these!

Most of the basic arithmetic we can do in Python is pretty intuitive:

In [None]:
2 + 4

In [None]:
72 - 86

In [None]:
3 * 4

In [None]:
10 / 2

You might think exponents would be $\texttt{^}$, but instead in Python we use $\texttt{**}$

In [None]:
10**2

You can of course use math when defining variables:

In [None]:
d = a + 8
print(d)

Something that's neat you can do but also need to be careful about is changing variables by doing operations on them:

In [None]:
d = 50
print(d)
d = d + 5
print(d)
d += 5
print(d)

Use the cell below to do some basic operations with our new variables $\texttt{a}$, $\texttt{b}$, and $\texttt{c}$.

As you might of noticed, some numbers are green and some are red, and Python won't let them all work together!! This is because each of these variables are of different $\textbf{type}$!! 

In the cell below, use the function $\texttt{type()}$ on each of our variables to see what type they are:

You should have discovered the three types of data: $\textbf{int}$, $\textbf{float}$, and $\textbf{str}$. Ints and floats usually work fine together (although not all the time!), but both of these have problems working with strings. That's okay though! We can use the $\texttt{int()}$, $\texttt{float()}$, and $\texttt{str()}$ commands to change the type of our objects.

In the cell below, change the type of $\texttt{c}$ so that we can do mathematical operations with it on the other variables. Print $\texttt{a + c}$

As far as formatting is concerned, we don't have to worry much when typing or defining ints and floats. However, to define strings, we have to enclose them in quotation marks. It doesn't matter whether you use single or double. The neat thing about strings is that we can multiply them by ints, or add them to other strings:

In [None]:
print('Yikes' * 5)
print('Put two ' + 'and two ' + 'together')

### Benchmark
Define a function below that takes two arguments - one, a string, the other an int. If you input a person's name (string) and age (int), the function will print $\texttt{[Name] is [age] years old.}$

In addition to basic arithmetic, Python can do $\textbf{boolean logic}$. To check if a variable is equivalent to something, we use a double equals sign. If the two objects are equal, Python will return the $\textbf{boolean}$ $\texttt{True}$. If not, it will return the boolean $\texttt{False}$.

In [None]:
a == 42

In [None]:
a == 37

There are other logic operations that you can do that are intuititve:

In [None]:
15 > 10

In [None]:
5 <= 7

In [None]:
3 >= 5

You can also use $\texttt{and}$ statements and $\texttt{or}$ statements. 

For $\texttt{and}$ statements, both statements must be true in order to return $\texttt{True}$. Otherwise, it will return $\texttt{False}$.

For $\texttt{or}$ statements, only one or both statements must be true in order to return $\texttt{True}$.

In [None]:
3 == 5 or 3 < 5

In [None]:
10 > 2 and 2 < 1

## Data - Lists and Arrays
Of course, science isn't usually just comparing a couple of values at a time. So far we've just used Python as a fancy calculator, but usually we deal with large data sets. Often, these are in the form of our old friend the list. However, lists have their limitations. For example, I can't simply add the values of two lists together. Instead I get this huge monster list:

In [None]:
list(range(0,21,2)) + list(range(0,11))

This is where arrays come in. Arrays are an object not built in to Python already, but in a package called numpy. To import numpy, we simply write $\texttt{import}$ and numpy.

In [None]:
import numpy

To call functions from the numpy module, we type $\texttt{numpy.}$ and then the function we want:

In [None]:
numpy.array(range(11))

numpy has lots of very useful things, and you can end up using it a lot while typing your code, so to save time, we usually import numpy as np:

In [None]:
import numpy as np

And now we only need to type np and Python will know what we're talking about

In [None]:
np.array(range(11))

Now, some of the drawbacks of the $\texttt{range}$ function is that the step size cannot go below 1 and that it wasn't inclusive. A great alternative is $\texttt{np.linspace}$. It also takes three arguments - the first is the beginning of your number sequence, the second is the end (inclusive), and the third is the number of values. linspace then spaces the values evenly so you get a nice even distribution of floats.

In [None]:
np.linspace(0,10,10)

In [None]:
np.linspace(0,10,100)

If you don't give a third argument, the default number is 50

In [None]:
array = np.linspace(0,20)
print(len(array))

If you really like the range function though, numpy also has $\texttt{arange}$, which works the same, but returns your values as a numpy array:

In [None]:
np.arange(5,50,5)

What's so special about arrays anyway?? Well, for one, you can do mathematical operations on them:

In [None]:
array = np.linspace(0,20)
print(array)
array *= 0.5
print(array)

or combine the elements of two:

In [None]:
array1 = np.linspace(0,20,10); array2 = np.linspace(0,0.5,10)
array3 = array1 * array2
print(array3)
print(array1 + array2)

You can even pass arrays through functions:

In [None]:
def arrayfunction(x):
    return x**2

arrayfunction(np.arange(0,10))

If you already have your data in a list, you don't have to worry either, because you can easily turn that list into an array:

In [None]:
mylist = [0,5,10,15,20]
array = np.array(mylist)
print(array/5)

Arrays can also be multidimensional:

In [None]:
np.array([[1,0],[0,1]])

If you're not sure of the dimension or size of an array, you can use the $\texttt{shape()}$ function:

In [None]:
array = np.array([[1,0,0], [0,0,1]])
np.shape(array)

And of course, to define your array manually is very much like creating a list. Within the array function, the data must also be within square brackets:

In [None]:
array = np.array([4,8,12,16])

Okay, I have arrays now, but what if I only need to work with or change one or a few numbers in that array? We can use indexing! To index a list or array, we use square brackets. Like I mentioned before, Python is zero-based, meaning the first element is element 0:

In [None]:
array[0]

You can try to think of this as the "zeroth" element, or just displace by one whenever you think about indices.

To get the last element of a list of array, you can put the index of that element, or -1. Think of this as just indexing in the opposite direction. In general, any distance from the end of the list will just be the negative of the number of elements away from the end:

In [None]:
array[-1]

In [None]:
array[-2]

To index multiple elements, we use a colon. The rules of these are as follows:

$\texttt{a:b}$ - from index a to index b (exlusive)

$\texttt{a:b:c}$ - every c'th entry between indices a and b

$\texttt{a:}$ - from index a until the end

$\texttt{:a}$ - from the beginning to index a

In the cell below, create a new 1d array that ranges from 0 to 100 with 25 elements. Print: a) the array until index 5; b) the array between indices 10 and 20; c) the entire array in steps of 3

When indexing 2d arrays, the convention is $\texttt{[row,column]}$:

In [None]:
array2d = np.array([[1,2,3],
                    [4,5,6],
                    [7,8,9]])
print(array2d[1,1])

Based on the 1d indexing rules, how do you think you could select an entire row of a 2d array? An entire column? Print the second column and the third row of array2d in the cell below:

Another useful trick when working with lists and arrays is to make an "empty" array that you will populate later. This is done using $\texttt{np.zeros()}$. 

For a 1d array, simply put the size you want in the argument. For higher dimensions, $\texttt{(rows,cols})$.

In [None]:
emptyarray = np.zeros((2,3))
emptyarray

### Benchmark
In the cell below, create the 3x3 identity matrix by creating an empty matrix and populating it.

Say you're looking at a very large array or only care about one portion of your data. You can use the $\texttt{np.where}$ function to retrieve the indices of the values within your array that meet certain criteria.

Consider this hypothetical list of grades:

In [None]:
grades = np.array([100, 62, 77, 56, 98, 54, 83, 91, 69, 96])

passing  = np.where(grades > 70)
print(passing)

$\texttt{np.where}$ returns a tuple, which is like a list. As you can see, the indices we care about are in the first element of this tuple:

In [None]:
passinggrades = grades[passing[0]]
print(passinggrades)

Notice how we can use a list of indices as itself an index in order to list multiple elements.

### Benchmark
In the cell below, create a 1d array that spans from 2 to 40 with step size 2. Think of this as our X values. Then create a step function by defining a function that returns 0 for values less than 20 and 1 for values greater than or equal to 20. Pass the array through this function

## If statements and Loops
A lot of Python's power comes from it logic and automation abilities. Using these in a smart way can save you a lot of work. As the saying goes - work smarter, not harder!

$\texttt{if}$ statements provide Python with a condition that must be met for the code to continue. If this condition is not met, the code will stop or look for other conditions.

Let's look at the syntax of how this works:

$\texttt{if condition is met:}$

$\quad \texttt{do something}$

Just as with defining functions, keeping track of indentations is very important here. It can easily become confusing once you start nesting these bad boys.

In [None]:
x = 'Python'

if type(x) == str:
    print('Python is a string')

To provide another possible condition, we can use $\texttt{else}$:

In [None]:
x = 5

if x > 10:
    print('X is a pretty big number')
else:
    print('X is a lame number')

To provide even more conditions, we use $\texttt{elif}$, or "else if" statements. We can add as much of these as we want, as long as they are between the $\texttt{if}$ and $\texttt{else}$ statements. 

In [None]:
array = np.array([[1,5,7],[3,2,8],[9,1,5]])

if np.shape(array) == (1,0):
    print('The array is a 1d list')
elif np.shape(array) == (2,2):
    print('The array is a 2x2 matrix')
elif np.shape(array) == (3,3):
    print('The array is a 3x3 matrix')
else:
    print('Not sure what the array is')

In the cell below, create a 1d array with 10 values between 0 and 1, inclusive. Write an $\texttt{if}$ statement that checks whether the last entry in the array is less than one. If so, print the last entry. If not, print "[last entry] is not less than one."

$\texttt{For}$ loops allow you to iterate through lists and arrays and lots of other things and do some operation for every iteration. The syntax is:

$\texttt{for x in y:}$

$\quad \texttt{do something}$

By convention, people like to use $\texttt{i}$, probably for "iteration", but it actually doesn't really matter as long as you keep track of it.

In [None]:
for i in range(10):
    print(i**2)

In [None]:
hidden_message = ['siht','si','a','neddih','egassem']
for word in hidden_message:
    print(word[::-1])

Like I said, you can nest loops and if statements within each other for extra fun, and things can get crazy. In my research I often use five layers of loops. Don't try this at home kids

In [None]:
students = ['Peter','Joseph','Pranav','Aryn']
grades = [100,82,65,70]
for i in range(len(students)):
    if grades[i] > 90:
        print(students[i],'is a great student')
    elif 75 < grades[i] < 89:
        print(students[i],'is an okay student')
    else:
        print(students[i],'is a terrible student')

In the cell below, iterate through the array, and for every value, check whether it is even. If so, print the value.

Hint: the remainder of x / y in Python is $\texttt{x % y}$

In [None]:
array = np.linspace(0,20,21)




A more dynamic variation of the $\texttt{for}$ loop is the $\texttt{while}$ loop. This will run the code as long as a certain condition is met and then stop once it is no longer true.

In [None]:
x = 0
while x < 10:
    print(x)
    x += 1
else:
    print('x is too big!')

## Data - Visualization
You know your way around numpy arrays and the logic of Python! Great! But numbers are very abstract, and once you get really large datasets, looking through those long lists just won't be able to cut it. Plotting data is the best way to get the most out of your research! but how to....?

Most people like to use another package called $\textbf{matplotlib}$, truncated to "plt". Just like numpy, we need to import it:

In [None]:
import matplotlib.pyplot as plt

The most common function you will call from matplotlib is $\textbf{plot}$. Aptly named, it plots your data! Give your X data first and Y data second:

In [None]:
x = np.linspace(0,20)
y = np.linspace(0,10)

plt.plot(x,y)

There are many many different ways to customize and tweak these plots, and there's no way I could go over them all, but I'll try to hit the major ones:

To change the size of your plot, before you call plot, type plt.figure, and within the argument set the variable figsize=(length,height)

In [None]:
plt.figure(figsize=(15,5))
plt.plot(x,y)

To change the actual line of the plot, there are several additional arguments you can put in plot(). Some are

linestyle: solid line, dashed line, dot dashed, whatever

color: plot in style!! what's the best color??

alpha: opacity basically. Useful if you have a lot of other plots in one graph

label: doesn't affect the graph but will show up in the legend and is very helpful

Here's an example:

In [None]:
plt.figure(figsize=(15,5))
plt.plot(x,y,ls = '-.', color='m', alpha=0.7,label='beautiful plot')
plt.legend()

Labeling is also very important!! 

In [None]:
plt.xlabel('X data')
plt.ylabel('Y data')
plt.title('my beautiful plot')

You can also change the scale and limits

In [None]:
plt.yscale('log')
plt.xlim(4,50)

Besides doing solid, continuous lines, you can also make scatter plots and histograms:

In [None]:
x = [5,2,8,4]
y = [2,7,3,9]
plt.scatter(x,y,marker = '*',s=2**8)
plt.show()

grades = [40,57,80,76,100,92,46,78,98,72,89]
plt.hist(grades)
plt.title('Grade distribution')
plt.show()

You can save these plots as well! Create a variable "fig" and set it equal to plt.figure(). Before calling plt.show(), do fig.savefig('filename.pdf').

### Benchmark
Create a quadratic function and an array from 0 to 50 to use as X values. Plot X and f(X), with the y scale on logarithmic. Make the length:height ratio of the plot 2:1, the color something other than default blue, and the line not solid. Then create another X array, half the resolution as the other, and scatter plot X and f(X). Again, make the color anything other than default blue and the markers anything other than dots. Label the line plot 'Excellent Data' and have a legend. Title the plot 'My first Python plot'. Once you're happy with it, save it.

# Part II: Pythonic Boogaloo (or, now THIS is podracing)

## Reading and Writing Data
I keep talking about data and how big datasets can be, but where is it?? Surely we don't type in every single number and make a numpy array out of it? Fortunately for us, there are several ways Python can read in data. One of these is reading text files. Numpy has a nice function called $\texttt{loadtxt}$ that reads in strings of numbers in text files and creates arrays we can work with. To show this, I'm going to make you do some of my research for me, mwahahahahahaha

In [None]:
plan_spec = np.loadtxt('gj1132b_psurf1.0_alb0.3_chem_earth.spec',skiprows=2)

I've read in the thermal emission spectrum of a planet from a text file and named it to the variable $\texttt{plan_spec}$. This file is split into columns, with wavelength in the first column and flux density in the second. Thankfully, the syntax for this is similar to something we've seen before. Thinking back to another object we used today that is split by columns, plot the planet spectrum in the cell below:

The filename for the star that this planet orbits is 'gj1132_interp.txt'. Read it in and plot it below:

Great, now we have both the planet spectrum and the stellar spectrum. Before we continue to the next step though, I'd like to import another very useful package: astropy. As it's name suggests, astropy has many tools catered for astronomy research through Python. In particular right now, I want to import its library of astronomical constants:

In [None]:
import astropy.constants as const

In [None]:
area_ratio = (1.16 * const.R_earth) / (0.207 * const.R_sun)
eclipsedepth = (plan_spec[:,1]/star_spec[:,1]) * area_ratio**2

In [None]:
plt.plot(plan_spec[:,0],eclipsedepth); plt.xlim(0,20)

In [None]:
f = open('New Spectrum.txt','w+'')
for i in range(len(eclipsedepth)):
    f.write(format(plan_spec[:,0][i]) + '\t' + format(eclipsedepth[i]) + '\t\n')
f.close()

### Benchmark
Use the quadratic function you created in the last section to make a numpy array of the x and y values. Write them to a text file, then read the text file back in and plot it

## Curve Fitting
Data doesn't mean anything if you don't know what it means!! Astronomers frequently try to fit data to models and theory in order to contrain what's going on. 

First, lets read in our next set of data:

In [None]:
RV = np.loadtxt('RV Curve.txt')

This is a radial velocity curve of a star. I want you to use scipy's $\texttt{curve_fit}$ function to fit this to a sinusoidal function to solve for the radial velocity of the star as well as the system's velocity through space.

However, one of the most important skills of a coder is knowing what to do when you don't know what to do. Therefore, this activity will be very hands off - I want you to google how to use this function and figure it out for yourselves.

Hint: You may make use of numpy's $\texttt{np.sin()}$ function.