<a href="https://colab.research.google.com/github/eonadler/Colab-Notebooks/blob/main/Week_1_Introduction_to_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Dark Matter & Data Visualization Week 1: Introduction to Python

Welcome to the interactive portion of the course!  This week, we'll introduce the programming language Python.

## Table of Contents:
* [Notebooks, Variables, and Printing](#first-bullet)
* [Arithmetic Operations](#second-bullet)
* [Arrays](#third-bullet)
* [Functions](#fourth-bullet)
* [Exercise: Hubble's Law](#fifth-bullet)

## Notebooks, Variables, and Printing <a name="first-bullet"></a>

This page is an interactive environment, called a "notebook," that lets you write and execute Python code. Notebooks consist of "cells," which are blocks of one or more Python instructions. Let's see how to run code in notebooks and how to define and print variables in Python.

Here's a cell that stores the result of a computation (the number of seconds in a day) in a variable and prints its value. Click on the "play" button to execute the cell. You should be able to see the result. You can also execute the cell using Ctrl+Enter (or Command+Enter if you're on a Mac).

In [None]:
seconds_in_a_day = 24 * 60 * 60
seconds_in_a_day

86400

Variables that you defined in one cell can be used later in other cells. The order of execution is important. For example, if we do not run the cell storing *seconds_in_a_day* beforehand, the following cell will hit an error.

In [None]:
#Lines beginning with # are "comments"; these notes aren't read as code
seconds_in_a_week = 7 * seconds_in_a_day

#Here's an example of how to print more than just the variable's value:
print('There are {} seconds in a week.'.format(seconds_in_a_week))

There are 604800 seconds in a week.


**Exercise.** Click on this cell then click on "+ Code" in the upper-left hand corner of the notebook. In the new cell, compute the number of seconds in a year by reusing the variables *seconds_in_a_week* and/or *seconds_in_a_day*, and print the result. Run the new cell to check your result.

In [None]:
seconds_in_a_year = seconds_in_a_day*365
print(seconds_in_a_year)

31536000


## Arithmetic Operations <a name="second-bullet"></a>

Python can perform arithmetic operations like + (addition), * (multiplication), / (division), ** (raise to a power), etc. For example:

In [None]:
#Define x
x = 5.
print(x)

#A few arithmetic operations:
print(x+3.)
print(2*x)
print(x/2)
print(x**2)

5.0
8.0
10.0
2.5
25.0


Arithmetic can also be performed between variables:

In [None]:
#Define y
y = 3.

#Arithmetic using x and y:
print(x+y)
print(x**y)

8.0
125.0


**Exercise.** Define a new variable, *z*, equal to any integer from 0 to 10. Print the results of an arithmetic operation between *x* and *z* that returns 9.0.

In [None]:
#Your code here!

9.0


## Arrays <a name="third-bullet"></a>

It's often very useful to work with multiple numbers at once; in Python, a collection of numbers is referred to as an *array*. We'll use a tool called NumPy to store arrays and perform computations on them. For example, let's store the integers from 1 through 5 as a NumPy array:

In [None]:
#Imports only need to be performed once per notebook
import numpy as np

integer_array = np.array([1,2,3,4,5])
print(integer_array)

[1 2 3 4 5]


Each number in an array is referred to as an *element*. In Python, the zeroth (0th) element refers to the first entry, the first (1st) element refers to the second entry, and so on. For example:

In [None]:
print(integer_array[0])
print(integer_array[1])

1
2


As for variables, arithmetic can be performed on and between arrays. For example:

In [None]:
even_integers = np.array([0,2,4,6,8])
odd_integers = np.array([1,3,5,7,9])

print(2*even_integers)
print(even_integers+odd_integers)

[ 0  4  8 12 16]
[ 1  5  9 13 17]


**Exercise.** Create an array containing the square of each even integer from -2 through 2. Print this variable to check your result.

In [None]:
#Your code here!

[4 0 4]


## Functions <a name="fourth-bullet"></a>

When writing Python code, it's often helpful to split conceptually different tasks apart into readable, reusable, and modular components. A *function* is a block of code that only executes when called; functions are very helpful tools for making code more modular and readable. For example, compare the following methods for generating an array and checking whether its greatest value exceeds a user-specified threshold:

In [None]:
#Method 1: No functions

#Set threshold
threshold = 0.98

#Generate 500 random numbers between 0 and 1
a = np.random.rand(500)

#Find largest number in the array
a_max = np.max(a)

#Check whether condition is met
if np.max(a) > threshold:
  print('Success!')

Success!


In [None]:
#Method 2: Functions

def generate_rand_array(size):
  return np.random.rand(size)

def find_max(array):
  return np.max(array)

def check_threshold(max,threshold):
  if max > threshold:
    print('Success!')
  return

#Call functions in sequence:
a = generate_rand_array(500)
a_max = find_max(a)
check_threshold(a_max,0.98)

#We could also package multiple components into one function:
def check_max_threshold(size,threshold):
  a = np.random.rand(size)
  a_max = np.max(a)
  if a_max > threshold:
    print('Success!')
  return

check_max_threshold(500,0.98)

Success!
Success!


## Exercise: Hubble's Law <a name="fifth-bullet"></a>


Each week, the interactive portion will end with an exercise that uses the programming/data visualization concepts it introduces and relates to the material presented in class.

**Exercise.** We learned about Hubble's law, which relates the speed galaxies are moving away from us to their distance. Mathematically, $v = H_0 D$, where $v$ is velocity in units of km/s, $D$ is distance in units of Mpc, and $H_0$ is a constant in units of km/s/Mpc.

1. Arrays containing velocity and distance data for the galaxies Hubble studied in 1929 are loaded below. Approximately what value of $H_0$ does this data yield, if Hubble's law holds?

2. We now know that $H_0$ is within a few percent of $70$ km/s/Mpc. What do you think could cause the differences between Hubble's measurements and today's known value?

*Hint: Think about what the measurements represent; should you use all of them when finding $H_0$, and how should you combine the ones you use?*

In [None]:
#Galaxy velocities measured in km/s
galaxy_velocities = np.array([170.,290.,-130.,-70.,-185.,-220.,200.,290.,270.,200.,300.,-30.,650.,150.,500.,920.,450.,500.,500.,960.,500.,850.,800.,1090.])

#Galaxy distances measured in Mpc
galaxy_distances = np.array([0.03,0.03,0.22,0.26,0.28,0.28,0.44,0.5,0.5,0.63,0.79,0.87,0.91,0.87,0.91,1.,1.1,1.1,1.38,1.74,2.,2.,2.,2.])

#Your code to estimate H_0 here!