## IPython notebooks are a browser-based python environment that integrates:
- Text
- Executable python code
- Plots and images
- Rendered mathematical equations

### Cell

The basic unit of a IPython notebook is a `cell`. A `cell` can contain any of the above elements. 

In a notebook, to run a cell of code, hit `Shift-Enter`. This executes the cell and puts the cursor in the next cell below, or makes a new one if you are at the end.  Alternately, you can use:
    
- `Alt-Enter` to force the creation of a new cell unconditionally (useful when inserting new content in the middle of an existing notebook).
- `Control-Enter` executes the cell and keeps the cursor in the same cell, useful for quick experimentation of snippets that you don't need to keep permanently.

### Hello World

In [6]:
print("Hello World!")

Hello World!


In [7]:
# lines that begin with a # are treated as comment lines and not executed

#print("This line is not printed")

print("This line is printed")

This line is printed


### Create a variable

In [8]:
g = 3.0 * 2.0

### Print out the value of the variable

In [4]:
print(g)

6.0


### or even easier:

In [5]:
g

6.0

### `UNIX` commands can be run by placing a `!` before the command:

In [9]:
!ls

01_UnixNotes.txt	      FirstLast_HW2.ipynb  UnixCommands.pdf
02_Python_Introduction.ipynb  LinuxSheet.pdf	   Week1_Assignment.pdf
Astro_Coordinates.pdf	      MainBelt.csv	   hw1temp
BrightStars.csv		      Planets.csv	   hw1temp~
Constellations.csv	      README.md		   images
DiegoMcDonald_HW1.txt	      Raven.txt		   junque.dat
DiegoMcDonald_hw1.txt	      Sillybus.pdf	   small.dat


# Datatypes

In computer programming, a data type is a classification identifying one of various types that data
can have. 

The most common data type we will see in this class are:

* **Integers** (`int`): Integers are the classic cardinal numbers: ... -3, -2, -1, 0, 1, 2, 3, 4, ...
    
* **Floating Point** (`float`): Floating Point are numbers with a decimal point: 1.2, 34.98, -67,23354435, ...
    
* **Booleans** (`bool`): Booleans types can only have one of two values: `True` or `False`. In many languages 0 is considered `False`, and any other value is considered `True`.

* **Strings** (`str`): Strings can be composed of one or more characters: ’a’, ’spam’, ’spam spam eggs and spam’. Usually quotes (’) are used to specify a string. For example ’12’ would refer to the string, not the integer.

## Collections of Data Types

* **Scalar**: A single value of any data type.

* **List**: A collection of values. May be mixed data types. (1, 2.34, ’Spam’, True) including lists of lists: (1, (1,2,3), (3,4))

* **Array**: A collection of values. Must be same data type. [1,2,3,4] or [1.2, 4.5, 2.6] or [True, False, False] or [’Spam’, ’Eggs’, ’Spam’]

* **Matrix**: A multi-dimensional array: [[1,2], [3,4]] (an array of arrays).

In [11]:
a = 1
b = 2.3
c = True
d = "Spam"

In [12]:
type(a), type(b), type(c), type(d)

(int, float, bool, str)

In [13]:
a + b, type(a + b)

(3.3, float)

In [14]:
a + c, type(a + c)    # True = 1

(2, int)

In [15]:
a + d

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [16]:
str(a) + d

'1Spam'

# NumPy (Numerical Python) is the fundamental package for scientific computing with Python.

### Load the numpy library:

In [17]:
import numpy as np

### pi and e are  built-in constants:

In [18]:
np.pi, np.e

(3.141592653589793, 2.718281828459045)

### Different types of division

In [19]:
# Normal division

1/3

0.3333333333333333

In [20]:
# Integer division

1//3

0

## Our basic unit will be the NumPy array

In [21]:
np.random.seed(42)                 # set the seed - everyone gets the same random numbers
x = np.random.randint(1,10,20)     # 20 random ints between 1 and 10
x

array([7, 4, 8, 5, 7, 3, 7, 8, 5, 4, 8, 8, 3, 6, 5, 2, 8, 6, 2, 5])

## Indexing

In [22]:
x[0]    # The 0th element of the array x

7

In [23]:
x[-1]    # The last element of the array x

5

## Slices

`x[start:stop:step]`
 
- `start` is the first item that you want [default = first element]
- `stop`  is the first item that you **do not** want [default = last element]
- `step`  defines size of `step` and whether you are moving forwards (positive) or backwards (negative) [default = 1]

In [24]:
x

array([7, 4, 8, 5, 7, 3, 7, 8, 5, 4, 8, 8, 3, 6, 5, 2, 8, 6, 2, 5])

In [25]:
x[0:4]           # first 4 items

array([7, 4, 8, 5])

In [26]:
x[:4]            # same

array([7, 4, 8, 5])

In [27]:
x[0:4:2]         # first four item, step = 2

array([7, 8])

In [28]:
x[3::-1]         # first four items backwards, step = -1

array([5, 8, 4, 7])

In [29]:
x[::-1]          # Reverse the array x

array([5, 2, 6, 8, 2, 5, 6, 3, 8, 8, 4, 5, 8, 7, 3, 7, 5, 8, 4, 7])

In [30]:
print(x[-5:])    # last 5 elements of the array x

[2 8 6 2 5]


## There are lots of different `methods` that can be applied to a NumPy array

In [40]:
x.cumsum()                   # Number of elements in x

array([  7,  11,  19,  24,  31,  34,  41,  49,  54,  58,  66,  74,  77,
        83,  88,  90,  98, 104, 106, 111])

In [32]:
x.mean()                 # Average of the elements in x

5.5499999999999998

In [33]:
x.sum()                  # Total of the elements in x

111

In [34]:
x[-5:].sum()              # Total of last 5 elements in x

23

In [35]:
x.cumsum()                # Cumulative sum

array([  7,  11,  19,  24,  31,  34,  41,  49,  54,  58,  66,  74,  77,
        83,  88,  90,  98, 104, 106, 111])

In [36]:
x.cumsum()/x.sum()        # Cumulative percentage

array([ 0.06306306,  0.0990991 ,  0.17117117,  0.21621622,  0.27927928,
        0.30630631,  0.36936937,  0.44144144,  0.48648649,  0.52252252,
        0.59459459,  0.66666667,  0.69369369,  0.74774775,  0.79279279,
        0.81081081,  0.88288288,  0.93693694,  0.95495495,  1.        ])

### Help about a function:

In [1]:
?x.min

Object `x.min` not found.


## NumPy math works over an entire array:

In [42]:
y = x * 2
y

array([14,  8, 16, 10, 14,  6, 14, 16, 10,  8, 16, 16,  6, 12, 10,  4, 16,
       12,  4, 10])

In [43]:
sin(x)     # need to Numpy's math functions

NameError: name 'sin' is not defined

In [44]:
np.sin(x)

array([ 0.6569866 , -0.7568025 ,  0.98935825, -0.95892427,  0.6569866 ,
        0.14112001,  0.6569866 ,  0.98935825, -0.95892427, -0.7568025 ,
        0.98935825,  0.98935825,  0.14112001, -0.2794155 , -0.95892427,
        0.90929743,  0.98935825, -0.2794155 ,  0.90929743, -0.95892427])

## Masking - The key to fast programs

In [45]:
mask1 = np.where(x>5)
x, mask1

(array([7, 4, 8, 5, 7, 3, 7, 8, 5, 4, 8, 8, 3, 6, 5, 2, 8, 6, 2, 5]),
 (array([ 0,  2,  4,  6,  7, 10, 11, 13, 16, 17]),))

In [46]:
x[mask1], y[mask1]

(array([7, 8, 7, 7, 8, 8, 8, 6, 8, 6]),
 array([14, 16, 14, 14, 16, 16, 16, 12, 16, 12]))

In [47]:
mask2 = np.where((x>3) & (x<7))
x[mask2]

array([4, 5, 5, 4, 6, 5, 6, 5])

## Fancy masking

In [48]:
mask3 = np.where(x >= 8)
x[mask3]

array([8, 8, 8, 8, 8])

In [49]:
# Set all values of x that match mask3 to 0

x[mask3] = 0
x

array([7, 4, 0, 5, 7, 3, 7, 0, 5, 4, 0, 0, 3, 6, 5, 2, 0, 6, 2, 5])

In [50]:
mask4 = np.where(x != 0)
mask4

(array([ 0,  1,  3,  4,  5,  6,  8,  9, 12, 13, 14, 15, 17, 18, 19]),)

In [51]:
#Add 10 to every value of x that matches mask4:

x[mask4] += 100
x

array([107, 104,   0, 105, 107, 103, 107,   0, 105, 104,   0,   0, 103,
       106, 105, 102,   0, 106, 102, 105])

## Sorting

In [52]:
np.random.seed(13)                 # set the seed - everyone gets the same random numbers
z = np.random.randint(1,10,20)     # 20 random ints between 1 and 10
z

array([3, 1, 1, 7, 3, 5, 4, 5, 3, 7, 6, 5, 3, 1, 4, 6, 4, 7, 6, 2])

In [53]:
np.sort(z)

array([1, 1, 1, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7])

In [54]:
np.sort(z)[0:4]

array([1, 1, 1, 2])

In [55]:
# Returns the indices that would sort an array

np.argsort(z)

array([ 1,  2, 13, 19, 12,  8,  0,  4, 14, 16,  6,  7,  5, 11, 18, 10, 15,
        3, 17,  9])

In [56]:
z, z[np.argsort(z)]

(array([3, 1, 1, 7, 3, 5, 4, 5, 3, 7, 6, 5, 3, 1, 4, 6, 4, 7, 6, 2]),
 array([1, 1, 1, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7]))

In [57]:
maskS = np.argsort(z)

z, z[maskS]

(array([3, 1, 1, 7, 3, 5, 4, 5, 3, 7, 6, 5, 3, 1, 4, 6, 4, 7, 6, 2]),
 array([1, 1, 1, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7]))

# Reading in data - The `AstroPy` package

In [58]:
from astropy.table import QTable

In [59]:
!cat Planets.csv

Name,a,
Mercury,0.3871,0.2056
Venus,0.7233,0.0068
Earth,0.9991,0.0166
Mars,1.5237,0.0935
Jupiter,5.2016,0.0490
Saturn,9.5424,0.0547
Uranus,19.1727,0.0486
Neptune,29.9769,0.0088
Halley,17.8589,0.9680


In [60]:
T = QTable.read('Planets.csv', format='ascii.csv')

In [61]:
T

Name,a,col2
str7,float64,float64
Mercury,0.3871,0.2056
Venus,0.7233,0.0068
Earth,0.9991,0.0166
Mars,1.5237,0.0935
Jupiter,5.2016,0.049
Saturn,9.5424,0.0547
Uranus,19.1727,0.0486
Neptune,29.9769,0.0088
Halley,17.8589,0.968


In [62]:
print(T)

  Name     a     col2 
------- ------- ------
Mercury  0.3871 0.2056
  Venus  0.7233 0.0068
  Earth  0.9991 0.0166
   Mars  1.5237 0.0935
Jupiter  5.2016  0.049
 Saturn  9.5424 0.0547
 Uranus 19.1727 0.0486
Neptune 29.9769 0.0088
 Halley 17.8589  0.968


## Renaming columns

In [63]:
T.rename_column('col2', 'ecc')
print(T)

  Name     a     ecc  
------- ------- ------
Mercury  0.3871 0.2056
  Venus  0.7233 0.0068
  Earth  0.9991 0.0166
   Mars  1.5237 0.0935
Jupiter  5.2016  0.049
 Saturn  9.5424 0.0547
 Uranus 19.1727 0.0486
Neptune 29.9769 0.0088
 Halley 17.8589  0.968


In [64]:
T['Name']

0
Mercury
Venus
Earth
Mars
Jupiter
Saturn
Uranus
Neptune
Halley


In [65]:
T['Name'][0]

'Mercury'

## Sorting

In [67]:
T.sort(['ecc'])

In [68]:
T

Name,a,ecc
str7,float64,float64
Venus,0.7233,0.0068
Neptune,29.9769,0.0088
Earth,0.9991,0.0166
Uranus,19.1727,0.0486
Jupiter,5.2016,0.049
Saturn,9.5424,0.0547
Mars,1.5237,0.0935
Mercury,0.3871,0.2056
Halley,17.8589,0.968


## Masking

In [69]:
T.sort(['a'])    # re-sort our table

In [70]:
mask6 = np.where(T['a'] > 5)

mask6

(array([4, 5, 6, 7, 8]),)

In [71]:
T[mask6]

Name,a,ecc
str7,float64,float64
Jupiter,5.2016,0.049
Saturn,9.5424,0.0547
Halley,17.8589,0.968
Uranus,19.1727,0.0486
Neptune,29.9769,0.0088


In [72]:
mask7 = ((T['a'] > 5) & (T['ecc'] < 0.05))

T[mask7]

Name,a,ecc
str7,float64,float64
Jupiter,5.2016,0.049
Uranus,19.1727,0.0486
Neptune,29.9769,0.0088


## Functions

In computer science, a `function` (also called a `procedure`, `method`, `subroutine`, or `routine`) is a portion
of code within a larger program that performs a specific task and is relatively independent of the
remaining code. The big advantage of a `function` is that it breaks a program into smaller, easier
to understand pieces. It also makes debugging easier. A `function` can also be reused in another
program.

The basic idea of a `function` is that it will take various values, do something with them, and `return` a result. The variables in a `function` are local. That means that they do not affect anything outside the `function`.

Below ia an example of a `function` that solves the mathematical function:

$ f(x,y) = x\ (1 - y)$

In the example the name of the `function` is **george** (you can name `functions` what ever you want). The `function` **george** takes two arguments `x` and `y`, and returns the value of the equation to the main program. In the main program a variable named `GeorgeResult` is assigned the value returned by **george**. Notice that in the main program the `function` **george** is called using the arguments `T[a]` and `T[ecc]`. Since the variables in the `function` are local, you do not have name them `x` and `y` in the main program.

In [82]:
def george(x,y):
    
    result = x * (1.0 - y)          # assign the variable result the value of the function
    return result                   # return the value of the function to the main program

In [75]:
GeorgeResult = george(T['a'],T['ecc'])

GeorgeResult

0
0.30751224
0.71838156
0.98251494
1.38123405
4.9467216
9.02043072
0.5714848
18.24090678
29.71310328


In [76]:
T['Perihelion'] = GeorgeResult

print(T)

  Name     a     ecc    Perihelion
------- ------- ------ -----------
Mercury  0.3871 0.2056  0.30751224
  Venus  0.7233 0.0068  0.71838156
  Earth  0.9991 0.0166  0.98251494
   Mars  1.5237 0.0935  1.38123405
Jupiter  5.2016  0.049   4.9467216
 Saturn  9.5424 0.0547  9.02043072
 Halley 17.8589  0.968   0.5714848
 Uranus 19.1727 0.0486 18.24090678
Neptune 29.9769 0.0088 29.71310328


#### The results of one function can be used as the input to another function

In [77]:
def ringo(x):
    
    result = x - 0.98251494
    return result

In [78]:
ringo(GeorgeResult)

0
-0.6750027
-0.26413338
1.11022302463e-16
0.39871911
3.96420666
8.03791578
-0.41103014
17.25839184
28.73058834


### Saving a table

In [79]:
T.write('newfile.csv', format='ascii.csv')

In [80]:
!cat newfile.csv

Name,a,ecc,Perihelion
Mercury,0.3871,0.2056,0.30751224
Venus,0.7233,0.0068,0.7183815600000001
Earth,0.9991,0.0166,0.9825149400000001
Mars,1.5237,0.0935,1.38123405
Jupiter,5.2016,0.049,4.9467216
Saturn,9.5424,0.0547,9.02043072
Halley,17.8589,0.968,0.5714848000000005
Uranus,19.1727,0.0486,18.24090678
Neptune,29.9769,0.0088,29.71310328
