# TAURUS + REU 2019
## written by Jackie Champagne (UT Austin)

# Introduction to Python Day 1
# Basic Syntax, Variables & Arrays

We have some exercises for you ranging from the most basic tasks to some more advanced plotting techniques that will be helpful to you in astronomy. 

The Jupyter notebook is an interactive browser interface for Python, allowing you to write and edit programs all in one notebook. The whole notebook is automatically saved periodically, but you can also save the outputs from your code as text files, plots, or images separate from the notebook. It is a great tool when you are building code from scratch and want to troubleshoot it or make a quick plot.

This series of notebooks is meant to be self-guided, so no lecture is strictly required. However, the goal of this workshop is to acquaint you with working together on research problems, so we will have a balance between brief lecture sections and having you sort through problems on your own. If you are attending this workshop live, the Questions in this notebook are meant to be a short few lines of code which you will do individually throughout the workshop. The Exercises, in a separate notebook, are longer problems you will begin working on together during the workshop, with the expectation of having all exercises finished and submitted by the end of the workshop. 

### Let's get started!

The first thing you'll need to do when writing any code is import the packages you expect to use. You can put your imports anywhere, as long as it's written somewhere before calling a function from the package, but it's cleaner to put them all at the top. 

For this tutorial, let's load up numpy, which is an external library containing most mathematical functions you will use. To run the code in a cell, press SHIFT+ENTER.

To comment your code, put a hashtag in front of your comment. 

In [1]:
import numpy as np #'import' loads up the package. 
#you can use 'as' to define a shortcut so you don't need to type
#numpy before every use of the package. Most people use 'np'.

import astropy

import matplotlib.pyplot as plt

from scipy import integrate #'from' allows you to import a specific sub-package

## Setting Variables

To set a definition, use =.

In [2]:
a = 1
b = 2

Now Python will always know that a is 1 in this notebook. Setting variables is useful for things like constants, such as c = 3.0e8. To check that this worked, we can print it out. The syntax for printing something is print(thing_you_want_to_print).

In [3]:
print(a)

1


The double equals sign, ==, represents boolean logic. Boolean logic refers to true/false definitions. This will come in handy when you write code where your data must meet a certain criterion. 

OR statements: statement is TRUE if statement A is true, statement B is true, or both statements or true; statement is FALSE if both statements A and B are false.

AND statements: statement is TRUE if and only if both A and B are true; statement is FALSE if one or both is false.

In Python, the phrase "a == 1 or 2" does not make sense. The full statement must be "a == 1 or a == 2" such that they are independent clauses.

In [4]:
a == 1

True

In [5]:
a == 2

False

In [6]:
a == 1 or a == 2

True

In [7]:
a == 1 and a == 2

False

In [8]:
a == 1 or b == 2

True

## Variable types

There are 3 kinds of variables in Python: strings, floats, and integers. 

A floating point value (float) is a number followed by a decimal. An integer is a whole number, and is required for discrete categories such as indexing (you can have entry 5 in an array, but not entry 5.5). A string has quotes around it and is treated as a word rather than a numeric value.

### Question 1: What kinds of variables are the following? Fill it in as a comment.

In [11]:
i = 1 # what is the answer???? help pls!!!!
j = 2.43
k = 'Hello world!'
L = 3. 
#m = hello world!

'''
HELP MY CODE IS KILLING ME!!!!!!
'''

'\nHELP MY CODE IS KILLING ME!!!!!!\n'

You can convert between variable types if necessary, using the following commands:

    int()
    float()
    str()
    
int() will print the whole number value of the float and *does not round*. float() will follow an integer with a .0, which sounds pointless but is sometimes necessary for Python arithmetic. str() will put quotes around it so that Python reads it literally rather than numerically.

### Question 2: Convert i to a float and to a string. Convert j into an integer. Print type(k) to check your answer.


In [16]:
# solution here

i_float = float(i)
i_string = str(i)
j_int = int(j)
print(i_float, i_string, j_int)
print(type(i_float), type(i_string), type(j_int))
print(type(k))

1.0 1 2
<class 'float'> <class 'str'> <class 'int'>
<class 'str'>


### Now convert k to an integer. What happens? 

In [17]:
#solution here

k_int = int(k)

ValueError: invalid literal for int() with base 10: 'Hello world!'

### Challenge question: Convert the string '1.4' into an integer. Hint: this takes multiple steps.

In [20]:
# solution here

q = '1.4'
o = int(float(q))
print(o)

1


## Python arithmetic

The syntax for arithmetic are the following:
    + add
    - subtract
    * multiply
    / divide
    ** power
    np.log() log-base e
    np.log10() log-base 10
    np.exp() exponential

In [21]:
example = 5 + 3
print(example)

8


In [22]:
i + j

3.43

In [23]:
i * j

2.43

In [27]:
j - i

1.4300000000000002

In [25]:
i / j 

0.4115226337448559

You're using Python 3.5, but note that in version 2.7, it was important to remember whether your variables were floats or integers. Dividing two integers used to give you an integer answer even if there was a remainder, so watch out for that if you decide to use 2.7.

## Arrays and Lists

When working with data, you usually won't be dealing with just one number, but an array of values. Arrays can consist of floats, integers, strings, or a combination of them. A list is denoted by brackets: [], while an array must be defined with np.array(). 

The dimensions of arrays and lists are the same as in linear algebra: (rows, columns).

The following is a 1D list:

In [28]:
beemovie = ['Barry B. Benson', 'Vanessa Bloome', 'Ray Liotta as Ray Liotta']
print(beemovie)

['Barry B. Benson', 'Vanessa Bloome', 'Ray Liotta as Ray Liotta']


For strings, this is fine, but you will need arrays in order to manipulate them mathematically. The array function is built into numpy. Here is a 1D array:

In [30]:
myarray = np.array([1, 2, 3])
my2darray = np.array([[1, 2], [1, 2]])

You will call the array function within parentheses (). The whole array must be enclosed in a set of brackets []. Each row of the array should be in its own set of brackets, separated by commas.

### Question 3: Create the following 2D array:

    1 2 3
    4 5 6

In [32]:
#solution here

another_2d_array = np.array([[1, 2, 3], [4, 5, 6]])

print(another_2d_array)

[[1 2 3]
 [4 5 6]]


## Populating Arrays

You don't always have to put in the values of your array manually, especially if, for example, you want a function to sample numbers evenly along some axis. 

There are two *inclusive* ways to define arrays, meaning that the last number is included.

The first is np.linspace(), giving you an array between two values that are linearly spaced. The second is np.logspace(), giving you an array between two values that are spaced evenly in log10 (10^1, 10^1.1, 10^1.2, etc).
    
The syntax is the following:

    np.linspace(beginning number, end number, number of values)
    np.logspace(log(beginning number), log(end number), number of values)
    
### Question 4: Create two arrays with ten entries between 1 and 100, one in linear space and the other in logspace.



In [33]:
# solution here

linarray = np.linspace(1, 100, 10)
logarray = np.logspace(0, 2, 10)
logarray_alt = np.logspace(np.log10(1), np.log10(100), 10)

print(linarray, logarray, logarray_alt)

[  1.  12.  23.  34.  45.  56.  67.  78.  89. 100.] [  1.           1.66810054   2.7825594    4.64158883   7.74263683
  12.91549665  21.5443469   35.93813664  59.94842503 100.        ] [  1.           1.66810054   2.7825594    4.64158883   7.74263683
  12.91549665  21.5443469   35.93813664  59.94842503 100.        ]


Another way to create an array is through np.arange(), but this is *exclusive*, meaning it will sample numbers linearly beginning with your initial number and ending one step size before your end number. So if you want your axis to go from 0 to 100 in steps of 1, make sure you tell it to go to 101.

Confusingly, the syntax is different. The third argument is your step size, not your number of data points, so this is good to use if the step size is important. Of course you can always figure out how many data points you need for the spacing to be such-and-such in linspace, but if you're lazy and bad at math this works too!

    np.arange(beginning number, end number+delta, delta)
    
### Question 5: Create another array with ten entries between 1 and 100 using np.arange. Notice how this is different from linspace.

In [36]:
# solution here

arange_array = np.arange(1, 110, 10)
print(arange_array)

[  1  11  21  31  41  51  61  71  81  91 101]


Another quick way to create an array is through np.zeros. This populates an array with, well, zeros. It might sound useless at first, but it's an easy way to initialize an array that you will later replace with different values. It helps keep arrays at a fixed length, for instance. More on that later.

The syntax is simply number of rows, number of columns. If it's 1D, then it can just be:

    np.zeros(3) #a 1x3 array of zeros
    
If it's N-D, you need two sets of parentheses:

    np.zeros((rows, columns))
    
    
### Question 6: Create a 3x3 array of zeros.

In [38]:
#solution here

zeros_array = np.zeros((3, 3))

print(zeros_array)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


## Indexing

## PYTHON USES ZERO-BASED INDEXING!

This means that the first value of an array is the 0th index. 

To call a certain value from an array, call the array name followed by brackets containing the index of the value you want:

    array[0]
    
The value inside the brackets can also be a variable, so long as the variable is an integer. 

A helpful shortcut is that you can also count backwards in your array with a negative sign, so the last value in your array is always array[-1].

### Question 7: Print out the first and last value in your linspace array.

In [39]:
#solution here

first_value = linarray[0]
last_value = linarray[-1]
print(first_value, last_value)

1.0 100.0


Finally, you can also slice arrays between certain index values. This is helpful if you want to plot only a small subsample of your data, for example.

For slicing, use a colon. Syntax:

    :x - from beginning to index x
    x: - from index x until the end
    a:b - from index a to b
    a:b:c - every c'th entry between indices a and b
    
These can be combined, e.g. a::c goes from index a until the end in steps of c.
    
Slicing is *exclusive*.

### Question 8: Print out the following: a) your linear array until index 5; b) your log array beginning at index 1; c) your arange array between indices 4 and 8; and d) your full linear array in steps of 2 indices.

In [41]:
#solution here

until_index_5 = linarray[:5]
begin_at_1 = logarray[1:]
between_4_8 = arange_array[4:8]
steps_of_2 = linarray[::2]
steps_of_2_longway = linarray[0:-1:2]
print(until_index_5, begin_at_1, between_4_8, steps_of_2, steps_of_2_longway)

#bonus: reversing arrays!
linarray_backwards = linarray[::-1]
print(linarray_backwards)

[ 1. 12. 23. 34. 45.] [  1.66810054   2.7825594    4.64158883   7.74263683  12.91549665
  21.5443469   35.93813664  59.94842503 100.        ] [41 51 61 71] [ 1. 23. 45. 67. 89.] [ 1. 23. 45. 67. 89.]
[100.  89.  78.  67.  56.  45.  34.  23.  12.   1.]


## np.where()

This function allows you to impose a condition on an array so you can search the array for specific values. These conditions can be the following:

    < less than
    <= less than or equal to
    > greater than
    >= greater than or equal to
    == equal to
    != not equal to
    
This function takes some getting used to. The output will be the indices where the array meets your condition. To print the actual values, you would call the array with the output from np.where() as the indices. 

You can also give np.where() a compound condition connected by an ampersand, &. 

    np.where(array < 2)

This gives the indices where this is true. The values of the array where it's true would be written as:

    array[np.where(array < 2)]
    
With a compound statement:

    array[np.where((array < 2) & (array > 0))]
    
Note that, as above, compound statements must be independent clauses, so something like "0<array<2" doesn't work.

### Question 9: Print out, from your linspace array, the following: a) where the array is greater than 10, b) the array where its values are greater than 10, c) where the array is greater than 1 and less than 10, and d) where the array divided by 2 is less than 50.

In [43]:
# solution here

#the first one is asking you to find the LOCATION where the condition is true
where_array = np.where(linarray > 10)

#the second one is asking you to find THE VALUES that meet the condition
array_where_true = linarray[where_array]
where_gt1_lt10 = np.where( (linarray > 1) & (linarray < 10) )
where_div2_lt50 = np.where( linarray / 2 < 50 )



## Array Manipulation and Attributes

You can manipulate arrays with one statement. Check it out:

### Question 10: Create a new array which is your logspace array divided by 2. Create another new array which is your linspace array + 2.

In [45]:
#solution here

logarray_div_2 = logarray / 2
linarray_plus_2 = linarray + 2

print(logarray_div_2, linarray_plus_2)

[ 0.5         0.83405027  1.3912797   2.32079442  3.87131841  6.45774833
 10.77217345 17.96906832 29.97421252 50.        ] [  3.  14.  25.  36.  47.  58.  69.  80.  91. 102.]


You can append values to the end of an array using np.append(). Use it like this:

    np.append(array, something_appended)
    
You can even append another array, like this:

    np.append(array, [5, 6])
    np.append(array1, array2)
    
### Question 11: Create a new array which is another linear array from 100 to 200 appended to your linspace array.

In [46]:
#solution here

linarray_append = np.append(linarray, np.linspace(100, 200, 10))
print(linarray_append)

[  1.          12.          23.          34.          45.
  56.          67.          78.          89.         100.
 100.         111.11111111 122.22222222 133.33333333 144.44444444
 155.55555556 166.66666667 177.77777778 188.88888889 200.        ]


The last part of today will be showing you how to acquire different information from an array. Some of these are attributes of the array, and some of them are attributes of np itself, so you may need to look this up again in the future.

Attributes of the array means that you call this by nameofarray.command:
   
    ndim - prints dimensions of your array
    size - number of elements in n-dimensional array
    shape - shape given by (rows, columns)
    flatten() - collapses the array along one axis
    T - transpose the matrix
    reshape(x, y) - change the dimensions of the array to x, y -- the total number of elements (x*y) MUST match
   
Attributes of numpy, meaning that you call it by np.command(nameofarray):

    sum - sum all the elements in the array
    min - print minimum value in array
    max - print maximum value
    sort - print array in ascending order
    len - print number of elements along the row axis
    dot - matrix multiplication
    

### If you are following along in the in person class:

### You will have received a link to our github classroom which will have your exercise assignments. (This is also included in Exercises.ipynb in this directory, if you're following along outside of the in-person seminar.) Please do the Day 1 exercises and upload this to github before the start of the next session on Thursday.