# TAURUS + NSF REU 2022
## written by Jackie Champagne & Oscar Chavez Ortiz (UT Austin)

# Introduction to Python Day 1
# Basic Syntax, Variables & Arrays

We have some exercises for you ranging from the most basic tasks to some more advanced plotting techniques that will be helpful to you in astronomy. 

The Jupyter notebook is an interactive browser interface for Python, allowing you to write and edit programs all in one notebook. The whole notebook is automatically saved periodically, but you can also save the outputs from your code as text files, plots, or images separate from the notebook. It is a great tool when you are building code from scratch and want to troubleshoot it or make a quick plot.

This series of notebooks is meant to be self-guided, so no lecture is strictly required. However, the goal of this workshop is to acquaint you with working together on research problems. If you are attending this workshop live, the Questions in this notebook are meant to be a short few lines of code which you will do individually throughout the workshop. The Exercises, in a separate notebook, are longer problems you will begin working on together during the workshop, with the expectation of having all exercises finished and submitted by the end of the workshop. 

### Let's get started!

The first thing you'll need to do when writing any code is import the packages you expect to use. You can put your imports anywhere, as long as it's written somewhere before calling a function from the package, but it's cleaner to put them all at the top. 

For this tutorial, let's load up numpy, which is an external library containing most mathematical functions you will use. To run the code in a cell, press SHIFT+ENTER.

To comment your code, put a hashtag in front of your comment. 

In [1]:
import numpy as np #'import' loads up the package. 
#you can use 'as' to define a shortcut so you don't need to type
#numpy before every use of the package. Most people use 'np'.

import astropy

import matplotlib.pyplot as plt

from scipy import integrate #'from' allows you to import a specific sub-package

## Setting Variables

To set a definition, use =.

In [None]:
a = 1
b = 2

Now Python will always know that a is 1 in this notebook. Setting variables is useful for things like constants, such as c = 3.0e8. To check that this worked, we can print it out. The syntax for printing something is print(thing_you_want_to_print). 

Alternatively, you can call the variable by just typing its name, which will show it as an output.

In [None]:
print(a)

In [None]:
a

# Note: 
The double equals sign, ==, represents boolean logic, which refers to true/false definitions. This will come in handy when you write code where your data must meet a certain criterion. 

# Boolean Logic

OR statements: statement is TRUE if statement A is true, statement B is true, or both statements are true; statement is FALSE if both statements A and B are false.

AND statements: statement is TRUE if and only if both A and B are true; statement is FALSE if one or both is false.

In Python, the phrase "a == 1 or 2" does not make sense. The full statement must be "a == 1 or a == 2" such that they are independent clauses.

In [None]:
a == 1

In [None]:
a == 2

In [None]:
a == 1 or a == 2

In [None]:
a == 1 and a == 2

In [None]:
a == 1 or b == 2

## Variable types

There are 4 kinds of variables in Python: strings, floats, integers and booleans. 

A floating point value (float) is a number followed by a decimal. An integer is a whole number, and is required for discrete categories such as indexing (you can have entry 5 in an array, but not entry 5.5). A string has quotes around it and is treated as a word rather than a numeric value and booleans are values containing True or False.

In [None]:
#Example of a float
pi = 3.24259

#Example of an int
integer = 15

#Example of a string
sentence = 'I am a string'
sentence2 = "This is also a string"

#example of a bool
facts = True
fake_news = False

# Converting between data types

You can convert between variable types if necessary, using the following commands:

    int()
    float()
    str()
    
int() will print the whole number value of the float and *does not round*. float() will follow an integer with a .0, which sounds pointless but is sometimes necessary for Python arithmetic. str() will put quotes around it so that Python reads it literally rather than numerically.

In [None]:
#Examples of int conversion
pi = 3.14

int_pi = int(pi)

print(int_pi)

string_10 = '10'

int_10 = int(string_10)

print(int_10)

In [None]:
#Do you think this one would work?
sentence = 'Python is Awesome :D!!'

int_sentence = int(sentence)
print(int_sentence)

## Python arithmetic

The syntax for arithmetic are the following:

    + add
    
    - subtract
    
    * multiply
    
    / divide
    
    ** power
    
    np.log() log-base e
    
    np.log10() log-base 10
    
    np.exp() exponential 

In [None]:
example = 5 + 3
print(example)

In [None]:
i = 1.2 
j = 2.43

In [None]:
i + j

In [None]:
i * j

In [None]:
j - i

In [None]:
i / j 

In [None]:
i**j

You're using Python 3.5, but note that in version 2.7, it was important to remember whether your variables were floats or integers. Dividing two integers used to give you an integer answer even if there was a remainder, so watch out for that if you decide to use 2.7.

# Containers: Arrays and Lists

When working with data, you usually won't be dealing with just one number, but an array of values. Arrays can consist of floats, integers, strings, or a combination of them. A list is denoted by brackets: [], while an array must be defined with np.array(). 

The dimensions of arrays and lists are the same as in linear algebra: (rows, columns).

The following is a one dimensional (1D) list:

In [None]:
beemovie = ['Barry B. Benson', 'Vanessa Bloome', 'Ray Liotta as Ray Liotta']
print(beemovie)

For strings, this is fine, but you will need arrays in order to manipulate them mathematically. The array function is built into numpy. Here is a 1D array:

In [None]:
myarray = np.array([1, 2, 3])
my2darray = np.array([[1, 2], [1, 2]])

You will call the array function within parentheses (). The whole array must be enclosed in a set of brackets []. Each row of the array should be in its own set of brackets, separated by commas.

## Array Arithmetic

Arrays allow you to do the same basic operations we covered above such as multiply and divide to the entire array without the need of any loops.

In [None]:
#getting some random numbers 
base_array = np.random.randint(low = 1, high = 50, size = 50)

print(base_array)
print()

# You uncover that this array is actually off by a factor of 2 
# so we are multiplying the base_array by 2

new_arr = 2 * base_array
print(new_arr)
print()

# someone else came up to you after you mutliplied it by 2 and they say that
# to account for inflation we need to actually divide the array by 4 and add 100000
new_arr1 = new_arr/4 + 100000
print(new_arr1)

# Populating Arrays

You don't always have to put in the values of your array manually, especially if, for example, you want a function to sample numbers evenly along some axis. 

There are two *inclusive* ways to define arrays, meaning that the last number is included.

The first is np.linspace(), giving you an array between two values that are linearly spaced. The second is np.logspace(), giving you an array between two values that are spaced evenly in log10 (10^1, 10^1.1, 10^1.2, etc).
    
The syntax is the following:

    np.linspace(beginning number, end number, number of values)
    np.logspace(log(beginning number), log(end number), number of values)

In [None]:
#generating 50 numbers evenly spaced between 1-100
print(np.linspace(1, 100, 50))

#generating 50 numbers evenly spaced between 10^1 to 10^3
print(np.logspace(1, 3, 50))

Another way to create an array is through np.arange(), but this is *exclusive*, meaning it will sample numbers linearly beginning with your initial number and ending one step size before your end number. So if you want your axis to go from 0 to 100 in steps of 1, make sure you tell it to go to 101.

Confusingly, the syntax is different. The third argument is your step size, not your number of data points, so this is good to use if the step size is important. Of course you can always figure out how many data points you need for the spacing to be such-and-such in linspace, but if you're lazy and bad at math this works too!

    np.arange(beginning number, end number+delta, delta)

In [None]:
bins_for_histogram = np.arange(0, 100, 2)

Another quick way to create an array is through np.zeros. This populates an array with, well, zeros. It might sound useless at first, but it's an easy way to initialize an array that you will later replace with different values. It helps keep arrays at a fixed length, for instance. More on that later.

The syntax is simply number of rows, number of columns. If it's 1D, then it can just be:

    np.zeros(3) #a 1x3 array of zeros
    
If it's 2-D, you need two sets of parentheses:

    np.zeros((rows, columns))

In [None]:
#1D array of zeros of length 10
print(np.zeros(10))

#2D array with 3 rows and 10 columns
print(np.zeros((3, 10)))

# Indexing

## PYTHON USES ZERO-BASED INDEXING!

This means that the first value of an array is the 0th index. 

To call a certain value from an array, call the array name followed by brackets containing the index of the value you want:

    array[0]
    
The value inside the brackets can also be a variable, so long as the variable is an integer. 

A helpful shortcut is that you can also count backwards in your array with a negative sign, so the last value in your array is always array[-1].

In [None]:
base_array = np.random.randint(low = 1, high = 50, size = 50)

#getting the first entry
first_entry = base_array[0]
print(first_entry)
print()

#getting the last entry (length of array) - 1
last_entry = base_array[49]
print(last_entry)
print()
#or
last_entry = base_array[-1]
print(last_entry)
print()

#getting the 10th entry
tenth_entry = base_array[9]
print(tenth_entry)
print()

## Slicing
Finally, you can also slice arrays between certain index values. This is helpful if you want to plot only a small subsample of your data, for example.

For slicing, use a colon. Syntax:

    :x - from beginning to index x
    x: - from index x until the end
    a:b - from index a to b
    a:b:c - every c'th entry between indices a and b
    
These can be combined, e.g. a::c goes from index a until the end in steps of c.
    
Slicing is *exclusive*.

In [None]:
long_array = np.arange(1, 1000, 1)

#getting a small subset between index 50-499
print(long_array[50:500])

#getting every even index values
print(long_array[::2])

#getting every 2nd entry between index 50-499
print(long_array[50:500:2])

## np.where()

This function allows you to impose a condition on an array so you can search the array for specific values. These conditions can be the following:

    < less than
    <= less than or equal to
    > greater than
    >= greater than or equal to
    == equal to
    != not equal to
    
This function takes some getting used to. The output will be the indices where the array meets your condition. To print the actual values, you would call the array with the output from np.where() as the indices. 

You can also give np.where() a compound condition connected by an ampersand, &. 

    np.where(array < 2)

This gives the indices where this is true. The values of the array where it's true would be written as:

    array[np.where(array < 2)]
    
With a compound statement:

    array[np.where((array < 2) & (array > 0))]
    
Note that, as above, compound statements must be independent clauses, so something like "0<array<2" doesn't work.

In [None]:
arr = np.linspace(1, 100, 50)

print(arr[np.where(arr < 50)])

print(arr[np.where((arr < 50) & (arr > 25))])

## Array Manipulation and Attributes

You can manipulate arrays with one statement. Check it out:

You can append values to the end of an array using np.append(). Use it like this:

    np.append(array, something_appended)
    
You can even append another array, like this:

    np.append(array, [5, 6])
    np.append(array1, array2)

In [36]:
array = np.array([12, 45, 6, 534, 2,3 ,4 ,42])

np.append(array, [5, 6])

array([ 12,  45,   6, 534,   2,   3,   4,  42,   5,   6])

The last part of today will be showing you how to acquire different information from an array. Some of these are attributes of the array, and some of them are attributes of np itself, so you may need to look this up again in the future.

Attributes of the array means that you call this by nameofarray.command:
   
    ndim - prints dimensions of your array
    size - number of elements in n-dimensional array
    shape - shape given by (rows, columns)
    flatten() - collapses the array along one axis
    T - transpose the matrix
    reshape(x, y) - change the dimensions of the array to x, y -- the total number of elements (x*y) MUST match
   
Attributes of numpy, meaning that you call it by np.command(nameofarray):

    sum - sum all the elements in the array
    min - print minimum value in array
    max - print maximum value
    sort - print array in ascending order
    len - print number of elements along the row axis
    dot - matrix multiplication
    

In [37]:
#example of reshape in action changing a size 100 1D array to a 20 x 5 2D array
array = np.linspace(0, 100, 100).reshape(20, 5)

In [None]:
print(f'Number of Dimensions: {array.ndim}')

print(f'Size of array is: {array.size}')

print(f'Shape of array is: {array.size}')

print(f'Transpose of array is: {array.size}')

# Excercises

## Exercise 1
A. Create an array with 10 evenly spaced values in logspace ranging from 0.1 to 10,000.

B. Print the following values: The first value in the array, the final value in the array, and the range of 5th-8th values.

C. Append the numbers 10,001 and 10,002 (as floats) to the array. Make sure you define this!

D. Divide your new array by 2.

E. Reshape your array to be 3 x 4.

F. Multiply your array by itself.

G. Print out the number of dimensions and the maximum value.

In [None]:
# Code Goes Here

## Exercise 2

Make a 50 x 50 array, using whatever method you like and make a checker board pattern on the 2D-array that spans the rows and columns. Kinda like the example below:

In [None]:
##########################
# 1 0 1 0 1 0 1 0 1 0 1 0
# 0 1 0 1 0 1 0 1 0 1 0 1
# 1 0 1 0 1 0 1 0 1 0 1 0
# 0 1 0 1 0 1 0 1 0 1 0 1
# 1 0 1 0 1 0 1 0 1 0 1 0
# 0 1 0 1 0 1 0 1 0 1 0 1
# 1 0 1 0 1 0 1 0 1 0 1 0
# 0 1 0 1 0 1 0 1 0 1 0 1
# 1 0 1 0 1 0 1 0 1 0 1 0
# 0 1 0 1 0 1 0 1 0 1 0 1
##########################

#code goes here

## Exercise 3

In image processing sometimes we are asked to change the color of an image by renormalizing the Red, Green and Blue images (RGB images). One quick and easy way to renormalize data is by using:

$Normalize = \frac{ImageData - MinData}{MaxData - MinData}$

Where MinData is the minimum value in your data, MaxData is the maximum value in your data and ImageData is the imaging data to be normalized.

Below is a variable called Image, your task is to go through this 3 x 100 x 100 image and normalize it with respect to each 100 x 100 layer. Once you have done that try plotting it up by passing the variable image into the plotting function predefined below.

Hint: The first entry in Image is the first 100 x 100 image to renormalize (Red), second is second image (Green), and third is third image (Blue).

In [2]:
np.random.seed(50)
Image = np.random.uniform(low = 20, high = 1543, size = (100, 100, 3))

In [None]:
# Write Renormalizing Code Here



In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.figure(figsize = (10, 10))
plt.imshow(Image.reshape(100, 100, 3))
plt.show()