# Session 4 - Python

Today we will cover some basic python techniques and structures that are really useful for analyzing data

## Today's Agenda
* Basics of Python
* List Comprehension
* Dictionaries
* Functions
* Classes

# Basics of Python

## The minimal Python script
Unlike many other languages, a simple Python script does not require any sort of header information in the code. So, we can look at the standard programming example, Hello World, in Python (below). Here we're simply printing to screen. If we put that single line into a blank file (called, say, HelloWorld.py]) and then run that in the command line by typing 'python HelloWorld.py' the script should run with no problems. This also shows off the first Python function, print, which can be used to print strings or numbers.

In [4]:
print("Hello World!") 

Hello World!


## There are different types of object classes in Python. Below are a few examples:

In [2]:
print( type("Hello World") ) #'str' is short for string

<class 'str'>


In [3]:
print( type(1) ) # 'int' is short for integer
print( type(1.25) ) # float is for numbers that have digits (AKA floating numbers)

<class 'int'>
<class 'float'>


There are, however, a few lines that you will usually see in a Python script. The first line often starts with "#!" and is called the "shebang". For a Python script (a .py file), an example of the shebang line would be "#!/usr/bin/env python"

Within Python, any line that starts with # is a comment, and won't be executed when running the script. The shebang, though, is there for the shell. If you run the script by calling python explicitly, then the script will be executed in Python. If, however, you want to make the script an executable (which can be run just by typing "./HelloWorld.py") then the shell won't know what language the script should be run in. This is the information included in the shebang line. You don't need it, in general, but it's a good habit to have in case you ever decide to run a script as an executable.

Another common thing at the starts of scripts is several lines that start with 'import'. These lines allow you to allow import individual functions or entire modules (files that contain multiple functions). These can be those you write yourself, or things like numpy, matplotlib, etc.

## Python variables

Some languages require that every variable be defined by a variable type. For example, in C++, you have to define a variable type, first. For example a line like "int x" would define the variable x, and specify that it be an an integer. Python, however, using dynamic typing. That means that variable types are entirely defined by what the variable is stored.

In the below example, we can see a few things happening. First of all, we can see that x behaves initally as a number (specifically, an integer, which is why 42/4=10). However, we can put a string in there instead with no problems. However, we can't treat it as a number anymore and add to it.

Try "Un-Commenting" the 5th line (print x+10) by removing the # to the front of that line, and we'll see that Python will still add *strings* to it.

In [13]:
#with numbers we can do numerical operations
x=42
print(x, type(x))
print (x+10, type(x+10))
print (x/4, type(x/4))# since this value is not an integer, Python will convert it to a float object

# we can also do add strings together although it won't change the content of the strings 
# but rather append them together
x="42"
print(x, type(x))
# print (x+10) # Note: you cannot add strings (inputs that are surrounded by quotes) to non-string objects
print (x+"10", type(x+"10"))

# Note: these numerical operations won't work on strings
# print (x-"10")
# print (x*"10")

42 <class 'int'>
52 <class 'int'>
10.5 <class 'float'>
42 <class 'str'>
4210 <class 'str'>


# Booleans
Booleans have a one of two states,"True" or "False". Try setting a variable equal to True or False in the box below - you should see Python "color" the word to indicate syntactically that it is a special word in Python that has a specific meaning.

In [43]:
T = True
F = False 

print(T,type(T),F,type(F))

True <class 'bool'> False <class 'bool'>


## Lists
The basic way for storing larger amounts of data in Python (and without using other modules like numpy) is Python's default option, lists. A list is, by its definition, one dimensional. If we'd like to store more dimensions, then we are using what are referred to as lists of lists. This is *not* the same thing as an array, which is what numpy will use. Let's take a look at what a list does.

We'll start off with a nice simple list below. Here the list stores integers. Printing it back, we get exactly what we expect. However, because it's being treated as a list, not an array, it gets a little bit weird when we try to do addition or multiplication. Feel free to try changing the operations that we're using and see what causes errors, and what causes unexpected results.

In [15]:
x=[1, 2, 3]
y=[4, 5, 6]
print (x)
print (y)

# what happens when we perform numerical operations on a list?

print (x*2) # this will repeat the contents of the list twice, instead of multiplying each entry by 2
print (x+y) # this will append the y list to the x list, instead of adding each entry to each other
print (y+x) # similar as above except we are now appending the x list to the y list

[1, 2, 3]
[4, 5, 6]
[1, 2, 3, 1, 2, 3]
[1, 2, 3, 4, 5, 6]
[4, 5, 6, 1, 2, 3]


We can also set up a quick list if we want to using the range function. If we use just a single number, then we'll get a list of integers from 0 to 1 less than the number we gave it.

If we want a bit fancier of a list, then we can also include the number to start at (first parameter) and the step size (last parameter). All three of these have to be integers.

If we need it, we can also set up blank lists.

In [16]:
import numpy as np


print(np.linspace(0,10,11))


[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]


If we want to, we can refer to subsets of the list. For just a single term, we can just use the number corresponding to that position. An important thing with Python is that the list index starts at 0, not at 1, starting from the first term. If we're more concerned about the last number in the list, then we can use negative numbers as the index. The last item in the list is -1, the item before that is -2, and so on.


We can also select a set of numbers by using a : to separate list indices. If you use this, and don't specify first or last index, it will presume you meant the start or end of the list, respectively.

After you try running the sample examples below, try to get the following results:
* [6] (using two methods)
* [3,4,5,6]
* [0,1,2,3,4,5,6]
* [7,8,9]

In [18]:
# x=range(1,11)
x=np.linspace(0,10,11)
print (x)
print ("First value", x[0])
print ("Last value", x[-1])
print ('First 5 values', x[0:5])
print ("Fourth to sixth values", x[3:5])


[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
First value 0.0
Last value 10.0
First 5 values [0. 1. 2. 3. 4.]
Fourth to sixth values [3. 4.]


## Modifying lists

The simplest change we can make to a list is to change it at a specific index just be redefining it, like in the second line in the code below.

There's three other handy ways to modify a list. append will add whatever we want as the next item in the list, but this means if we're adding more than a single value, it will add a list into our list, which may not always be what we want.

extend will expand the list to include the additional values, but only if it's a list, it won't work on a single integer (go ahead and try that).

Finally, insert will let us insert a value anywhere within the list. To do this, it requires a number for what spot in the list it should go, and also what we want to add into the list.

In [21]:
x=[1,2,3,4,5]

# let's change the 3rd element from 3 to 8
x[2]=8
print (x)

# let's append the number 6 to our list
print ("Testing append")
x.append(6)
print (x)

# let's append a 2nd list to our list
x.append([7,8])
print (x)


# let's say we want to add a list to our list, not append it. We can use ".extend"
print ("Testing extend")
x=[1,2,3,4,5]
#x.extend(6)
#print x
x.extend([7,8])
print (x)

# we can also use '.insert' to inster values to entries in our list
print ("Testing insert")
x=[1,2,3,4,5]
x.insert(2, "in")
print (x)

[1, 2, 8, 4, 5]
Testing append
[1, 2, 8, 4, 5, 6]
[1, 2, 8, 4, 5, 6, [7, 8]]
Testing extend
[1, 2, 3, 4, 5, 7, 8]
Testing insert
[1, 2, 'in', 3, 4, 5]


# Loops and List Comprehension

Like most languages, we can write loops in Python. One of the most standard loops is a for loop, so we'll focus on that one. Below is a 'standard' way of writing a 'for' loop. We'll do something simple, where all we want is to get the square of each number in the array.

In [24]:
x=np.linspace(0,10,11)

#let's make a blank list that we'll append to.
x_2=[]

# we can append new numbers to our blank list using a "for loop" which will look at every entry in the list independently
for i in x:
    i_2=i*i
    x_2.append(i_2)
print (x_2)

[0.0, 1.0, 4.0, 9.0, 16.0, 25.0, 36.0, 49.0, 64.0, 81.0, 100.0]


# Let's say we only want to perform an operation on the first 5 elements in x, we can run our for loop like so with the "range" command

In [26]:
x_2=[]
for i in range(5):
    x_2.append(x[i]*i)
print(x_2)    

# we can also do this for different ranges

x_2=[]
for i in range(3,7):
    x_2.append(x[i]*i)
print(x_2)    
    

[0.0, 1.0, 4.0, 9.0, 16.0]
[9.0, 16.0, 25.0, 36.0]


While that loop works, even this pretty simple example can be condensed into something a bit shorter. We have to set up a blank list, and then after that, the loop itself was 3 lines, so just getting the squares of all these values took 4 lines. We can do it in one with list comprehension.

This is basically a different way of writing a for loop, and will return a list, so we don't have to set up an empty list for the results.

In [27]:
x=np.linspace(0,10,11)
print (x)
x_2=[i*i for i in x]
print (x_2)

[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
[0.0, 1.0, 4.0, 9.0, 16.0, 25.0, 36.0, 49.0, 64.0, 81.0, 100.0]


# Dictionaries

Dictionaries are another way of storing a large amount of data in Python, except instead of being referenced by an ordered set of numbers like in a list, they are referenced by either strings/characters or numbers, referred to as keys.

In [28]:
x={}
x['answer']=42
print (x['answer'])

42


These are particularly useful if you'll have a handful of values you'd like to call back to often. For an astronomy example, we can set up a dictionary that contains the absolute magnitude of the Sun in a bunch of bands (from Binney & Merrifield). We can now have a code that easily calls absolute magnitudes whenever needed using that dictionary.

We can also list out the dictionary, if needed, with AbMag.items(). There's some other tools for more advanced tricks with dictionaries, but this covers the basics.

In [34]:
AbMag={'U':5.61, 'B':5.48, 'V':4.83, 'R':4.42, 'I':4.08}

# we can print specific entries by using a specific key:
print(AbMag['V'])

print (AbMag.items()) #for showing all entries
print (AbMag.keys()) # for showing all keys
print (AbMag.values()) #for showing all values



4.83
dict_items([('U', 5.61), ('B', 5.48), ('V', 4.83), ('R', 4.42), ('I', 4.08)])
dict_keys(['U', 'B', 'V', 'R', 'I'])
dict_values([5.61, 5.48, 4.83, 4.42, 4.08])


# Functions

At a certain point you'll be writing the same bits of code over and over again. That means that if you want to update it, you'll have to update it in every single spot you did the same thing. This is.... less than optimal use of time, and it also means it's really easy to screw up by forgetting to keep one spot the same as the rest.

We can try out a function by writing a crude function for the sum of a geometric series.
$$\frac{1}{r} + \frac{1}{r^2} + \frac{1}{r^3} + \frac{1}{r^4} + \ldots $$

Conveniently, so long as r is larger than 1, there's a known solution to this series. We can use that to see that this function works.
$$ \frac{1}{r-1} $$

This means we can call the function repeatedly and not need to change anything. In this case, you can try using this GeoSum function for several different numbers (remember, r>1), and see how closely this works, by just changing TermValue

In [40]:
def GeoSum(r):
    powers=range(1,11,1) #set up a list for the exponents 1 to 10
    terms=[(1./(r**x)) for x in powers] #calculate each term in the series
    return sum(terms) #return the sum of the list

TermValue=2
print (GeoSum(TermValue), (1.)/(TermValue-1))

0.9990234375 1.0


# Classes

To steal a good line for this, ["Classes can be thought of as blueprints for creating objects."](https://jeffknupp.com/blog/2014/06/18/improve-your-python-python-classes-and-object-oriented-programming/)

With a class, we can create an object with a whole set of properties that we can access. This can be very useful when you want to deal with many objects with the same set of parameters, rather than trying to keep track of related variables over multiple lists, or even just having a single object's properties all stored in some hard to manage list or dictionary.

Here we'll just use a class that's set up to do some basic math. Note that the class consists of several smaller functions inside of it. The first function, called __init__, is going to be run as soon as we create an object belonging to this class, and so that'll create two attributes to that object, value and square. The other function, powerraise, only gets called if we call it. Try adding some other subfunctions in there to try this out. They don't need to have anything new passed to them to be run.

In [42]:
class SampleClass:
    def __init__(self, value): #run on initial setup of the class, provide a value
        self.value = value
        self.square = value**2
        
    def powerraise(self, powerval): #only run when we call it, provide powerval
        self.powerval=powerval
        self.raisedpower=self.value**powerval # In python the exponential operation is "**" (not "^")

MyNum=SampleClass(3)
print (MyNum.value)
print (MyNum.square)
MyNum.powerraise(4)
print (MyNum.powerval)
print (MyNum.raisedpower)
print (MyNum.value,'^',MyNum.powerval,'=',MyNum.raisedpower)

3
9
4
81
3 ^ 4 = 81


Next session, the first modules!