# Lecture Six: Review of Weeks 1-5 + Intro to Classes

The last few weeks have gone by pretty quickly, and there's been a lot to learn. It is also peak midterm time right now. So generally we take this week as a review of the content we've already covered as well as introducing classes, since they'll be a tricky(ish) new syntax to learn for next week.

## Part 1: Review

### Variables and Datatypes

- In python, all of the quantities we are interested in working in are stored into what are called variables. 

- We set variables by setting our chosen variable name = value we want it to take. 

- For example, if we want to take the transpose of an array A, (which has already been defined), we would write:

In [None]:
A = np.transpose(A)

If we had just run the transpose function, python would have done the transpose but not saved it anywhere so we would have no way to access that transposed array. 

We always have a choice to save the outputs of functions into new variables, or redefine variables we had before if we want to keep the total number of variables down and have no interest in the previous quantity. When working in ipython notebooks though, it is almost imperative that you use new variable names every time, since the order you run the cells in matters.

### Datatypes

The variables that we create have to be one of several available datatypes that python knows how to handle. The rules that apply to each datatype are different, so we want to choose to store our information in the datatype that is most suited to our needs. 

The primary datatypes we work with are:

- Integers, which use little memory 

- Floats, which have decimal precision (most data info is here)

- Strings, which let us store any characters into a small object

- Lists, which allow us to store all datatypes into an ordered set

- Numpy Arrays, which do the same, but have different operations and only work with single datatypes 

- Dictionaries, which let us create lists that are indexed by a special key rather than by position

- Booleans (True and False)

-Tuples : what the np.where function returns 



### Indexing

Recall that we index the iterable datatypes (strings, lists, arrays) by adding closed brackets to their variable names and the element number we want to access. Most of the tricky parts in programming involve figuring out which elements of a list we are actually interested in (ie, where are the peaks, or the minimums etc). 
We can index multiple values using colon notation (i.e., thing[1:4]), or for multidimensional arrays, in row column format (thing[1:3,2:4]).

For dictionaries we have the same formatting but we substitute a key in the brackets. **Note** that dictionaries are (at least in python 2 for sure) unordered; the order entries appear doesn't matter because you only extract things by key. When you want to use a for-loop over things in a dictionary, a handy method is the dict.keys(), which works like 

```
for i in some_dict.keys():
  val = some_dict[i]
```

Lastly, we learned about indexing by conditions using np.where. You can actually also do this straight in python for most simple cases; that is, 

```
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = A[A>5] 
```

Where we are basically indexing A everywhere A > 5 and storing that sub-array of A as B. 
Why use np.where then? The method I used here doesn't tell us the indices of where the condition is true, it handles the "where" and the "index the thing" at the same time, but sometimes we want to know the indices for other reasons. 

## Functions

Loops make it easy to automate tasks. Just as we have written scripts to perform a variety of tasks, we can write and define functions to do something more specific so we don't have to run whole scripts each time. 
Splitting programs up into functions makes them easier to read and easier to interface with.
A good function should perform a simple task. For example, if you want to convert rest masses into energies you would write something like

In [4]:
def mass_to_en(mass):
    c=3.0e8
    energy = mass*c**2
    return energy, c
en, cc = mass_to_en(10)



In this example, the mass is some arbitrary argument which we pass into the function in order to get the energy. we have returned the parameter we are interested in. The return statement is incredibly important. In functions, we can choose to return different datatypes. Using the information returned by a function is what makes them actually useful. Understanding how to utilize them is essential.


### Scope 

Remember that variables defined inside functions cannot be accessed once the function is done running unless those variables are somehow included in the return statement. 

Also remember that while technically, a function can access any globally defined variable in your code (say, if some variable ```star_name``` were defined up in your script and then you used that name in your function), but this is bad practice. You should define all needed inputs to your functions as arguments and define them locally in your functions. That way your function can be easily transplanted to other codes and bugs in your scripts don't propogate to your functions.

### Optional Arguments 

As we saw in the last tutorial, sometimes we want to have optional arguments for our functions, things that are set by default but can be easily adjusted when running the function (this is what our ```do_plot=True``` in the last tutorial was doing). 

This can be handy for having optional plotting commands, or say, how np.arange(10) works as well as np.arange(0,10,1) (where start and step are defaulted). When the creators of numpy wrote this function, it looked (something) like this:

```
def arange(stop, start=0,step=1):
   blah blah blah 
```

where stop is the only **required** argument and start and step have default values if you don't specify them. Note that when doing optional arguments like this (where their value is set), you have to list them after all the non-optional arguments as I've done above. Numpy then went and did some funky things to tell it that when 2 numbers are entered, assume start and stop, when three, assume start, stop, step. It will return an empty array if you went in the order stop, start, step). 

### Variable Length Arguments

Here's something we haven't touched on yet. What if you want your function to be able to take 5, or 10, or 20 inputs, based on some other output of your code? 

We can do that using the \* args and \*\*kwargs options when defining functions. Here's an example

In [None]:
def test_function(farg,*args):
    print('formal_argument: ', farg)
    for arg in args:
        print('another arg: ',arg)
    

So what’s going on here? the formal argument farg is read in like a normal argument. We could
have any number of these. But we’ve specified the last argument as \*args, which tells python “Hey,you’re gonna get some unknown number of inputs after this- stick em all in a list called ‘args’ for me.’ Then, within the function, you can iterate through the list of extra inputs and do things with them individually. You can also do checks to see how many extra arguments were passed (using len(args)). 

What about \*\*kwargs?

In [None]:
def test_function(farg,**kwargs):
    print('formal_argument: ', farg)
    for key in kwargs:
        print('another arg: ',kwargs[key])

Essentially, instead of adding new variables to your function when calling like 
```
test_function(farg, arg1, arg2, arg3)
```

like you would in the first example, all non-formal arguments would need a key attached (via equals sign) like 

```
test_function(farg,arg1=something,arg2=somethingelse)
```

and then instead of a list of extra arguments, you have a dictionary where in this case arg1 and arg2 are the keys, and can be used to access whatever something and somethingelse are. 

In [3]:
def lists():
    a = range(0,100)
    b = range(100,200)
    c = range(200,300)
    return [a,b,c]
confusion = lists()
#what should the length of this be?
print len(confusion)
#why is it this?
print confusion[0][0:10]
#how would we get the first ten elements of the first array?

3
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


### Recursion

Recursion is a useful technique for reducing complex problems into simpler problems. Recursive formulas are often used in mathematics. The fibonacci sequence is an example of a recursive formula. Recursion involves having a function call itself until it reaches a certain condition. For determining a factorial we would write 

In [8]:
def factorial(n):
    if n==0:
        return 1
    else:
        print n
        return n*factorial(n-1)
print factorial(12)

12
11
10
9
8
7
6
5
4
3
2
1
479001600


Recursion is an alternative to using loops and can often simplify the problem you might wish to solve if there is some base case. 

# Classes

Classes are structures which allow us to organize information that is easily accessible through dot notation. Classes contain methods and attributes. A method is a function available through dot notation. An attribute is a parameter about the class as a whole or about a particular instance of a class. When a class is initialized, it is an instance of that specific class. Attributes can either be of class or instance type.
Class attributes will be indepedent of th initialiation.
Instance attributes will contain information about that specific instance of a class.
An example of a class we have been frequently using is that of a numpy array.  Each time we write arr =np.array([data]) it creates an instance of the numpy array class. When we want to access information about that array like its length we would say arr_len = arr.size . If wanted to make a copy of that array we would write arr.copy() where copy is a method of the array class. 
Check out the example of a planet class below.

In [5]:
G = 6.67e-11
import numpy as np

class planet(object):
    def __init__(self, mass, position):
        self.mass = mass
        self.position = position
        habitable_check = np.random.randint(0,1)
        if habitable_check == 1:
            self.habitable = True
        else:
            self.habitable = False
    def grav(self, body2):
        return G*self.mass * body2.mass/((self.position - body2.position)**2)
    
planet1 = planet(1000, 10)
print('Planet 1 Mass: ',planet1.mass)
planet2 = planet(1500, 25)
grav1_2 = planet1.grav(planet2)
planet1.name = 'mars'
print('Planet 1 Name: ',planet1.name)
print('Planet 1 Habitable: ',planet1.habitable)
print('Grav Force between planet 1 and planet 2: ', grav1_2)

Planet 1 Mass:  1000
Planet 1 Name:  mars
Planet 1 Habitable:  False
Grav Force between planet 1 and planet 2:  4.4466666666666665e-07


The \__init\__ is how one sets up the initializion of classes in python. The self argument present in both functions is the class instance. It is fulfilled by the thing which precedes the dot when calling the method.