## Introduction to Python III
**Nick Kern**
<br>
**Astro 9: Python Programming in Astronomy**
<br>
**UC Berkeley**

---

In this lesson, we will begin to study more advanced topics that make Python a powerful high-level language. We will start with an overview of the function object in Python, how to write them and how they operate. We will then look at how to import Python packages, which have pre-defined functions that accomplish certain tasks. In doing so, we will look at the most useful and commonly used built-in packages that come with a standard Python distribution. This will naturally lead us to then discussing the class object in Python, and in general the design of objection-oriented programming. Lastly, we will conclude with an overview of reading-in and writing-out data (aka Input/Output or I/O).

1. [Functions](#Functions)
3. [Namespaces](#Namespaces-and-Scope)
3. [lambda functions](#lambda-functions)
3. [Built-in Functions](#Built-in-Functions)
2. [Modules](#Modules)
3. [Classes and OOP](#Classes-and-OOP)
4. [File I/O](#File-I/O)

### Functions

**How-To**

You can think of a function in Python as a chunk of code that we have purposely separated from the rest of our code and given a name. At any point in the linear flow of our script, we can **call** this function and **pass it arguments** that allow it to run as we desire. Typically, a function will perform some calculation and will **return** the result, or will do something like take an existing data structure and append values to it. Let's start by writing some basic functions. The syntax for constructing a function is as follows:
```
def <func_name>(argument1, argument2, keyword-argument1=something, keyword-argument2=something):
    """
    This is the doc-string! Similar to the # character, the ''' characters mean that the following lines
    are not evaluated by the interpreter and are comments meant to inform the user on how to use
    this function.
    """
    operation1 = argument1 * argument2
    if keyword-argument1 is not something:
        operation1 += 1
    
    return operation1
```
Note that the **indentation** specifies which lines of code are tied to the function.

In [None]:
# A simple function
def my_func(x, y=10):
    "this is my_func, which will return x * y"
    result = x * y
    return result

In [None]:
# inspect the doc string
my_func?

In [None]:
# feed it a number
output = my_func(5)
print(output)

In [None]:
a = 10

In [None]:
# Try to print out result
print(result)

### Namespaces and Scope

The previous test leads us to consider the concept of a ***namespace*** and ***scope***, which are intimately tied to the concept of a function in Python. A namespace is like a dictionary for the Python interpreter: it tells us which variable is attached to a certain piece of data. There can be multiple namespaces spanning a single program. The concept of scope tells us how to navigate the different namespaces depending on where we are within a program.

Spanning the entire program is the "global" namespace. When we define a variable like
```
a = 10
```
for example, we are assigning the variable in the global namespace. Anytime we enter a function, however, we temporarily construct a "local" namespace. Variables defined in this namespace get preference while performing operations in this namespace. This leads to the construction of a hierarchy of namespaces, with local being nested inside of global. If we entered yet another function within the first function, we would create a doubly-nested set of namespaces. When performing an operation with a variable, we need to evaluate the its scope, which tells us which namespaces we can look in to search for the variable. A fundamental principle of scope tells us that when we are currently in some namespace, we can **always reach back** in the hierarcy to access the namespaces upstream that enclose us, but can **never reach forward** in the hierarchy to reach namespaces that are nested further down. 

Let's take our function `my_func` for example. The global namespace exists everywhere, meaning we can access it within functions. Schematically, the namespaces we can access depending on where we are in the script are
```
<global namespace>

def func():
    <local namespace>
    <can also access global namespace>
    
<back to global namespace>
```

In [None]:
# Access the variable z while inside the function
z = 10
x = 0

def my_func(x, y=10):
    result = x * y * z
    print(result)
    
my_func(10)
print(result)

Two things happened here. First, we can see that even though we had previous defined `x = 0` in the global namespace, the scope of the function told us to look first in the local namespace to see if `x` existed. In this case it did, and it used the locally defined `x` instead of the global `x`. Second, we can see that we did not define `z` locally, so it went up a hierachy level to search `global`, and found it.

Anytime we enter a local namespace and then exit that namespace, the variables inside `local` are destroyed. This is important enough to state twice; the variables we define in `local` are lost unless we transport them back to the global namespace before we exit. This is actually a very convenient functionality of Python called ***garbage collection***, which 1) keeps `global` from being cluttered up by series of nested namespaces and 2) helps us recycle memory for future use. How can we transport `local` variables to `global`? Well, this is what the `return` command does at the end of a function: it sends the pointer of the desired object(s) to the global namespace, where a variable in the global namespace awaits its assignment.

In [None]:
# sending output to the global namespace
def my_func(x, y=10):
    return x * y, x**2 + y**2, x + 3

output = my_func(10)

print(output)

In [None]:
# Another example of scope with two nested namespaces
z = 10
def my_func(x, y=10):
    def my_other_func(x):
        return x + y + z
    
    result = my_other_func(x)
    return result, my_other_func

output = my_func(2)
print(output)

### Built-in Functions

Python comes pre-programmed with some built-in functions. We have already explored some of them, like `print()`, `int()`, `float()`, `set()`, `dict()`, and `hex()` and `id()`, to name a few. To get a list of all the built-in functions in Python 3, see [here](https://docs.python.org/3/library/functions.html). Let's explore some of the more useful ones we have yet to see.

`map()` is a very useful function for looping over an iterable and performing some calculation on it element-by-element. In some sense, you can think of it as a function and a for-loop pieced together in a single line. The syntax is `map(<function>, <iterable>)`. In Python 3 (unlike Python 2), the output of `map()` is an iterator, whereas in Python 2 it is a list. 

In [None]:
# Convert each element of the integer list range(5) to a float
result = list( map(float, [0, 1, 2, 3, 4] ) )
print(result)

In [None]:
# Map can also take a custom function as its first argument
# Find all odd numbers in range(10)
def find_odd(x):
    return bool(x % 2)

result = list( map(find_odd, range(10)) )

print(list( range(10) ))
print(result)

The `sorted()` method allows us to easily sort lists.

In [None]:
# Sort numbers
a = [3, 7, 1, 33, 6, 245, 8]
print(sorted(a))

# Sort letters
a = ['t', 'u', 'a', 'g', 'e', 'w']
print(sorted(a))

The `zip()` function weaves or "zips" two separate data structures together element-by-element.

In [None]:
# zip together these lists
states = ["Michigan", "Minnesota", "Massachusetts","Maine","Montana"]
capitals = ["Lansing", "Saint Paul", "Boston", "Augusta", "Helena"]
state_cap = list( zip(states, capitals) )
print(state_cap)

### Breakout

Write a function that calculates and returns either the area or the circumference of a circle when given a radius. enable the selection between outputting the area or circumference by incorporating a keyword-argument. it would be good to know that $\pi \sim 3.14159$

### Modules

Besides the built-in functions we just explored, there are functions that live in external modules. A Python distribution typically comes with standard modules, which you won't need to download if you already have a working Python installation. Let's explore how we can import these modules into our interpreter environment and use them. Some of the modules we will exlore are
* math
* time
* os
* sys
* collections

The syntax for importing a module is
```
import <module>.<submodule> as <module_reference>
```
or
```
from <module> import <submodule> as <module_reference>
```

But first..

In [None]:
import antigravity as ag

**math**

While Python natively allows for basic arithmetic, there are many mathematical operations that we might like to do that aren't pre-built into Python. The `math` modules carries many such functions.

In [None]:
# This will import the math module and name is math
import math

# This will import the math module and name it m
import math as m

# To see a list of what submodule and functions it contains, use dir
dir(m)

In [None]:
# import just one function from the math module
from math import sqrt

In [None]:
sqrt = "this is some string"

In [None]:
# import all methods from the math module
from math import *

# this is considered bad practice because it muddies up your global namespace

You can see the math module has a `log` function, trigonometric functions like `sin` and `cos`, a square-root function, rounding functions like `floor` and `ceil`, and fundamental constants like `pi` and `e`. We can access the functions attached to a module (called methods) with a dot syntax: `<module>.<method>`.

In [None]:
# take sqrt of 25
print( m.sqrt(25) )

**time**

The `time` module allows us get the current time, timezone, and make our interpreter `sleep`.

In [None]:
import time

print("The current time and date in %s is %s" % (time.tzname[0], time.asctime()))

In [None]:
# use sleep to delay the interpreter from continue for X seconds
print("start loop!")
for i in range(2):
    time.sleep(1)
    print("its been %s seconds since the loop started..." % (i + 1))

**os**

The `os` module stands for operating system, and allows us to interact with our operator system like we might through a command line interface like a bash shell.

In [None]:
import os

In [None]:
# similar to bash ls
print( os.listdir() )

In [None]:
# similar to bash pwd
print( os.getcwd() )

In [None]:
# similar to bash cd
os.chdir('../')
os.listdir()

In [None]:
# go back
os.chdir('02_IntroPython')

**sys**

The `sys` module allows us to interact with the interpreter. One thing we can do is access positional arguments that may have been fed to the script. We can also

In [None]:
%%file hello.py
import sys
print(sys.argv)

print("hello!")

In [None]:
%run hello.py arg1 arg2 arg3

**collections**

The `collections` modules provides specialized container data structures as an alternative to Python's general purpose built-in data structures. One of the more notable is the `OrderedDict`, which is a dictionary that keeps the order in which things are added to it, which might be desirable in some circumstances.


In [None]:
import collections

In [None]:
# OrderedDict example
d1 = collections.OrderedDict([['apple',1], ['banana',2],['orange',3],['pear',4]])
d2 = dict([['apple',1], ['banana',2],['orange',3],['pear',4]])

print(d1.keys())
print(d2.keys())

### How to explore new modules

Often we would like to accomplish a task / solve a problem that someone else has already solved and written a module for. Searching the web for Python modules, downloading them, installing them and insepcting them is a useful tool for any scientist. Use the interactive functionalities of Jupyter Notebooks to your advantage! Remember the following commands to help you learn what a module and its associated functions can be used for:
```
<module>.<tab>
```
is a way to see all of the methods and attributes of a module. Another way to print this out to the stdout in a non-interactive session is `dir(<module>)`
You can use 
```
<module>?
```
to see the doc-string of the module, which should give a reasonable explanation of the module and its capabilities. The way to print this to the stdout in a non-interactive session is `help(<module>)`.

In [None]:
import string

In [None]:
string.ascii_lowercase

In [None]:
string?

### Building our Own Module

To bulid a simple module, you need only write Python code into a script and "import" the script (with no .py suffix). When you import the script, you are actually running the script and importing the resultant global variables. If you want to define functions and operations that only occur when the program is run directly from the command line, use the fact that the variable `__name__` is set to the value of `__main__` only when the program is run from the CLI. Take the following example.

In [None]:
%%file module1.py 

myname = "Nick"

def my_simple_function(x):
    return x**2

In [None]:
import module1

In [None]:
dir(module1)

In [None]:
module1.myname

In [None]:
module1.my_simple_function(4)

### Breakout


* Write a function that approximates $\pi$ via the convergent series of inverse squares
\begin{align}
\frac{1}{1} + \frac{1}{4} + \frac{1}{9} + \frac{1}{16} + \ldots = \frac{\pi^{2}}{6}
\end{align}
Take the number of terms in the sum as input, and print out the approximation as well as the fractional error from the true value.


* turn it into a module and import the function!


* don't forget to add an informative doc-string to the function!

### Classes & OOP

Now that we have reviewed functions and module importing, we will cover another fundamental level of Pythonic abstraction: classes. Classes are not unique to Python. In fact, the term object-oriented programming (OOP) implies that a language is built to handle class structures. Here, we will cover how Python uses classes to form high-level abstractions of containerized objects. To be precise, an object in Python is a container that holds **methods (functions)** and **attributes (variables)**, which can be accessed with a `<object>.<method>` syntax.

In this sense, the modules we have been working with are like objects, in that they hold both methods `math.sin` and attributes `math.pi`. In fact, almost everything in Python is an object, like strings (which have their own methods as we will see later), and all of the data structures we saw before. Another thing that separates classes from functions besides the ability of a class to hold methods and attributes, is that to use the class, we cannot just call it from the interpreter like we would a function; to use a class we first need to ***instantiate*** it. Let's first learn what this means through an example.

In [None]:
# Define a very simple class
class My_Class:
    i = 10

In [None]:
# Instantiate the class
MC = My_Class()

In [None]:
# look at class object
My_Class

In [None]:
# look at instantiation of class
print(MC)

This is a very simple class. We have called the object `My_Class` a class, and have assigned it one attribute, `i`, which we set equal to `10`. We then instantiated the class and called that particular instantiation `MC`. If we wanted to access the variable `i`, we can do so via the `<object>.<attribute>` syntax.

In [None]:
# print i
print(MC.i)

We can also add new attributes, or change old ones. This does not affect the class object, only the instantiation.

In [None]:
MC.a = 100
MC.i = 0

In [None]:
print(MC.a, MC.i)

Now let's make the class a little less simple.

In [None]:
# a little less simple class
class My_Class:
    """
    doc-string for the class
    """
    i = 10
    
    def func1(self, arg1):
        """
        doc-string for func1
        """
        return arg1**2

In [None]:
# Instantiate the class
MC = My_Class()

In [None]:
MC.func1(10)

Here, we have assigned the class a **doc string** for informative purposes. Along with `i`, we now define a single **method** for the class, called `func1`, which returns the square of a number passed to it. In this basic form, the class isn't all that helpful to us; its only purpose is to hold the function `func1`, and the attribute `i`, which we could have defined separately. Let's make this class a little more complex.

In [None]:
# a less simple class
class My_Class:
    """
    doc-string for the class
    """
    def __init__(self, var1, var2):
        self.var1 = var1
        self.var2 = var2
        
    def func1(self, arg1):
        """
        doc-string for func1
        """
        result = self.var1 * arg1
        return result


In [None]:
# Instantiate the class
MC = My_Class(10, 20)

Here we have defined an `__init__()` method, which has special meaning when nested inside a class as such. This method will be run automatically when the class is instantiated, and whatever arguments we pass to the class upon instantiation will be passed to this function. In it, we have defined the **class variables** `self.var1` and `self.var2`. Notice that we have modified `func1` to utilize the attribute `self.var1`. What's going on here? Remember our discussion about namespaces and scope? Intuition tells us that `self.var1` should be destroyed after instantiation when the `__init__()` method is finished. What's happening, is that when within a class, the `self` object refernces the class itself. Meaning that when defining `self.var1`, we are promoting the variable the a class-wide namespace, which can be accessed at anytime by calling the class itself, both inside and outside the class.

In [None]:
# access attributes assigned upon class instantiation
print(MC.var1)
print(MC.var2)

In [None]:
# Run func1
MC.func1(2.0)

Let's look at the differences between the object `My_Class` and `MC`. The former is considered a class, while the latter an instance (or an instantiation of `My_Class`). Besides the semantics what's the difference? A good analogy I once heard was that a class is like the blueprints to a certain kind of car: it details how the car should be made and what should come pre-built with it. An instance (or instantiation of the class) is like the actual car itself once it is made. I can have any number of actual cars from the initial blueprint, just as I could have an arbitrary number of instantiations of the original class. Methods of the object `My_Class` are therefore called class methods, and methods of `MC` are called instance methods.

### Special Class Methods

Along with the `__init__` method, there are other special class methods that allow a custom class to do many different things. When we want to print a class, for example, we are really calling the `<class>.__str__` method. If we try to add a number to a class via `<class> + number`, we are really calling `<class>.__add__(number)`. Changing the inherent `__add__()` method of a class is called operator overloading. More info on the available special methods can be found [here](http://www.diveintopython3.net/special-method-names.html).

In [None]:
# This example changes the string representation of the class when printed
class MyClass:
    
    def __init__(self, arg1):
        self.arg1 = arg1
    
    def __str__(self):
        return "MyClass object with arg1 = %s" % self.arg1

MC = MyClass(25)

In [None]:
# call __str__() method when printing
print(MC)

In [None]:
# call __repr__() method when looking at object directly
MC

In [None]:
# This example allows us to add custom class objects
class MyClass:
    
    def __init__(self, arg1):
        self.arg1 = arg1
    
    def __str__(self):
        return "MyClass object with arg1 = %s" % self.arg1
    
    def __repr__(self):
        return "%s" % self.arg1
    
    def __add__(self, other):
        return MyClass(self.arg1 + other.arg1)

MC = MyClass(10)

In [None]:
# Look at MyClass __str__
print(MC)

In [None]:
# Look at MyClass with __repr__
MC

In [None]:
# Add two MyClass instances together!
MC1 = MyClass(10)
MC2 = MyClass(15)
MC3 = MC1 + MC2

In [None]:
# look at __str__
print(MC3)

In [None]:
# look at __repr__
MC3

### Breakout

Write your own custom class called `Complex(r, i)`, where `r` is the real and `i` imaginary component. When instantiated, the class should have the following attributes
* self.real
* self.imag
* self.abs

and the method
* self.conjugate()

which returns the complex conjugate. Your custom Complex class should also support Complex addition.
The syntax and expected output of Complex should be something along the lines of
```
>>> C = Complex(1, 2)
>>> print(C)
(1, 2j)
>>> print(C.real)
1
>>> print(C.imag)
2
>>> print(C.conjugate())
(1, -2j)
>>> print(type(C.conjugate())
<class '__main__.Complex'>
```

**Bonus**:

Allow your custom class to add with the built-in `complex` type. Hint: try using the `isinstance(object, type)` conditional to check what the type of an incoming object is...


### Inheritance and the cat-dog

When we write pieces of code that have separate functionalities, it can be a good idea to branch them into separate classes. This helps keep our codebase clean and organized, and when we want to use the code we've written we only need to import the code we want to use instead of everything in our codebase. There are times, however, that we want to merge the capabilities of two separate classes together, so that we can access the methods and attributes from both classes. We can do this through inheritance. 

In [None]:
class dog:
    def __init__(self, name):
        self.name = name
        
    def bark(self):
        print("bark!")
         
class cat:
    def __init__(self, name):
        self.name = name
    
    def meow(self):
        print("meow!")

In [None]:
# Instantiate
C = cat('tom')
D = dog('goofy')

C.meow()
D.bark()

In [None]:
# Example in inheritance
class dog:
    def __init__(self, name):
        self.name = name
        
    def bark(self):
        print("bark!")
         
class cat(dog):
    def __init__(self, name):
        self.name = name
    
    def meow(self):
        print("meow!")
        
#    def bark(self):
#        print("meark!")

In [None]:
C = cat('tom')

C.bark()

Another solution is to compose the second class within the first via a technique called **composition**. One benefit is that this leads to non-overlapping namespaces.


In [None]:
# Example in composition
class dog:
    def __init__(self, name):
        self.name = name
        
    def bark(self):
        print("bark!")
         
class cat:
    def __init__(self, name):
        self.name = name
        self.dog = dog('goofy')
    
    def meow(self):
        print("meow!")
    
    def bark(self):
        print("meark!")

In [None]:
C = cat('tom')

In [None]:
C.bark()
print(C.name)

In [None]:
C.dog.bark()
print(C.dog.name)

Inheritance and composition are high-level concepts that you may not use in this class, but we would be remiss to at least not mention this powerful capability of the class structure in this course. If you are interested in learning more, there are lots of great resources online. [This link](http://python-textbok.readthedocs.io/en/1.0/Object_Oriented_Programming.html) contains a fairly comprehensive overiew.

### File I/O

We have spent a while talking about how you can manipulate, store and perform operations on data. What about when we are finished with the data? We would like to store it somewhere, share it, or publish it! Even still, often we want to take someone else's data and perform our own operations on it. To do these things with data, we need to talk about how to write and read data, aka file input / output (I/O). One of the many built-in functions we haven't talked about is the `open()` function, which allows us to access data within files. It returns a file descriptor, which allows us to step through the file line-by-line and read in or write out data. This works well for basic text files, like a `.py`, `.sh`, `.txt`, `.tab`, and `.csv`. Let's try to read in the previous `module1.py` file we created before.

In [None]:
# The second 'r' specifies that we are opening it in "read-only" mode: we will not
# be allowed to write into the file, even if we try
f = open('module1.py', 'r')
firstline = f.readline()
secondline = f.readline()
thirdline = f.readline()
fourthline = f.readline()
fifthline = f.readline()
print(firstline+'\n'+secondline+'\n'+thirdline+'\n'+fourthline+'\n'+fifthline)

In [None]:
# we can do this all at once with read
f = open('module1.py', 'r')
f.read()

In [None]:
# package line by line
f = open('module1.py', 'r')
f.readlines()

# good practice is to close the file descriptor when done
f.close()

In [None]:
f.readline()

In [None]:
f = open()

f.close()

In [None]:
# we can handle file descriptors more easily with a "context manager"
with open('module1.py', 'r') as f:
    lines = f.readlines()
    
print(lines)
print(f.closed)

In [None]:
# let's try and write files, 'w' is for overwrite and 'a' is for append
with open('newfile.txt', 'w') as f:
    f.write('hello, this is the first line!')
    f.write('we forgot to make a line break, so this will look weird...\n')
    f.write('now we got the hang of things!\n')
    f.write('\tbut this will have an indent.     ')  

In [None]:
less newfile.txt

In [None]:
# We can use string methods to manipulate text
with open('newfile.txt', 'r') as f:
    lines = f.read()
    
lines

In [None]:
# Get rid of whitespace
lines.strip()

In [None]:
# separate characters by delimieter, which is whitespace by default, but can be anything: \n or ',' for example
words = lines.split()
print(words)

In [None]:
# join them again!
print(' '.join(words))

In the case when we'd like to store / read tabularized data, or entire data structures themselves, we will need to resort to serializing the data. We can't read these kinds of files with our command line text editors, but we can use Python modules like `pickle`, `numpy` or `fits` to read and write these files. `pickle` happens to be my preference, so let's start with that one. `fits` (Flexible Image Transport System) files are **very common** in astronomy for handing imaging data, so we will likely see these kinds of files repeatedly.

In [None]:
# here we will store an entire dictionary to file
import pickle as pkl
from collections import OrderedDict

# make some data (doesn't need to be a dictionary, can be anything!)
d = OrderedDict([['var1', 1], ['var2', 2], ['animal', 'goat']])

# write it with pickle and a new option for the file descriptor
with open('diction.pkl', 'wb') as f:
    # Use pkl to wrap the file descriptor
    output = pkl.Pickler(f)
    
    # dump data
    output.dump(d)
    
    # close file by exiting context manager

In [None]:
# open file with pickle
with open('diction.pkl', 'rb') as f:
    inp = pkl.Unpickler(f)
    dictionary = inp.load()
    
print(dictionary)

### Breakout

* write a function that takes data from the `random.csv` file and writes it out in a `.tab` file (bonus: and vice versa). note the format of the `random.csv` file and try to preserve it when you convert it to `.tab`