**PySDS Week 1. Lecture 4. v1.1** Author: Bernie Hogan

# Learning Python: Functions and abstraction

# Section 1: Error Handling 

you will have noticed by now that sometimes the code gives errors. And sometimes this is not actually a problem for the code. We might want to deal with the error and move on rather than have the program break because of an error. This will be espeically pertinent in week three once we start doing work on the web, as socket and connectivity errors happen but we will still want the program to continue collecting data. 

Here is an example of an error: 

In [None]:
1/0

The error is a zero division error as we know from maths that we cannot divide by zero for zero means nothing (sparing us all the complex proofs of this, which are really interesting, but not for here). The important part is that we can catch the error and move on if we anticipate it and think it's not going to affect future code. We do this using ```try``` and ```except``` statements. See the example below: 

In [None]:
print(...)

In [None]:
import numpy as np 

for i in range(-5,5):
    try:
        print(1/i)
    except:
#         print(1/i)
        print(np.nan)


If we just said ```except``` and left out the ```ZeroDivisionError``` it would catch all possible errors. This is not necessarily what we want, since some of those errors might be legitimately concerning while others are things we anticipate as a matter of course. 

# Section 2. Building your own functions

Below I show some code to build a list called outlist. Then I think, how about doing that in a list comprension. It's a problem as it has if and else statements available, and things are starting to get a bit complex. What we can do, is write a function, throw all the complicated bits in the function and then just call the function inside the list comprehension. That's what we do with the ```makeEven``` function. 

In [None]:
# We make numbers even in bespoke, non-reusable code here
outlist = []
for i in range(10):
    if i%2 ==1:
        outlist.append(i+1)
    else:
        outlist.append(i)

print(outlist)


def makeEven(number):
    if number %2 ==1:
        return number+1
    else:
        return number
    
[makeEven(x) for x in range(10)]


Building your own functions is a crucial part of coding. Without functions, you are left with code that is literally just one command after another. With functions you can abstract away the common parts and just send the novel parts to the function as __arguments__. 

We have already seen a few functions such as ```print()``` and ```len()```. Below is an example of some repetitive code and then is an example of a function that does this repetitive job at once. As I mentioned above, if you are going more than three loops deep (some would say one loop deep) then you're doing it wrong. One of the things you ought to do rather than loops within loops is to call a function. How to do that will become clear as we learn more about these crucial parts of programming. 

In [None]:
x = 5
if x %2 == 1: 
    y = x * 2 
else:
    y = x

x = 8
if x % 2 == 1: 
    y = x * 2
else:
    y = x 
    
x = 10
if x % 2 == 1: 
    y = x * 2 
else:
    y = x
    
    
def doubleIfOdd(input):
    if input % 2 == 1:
        return input * 2
    else:
        return input

x = 7
y = doubleIfOdd(x) 
print(y)

This will serve as a tutorial to functions. It is really, really superficial. Later we will make these functions more complicated but for now, let's stick with the bare minimum. 

A function is a piece of code with a name, a place for some input, a place for some calculations and a means to return the results of the calculations back to where ever the function was invoked. 

Imagine I have a function called ```doubleTheNumber``` that literally just takes a number and doubles it.

~~~ python
x = 5
y = doubleTheNumber(x) 
print(y)
> 10
~~~

Now to build that function we need to do the four things specified above: 
- Name
- Inputs
- Calculations
- Outputs

~~~ python
def doubleTheNumber ( input_number): 
    output_number = input_number * 2
    return output_number
~~~ 

This code above has all four things things we wanted. Of course, we could have just taken the input and went ```input * 2``` but that's not how we learn how to use functions. We can do lots of things inside a function. This way we can then call that function inside a list comprehension. In the example above we did not just double the number, but doubled the number if the number was odd. That way all the numbers that get returned are even.

Below we will use that function inside of a list comprehension.

In [None]:
def doubleTheNumberIfEven ( input_number ): 
    if input_number%2==0:
        return input_number * 2
    else:
        return input_number
    
numbers = [1,4,6,7,9,14,17]

new_numbers = [doubleTheNumberIfEven(i) for i in numbers]

print(new_numbers)

## Important notes on functions 
Functions are a huge topic. These will not be the last notes you'll need. 

### Note 1. Variables have a 'scope'. 
A variable that is created inside of a function is not the same as the one created outside of that function even if they have the same name. This is because the variable inside the function is a __local__ variable. Variables created in jupyter are typically treated as __global__ variables if they are created in a cell but not in a function. To be global means that they can be used anywhere. Local variables are created and destroyed within their local context. You can watch this behavior with a code snippet. 

In [None]:
# Local / Global scope example 1: Variable in the function stays in there.

def multiplyTheValue(input_number):
    x = input_number * 2
    print("Inside the function",x)
    return x 

x = 4 
print( "Before the function",x)

output_number = multiplyTheValue(x)

print("After the function",x)
print(output_number)

But ```x``` wasn't the argument, input_number was. So what if we change input number inside the function? 

In [None]:
# Local / Global scope example 2: Argument sent to function doesn't escape the function.

x = 4 
# print("Before the function",input_number)

def multiplyTheValue(input_number):
    x = input_number * 2
    input_number = 33
    print("Inside the function",input_number)
    return x 

output_number = multiplyTheValue(x)

print("After the function",input_number)

It seems that it is still the case. We changed ```input_number``` to 33 inside the function. Yet, when we print it outside of the function, it throws an error. This is because we created ```input_number``` inside the function, so it isn't available outside the function unless we explicitly make it available (which is often a very bad idea that leads to all sorts of unexpected issues). 

In [None]:
# Local / Global scope example 3: Casting a variable as global makes it available outside the function.

x = 4
print("Before the function",x)

def multiplyTheValue(input_number):
    global x
    x = input_number * 2
    
    input_number = 33
    print("Inside the function",x)
    return x 

output_number = multiplyTheValue(x)

print("After the function",x)

In this third example, we can see that when we declare x is a global variable inside the function, that value then becomes the value outside of the function. We double ```x``` inside the function and then later when we print x it is no longer 4, it retains the value it had inside the function. 

In [None]:
print("a","b","c",4456456)

### Note 2. There are all kinds of ways of passing data to a function. 


Functions can take more than one input. Here are some things we can do with inputs: 
1. Just give it a name. 

    ~~~ python 
    def example( just_name):
        return just_name

    print ( example1("some data") )
    ~~~

2. Give it a name and a default value.

    ~~~ python
    def example2( just_name, name_default = True ):
        if name_default:
               return just_name
        return "Something else"

    print ( example2("some data") )
    ~~~

3. Leave it ambiguous as a list of values. You'll have to query these in order.

    ~~~ python
    def example3( just_name, **args):
        if len(args) > 0:
            for i in args: print(i)

    print ( example3("some data","Maybe","more data") )
    ~~~

4. Leave it ambiguous as a dictionary of variable names and values. You'll have to query these by key.

    ~~~ python
    def example4( **kwargs):
        if len(args) > 0:
            for i,j in kwargs: print(i)

    print ( example3(var1="some data",var3="Maybe",var2="more data") )
    ~~~

[This page from ProTech](https://www.protechtraining.com/content/python_fundamentals_tutorial-functions) gives a nice simple overview of these sorts of arguments. Below you can try these out for yourself.


In [None]:
# Example 1. Just a single positional argument
def example1( just_name):
    print( just_name)

example1("example 1 argument")

In [None]:
# Example 2. A positional argument with a default value
def example2( just_name, name_default = True ):
    if name_default:
        print(just_name)
    else:
        print("name_default was false")

example2("Are the defaults true?",name_default=False)

In [None]:
# Example 3. Postional arguments passed but not defined ahead of time
def example3( *args, just_name):
    if len(args) > 0:
        for i in args: print(i)

example3("some data","Maybe","more data")

In [None]:
# Example 4. Keyword arguments passed but not defined ahead of time
def example4(**kwargs):
    if len(kwargs) > 0:
        for i,j in kwargs.items(): print("var name:",i,"\tvalue:",j)

example4(var1="some data from var1",var3="Maybe it's var3?",var2="var2's valuedata")

In [None]:

def MakeDouble(value):
    try: 
        output = value*2
    except TypeError:
        output = None
        
    return output

print( MakeDouble(2)  )
print( MakeDouble("Double")  )
print( MakeDouble(["2"]))
print( MakeDouble({1:4}))



### Note 3. A function always returns, but it might be nothing at all.

Your function always stops at the return statement. You can have multiple return statements for different conditions (like saying if...return one thing and else...return another). After the return statement, the rest will not be evaluated by the program. But if your statement does not have a return, python will still return ```None``` (which if you remember from above evaluates to ```False```). Just try it for yourself. 

In [None]:
def noReturn():
    pass

print( noReturn())

if noReturn(): 
    print("Did it work?")
else:
    print("Oh right, None evaluates to false.")
    


# Section 3. Classes and Objects 

This is likely to be one of the hardest sections you have encountered yet. Not all programming is object-oriented these days. And not all programs need to be object oriented. Regardless, Python is object oriented (as is Java, C++, swift, Objective-C, and Ruby, for example). Starting next week we will be using a lot of external packages in python. These packages will require you to be familiar with a host of different tools. These tools are normally objects. So learning how an object works, what you can do with it and how you might want to creat your own to help with your code is a very good idea. 

Remember before we said that there was a difference between functions and methods, but it wasn't that important. Now it gets a little more important. A __method__ is a function that is called by an object and operates on that object. This means that the method has a notion of a __self__. Self refers to the object that we created. So when we call:

~~~
"".join()
~~~
we are actually invoking a method that's part of the string __Class__. Later, when we work with tweet objects or Reddit objets through the praw package,  we will be using methods specific to that object. 

## Basic object example 
Imagine we have an object class called ```DoubleNumber```,  that stores two numbers. We then have two methods, ```getNumberA()``` and ```getNumberB()```. These objects will know that numbers A and B were already stored in the ```DoubleNumber``` object. In fact, we could make it so that the object can't be created without declaring what will be number A and what will be number B. 

~~~ python 

class DoubleNumber: 

    def __init__ (self):
        self.numberA = 0
        self.numberB = 42 
    
    def getNumberA(self): 
        return self.numberA
    
    def getNumberB(self): 
        return self.numberB  
 
dn = DoubleNumber() 

~~~

See in the code above that the class is called DoubleNumber. When you call it by invoking ```dn = DoubleNumber()``` you now have an object called ```dn``` which is a ```DoubleNumber``` object. It comes ready-made with a numberA with a value of 0 and a numberB with a value of 42. You can discover these values by calling the ```get``` methods. Unfortunately, we can't use a method to change these values because we haven't created one! Below in the working code is a more elaborate class that you can use.  It will have get, set and another method for you to use. 



In [None]:
class DoubleNumber: 
    '''an object

    you guessed it
    '''
    
    def __init__(self):
        self.numberA = 0
        self.numberB = 42 
    
    def getNumber(self, number = 'a'): 
        if number == 'a':
            return self.numberA
        elif number == 'b':
            return self.numberB
        else:
            return
    
    def setNumber(self, new_value, number = 'a'):
        if number == 'a':
            self.numberA = new_value 
            return True 
        elif number == 'b': 
            self.numberB = new_value
            return True         
        else:
            return False  

dn = DoubleNumber() 

fn = DoubleNumber() 

print ( dn.getNumber() )

print ( dn.getNumber('b') )

dn.setNumber(9234523425401,'b')

print ( fn.getNumber('b') )

print( fn.numberB, dn.numberB)

In [None]:
import numpy 

x = numpy.array(5)
type(fn)

## Reasons to use a class

Part of the reason for showing a class is that it helps us understand the basis of objects, as each object is necessarily an __instantiation__ of some _class_ of object. So next week we will be exploring data wrangling and this will involve the use of more complex data structures thann we have seen up to this point. These structures are all their own kinds of objects with their own features. When you see: 

~~~
import pandas as pd 

df = pd.DataFrame(cols=["name","age"]) 
~~~

You will be able to note that pandas is a library. In this library, which we have imported under the name 'pd' rather than 'pandas', is a class called a DataFrame. By calling ```df = pd.DataFrame()``` you are creating an __instance__ of the DataFrame class called df. As a class it has certain methods available to it. You can discover these methods with the ```dir(<instance>)``` command, which will list the methods available to that object. Try it with the list class below:  

In [None]:
dir([])

You will note that the class has an awful lot of methods with __ in the front of them. These are system methods that you should never have to use unless you are doing some really low level hacking. Then beyond that are the methods we have already introduced (plus a handful others). Now if you list the directory of a method like a DataFrame, you will see a huge number of methods since DataFrame is a very complex object with a ton of options. 


## Important notes on class files

### Note 1. A class variable is referenced different inside and outside the class.

In the doubleNumber example above we had a ```numberA``` and a ```numberB``` instantiated every time we create a DoubleNumber object. If you have a method inside the class, such as the setNumber method then numberA has to be referenced as ```self.numberA```. If you are outside the class, say perhaps if you create a DoubleNumber object called ```dn``` and you want to access its numberA value directly, it would be ```dn.numberA```. 

### Note 2. Classes always have an init statement at the top, the rest is up to you. 

Actually, you can cheat a little bit here and not even have an init statement, but in that case, it is created implicitly. Init is the method that is run whenever you create a new object of that class type. If you have program that creates tweets for processing, then putting things in the init (such as when you created the tweet) will then be available for later. 

### Note 3. Classes can have subclasses

If you create a class, it can inherit the methods of another class. This is a common pattern in python programming. That is, the systemm will provide a class with some empty methods and then you create a subclass that overwrites those methods with ones that do the things you want to do. This can be a bit abstract, so we will just leave it here. Later on you will see methods like HTMLparser and TweetStreamer that want you to create a subclass to handle data that comes in. Below is a basic example of this.

In [None]:
class Fruit():
    def __init__ (self):
        self.name = "Fruit"
    
    def getName(self):
        return self.name
    
class Apple(Fruit):  # This is Apple inheriting the methods and properties of Fruit
    def __init__ (self,variety=""):
        self.name = "Apple"
        self.variety = type

granny_smith = Apple(variety="Granny Smith")
print(granny_smith.getName())

Notice in the above example, we did not make a ```getName``` for Apple? Instead we _inherited_ it from fruit. 

# Section 4. Brief note on style. 

There are different ways to write vvariable names. There is: 

A. concatenatewords

B. underscore_words

C. camelCase

D. also, CamelCase

E. hyphen-words

F. UPPERCASE

So when are these used, is there any consistency and why? These tend to be slightly different depending on program and project, but there are some general guidelines. 

A. Classes: 
    - CamelCase, like DataFrame or DoubleNumber
B. Functions 
    - camelCase like getNumber. Function names are more inconsistent. 
C. Variables
    - underscore_words like muppet_list. 
D. Static variables. 
    - UPPERCASE like DOWNLOADDIR. Upper case means that the variable should not change throughout the program, like a setting or a REGISTRY_KEY in Windows.
    
These are rarely consistent, but this tends to be the case. It should help you navigate variable names and make your code and the code of others easier to read. 