# Advanced Python Objects

While functions play a big role in the Python ecosystem, Python does have classes which can have attached methods, and be instantiated as objects. While you will use objects a lot in Python, you are less likely to create new classes when you use the interactive environment, because it's a bit verbose. But I think it's important to go over a few details of objects in Python, so that you aren't surprised when you see them.

First, you can define a class using a class keyword, and ending with a colom. Anything indented below this, is within the scope of the class. Classes in Python are generally named using camel case, which means the first character of each word is capitalized. 

You don't declare variables within the object, you just start using them. Class variables can also be declared. These are just variables which are shared across all instances. So in this example, we're saying that the default for all Animal is in the category of herbivore. 

To define a method, you just write it as you would have a function. The one change, is that to have access to the instance which a method is being invoked upon, you must include `self`, in the method signature. Similarly, if you want to refer to instance variables set on the object, you prepend them with the word self, with a full stop. 

In this definition of an animal, for instance, we have written two methods. `set_name` and `set_food`. And both change instance bound variables, called `name` and `food` respectively. When we run this cell, we see no output. The class exists, but we haven't created any objects yet. We can instantiate this class by calling the class name with empty parenthesis behind it. 

In [1]:
class Animal:
    category = "herbivore"
    
    def set_name(self, new_name):
        self.name = new_name
    def set_food(self, new_food):
        self.food = new_food

Then we can call functions and print out attributes of the class using the dot notation, common in most languages. There are a couple of implications of object-oriented programming in Python, that you should take away from this very brief example. 

1. objects in Python do not have private or protected members. If you instantiate an object, you have full access to any of the methods or attributes of that object. 
2. there's no need for an explicit constructor when creating objects in Python. You can add a constructor if you want to by declaring the `__init__` method. 

Now I'm not going to dive any more into Python objects, because there's lots of subtlety and, to be honest, most of the object oriented features of Python aren't really all that salient for introduction to data science. If you're more interested, I'd recommend checking out the Python documentation from the Python tutorial. It's fairly comprehensive overview of the object features of the language, and there will be a reference in the class resources. 

# `map()`

The `map` function is one of the basis for functional programming in Python. Functional programming is a programming paradigm in which you explicitly declare all parameters which could change through execution of a given function. Thus functional programming is referred to as being side-effect free, because there is a software contract that describes what can actually change by calling a function. Now, Python isn't a functional programming language in the pure sense. Since you can have many side effects of functions, and certainly you don't have to pass in the parameters of everything that you're interested in changing. 

But functional programming causes one to think more heavily while chaining operations together. And this really is a sort of underlying theme in much of data science and date cleaning in particular. So, functional programming methods are often used in Python, and it's not uncommon to see a parameter for a function, be a function itself. The map built-in function is one example of a functional programming feature of Python, that I think ties together a number of aspects of the language. The map function signature looks like this. The first parameters of function that you want executed, and the second parameter, and every following parameter, is something which can be iterated upon. 

`map(function, inerable, ...)`

All the iterable arguments are unpacked together, and passed into the given function. That's a little cryptic, so let's take a look at an example. Imagine we have two list of numbers, maybe prices from two different stores on exactly the same items. And we wanted to find the minimum that we would have to pay if we bought the cheaper item between the two stores. To do this, we could iterate through each list, comparing items and choosing the cheapest. With map, we can do this comparison in a single statement. 

In [4]:
store1 = [10, 11, 12.34, 32.4]
store2 = [9.00, 21.00, 23.42, 1.34]
cheapest = map(min, store1, store2)
cheapest

<map at 0x110c4f080>

But when we go to print out the map, we see that we get an odd reference value instead of a list of items that we're expecting. This is called **lazy evaluation**. In Python, **the `map` function returns to you a `map` object. It doesn't actually try and run the function min on two items, until you look inside for a value.** This is an interesting design pattern of the language, and it's commonly used when dealing with big data. This allows us to have very efficient memory management, even though something might be computationally complex. 

In [5]:
for i in cheapest:
    print(i)

9.0
11
12.34
1.34


In [12]:
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']

def split_title_and_name(person):
    title = person.split()[0]
    lastname = person.split()[-1]
    return '{} {}'.format(title, lastname)

list(map(split_title_and_name, people))

['Dr. Brooks', 'Dr. Collins-Thompson', 'Dr. Vydiswaran', 'Dr. Romero']

# lambda

Lambda's are Python's way of creating anonymous functions. These are the same as other functions, but they have no name. The intent is that they're simple or short lived and it's easier just to write out the function in one line instead of going to the trouble of creating a named function. The lambda syntax is fairly simple. But it might take a bit of time to get used to. 

You declare a lambda function with the word lambda followed by a list of arguments, followed by a colon and then a single expression and this is key. There's only one expression to be evaluated in a lambda. The expression value is returned on execution of the lambda. 

In [15]:
my_function = lambda a, b, c: a + b + c

In [16]:
my_function(1,2,3)

6

Note that you can't have default values for lambda parameters and you can't have complex logic inside of the lambda itself because you're limited to a single expression. So lambdas are really much more limited than full function definitions. But I think they're very useful for simple little data cleaning tasks. And you'll see lots of examples with them on the web. So you should be able to read and write lambdas. Let's give it a try here. 

In [17]:
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']

def split_title_and_name(person):
    return person.split()[0] + ' ' + person.split()[-1]

#option 1
for person in people:
    print(split_title_and_name(person) == (lambda x: x.split()[0] + ' ' + x.split()[-1])(person))

#option 2
list(map(split_title_and_name, people)) == list(map(lambda person: person.split()[0] + ' ' + person.split()[-1], people))

True
True
True
True


True

# List Comprehensions

We've learned a lot about sequences and in Python. Tuples, lists, dictionaries and so forth. Sequences are structures that we can iterate over, and often we create these through loops or by reading in data from a file. Python has built in support for creating these collections using a more abbreviated syntax called **list comprehensions**. 


In [21]:
my_list = []
for num in range(0,10):
    if num % 2 ==0:
        my_list.append(num)

In [22]:
my_list

[0, 2, 4, 6, 8]

We can rewrite this as a list comprehension by pulling the iteration on one line. We start the list comprehension with the value we want in the list. In this case, it's a number. Then we put it in the for-loop, and then finally, we add any condition clauses. You can see that this is much more compact of a format. And it tends to be faster as well. 

In [24]:
my_list = [num for num in range(0,10) if num % 2 ==0]
my_list

[0, 2, 4, 6, 8]

Just like with lambdas, list comprehensions are a condensed format which may offer readability and performance benefits and you'll often find them being used in data science tutorials or on stack overflow. 

In [25]:
def times_tables():
    lst = []
    for i in range(10):
        for j in range (10):
            lst.append(i*j)
    return lst

times_tables() == [j*i for i in range(10) for j in range(10)]


True