# Intermediate Concepts in Python

## Objectives

By the end of this lesson you will:
- Understand iterators and generators and implement them in your own code
- Have some basic familiarity with the functional programming paradigm
- Understand how to implement list comprehensions, lambda functions, and various functional programming methods to tighten up and speed up your workflow

<a id='toc'></a>
## Table of Contents

- [Iterators](#iterators)
- [Generators](#generators)
- [Introduction to Functional Programming](#functional)
- [Higher-Order Functions and Decorators](#decorators)
- [Next Steps](#nextsteps)
- [Appendix](#appendix)

<a id='iterators'></a>
## Iterators
([Back to the Table of Contents](#toc)

To understand iterators, we must first be comfortable with **iterables** -- which, perhaps without realizing it, you already are!

In layman's terms, an *iterable* is anything you can *iterate* over. This includes lists and dictionaries, but also strings!

In [1]:
for i in ['a', 'b', 'c']:
    print(i)

a
b
c


In [2]:
for i in {1:'a', 2:'b', 3:'c'}:
    print(i)

1
2
3


In [3]:
for i in 'abc':
    print(i)

a
b
c


### iter and next

So, in the for-loops above, what's happening under the hood?

First, let's get technical. An iterable is an object you can pass to the built-in function called **\__iter__**. Once the iterable is passed through the function, the function returns an iterator object.

An iterator object is one that has a single method, **\__next__**, which lets us increment through an iterable (e.g., a list or dictionary) one item at a time.

More concretely, first we create a list ['a', 'b', 'c']. This list is our iterable.

Invoking a for loop passes this list to **\__iter__** and returns an iterator.

Now, we invoke **\__next__**, which 'remembers' where we are in the list. After we have completed the first iteration of the loop, **\__next__** increments forward to the second element ('b') and we repeat the loop.

Once we get to the end of the list, **\__next__** raises an exception, ending the loop.

<a id='generators'></a>
## Generators
([Back to the Table of Contents](#toc))

The issue with the above example is that any procedure that relies on looping over iterators (irrespective of how we implement it) means we have to loop over the **entire** iterable, which can be memory intensive.

Here are two instances where this could be a problem.

First, consider the case where we want to sum every third number from 1 to some large *n*. In such a case, given what we know so far, we could create a list of all such numbers using a loop (increment by 3 until you hit n), and then find the sum of that list.

In [4]:
n = 10000
i = 1
l = []
while i <= n:
    l.append(i)
    i += 3

10,000 is a small number so this should've worked fine for you. But you can imagine we'd quickly hit a problem if we increased by several orders of magnitude or complicated the expressions within the loop in any way.

Think about our task: all we want to do is sum the *next* number in our loop to the numbers we already have. Why build a long list and store it in memory until the end, when we can then sum it?

What we want to do is start the loop, get the second number, STOP, and sum those two numbers. Then, we can ignore the first two numbers in the loop, get the third number, STOP, and sum the third number to the sum of the first two numbers. In this case, we are only ever holding two numbers in memory.

One more example of where we run into issues with iterators. 

Say we have a function that creates a list counting by three to some specified n:

In [5]:
def countbythree(end):
    i = 1
    l = []
    while i <= end:
        l.append(i)
        i += 3
    return l

countbythree(100)

[1,
 4,
 7,
 10,
 13,
 16,
 19,
 22,
 25,
 28,
 31,
 34,
 37,
 40,
 43,
 46,
 49,
 52,
 55,
 58,
 61,
 64,
 67,
 70,
 73,
 76,
 79,
 82,
 85,
 88,
 91,
 94,
 97,
 100]

But we also have a second function that does *something* to every ith element in l:

In [6]:
def multiplybythree(tomultiply):
    l = []
    for i in tomultiply:
        l.append(i*3)
    return l

The problem with this is that you first have to create the list I titled *tomultiply*. Only then can you create the second list. That is, there's no way to create both concurrently. That is, you cannot do something like this:

Rather, to overcome this issue, we need to utilize **generators**. A generator works much like a function, but **yields** a value and then stops. Normally, you have to run through a loop entirely, but with generators you can start or stop at some specified point in an iterable, stop, and get some value. When you are ready to continue, you can use **next** to move to the next item in the iterable.

To create a generator, you simply create a function. However, rather than **return**ing some object at the end, you **yield** it. This automatically creates a generator.

Check this out:

In [7]:
def countbythree(end):
    i = 1
    l = []
    while i <= end:
        yield i
        i += 3

for value in countbythree(100):
    print(value)

1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100


Do you see how this is different from what we've been doing?

Hint: Take a good long look at this:
> for value in countbythree(100):

We're looping over a function! That is, once we invoke the function, the function "remembers" where it is within the for-loop and only proceeds once you're ready.

I won't say much more on generators except that they tend to be quicker, which matters when you're working with larger datasets. For now, it's enough to know that these are a powerful resource and you can develop your familiarity with these further as you progress in your learning.

<a id='functional'></a>
## Introduction to Functional Programming
([Back to the Table of Contents](#toc))

As I mentioned in the last tutorial, there are several different programming paradigms and Python has ways of handling multiple such coding philosophies.

While Python is often introduced to beginners as an object-oriented programming language, it is clear that different users can develop their own experiences with the language (e.g., developers probably treat it more like an object-oriented language, while data analysts probably treat it more like a procedural or functional language).

Functional programming converts all expressions to functions. So that, in each line, we are telling our program what to do, not necessarily how to do it (this is defined elsewhere). It has the advantage of being more logical in some ways ("Here is an input. Do something to it. Give me the output.") and is often praised for being faster and more concise. At first, it is harder to read (you often have to think through what is being done at each stage), but once you develop a feel for it, may be quicker to read than even OOP.

Below, we'll go over list comprehensions, which is the more 'pythonic' way of implementing functional programming tasks (by 'pythonic', I mean that Python's creator and community style guide prefer list comprehensions and seem to not love other functional programming methods).

### List Comprehensions

One of the brilliant things about loops is that you can loop through a set of values to dynamically accomplish some other task.

In [8]:
evensto10 = [2, 4, 6, 8, 10]
oddsto10 = []
# Loop through the list *even* to create a list called *odd*
for i in evensto10:
    oddsto10.append(i-1)
oddsto10

[1, 3, 5, 7, 9]

You can even add conditions within loops:

In [9]:
evensto20 = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
oddsto10 = []
# Loop through the list *even* to create a list called *odd*
for i in evensto20:
    if i <= 10:
        oddsto10.append(i-1)
oddsto10

[1, 3, 5, 7, 9]

However, this is lengthy. And, the more conditions you add, the farther you have to indent code to the right. Fortunately, Python has a concept called a list comprehension that easily and parsimoniously achieves this task for you.

In [10]:
evensto10 = [2, 4, 6, 8, 10]
oddsto10 = [i - 1 for i in evensto10] # "Calculate 'i - 1' for each item, which we will call 'i' in the list evensto10
oddsto10

[1, 3, 5, 7, 9]

You can even add conditions to the list comprehension

In [11]:
evensto20 = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
oddsto10 = [i - 1 for i in evensto20 if i <= 10]
oddsto10

[1, 3, 5, 7, 9]

By the way, did you notice that I could create the list evensto20 with a list comprehension?

In [12]:
evensto20 = [i for i in range(1,21) if i%2==0]
evensto20

[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Note that this concept is not limited to lists. You can do something similar with dictionaries and other iterables as well.

### Lambda functions (or anonymous functions)

Recall that you can create a function as follows:

In [13]:
def getodds(evens):
    odds = [i-1 for i in evens]
    return odds

Then, to get a list of odds, you'd just need to pass a list of evens to the function

In [14]:
evens = [i for i in range(1,21) if i%2==0]
getodds(evens)

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]


Functions are useful if you want to repeatedly access that function. However, often you don't want to take the time to create a function and you just want to call it once or to use in a specific routine. Or, you might just have a really limited function that performs one specific task (rather than a series of related tasks).

In this case you can create an unnamed function called a lambda, or anonymous, function.

In [15]:
printstatement = lambda x: print(x)
printstatement("Hello, World")

Hello, World


Here's a more clear example.

In [16]:
sum = lambda x, y : x + y
sum(20, 30)

# Can you see how this is identical to the following:

def sum(x, y):
    return (x+y)

In [17]:
#With comments

sum = lambda x, y : x + y
# Create a function that takes two parameters called x and y
# After the colon, pass an expression. Plug the arguments x and y
# to fill the expression
# Finally, store this routine in the name **sum**

sum(20, 30)

50

The creator of Python and much of the Python community is fine with everything we have discussed to this point. However, some in the community like to avoid the following implementations where possible, but it is clear that they can be very useful and why some people might prefer them.

### map

**map** is one of these extraordinarily useful functions. It allows us to perform some function over some iterable.

In [18]:
evens = [i for i in range(1,21) if i%2==0]
getodds = map(lambda x:x-1, evens) 
# for each item -- let's call them x -- in the list called evens, 
# subtract 1

list(getodds) # create a list from the map object

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

**map** is often used as a replacement for a for-loop. However, a **map** expression is more concise and generally speeds things up.

### filter

Filter, like **map**, iterates over every element in some iterable. However, rather than performing some function over each element, it simply returns those values for which some condition is true.

Again, this helps avoid a lengthy for-loop full of if-else expressions and is generally faster.

In [19]:
from random import shuffle
evens = [i for i in range(1,21) if i%2==0]
shuffle(evens) # randomizes order of elements in list
evens

[6, 4, 2, 20, 14, 8, 16, 12, 18, 10]

In [20]:
evensto10 = filter(lambda x: x <= 10, evens)
evensto10

<filter at 0x560f1d0>

More on this soon, but note this returns a filter object. You have to compile this into a list:

In [21]:
list(evensto10)

[6, 4, 2, 8, 10]

In [22]:
# Or, all at once:
evensto10 = list(filter(lambda x: x <= 10, evens))

### reduce

**reduce** is a really cool function. It performs some operation on two items in a list. Then, it stores the result and performs that *same* operation on the result and a third item on the list. **reduce** repeats the process until there is nothing more to iterate over.

Recall the function that we created in the last notebook:

In [23]:
def addTwo(x, y):
    return x + y

The problem with this function is it only allows us to perform an operation on a pair of values at a time. So, we would not be able to get the sum of a list of numbers. However, this is precisely where **reduce** is useful!

In [24]:
evens = [i for i in range(1,21) if i%2==0]

In [25]:
from functools import reduce
reduce((lambda x, y: x + y), evens)

110

<a id='decorators'></a>
## Higher-Order Functions
([Back to the Table of Contents](#toc))

A higher-order function is one where at least one of the parameters it takes is a function and/or it returns a separate function as a result.

### Decorators

Decorators modify functions in a concise way and without *explicitly* modifying them.

Let's start with the following nested function:

In [26]:
def playfootball(playtype):
    def initializeplay():
        print("The play has started.")
        playtype()
        print("The play ended.")
    return initializeplay
    
def runplay():
    print("They run the ball.")
    
def passplay():
    print("They pass the ball.")

In [27]:
foo = playfootball(runplay)
foo()

The play has started.
They run the ball.
The play ended.


Woah! What happened?

So, we first created three functions. Since we did not execute these functions, these were just directions for what to do when called.

Then we created an object *foo* that tells Python, "Hey, I want to pass the function runplay to *playfootball* -- but don't do anything just yet."

Finally, we execute this function *foo*. At this stage, we pass execute the function *initializeplay*, which prints a statement, executes the function that was passed to it, and executes a final statement. Finally, the entire sequence is returned so that it will be executed when we call *playfootball*.

Okay, but let's take this one step further and use the decorator "@". Note that this produces an equivalent output, but saves a line of code (which I've commented out).

In [28]:
def playfootball(playtype):
    def initializeplay():
        print("The play has started.")
        playtype()
        print("The play ended.")
    return initializeplay

@playfootball
def runplay():
    print("They run the ball.")

@playfootball
def passplay():
    print("They pass the ball.")

#foo = playfootball(runplay) # we don't need this anymore
runplay()
passplay()

The play has started.
They run the ball.
The play ended.
The play has started.
They pass the ball.
The play ended.


What happened here? The decorator signals that the function that follows will always be passed to **playfootball**. In this way, it is both good form (because it makes it easy to read) and serves a clear function.

Also, I struggle with this so shout out to [Real Python](https://realpython.com/primer-on-python-decorators/) for making this easy to follow.

<a id='nextsteps'></a>
## Hey, wait...is that all? What do I do now???
([Back to the Table of Contents](#toc))

Glad to see you've been enjoying these tutorials. Okay, so now you know way more than enough to get started with using Python in your day-to-day tasks. Before I started writing these tutorials, I only understood about half of these concepts, but I was still able to write code to perform myriad tasks.

Your next step should be to learn some Python packages. If you want to learn data analysis or data science, your next step is likely learning **pandas** (which is the fourth and final tutorial in this series). However, you should also learn other packages. I don't have a list of things you should learn, but I do have one piece of advice:

**Think of a project and pursue it relentlessly.** When I started, I wanted to figure out when we could expect NBA players to start declining in skill. I set out to create an age curve, plotting player experience against player productivity. To do this, I had to learn how to download data from the Internet, store this information in a CSV file, and scrape the CSV for the relevant data points. I expanded by storing the final dataset in a SQL database. Then, I created a short script that would allow users to query the database (e.g., for a player or position), and would finally use the website plot.ly to output a graphic. 

I knew none of this when I started. I completed the project within a couple of months of being introduced to Python. To do this, I had to learn a dozen packages and how to connect to an API. Most importantly, I learned to read documentation and navigate StackOverflow. 

The point is this: **if you pick a project, you will have internal motivation and the project will create a natural syllabus for you.** Pursuing your own project, I believe, is the best way to move forward.

<a id='appendix'></a>
## Appendix
([Back to the Table of Contents](#toc))

### Underscores in method names (e.g., \__init\__, \__iter\__)

From the [style guide](https://www.python.org/dev/peps/pep-0008/#descriptive-naming-styles):

> These are "magic" objects or attributes that live in user-controlled namespaces.

Not much more to say. There are only a handful of such commands, so when you see them, it can be confusing. However, it's important to recognize these are just objects that the creators of Python have deemed special in some way.

### High vs Low-Level Languages

First remember that a machine can only read binary -- sequences of 0s and 1s.

A low-level programming language is one that is written very close to machine code. [First-generation languages](https://en.wikipedia.org/wiki/Low-level_programming_language) might have used binary or, more practically (to someone who is not me), some translatable but simpler-to-use alternative like hexademical. 

Second-generation (but still low-level) languages were an improvement because they used some mapping of English symbols to extend the first-gen languages.

The lowest-level programming languages are fast and powerful, but they are also very difficult to learn. Have you noticed that everything you've learned in Python is basically English? Furthermore, it has a clear grammar that you can follow and internalize. I would wager that, by now, any time you read most Python code, you have a strong intuition of what's going on.

Python is an example of a high-level programming language. The language is written for us, not for the computer. As I mentioned above, they are generally written in English and have a clear syntactical structure. These are slower and we don't have as much control in programming; however, they are much easier to work with. 

### Compiled vs Interpreted Implementations

If this tutorial is your first and only experience with a programming language, you might not be aware that, even among high-level programming languages, there is a further distinction in how the programs are implemented.

C, for instance, generally relies on a compiled implementation. You first write a script and save it. However, you cannot execute it immediately. It must separately be 'compiled', or translated, it into machine code and only then is it executed.

Python, on the other hand, relies primarily on a interpreted implementation. An interpreter translates the code we write into an efficient, compressed version (for Python, this is bytecode), which can then be directly executed. This largely goes above my head and is not necessary for you to be able to program on a day-to-day basis. It is just worth knowing that interpreted code is typically slower but, again, more user-friendly.

### Lazy Evaluation

Did you notice anything about the objects returned by the functional programming methods we've been learning?

In [29]:
def countbythree(end):
    i = 1
    l = []
    while i <= end:
        yield i
        i += 3
countbythree(100)

<generator object countbythree at 0x00000000055F42B0>

In [30]:
evens = [i for i in range(1,21) if i%2==0]
getodds = map(lambda x:x-1, evens) 
getodds

<map at 0x5602a58>

In [31]:
from random import shuffle
evens = [i for i in range(1,21) if i%2==0]
shuffle(evens) # randomizes order of elements in list
evensto10 = filter(lambda x: x <= 10, evens)
evensto10

<filter at 0x560f860>

None of these methods returns the list we were expecting. We are instead returned a generator, map, and filter object, respectively.

This is what is known as *lazy evaluation*. In each of these cases, Python stores a blueprint of the object. Consider the map object called *getodds*. Python is telling us, "Hey, you wanted an object that subtracts 1 from each item in the list evens. You don't need it right now, so I'm not going to go get it right now. But I'll keep in mind what you want." In this case, Python is either being an annoying little brother who won't go and get you the remote when you want it, or is being a terrific younger brother who brings you things exactly when you need it.

So, the instructions are ready to go and are stored in the name *getodds*. Once we're ready to use the numbers, we can do any number of things to actually get Python to evaluate those instructions. For instance, we might loop over those values or create a list out of them:

In [32]:
list(getodds)

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

# END OF FILE