<div style="text-align: right">INFO 6105 Data Science Eng Methods and Tools, Lecture 2 Day 2</div>
<div style="text-align: right">Dino Konstantopoulos, 27 January 2022</div>

## A brief introduction to the language Python in 10 chapters, Part 2

### List comprehensions

List comprehensions are one of the most useful and compact Python expressions. They allow you to loop over container types without writing any ugly loop structures. We'll use this ***a ton*** in class, and the more loops you transform into list comprehensions, the faster your program runs.

In [None]:
str_list = ['things', 'stuff', 'Jones']

In [None]:
str_list

Pretty:

In [None]:
['(' + x + ')' for x in str_list]

In [None]:
['element (' + str(i) + '): ' + x for i,x in enumerate(str_list)]

Ugly:

In [None]:
mylist = []
i = 0
for x in str_list:
    mylist.append('my ' + str(i) + ':' + x)
    i += 1
mylist

In [None]:
[x.upper() for x in str_list]

In [None]:
a = [0,1,2,3,4]
b = [5,6,7,8,9]
list(zip(a,b))

In [None]:
[x+y for x,y in zip(a,b)] # using zip (above)

In [None]:
a = [10,11,12,13,14]
a

In [None]:
[x + 3 if (x > 3) else x for x in a]

And this is how you do the above in the traditional ***ugly*** way of classical computer languages: Using **loops**:

In [None]:
for x in a:
    if (x < 3):
        print (x + 6)
    else:
        print(x)

oopsie.. still not there! Can you modify the code above to print ***exactly*** the same result as in cell ```Out[5]``` above?

<div style="visibility: hidden">
answer = []
for x in a:
    if (x < 3):
        answer.append(x + 6)
    else:
        answer.append(x)
answer  
</div>

In [None]:
answer = []
for x in a:
    if (x < 3):
        answer.append(x + 6)
    else:
        answer.append(x)
answer

### Dictionaries 

One of the more flexible built-in data structures is the **dictionary**. A dictionary maps a collection of values to a set of associated keys. These mappings are mutable, and unlike lists or tuples, are unordered. Hence, rather than using the sequence index to return elements of the collection, the corresponding key must be used. 

Dictionaries are specified by a comma-separated sequence of keys and values, which are separated in turn by colons. The dictionary is enclosed by curly braces. Dictionaries are also the general JSON format of the Web. For example:

In [None]:
my_dict = {'a':16, 'b':(4,5), 'foo':'''(noun) a term used as a universal substitute 
           for something real, especially when discussing technological ideas and 
           problems'''}
my_dict

In [None]:
if 'k' in my_dict:
    print('yes')
else:
    print('no')

In [None]:
if 'b' in my_dict:
    print(my_dict['b'])

In [None]:
'a' in my_dict	# Checks to see if ‘a’ is in my_dict

In [None]:
my_dict.items()		# Returns key/value pairs as list of tuples

In [None]:
my_dict.keys()		# Returns list of keys

In [None]:
my_dict.values()	# Returns list of values

In [None]:
my_dict['c']

If we would rather not get the error, we can use the `get` method, which returns `None` if the value is not present, or a value of your choice

In [None]:
my_dict.get('a')

In [None]:
my_dict.get('k', -1)

In [None]:
for (k,v) in my_dict.items():
    print(k,v)

In [None]:
for k in my_dict.keys():
    print(k, my_dict[k])

In [None]:
for v in my_dict.values():
    print(v)

## 7. Logical operators 

Logical operators will **test** for some condition and return a boolean (True, False)

#### Comparison operators

+ `>` : Greater than
+ `>=` : Greater than or equal to
+ `<` : Less than
+ `<=` : Less than or equal to
+ `==` : Equal to
+ `!=` : Not equal to

**is / is not**

Use **==** (**!=**) when comparing values and **is** (**is not**) when comparing **identities**.

In [None]:
x = 5.

In [None]:
type(x)

In [None]:
y = 5

In [None]:
type(y)

In [None]:
x == y

x is a float, y is a int, they point to different addresses in memory!

In [None]:
x is y

#### Some examples of common comparisons

In [None]:
a = 5
b = 6

In [None]:
a == b

In [None]:
a != b

In [None]:
(a > 4) and (b < 7)

In [None]:
(a > 4) and (b > 7)

In [None]:
(a > 4) or (b > 7)

**All** and **Any** can be used for a *collection* of booleans

In [None]:
x = [5,6,2,3,3]

In [None]:
cond = [item > 2 for item in x]

In [None]:
cond

In [None]:
all(cond)

In [None]:
any(cond)

## 8. Control flow structures

### Indentation is meaningful

In Python, there are no annoying curly braces, parenthesis, brackets etc., as in other languages, to delimitate flow control blocks. Instead, **indentation** plays this role.

In [None]:
'aa' if False else 'bb'

In [None]:
# Let's just make a variable
some_var = 5

# Here is an if statement. Indentation is significant in python!
# prints "some_var is smaller than 10"
if some_var > 10:
    print("some_var is totally bigger than 10.")
elif some_var < 10:  # This elif clause is optional.
    print("some_var is smaller than 10.")
else:  # This is optional too.
    print("some_var is indeed 10.")

In [None]:
for x in range(10): 
  if x < 5:
    print(x**2)
  else:
    print(x) 

**Note**: A Jupyter notebook will guess the right indentation :-). When editing a code cell in IPython, the indentation is handled intelligently, try typing in a new blank cell: 

    for x in xrange(10): 
        if x < 5:
            print x**2
        else:
            print x 
            

In [None]:
for x in xrange(10):
    if x < 5: 
        

For other editors, the standard is to use 4 spaces (**NOT** tabs) for the indentation, set your favorite editor accordingly. For example in vi / vim: 

    set tabstop=4
    set expandtab
    set shiftwidth=4
    set softtabstop=4

### if ... elif ... else

In [None]:
x = 10

if x < 10: # not met
    x = x + 1
elif x > 10: 
    x = x - 1 # not met either 
else: 
    x = x * 2
    
print(x)

In [None]:
x = 10

if (x > 5 and x < 8): 
    x = x+1
elif (x > 5 and x < 12): 
    x = x * 3
else:
    x = x-1
    
print(x)

### The For loop 

The basic structure of FOR loops is

    for item in iterable: 
        expression(s)
        

In [None]:
count = 0
# x = range(1,10) # range creates a list ... 
# xrange is a convenience function, it creates an iterator rather than a list
# which has a smaller memory footprint
x = range(1,10) 
for i in x:
    count += i
    print(count)

### try ... except

You can see it as a generalization of the ```if ... else``` construction, allowing more flexibility in handling failures in code

In [None]:
text = ('a','1','54.1','43.a')
for t in text:
    try:
        temp = float(t)
        print(temp)
    except ValueError:
        print(str(t) + ' is Not convertible to a float')

A list of built-in exceptions is available here 

[http://docs.python.org/3.1/library/exceptions.html](http://docs.python.org/3.1/library/exceptions.html)

## 9. Recycling code in Python

As with R, it's a good idea to write **functions** for bits of code that you use often. 

The syntax for defining a function in Python is: 

    def name_of_function(arguments): 
        "Some code here that works on arguments and produces outputs"
        ...
        return outputs

Note that the execution block **must be indented** ... 

You can create a file (a **module**: extension **.py** required) which contains **several** functions, and can also define variables, and import some other functions from other modules.

In [None]:
%%file some_module.py 

PI = 3.14159 # defining a variable

from numpy import arccos # importing a function from another module

def f(x): 
    """
    This is a function which adds 5 to its argument
     
    """
    return x + 5

def g(x, y): 
    """
    This is a function which sums its 2 arguments
    """
    return x + y

This is how we import an external module. Can you guess where the files resides?

In [None]:
import some_module

The magic `%whos` object (all objects preceded by % are called magic) gies us all the valiables we declared in the notebook or imported from external files!

In [None]:
%whos

`dir()` yeilds the functions. Note there are buiilt-in functions, too.

In [None]:
dir(some_module)

And we can get help information from the module, which consits of the triple-quoted comment string for each defined function.

In [None]:
help(some_module)

And here's how we use our module, A variable in the module:

In [None]:
some_module.PI

In [None]:
some_module.f

Notice a cool trick by executing the cell below.

In [None]:
some_module.arccos

In [None]:
some_module.arccos?

A function in the module. Notice that with a function, we need to give it an input variable, too.

In [None]:
some_module.f(7)

In [None]:
help(some_module.f)

Here are two ways for creating shortcuts to the module:

In [None]:
from some_module import f

In [None]:
f(5)

In [None]:
import some_module as sm

In [None]:
sm.f(10)

The Zen of python says: 
    
```Namespaces are one honking great idea -- let's do more of those!```
    
so **don't** do: 

    from some_module import *
    
As to avoid names conflicts ...

### A bit more on functions: 

Functions can have **positional** as well as **keyword** arguments (with defaults, can be `None` if that's allowed / tested)

Positional arguments must always come before keyword arguments

In [None]:
type(1e3)

In [None]:
type(10**3)

In [None]:
def some_function(a,b,c=5,d=1e3): 
    res = (a + b) * c * d
    return res

In [None]:
some_function(2,3)

In [None]:
some_function(2, 3, c=5, d=0.01)

You can return more than one output from a function, and by default it will be a tuple:

In [None]:
def some_function(a, b): 
    return a+1, b+1, a*b

In [None]:
a, b, c = some_function(2,3)
a

In [None]:
c

In [None]:
type(res)

## 10. Functions and Anonymous Functions are first class in Python

Functions in Python are just like data objects, you can create variables to store them and pass them around, even to other functions!

In [None]:
# Python has first class functions
def create_adder(x):
    def adder(y):
        return x + y

    return adder

In [None]:
create_adder

In [None]:
add_10 = create_adder(10)
add_10

In [None]:
add_10(3)  # => 13

In [None]:
create_adder(10)(3)
l[0][0]

You can define ***anonymous*** functions using `lambdas`:

In [None]:
def f(x):
    return x > 2

In [None]:
(lambda _: _ > 2)(3), f(3)

In [None]:
# There are also anonymous functions C#: x => x> 2     Java: x -> x > 2
(lambda _: _ > 2)(3)  # => True
(lambda x, y: x ** 2 + y ** 2)(2, 1)  # => 5

There are built-in higher order functions you should know of. It's ok if they're still a bit myesterious to you. We'll explore them more in later lectures.

In [None]:
list(zip([1,2,3], [4,5,6]))

In [None]:
list(map(add_10, [1, 2, 3]))

In [None]:
list(map(max, [1, 2, 3], [4, 2, 1]))

In [None]:
list(filter(lambda x: x > 5, [3, 4, 5, 6, 7]))

In [None]:
map(add_10, [1, 2, 3])  # => [11, 12, 13]
map(max, [1, 2, 3], [4, 2, 1])  # => [4, 2, 3]

filter(lambda x: x > 5, [3, 4, 5, 6, 7])  # => [6, 7]

You can use list comprehensions, too, as nice maps and filters:

In [None]:
[add_10(i) for i in [1, 2, 3]]  # => [11, 12, 13]
[x for x in [3, 4, 5, 6, 7] if x > 5]  # => [6, 7]

Note you can construct set and dict comprehensions as well:

In [None]:
{x for x in 'abcddeef' if x in 'abc'}  # => {'a', 'b', 'c'}
{x: x ** 2 for x in range(5)}  # => {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
[x for x in [3, 4, 5, 6, 7] if x > 5]  # => [6, 7]

Finally, Python's `*args` and `**kwargs` constructs lest you iterate over **positional arguments** and **named arguments**: 

In [None]:
def magic(*args, **kwargs):
  print ("unnamed args: ", args)
  print ("keyword args: ", kwargs)

In [None]:
magic(1, 2, 3, a=4, b=5, c=6)

## Generators

A generator *generates* values as they are requested instead of storing everything up front. Let's see what storing everything up front really means. 

First, let us consider the simple example of building a list and returning it.

In [None]:
def first_n(n):
    '''Build and return a list'''
    num, nums = 0, []
    while num < n:
        nums.append(num)
        num += 1
    return nums

In [None]:
sum_of_first_n = sum(first_n(1000000))
sum_of_first_n

The code is quite simple and straightforward, but it *builds the full list in memory*. This is a *problem*! It is clearly not acceptable in our case, because we cannot afford to keep all $n$ "10 megabyte" integers in memory.

Lets us rewrite the above iterator as a generator function instead:

In [None]:
# a generator that yields items instead of returning a list
def firstn(n):
    num = 0
    while num < n:
        yield num
        num += 1

In [None]:
sum_of_first_n = sum(firstn(1000000))
sum_of_first_n

The expression of the number generation logic is clear and natural. It is very similar to the implementation that built a list in memory, but has the memory usage characteristic of the iterator implementation.

In fact, we can turn a list comprehension into a generator expression by replacing the square brackets ("[ ]") with parentheses. Alternately, we can think of list comprehensions as generator expressions wrapped in a list constructor.

In [None]:
# list comprehension
doubles = [2 * n for n in range(50)]

# same as the list comprehension above
doubles = list(2 * n for n in range(50))

Notice how a list comprehension looks essentially like a generator expression passed to a list constructor.

By allowing generator expressions, we don't have to write a generator function if we do not need the list. If only list comprehensions were available, and we needed to lazily build a set of items to be processed, we will have to write a generator function.

This also means that we can use the same syntax we have been using for list comprehensions to build generators.

The performance improvement from the use of generators is the result of the lazy (on demand) generation of values, which translates to lower memory usage. Furthermore, we do not need to wait until all the elements have been generated before we start to use them. This is similar to the benefits provided by iterators, but the generator makes building iterators easy.

Generators can be **composed**. Here we create a generator on the squares of consecutive integers.

In [None]:
#square is a generator
square = (i*i for i in range(1000000))
#add the squares
total = 0
for i in square:
    total += i
total

Here, we compose a square generator with the `takewhile` generator, to generate squares less than 100:

In [None]:
def count():
    num = 0
    while True:
        yield num
        num += 1

In [None]:
import itertools

#add squares less than 100
square = (i*i for i in count())
bounded_squares = itertools.takewhile(lambda x : x< 100, square)
total = 0
for i in bounded_squares:
    total += i
total

A generator is characterized by a ```yield``` (```yield return``` in Java and C#) instead of a regular ```return```.

A list comprehension is by nature a parallel construct.

In [None]:
def count(limit):
    num = 0
    while num < limit:
        yield num
        num += 1

In [None]:
count(100)

In [None]:
mylist = list(count(100))
print(mylist)

## Decorators (***optional***)

A decorator is a **higher order function**: A function which accepts and returns... a function! 

Simple usage example `add_apples` decorator will add 'Apple' element into fruits list returned by get_fruits target function.

In [None]:
def add_apples(func):
    def get_fruits():
        fruits = func()
        fruits.append('Apple')
        return fruits
    return get_fruits

@add_apples
def get_fruits():
    return ['Banana', 'Mango', 'Orange']

# Prints out the list of fruits with 'Apple' element in it:
# Banana, Mango, Orange, Apple
print(', '.join(get_fruits()))

In this example, `beg` wraps `say`. `beg` will call `say`. If `say_please` is True then it will change the returned message:

In [None]:
from functools import wraps


def beg(target_function):
    @wraps(target_function)
    def wrapper(*args, **kwargs):
        msg, say_please = target_function(*args, **kwargs)
        if say_please:
            return "{} {}".format(msg, "Please! I am poor :(")
        return msg

    return wrapper


@beg
def say(say_please=False):
    msg = "Can you buy me a beer?"
    return msg, say_please


print say()  # Can you buy me a beer?
print say(say_please=True)  # Can you buy me a beer? Please! I am poor :(

# Homework: Let's program!

And now dear class, you are ready to program in Python!

Please write below the *simplest possible* implementation of Fibonacci numbers, given what we learned above. Those are the series where the next element is the sum of the previous two. Also, give me the first 100 Fibonacci numbers.

Now :-) Here's a hint:

In [None]:
a = 1
b = 2
a, b = b, a
print('a = ' + str(a))
print('b = ' + str(b))

Now, enough for this week?

<br />
<center>
<img src = ipynb.images/tree-sloth.jpg width = 600 />
</center>

*Almost!*