<div style="text-align: right">INFO 6105 Data Science Eng Methods and Tools, Lecture 2 Day 1</div>
<div style="text-align: right">Dino Konstantopoulos, 9 September 2019</div>

## A brief introduction to the language Python in 10 chapters, Part 2

### List comprehensions

List comprehensions are one of the most useful and compact Python expressions. They allow you to loop over container types without writing any ugly loop structures. We'll use this ***a ton*** in class!

In [1]:
a=[1,2,3]
b=[4,5,6]

In [2]:
c = zip(a,b)

In [3]:
str_list = ['things', 'stuff', 'Brady']

In [4]:
str_list

['things', 'stuff', 'Brady']

In [5]:
['my ' + x for x in str_list]

['my things', 'my stuff', 'my Brady']

In [6]:
[x.upper() for x in str_list]

['THINGS', 'STUFF', 'BRADY']

In [7]:
[x+y for x,y in zip(a,b)] # using zip (above)

[5, 7, 9]

In [8]:
a = [0,1,2,3,4]
a

[0, 1, 2, 3, 4]

In [9]:
[x + 6 if (x < 3) else x for x in a]

[6, 7, 8, 3, 4]

And this is how you do the above in the traditional ***ugl*** way of classical computer languages:

In [10]:
for x in a:
    if (x < 3):
        print (x + 6)
    else:
        print(x)

6
7
8
3
4


oopsie.. still not there! Can you modify the code above to print eaxctly the same result as in cell ```Out[5]``` above?

In [9]:
answer = []
for x in a:
    if (x < 3):
        answer.append(x + 6)
    else:
        answer.append(x)
answer

[6, 7, 8, 3, 4]

### Dictionaries 

One of the more flexible built-in data structures is the **dictionary**. A dictionary maps a collection of values to a set of associated keys. These mappings are mutable, and unlike lists or tuples, are unordered. Hence, rather than using the sequence index to return elements of the collection, the corresponding key must be used. 

Dictionaries are specified by a comma-separated sequence of keys and values, which are separated in turn by colons. The dictionary is enclosed by curly braces. Dictionaries are also the general JSON format of the Web. For example:

In [11]:
my_dict = {'a':16, 'b':(4,5), 'foo':'''(noun) a term used as a universal substitute 
           for something real, especially when discussing technological ideas and 
           problems'''}
my_dict

{'a': 16,
 'b': (4, 5),
 'foo': '(noun) a term used as a universal substitute \n           for something real, especially when discussing technological ideas and \n           problems'}

In [12]:
my_dict['b']
(4, 5)

(4, 5)

In [13]:
'a' in my_dict	# Checks to see if ‘a’ is in my_dict

True

In [14]:
my_dict.has_key('bar')	# Checks to see if a key exists

AttributeError: 'dict' object has no attribute 'has_key'

In [15]:
my_dict.items()		# Returns key/value pairs as list of tuples

dict_items([('a', 16), ('b', (4, 5)), ('foo', '(noun) a term used as a universal substitute \n           for something real, especially when discussing technological ideas and \n           problems')])

In [16]:
my_dict.keys()		# Returns list of keys

dict_keys(['a', 'b', 'foo'])

In [17]:
my_dict.values()	# Returns list of values

dict_values([16, (4, 5), '(noun) a term used as a universal substitute \n           for something real, especially when discussing technological ideas and \n           problems'])

In [18]:
my_dict['c']

KeyError: 'c'

If we would rather not get the error, we can use the `get` method, which returns `None` if the value is not present, or a value of your choice

In [19]:
my_dict.get('c')

In [20]:
my_dict.get('c', -1)

-1

## 7. Logical operators 

Logical operators will **test** for some condition and return a boolean (True, False)

#### Comparison operators

+ `>` : Greater than
+ `>=` : Greater than or equal to
+ `<` : Less than
+ `<=` : Less than or equal to
+ `==` : Equal to
+ `!=` : Not equal to

**is / is not**

Use **==** (**!=**) when comparing values and **is** (**is not**) when comparing **identities**.

In [20]:
x = 5.

In [21]:
type(x)

float

In [22]:
y = 5

In [23]:
type(y)

int

In [24]:
x == y

True

x is a float, y is a int, they point to different addresses in memory!

In [25]:
x is y

False

#### Some examples of common comparisons

In [26]:
a = 5
b = 6

In [27]:
a == b

False

In [28]:
a != b

True

In [29]:
(a > 4) and (b < 7)

True

In [30]:
(a > 4) and (b > 7)

False

In [31]:
(a > 4) or (b > 7)

True

**All** and **Any** can be used for a *collection* of booleans

In [32]:
x = [5,6,2,3,3]

In [33]:
cond = [item > 2 for item in x]

In [34]:
cond

[True, True, False, True, True]

In [35]:
all(cond)

False

In [36]:
any(cond)

True

## 8. Control flow structures

### Indentation is meaningful

In Python, there are no annoying curly braces, parenthesis, brackets etc., as in other languages, to delimitate flow control blocks. Instead, **indentation** plays this role.

In [38]:
# Let's just make a variable
some_var = 5

# Here is an if statement. Indentation is significant in python!
# prints "some_var is smaller than 10"
if some_var > 10:
    print("some_var is totally bigger than 10.")
elif some_var < 10:  # This elif clause is optional.
    print("some_var is smaller than 10.")
else:  # This is optional too.
    print("some_var is indeed 10.")


some_var is smaller than 10.


In [40]:
for x in range(10): 
    if x < 5:
        print(x**2)
    else:
        print(x) 

0
1
4
9
16
5
6
7
8
9


**Note**: A Jupyter notebook will guess the right indentation :-). When editing a code cell in IPython, the indentation is handled intelligently, try typing in a new blank cell: 

    for x in xrange(10): 
        if x < 5:
            print x**2
        else:
            print x 
            

In [43]:
for x in xrange(10):
    if x < 5: 
        

SyntaxError: unexpected EOF while parsing (<ipython-input-43-60734ba9811c>, line 3)

For other editors, the standard is to use 4 spaces (**NOT** tabs) for the indentation, set your favorite editor accordingly. For example in vi / vim: 

    set tabstop=4
    set expandtab
    set shiftwidth=4
    set softtabstop=4

### if ... elif ... else

In [44]:
x = 10

if x < 10: # not met
    x = x + 1
elif x > 10: 
    x = x - 1 # not met either 
else: 
    x = x * 2
    
print(x)

20


In [45]:
x = 10

if (x > 5 and x < 8): 
    x = x+1
elif (x > 5 and x < 12): 
    x = x * 3
else:
    x = x-1
    
print(x)

30


### The For loop 

￼The basic structure of FOR loops is ￼

    for item in iterable: 
        expression(s)
        

In [46]:
count = 0
# x = range(1,10) # range creates a list ... 
# xrange is a convenience function, it creates an iterator rather than a list
# which has a smaller memory footprint
x = range(1,10) 
for i in x:
    count += i
    print(count)

1
3
6
10
15
21
28
36
45


### try ... except

You can see it as a generalization of the ```if ... else``` construction, allowing more flexibility in handling failures in code

In [47]:
text = ('a','1','54.1','43.a')
for t in text:
    try:
        temp = float(t)
        print(temp)
    except ValueError:
        print(str(t) + ' is Not convertible to a float')

a is Not convertible to a float
1.0
54.1
43.a is Not convertible to a float


A list of built-in exceptions is available here 

[http://docs.python.org/3.1/library/exceptions.html](http://docs.python.org/3.1/library/exceptions.html)

## 9. Recycling code in Python

As with R, it's a good idea to write **functions** for bits of code that you use often. 

The syntax for defining a function in Python is: 

    def name_of_function(arguments): 
        "Some code here that works on arguments and produces outputs"
        ...
        return outputs

Note that the execution block **must be indented** ... 

You can create a file (a **module**: extension **.py** required) which contains **several** functions, and can also define variables, and import some other functions from other modules.

In [48]:
%%file some_module.py 

PI = 3.14159 # defining a variable

from numpy import arccos # importing a function from another module

def f(x): 
    """
    This is a function which adds 5 to its argument
     
    """
    return x + 5

def g(x, y): 
    """
    This is a function which sums its 2 arguments
    """
    return x + y

Writing some_module.py


This is how we import an external module. Can you guess where the files resides?

In [49]:
import some_module

The magic `%whos` object (all objects preceded by % are called magic) gies us all the valiables we declared in the notebook or imported from external files!

In [50]:
%whos

Variable      Type      Data/Info
---------------------------------
a             int       5
answer        list      n=5
b             int       6
cond          list      n=5
count         int       45
i             int       9
my_dict       dict      n=3
some_module   module    <module 'some_module' fro<...>Numbers\\some_module.py'>
some_var      int       5
str_list      list      n=3
t             str       43.a
temp          float     54.1
text          tuple     n=4
x             range     range(1, 10)
y             int       5


`dir()` yeilds the functions. Note there are buiilt-in functions, too.

In [51]:
dir(some_module)

['PI',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'arccos',
 'f',
 'g']

And we can get help information from the module, which consits of the triple-quoted comment string for each defined function.

In [53]:
help(some_module)

Help on module some_module:

NAME
    some_module

FUNCTIONS
    f(x)
        This is a function which adds 5 to its argument
    
    g(x, y)
        This is a function which sums its 2 arguments

DATA
    PI = 3.14159
    arccos = <ufunc 'arccos'>

FILE
    d:\neucourses\data science engineering methods and tools – info 6105\lecture2-just big numbers\some_module.py




And here's how we use our module, A variable in the module:

In [54]:
some_module.PI

3.14159

Notice a cool trick by executing the cell below.

In [56]:
some_module.arccos?

A function in the module. Notice that with a function, we need to give it an input variable, too.

In [57]:
some_module.f(7)

12

In [58]:
help(some_module.f)

Help on function f in module some_module:

f(x)
    This is a function which adds 5 to its argument



Here are two ways for creating shortcuts to the module:

In [59]:
from some_module import f

In [60]:
f(5)

10

In [61]:
import some_module as sm

In [62]:
sm.f(10)

15

The Zen of python says: 
    
```Namespaces are one honking great idea -- let's do more of those!```
    
so **don't** do: 

    from some_module import *
    
As to avoid names conflicts ...

### A bit more on functions: 

Functions can have **positional** as well as **keyword** arguments (with defaults, can be `None` if that's allowed / tested)

Positional arguments must always come before keyword arguments

In [63]:
def some_function(a,b,c=5,d=1e3): 
    res = (a + b) * c * d
    return res

In [64]:
some_function(2,3)

25000.0

In [65]:
some_function(2, 3, c=5, d=0.01)

0.25

You can return more than one output from a function, and by default it will be a tuple:

In [66]:
def some_function(a, b): 
    return a+1, b+1, a*b

In [67]:
a,b,c = some_function(2,3)

In [68]:
c

6

In [69]:
type(res)

NameError: name 'res' is not defined

## 10. Functions and Anonymous Functions are first class in Python

Functions in Python are just like data objects, you can create variables to styore them and pass them around, even to other functions!

In [70]:
# Python has first class functions
def create_adder(x):
    def adder(y):
        return x + y

    return adder


add_10 = create_adder(10)
add_10(3)  # => 13

13

You can dedfine anonymous functions using `lambdas`:

In [71]:
# There are also anonymous functions
(lambda x: x > 2)(3)  # => True
(lambda x, y: x ** 2 + y ** 2)(2, 1)  # => 5

5

There are built-in higher order functions you should know of. It's ok if they're still a bit myesterious to you. We'll explore them more in later lectures.

In [79]:
map(add_10, [1, 2, 3])  # => [11, 12, 13]
map(max, [1, 2, 3], [4, 2, 1])  # => [4, 2, 3]


filter(lambda x: x > 5, [3, 4, 5, 6, 7])  # => [6, 7]

<filter at 0x2c30c861048>

You can use list comprehensions for nice maps and filters:

In [80]:
[add_10(i) for i in [1, 2, 3]]  # => [11, 12, 13]
[x for x in [3, 4, 5, 6, 7] if x > 5]  # => [6, 7]

[6, 7]

You can construct set and dict comprehensions as well:

In [81]:
{x for x in 'abcddeef' if x in 'abc'}  # => {'a', 'b', 'c'}
{x: x ** 2 for x in range(5)}  # => {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
[x for x in [3, 4, 5, 6, 7] if x > 5]  # => [6, 7]

[6, 7]

Finally, Python's `*args` and `**kwargs` constructs lest you iterate over **positional arguments** and **named arguments**: 

In [82]:
def magic(*args, **kwargs):
  print ("unnamed args: ", args)
  print ("keyword args: ", kwargs)

In [83]:
magic(1,2,3,a=4,b=5,c=6)

unnamed args:  (1, 2, 3)
keyword args:  {'a': 4, 'b': 5, 'c': 6}


## Generators

A generator "generates" values as they are requested instead of storing everything up front. Let's see what storing everything up front really means. # The following method (*NOT* a generator) will double all values and store it
in `double_arr`. For large size of iterables, that might get huge!

In [84]:
def double_numbers(iterable):
    double_arr = []
    for i in iterable:
        double_arr.append(i + i)
    return double_arr

Running the following would mean we'll double all values first and return all of them back to be checked by our condition:

In [85]:
for value in double_numbers(range(1000000)):  # `test_non_generator`
    print value
    if value > 5:
        break

SyntaxError: Missing parentheses in call to 'print'. Did you mean print(value)? (<ipython-input-85-77425074c1df>, line 2)

We could instead use a generator to *generate* the doubled value as the item is being requested

In [86]:
def double_numbers_generator(iterable):
    for i in iterable:
        yield i + i

Running the same code as before, but with a generator, now allows us to iterate
over the values and doubling them one by one as they are being consumed by
our logic. 

Thus, as soon as we see a value > 5, we break out of the
loop and don't need to double most of the values sent in (MUCH FASTER!)

In [87]:
for value in double_numbers_generator(xrange(1000000)):  # `test_generator`
    print value
    if value > 5:
        break

SyntaxError: Missing parentheses in call to 'print'. Did you mean print(value)? (<ipython-input-87-cc2e3141ffb3>, line 2)

By the way, did you notice the use of `range` in `test_non_generator` and `xrange` in `test_generator`?

Just as `double_numbers_generator` is the generator version of `double_numbers`, We have `xrange` as the generator version of `range`.

`range` would return back and array with 1000000 values for us to use
`xrange` would generate 1000000 values for us as we request / iterate over those items.

Just as you can create a list comprehension, you can create generator comprehensions as well:

In [88]:
values = (-x for x in [1, 2, 3, 4, 5])
for x in values:
    print(x)  # prints -1 -2 -3 -4 -5 to console/terminal

-1
-2
-3
-4
-5


You can also cast a generator comprehension directly to a list:

In [89]:
values = (-x for x in [1, 2, 3, 4, 5])
gen_to_list = list(values)
print(gen_to_list)  # => [-1, -2, -3, -4, -5]

[-1, -2, -3, -4, -5]


## Decorators

A decorator is a higher order function, which accepts and returns a function. Simple usage example `add_apples` decorator will add 'Apple' element into fruits list returned by get_fruits target function.

In [90]:
def add_apples(func):
    def get_fruits():
        fruits = func()
        fruits.append('Apple')
        return fruits
    return get_fruits

@add_apples
def get_fruits():
    return ['Banana', 'Mango', 'Orange']

# Prints out the list of fruits with 'Apple' element in it:
# Banana, Mango, Orange, Apple
print(', '.join(get_fruits()))

Banana, Mango, Orange, Apple


In this example, `beg` wraps `say`. `beg` will call `say`. If `say_please` is True then it will change the returned message:

In [91]:
from functools import wraps


def beg(target_function):
    @wraps(target_function)
    def wrapper(*args, **kwargs):
        msg, say_please = target_function(*args, **kwargs)
        if say_please:
            return "{} {}".format(msg, "Please! I am poor :(")
        return msg

    return wrapper


@beg
def say(say_please=False):
    msg = "Can you buy me a beer?"
    return msg, say_please


print say()  # Can you buy me a beer?
print say(say_please=True)  # Can you buy me a beer? Please! I am poor :(

SyntaxError: invalid syntax (<ipython-input-91-464aec407535>, line 21)

## Let's program!

And now my dear students, you are ready to program in Python!

Please write below the simplest implementation of Fibonacci numbers. Those are the series where the next element is the sum of the previous two. Amd give me the first 100 Fibonacci numbers.

Now :-)

In [94]:
def Fcal(fNumber):
    f1 = 1
    f2 = 1
    print("1\n1")
    for i in range(fNumber):
        f3 = f1+f2
        print(f3)
        f1 = f2
        f2 = f3

In [102]:
Fcal(100)

1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
1346269
2178309
3524578
5702887
9227465
14930352
24157817
39088169
63245986
102334155
165580141
267914296
433494437
701408733
1134903170
1836311903
2971215073
4807526976
7778742049
12586269025
20365011074
32951280099
53316291173
86267571272
139583862445
225851433717
365435296162
591286729879
956722026041
1548008755920
2504730781961
4052739537881
6557470319842
10610209857723
17167680177565
27777890035288
44945570212853
72723460248141
117669030460994
190392490709135
308061521170129
498454011879264
806515533049393
1304969544928657
2111485077978050
3416454622906707
5527939700884757
8944394323791464
14472334024676221
23416728348467685
37889062373143906
61305790721611591
99194853094755497
160500643816367088
259695496911122585
420196140727489673
679891637638612258
1100087778366101931
1779979416004714189
2880067194370816120
4660046610375530309
7540113804746346429


Enough for today?

![sloth](https://tellingthetruth1993.files.wordpress.com/2015/06/sloth-from-imgsoup-com.jpg)