# Learning Python
## Essential version 3.x infomation and syntax

A **String** in Python 3 is an immutable sequence that stores Unicode code points and are sequences/data-containers (albeit ones that are limited to storing Unicode text data).

In [21]:
str2 = 'This is a string.'
str3 = """This too
... is a multiline string
... built with triple double-quotes."""
print(str2, str3)
print(bytes([102, 111, 111]))
s = "This is üŋíc0de"
print(s)
encoded_s = s.encode('utf-8')
encoded_s

This is a string. This too
... is a multiline string
... built with triple double-quotes.
b'foo'
This is üŋíc0de


b'This is \xc3\xbc\xc5\x8b\xc3\xadc0de'

In [24]:
list(b'foo bar')
bytes_obj = b"A bytes object"
print(bytes_obj)

b'A bytes object'


### Indexing and Slicing Strings

* Zero-based access to any position by indexing
* Slicing gets a subsequence.

my_sequence[start:stop:step] # all args are optional

(start is inclusive......stop is exclusive]


In [4]:
s = "The trouble is you think you have time."
print(s[0])
print(s[5])
print(s[:4])
print(s[4:])
print(s[2:14:3])
print(s[:]) # quick way of making a copy of a sequence!
print(s[::-1])

T
r
The 
trouble is you think you have time.
erb 
The trouble is you think you have time.
.emit evah uoy kniht uoy si elbuort ehT


In [7]:
import re

st = "1,5,88,1203,12675"
stc = re.split(r'[,]+', st)
print(stc)

# Convert elements to integers
# Lamda functions in Python use: lambda x: x + 10
stc_ints = sorted(map(lambda x: int(x), stc))
print(stc_ints) 



['1', '5', '88', '1203', '12675']
[1, 5, 88, 1203, 12675]


Packing/Unpacking of sequences, Simultaneous Assignment, Spread/Rest operations

In [11]:
# Common use is also when returning multiple values from a function
#     return x,y

data = 1,3,5,7,9 # data treated as tuple, automatic packing behavior
print(data)

# Unpacking
a, b, c, d = range(7, 11)
print(a,b,c,d)

# commonly used to iterate through key-value pairs that are 
# returned by the items( ) method of the dict class
# for k,v in mapping.items()

# simultaneous
a,b = b,a
print(a,b,c,d)

first, second, *rest = 0,1,2,3,4,5,6
print(first)
print(second)
print(rest)

# first, *inner, last = 0, 1, 2, 3  # also works

(1, 3, 5, 7, 9)
7 8 9 10
8 7 9 10
0
1
[2, 3, 4, 5, 6]


In [5]:
tuple(b'foo bar')

(102, 111, 111, 32, 98, 97, 114)

In [6]:
10/4

2.5

In [7]:
10//4

2

In [8]:
2 ** 1024

179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216

In [13]:
import sys
sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

### Complex and Fractions
Location is p.37 OF LEARNING_PYTHON

In [1]:
c = 3.14159 + 2.73000j

In [8]:
c.real
c.imag
c.conjugate()

(3.14159-2.73j)

In [11]:
from fractions import Fraction

Fraction(10,6)


Fraction(5, 3)

## Tips and Optimizations

### Map, Zip, Filter to Massage Sequences Efficiently

* The decorate-sort-undecorate idiom (also known as Schwartzian transform)
* Map applies function to every element in list
* Zip combined with Map
* Filter passing only those arguments that return True from predicate function

Lamda functions in JavaScript use: __(x) => x + 10__

Lamda functions in Python use: __lambda x: x + 10__

In [1]:
students = [
    dict(id=0, credits=dict(math=9, physics=6, history=7)),
    dict(id=1, credits=dict(math=6, physics=7, latin=10)),
    dict(id=2, credits=dict(history=8, physics=9, chemistry=10)),
    dict(id=3, credits=dict(math=5, physics=5, geography=7)),
]

In [2]:
def decorate(student):
    # create a 2-tuple (sum of credits, student) from student dict
    return (sum(student['credits'].values()), student)

def undecorate(decorated_student):
    # discard sum of credits, return original student dict
    return decorated_student[1]

In [5]:
students = sorted(map(decorate, students), reverse=True)
students = list(map(undecorate, students))

# sorted by sum of their credits
print(students)

[{'id': 2, 'credits': {'history': 8, 'physics': 9, 'chemistry': 10}}, {'id': 1, 'credits': {'math': 6, 'physics': 7, 'latin': 10}}, {'id': 0, 'credits': {'math': 9, 'physics': 6, 'history': 7}}, {'id': 3, 'credits': {'math': 5, 'physics': 5, 'geography': 7}}]


In [6]:
grades = [18, 23, 30, 27, 15, 9, 22]
avgs = [22, 21, 29, 24, 18, 18, 24]
list(zip(avgs, grades))

[(22, 18), (21, 23), (29, 30), (24, 27), (18, 15), (18, 9), (24, 22)]

In [7]:
list(map(lambda *a: a, avgs, grades)) # equivalent to Zip

[(22, 18), (21, 23), (29, 30), (24, 27), (18, 15), (18, 9), (24, 22)]

In [11]:
# Using Map and Zip
# Calculate the elem-wise maximum among sequences
# The maximum of the first element of each seq, then max of second seq, and so on
a = [5, 9, 2, 4, 7]
b = [3, 7, 1, 9, 2]
c = [6, 8, 0, 5, 3]

# using map instead of zip would be more complicated function
maxs = map(lambda n: max(*n), zip(a,b,c))
list(maxs)

[6, 9, 2, 9, 7]

In [14]:
filData = [2,5,8,0,0,1,0]
list(filter(None, filData))

[2, 5, 8, 1]

In [15]:
list(filter(lambda x: x, filData)) # equivalent to prior filter

[2, 5, 8, 1]

In [16]:
list(filter(lambda x: x > 4, filData))

[5, 8]

Use *OrderedDict* to keep the order of additions to it

In [4]:
from collections import OrderedDict

OrderedDict((str(number), None) for number in range(5)).keys()

odict_keys(['0', '1', '2', '3', '4'])

### Comprehensions

Python offers you different types of comprehensions: list, dict, and set

List comprehensions consist of brackets containing an expression followed by
a for clause, then zero or more for or if clauses. The result will be a new list with the results.


In [3]:
squares = map(lambda n: n**2, range(10))
list(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [5]:
[n**2 for n in range(10) if not n%2]

[0, 4, 16, 36, 64]

In [7]:
# Nested comprehensions
#
items = 'ABCDE'
pairs = []
for a in range(len(items)):
    for b in range(a, len(items)):
        pairs.append((items[a], items[b]))
        
pairs

[('A', 'A'),
 ('A', 'B'),
 ('A', 'C'),
 ('A', 'D'),
 ('A', 'E'),
 ('B', 'B'),
 ('B', 'C'),
 ('B', 'D'),
 ('B', 'E'),
 ('C', 'C'),
 ('C', 'D'),
 ('C', 'E'),
 ('D', 'D'),
 ('D', 'E'),
 ('E', 'E')]

In [10]:
pairs2 = [(items[a], items[b]) 
          for a in range(len(items)) 
          for b in range(a, len(items))] # Order of for loops must be correct

pairs2

[('A', 'A'),
 ('A', 'B'),
 ('A', 'C'),
 ('A', 'D'),
 ('A', 'E'),
 ('B', 'B'),
 ('B', 'C'),
 ('B', 'D'),
 ('B', 'E'),
 ('C', 'C'),
 ('C', 'D'),
 ('C', 'E'),
 ('D', 'D'),
 ('D', 'E'),
 ('E', 'E')]

General comprehension syntax:

```
[ expression for value in iterable if condition ]

[ k k for k in range(1, n+1) ]     # list comprehension
{ k k for k in range(1, n+1) }     # set comprehension
( k k for k in range(1, n+1) )     # generator comprehension
{ k : k k for k in range(1, n+1) } # dictionary comprehension
```

The generator syntax is particularly attractive when results do not need to be stored in memory.


### Enumerate

This built-in function provides a convenient way to get an index when a sequence is used in a loop.

In [1]:
for  i, elem in enumerate(['one', 'two', 'three', 'four']):
    print(i, elem)

0 one
1 two
2 three
3 four


In [15]:
from math import sqrt

# Pythagorean Triples
#
mx = 10
legs = [(a, b, sqrt(a**2 + b**2)) 
        for a in range(1, mx)
        for b in range(a, mx)]

# filter out triples where the c "hypotenuse" isn't an integer
legs = list(filter(lambda t: t[2].is_integer(), legs))

# use list comprehension to combine filter/map in one clean list
legs = [(a, b, int(c)) for a, b, c in legs if c.is_integer()]

print(legs)

[(3, 4, 5), (6, 8, 10)]


##Dictionary and Set Comprehensions 

Work exactly like the list ones, only there is a little difference in the syntax.

In [17]:
# lettermap = {c: k for k, c in enumerate(ascii_lowercase, 1)}

In [19]:
# Dicts do not allows duplicates in the keys
# it simply re-assigns the last value assigned
#
word = 'Hello'
swaps = {c: c.swapcase() for c in word}
print(swaps)

{'H': 'h', 'e': 'E', 'l': 'L', 'o': 'O'}


In [20]:
# Sets also do not allow duplicates
ls1 = set(c for c in word)
ls2 = {c for c in word}
print(ls1)
print(ls2)
print(ls1 == ls2)

{'l', 'e', 'H', 'o'}
{'l', 'e', 'H', 'o'}
True


### Generators

Generator functions yield instead of returning results

* suspend and resume their state between calls. 
* automatically turned into their own iterators, yield one, suspend/resume, yield another...
* allow you to get/use computations one at a time, instead of waiting for every computation to complete before providing results. 
* sometimes the amount of data you have to iterate over is so huge that you cannot keep it all in memory in a list. 
* In this case, generators are invaluable: they make possible what wouldn't be possible otherwise.

* Generator Expressions are similar to list comprehensions, but instead of returning a list, they return an object that produces results one by one


In [21]:
# normal function
def get_squares(n):
    return [x**2 for x in range(n)]

print(get_squares(10))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


In [22]:
# generator function
def get_squares_gen(n):
    for x in range(n):
        yield x**2
        
print(get_squares_gen(10))

<generator object get_squares_gen at 0x000001D86513A728>


In [27]:
squares = get_squares_gen(4)
print(next(squares)) # same as squares.__next__()
print(next(squares))
print(next(squares))
print(next(squares))
print(next(squares))
# StopIteration -- generator exhausted. 
# This is what a for loop receives when loop iteration over
print(next(squares))

0
1
4
9


StopIteration: 

In [4]:
# LEARNING_PYTHON -- see p.151
#
# MORE Material Needed Here to Explain
# CREATING your Own ITERATOR
#

# What happens when you call next(generator) is that 
# you're calling the generator.__next__() method on the object

# and its purpose is to return the next element of the
# iteration, or to raise StopIteration when the iteration 
# is over and there are no more elements to return.

def geomProg(a, q):
    k = 0
    while True:
        result = a * q**k   # geometric progression/series
        if result <= 100000:
            yield result
        else:
            return
        k += 1
        
for n in geomProg(2, 5):
    print(n)

    # should print 2,10,50,250,1250,6250,31250
    # then next term would have been 156250 which is too long

2
10
50
250
1250
6250
31250


### Generator Expressions

Use () braces instead of [] like list comprehensions, and behave the same
expect they allow for only ONE iteration, then theyll be exhausted.

* Can use reproduce .map .filter using generator expressions.

* NOTE: map calls can be twice as fast as equivalent for loops, and list comprehensions can be (always generally speaking) even faster than equivalent map calls.

In [6]:
cubes = [k**3 for k in range(10)]
cubes

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

In [11]:
cubes_gen = (k**3 for k in range(10))
cubes_gen

<generator object <genexpr> at 0x00000198F926C728>

In [12]:
list(cubes_gen) # same as manually calling next in a for loop

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

In [13]:
list(cubes_gen) # nothing more to give

[]

Generator Expressions can reproduce .map and .filter functions but with 
better readability


In [29]:
N = 20

cubes = [x**3 for x in range(10)]

odd_cubes1 = filter(lambda cube: cube % 2, cubes)
cubes1 = map(
    lambda n: (n, n**3),
    filter(lambda n: n % 3 == 0 or n % 5 == 0, range(N)))

odd_cubes1 # map/filter style
cubes1 # map/filter style

odd_cubes2 = (cube for cube in cubes if cube % 2)
cubes2 = ((n, n**3) for n in range(N) if n % 3 == 0 or n % 5 == 0)
    
odd_cubes2 # Generator Expression
cubes2 # Generator Expression

# Return is 2-tuples (n, n^3)
print(list(cubes1))
print(list(cubes2))

[(0, 0), (3, 27), (5, 125), (6, 216), (9, 729), (10, 1000), (12, 1728), (15, 3375), (18, 5832)]
[(0, 0), (3, 27), (5, 125), (6, 216), (9, 729), (10, 1000), (12, 1728), (15, 3375), (18, 5832)]


* Here is where efficiency is important. s1 and s3 are going to produce the exact same sum. Inside s1 there is a list comprehension. In order to calculate s1, the sum function has to call .next on a list, a million times! BEFORE the list needs to have been created -- waste of time and memory.

* It is much better for .sum to call .next on a simple generator expressions. There is no need to have all the numbers from range(10^6) stored in a list

In [33]:
# Much slower and more memory
s1 = sum([n**2 for n in range(10**6)])
print(s1)

# A faster/direct way of summing from a generator expression!
s3 = sum(n**2 for n in range(10**6))
print(s3)

333332833333500000
333332833333500000


In [34]:
from time import time

mx = 5500 # this is the max I could reach with my computer...

t = time() # start time for the for loop
dmloop = []
for a in range(1, mx):
    for b in range(a, mx):
        dmloop.append(divmod(a, b))
print('for loop: {:.4f} s'.format(time() - t)) # elapsed time

t = time() # start time for the list comprehension
dmlist = [
    divmod(a, b) for a in range(1, mx) for b in range(a, mx)]
print('list comprehension: {:.4f} s'.format(time() - t))

t = time() # start time for the generator expression
dmgen = list(
    divmod(a, b) for a in range(1, mx) for b in range(a, mx))
print('generator expression: {:.4f} s'.format(time() - t))

# verify correctness of results and number of items in each list
print(dmloop == dmlist == dmgen, len(dmloop))

for loop: 7.9241 s
list comprehension: 4.5837 s
generator expression: 5.8249 s
True 15122250


In [37]:
from time import time

mx = 2 * 10 ** 6

t = time()
absloop = []
for n in range(mx):
    absloop.append(abs(n))
print('for loop: {:.4f} s'.format(time() - t))

t = time()
abslist = [abs(n) for n in range(mx)]
print('list comprehension: {:.4f} s'.format(time() - t))

t = time()
absmap = list(map(abs, range(mx)))

print('map: {:.4f} s'.format(time() - t))
print(absloop == abslist == absmap)

for loop: 0.9206 s
list comprehension: 0.5396 s
map: 0.1715 s
True


__Name Localization__: Python 3 localizes loop variables in all four forms of comprehensions: 
list, dict, set, and generator expressions. This behavior is therefore different from that
of the for loop.

In [30]:
# None of these alter the global name A
A = 100
ex1 = [A for A in range(5)]
print(A) # prints: 100

ex2 = list(A for A in range(5))
print(A) # prints: 100

ex3 = dict((A, 2 * A) for A in range(5))
print(A) # prints: 100

ex4 = set(A for A in range(5))
print(A) # prints: 100

# The for loop does modify it!
nloc = 0
for A in range(5):
    nloc += A
print(A)

100
100
100
100
4


In [4]:
def gcd(a, b):
    """Calculate the Greatest Common Divisor of (a, b). """
    while b != 0:
        a, b = b, a % b
    return a

In [8]:
#
# Cool.  But not straightforward to read
#
N = 50
triples = sorted(
    ((a, b, c) for a, b, c in (
        ((m**2 - n**2), (2 * m * n), (m**2 + n**2))
        for m in range(1, int(N**.5) + 1)
        for n in range(1, m)
        if (m - n) % 2 and gcd(m, n) == 1
    ) if c <= N), key=lambda *triple: sum(*triple)
)

print(triples)

[(3, 4, 5), (5, 12, 13), (15, 8, 17), (7, 24, 25), (21, 20, 29), (35, 12, 37), (9, 40, 41)]


In [7]:
#
# Easier to read and understand
#
def gen_triples(N):
    for m in range(1, int(N**.5) + 1):
        for n in range(1, m):
            if (m - n) % 2 and gcd(m, n) == 1:
                c = m**2 + n**2
                if c <= N:
                    a = m**2 - n**2
                    b = 2 * m * n
                    yield (a, b, c)
                    
triples = sorted(gen_triples(50), key=lambda *triple: sum(*triple))
print(triples)

[(3, 4, 5), (5, 12, 13), (15, 8, 17), (7, 24, 25), (21, 20, 29), (35, 12, 37), (9, 40, 41)]


## Example Problem

Submitted to candidates for a Python developer role.

The problem is the following: given the sequence 0 1 1 2 3 5 8 13 21 ... write a function 
that would return the terms of this sequence up to some limit N.


In [22]:
# Brute rough draft of function
#
def fibonacci(N):
    """Return all fibonacci numbers up to N. """
    
    result = [0]
    next_n = 1
    
    while next_n <= N:
        result.append(next_n)
        next_n = sum(result[-2:])

    return result

print(fibonacci(0))
print(fibonacci(1))
print(fibonacci(50))

[0]
[0, 1, 1]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


In [26]:
# Better as a generator function
#
def fib(N):
    """Return all fibonacci numbers up to N. """
    a, b = 0, 1
    while a <= N:
        yield a
    a, b = b, a + b
    
# generator functions need to be iterated over to get results (hence the list())
print(list(fibonacci(0)))
print(list(fibonacci(1)))
print(list(fibonacci(50)))

[0]
[0, 1, 1]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


##### Discussion

Now, the fibonacci function is a generator function. First we yield 0, then if N is 0 we return (this will cause a StopIteration exception to be raised). If that's not the case, we start iterating, yielding b at every loop cycle, and then updating a and b. All we need in order to be able to produce the next element of the sequence is the past two: a and b, respectively.

## Iterators

Iterator: object that manages iteration by a series of values. next(i) produces an element from the underlying series

Iterable: an object that produces and iterator. iter(obj)

Same object, you can create multiple iterators each with their own state but can have data changed underneath them.

Some functions/classes can produce iterable series of values. A lazy evaluation technique (no memory set aside or storing all those items):

```
for j in range(10000000):
```

## Functions

* Python’s mechanism for passing information to and from a function is that objects are not copied.
* The communication of a return value from the function back to the caller is similarly implemented as an assignment.

In [None]:
# Computation with default grading system used via the points if another isn't provided
def compute_gpa(grades, points={ A+ :4.0, A :4.0, A- :3.67, B+ :3.33, B :3.0, B- :2.67, C+ :2.33, C :2.0, C :1.67, D+ :1.33, D :1.0, F :0.0}):
    num courses = 0
    total points = 0
    for g in grades:
        if g in points: # a recognizable grade
            num courses += 1
            total points += points[g]
    return total points / num courses




#### Keyword Arguments

The traditional mechanism for matching the actual parameters sent by a caller, to the formal parameters declared by the function signature is based on the concept of positional arguments.

In the example below the function is polymorphic in the number of arguments, allowing a call such as max(a,b,c,d); therefore, it is not possible to designate a key function as a traditional positional element. 

Sorting functions in Python also support a similar key parameter for indicating a nonstandard order.


In [None]:
# Example
# In order to vary the notion of "maximum" that is used.
m = max(a, b, key=abs) # built-in abs func itself sent as value assoc with keyword param 'key'


## Simple I/O

In [3]:
# OUTPUT

# Comma seperated arbitrary no. arguments
# will print arg with spaces between and a \n added at the end
print(1,2,3,4)
# seperator, end-of-line, output
print(1,2,3,4, sep=':')
# rint(1,2,3,4, end=':')
# print(1,2,3,4, file='test.csv')# output filestream (to change from default console) 

1 2 3 4
1:2:3:4


In [7]:
# INPUT

age = int(input('Enter your age in years: '))
print(age)

# When processing a file, the proxy maintains a current position within the file
#     as an offset from the beginning, measured in num of bytes.

#fp.read() # return the remaining contents of a readable file as a string
#fp.readline() # remain current line as string
#fp.readlines() # all remaining lines
#for line in fp: # iterate all remain lines of a file
#fp.seek(k) # change pos oto be at the kth byte of a file
#fp.tell() # return curr pos measured as byte-offset from the start
#fp.write(string) # write at curr po of the writable file
#fp.writelines(seq) each strings of sequqnce at the curr pos




Enter your age in years: 4
4


#### Example Data Manipulation

In [9]:
import re

st = "1,5,88,1203,12675"
stc = re.split(r'[,]+', st)
print(stc)

# Convert elements to integers
# Lamda functions in Python use: lambda x: x + 10
stc_ints = sorted(map(lambda x: int(x), stc))
set_integers = set(stc_ints) 
result = ', '.join(str(s) for s in set_integers)
print(result)

['1', '5', '88', '1203', '12675']
1, 12675, 5, 1203, 88
