# Advanced Python Features

## Language Features

###  Python Truth Values

In Python, any object can be tested for Boolean truth value. In general, any object is considered to be equivalent to Boolean true, unless it's class defines a Bool method that returns false, or has a len method that returns a zero length. 
There are two constants which are defined to evaluate to false:

* the false Boolean constant itself
* and the none constant which you may have seen represented in other languages as null.

Any of the built-in numeric types that evaluate to zero are also considered to be false.  

The empty string and empty collection objects are all considered to be false. In addition, if you call the built-in set function with no parameters, or you create a range of zero, those are also considered to be Boolean false. For custom objects, they are by default considered to be true, unless they override the Bool function and return a false value, or they override the len function and return a value of zero. 

There are also three basic Boolean operations, and, or, and not. The first two of these operations are short circuit operators. In the case of and, if the first value evaluates to false, then the second operand isn't evaluated, since it won't matter what it is because anything anded with false comes out to be false. Similarly, the or operator only evaluates the second operand if the first value is false, because anything ored with true will come out to be true. 

### String vs Bytes

In Python 3, there are very important differences between the notions of strings and bytes. A string in Python 3 is a sequence of Unicode characters, while bytes are a sequence of raw eight-bit values.

In [1]:
# strings and bytes are not directly interchangeable
# strings contain unicode, bytes are raw 8-bit values
b = bytes([0x41, 0x42, 0x43, 0x44])
b

b'ABCD'

In [2]:
s = "This is a string"
s

'This is a string'

And you can see in the output that the first is a sequence of bytes,  as noted by the little b character right there. And the second print statement prints out this is a string. Try combining them. This will cause an error

In [3]:
s+b

TypeError: can only concatenate str (not "bytes") to str

In [4]:
# Bytes and strings need to be properly encoded and decoded
# before you can work on them together
s2 = b.decode('utf-8')
s+s2

'This is a stringABCD'

In [5]:
b2 = s.encode('utf-8')
b+b2

b'ABCDThis is a string'

In [6]:
# encode the string as UTF-32
b3 = s.encode('utf-32')
b3

b'\xff\xfe\x00\x00T\x00\x00\x00h\x00\x00\x00i\x00\x00\x00s\x00\x00\x00 \x00\x00\x00i\x00\x00\x00s\x00\x00\x00 \x00\x00\x00a\x00\x00\x00 \x00\x00\x00s\x00\x00\x00t\x00\x00\x00r\x00\x00\x00i\x00\x00\x00n\x00\x00\x00g\x00\x00\x00'

### Template Strings

Why would you use different method of string formatting instead of the regular string format function?  
First, if all you need to do is simple variable substitution:
* the template string method is much easier to use   
* the code is more readable.  

You can control the output formatting with all kinds of specifiers:
* for spacing
* number formatting
* justification


In [7]:
from string import Template

# Usual string formatting with format()
str1 = "You're looking at {0} by {1}".format("Python 3 Code", "Jupyter Notebook")
str1

"You're looking at Python 3 Code by Jupyter Notebook"

In [8]:
# create a template with placeholders
templ = Template("You're looking at ${title} by ${software}")

# use the substitute method with keyword arguments
str2 = templ.substitute(title="Python 3 Code", software="Jupyter Notebook")
str2

"You're looking at Python 3 Code by Jupyter Notebook"

In [9]:
# use the substitute method with a dictionary
data = {"software": "Jupyter Notebook", 
        "title": "Python 3 Code"}
    
str3 = templ.substitute(data) 
str3

"You're looking at Python 3 Code by Jupyter Notebook"

## Built-in Functions

### Utilities

In [10]:
# demonstrate built-in utility functions
# use any() and all() to test sequences for boolean values
list1 = [1, 2, 3, 0, 5, 6]
any(list1)    

True

In [11]:
all(list1)

False

In [12]:
min(list1)

0

In [13]:
max(list1)

6

In [14]:
sum(list1)

17

### Iterators

The term we use to describe looping over is called iteration. Iter creates an iterable object out of a sequence that you give it.

In [15]:
# use iterator functions like enumerate, zip, iter, next
# define a list of days in English and French
days = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
daysFr = ["Dim", "Lun", "Mar", "Mer", "Jeu", "Ven", "Sam"]

In [16]:
# use iter to create an iterator over a collection
i = iter(days)

In [17]:
next(i)

'Sun'

In [18]:
next(i)

'Mon'

In [24]:
# File part
# iterate using a function and a sentinel
with open("testfile.txt", "r") as fp:
    for line in iter(fp.readline, ''):
        print(line)

This is line 1

This is line 2

This is line 3

This is line 4

This is line 5

This is line 6



In [25]:
# Regular iteration
# use regular iteration over the days
for m in range(len(days)):
    print(m+1, days[m])

1 Sun
2 Mon
3 Tue
4 Wed
5 Thu
6 Fri
7 Sat


In [26]:
# Enumerate part
# using enumerate reduces code and provides a counter
for i, m in enumerate(days, start=1):
    print(i, m)

1 Sun
2 Mon
3 Tue
4 Wed
5 Thu
6 Fri
7 Sat


In [27]:
# Zip part
# use zip to combine sequences
for m in zip(days, daysFr):
    print(m)

('Sun', 'Dim')
('Mon', 'Lun')
('Tue', 'Mar')
('Wed', 'Mer')
('Thu', 'Jeu')
('Fri', 'Ven')
('Sat', 'Sam')


In [28]:
# Zip and Enumerate part    
for i, m in enumerate(zip(days, daysFr), start=1):
    print(i, m[0], "=", m[1], "in French")

1 Sun = Dim in French
2 Mon = Lun in French
3 Tue = Mar in French
4 Wed = Mer in French
5 Thu = Jeu in French
6 Fri = Ven in French
7 Sat = Sam in French


**MORE ON ZIP FUNCTION**

### Transformers

The Python standard library provides built-in functions for transforming sequences of data.   
  
The filter functions does essentially what its name implies. It creates an iterator that filters out values from a given sequence. You pass it a function to perform a Boolean test and if that test returns false, then that item is removed from the resulting sequence.  
  
The map function creates an iterator that takes one or more sequences of values and produces a new sequence by applying a given function to each value in the original sequences. 

In [30]:
def filterFunc(x):
    if x % 2 == 0:
        return False
    return True


def filterFunc2(x):
    if x.isupper():
        return False
    return True


def squareFunc(x):
    return x**2


def toGrade(x):
    if (x >= 90):
        return "A"
    elif (x >= 80 and x < 90):
        return "B"
    elif (x >= 70 and x < 80):
        return "C"
    elif (x >= 65 and x < 70):
        return "D"
    return "F"

In [31]:
# define some sample sequences to operate on
nums = (1, 8, 4, 5, 13, 26, 381, 410, 58, 47)
chars = "abcDeFGHiJklmnoP"
grades = (81, 89, 94, 78, 61, 66, 99, 74)

In [32]:
# use filter to remove items from a list
odds = list(filter(filterFunc, nums))
odds

[1, 5, 13, 381, 47]

In [33]:
# use filter on non-numeric sequence
lowers = list(filter(filterFunc2, chars))
lowers

['a', 'b', 'c', 'e', 'i', 'k', 'l', 'm', 'n', 'o']

In [34]:
# use map to create a new sequence of values
squares = list(map(squareFunc, nums))
squares

[1, 64, 16, 25, 169, 676, 145161, 168100, 3364, 2209]

In [35]:
# use sorted and map to change numbers to grades
grades = sorted(grades)
letters = list(map(toGrade, grades))
letters

['F', 'D', 'C', 'C', 'B', 'B', 'A', 'A']

### Itertools

Now this is not technically a set of built in language functions, but they are part of the standard library that comes with Python, and they are incredibly useful for creating iterators to handle a variety of common scenarios.

Infinite iterators that will generate values for as long as you need them and they just never end.  
  
So the first is called `cycle`, and it does what its name implies, it cycles over a set of values.
The next infinite iterator is called a `count` iterator. Count iterator does pretty much what you'd expect, it creates a counter.  
Another interesting iterator is the `accumulate` function which will aggregate values together. It defaults to addition, but it can be changed.  
Last is the `chain` function. So the chain function will take multiple sequences and chain them together to act as one.

In [36]:
import itertools


def testFunction(x):
    return x < 40

In [37]:
# cycle iterator can be used to cycle over a collection
seq1 = ["Joe", "John", "Mike"]
cycle1 = itertools.cycle(seq1)
print(next(cycle1))
print(next(cycle1))
print(next(cycle1))
print(next(cycle1))

Joe
John
Mike
Joe


In [38]:
# use count to create a simple counter
count1 = itertools.count(100, 10)
print(next(count1))
print(next(count1))
print(next(count1))

100
110
120


In [39]:
# accumulate creates an iterator that accumulates values
vals = [10,20,30,40,50,40,30]
acc = itertools.accumulate(vals, max)
list(acc)

[10, 20, 30, 40, 50, 50, 50]

In [40]:
# use chain to connect sequences together
x = itertools.chain("ABCD", "1234")
list(x)

['A', 'B', 'C', 'D', '1', '2', '3', '4']

The Itertools modules provides two similar functions called dropwhile and takewhile. So these iterators will provide values until a trigger value is reached, at which point they'll stop. So both of these functions take a predicate function to perform the value test. Dropwhile will drop values from the sequence while the test function returns true, and then it will start returning every value after that.

And then takewhile is the opposite. It will return values from the sequence while the predicate function returns true, and then it will stop giving you values.

In [41]:
# dropwhile and takewhile will return values until
# a certain condition is met that stops them
list(itertools.dropwhile(testFunction, vals))

[40, 50, 40, 30]

In [42]:
list(itertools.takewhile(testFunction, vals))

[10, 20, 30]

## Advanced Functions

### Variable Arguments

Python functions support variable argument lists and this makes it possible to build functions that have a high degree of flexibility by accepting different numbers of parameters.  

A good example of this might be an addition function that adds up the parameters passed to it. Now it would be pretty inconvenient to require cause of this function to have to confirm to putting all the numbers into a list. So defining the function to accept a variable list of parameters would be a better way to go. This parameter has to come after all the other positional parameters that the function defines.  
  
There is a potential drawback to using variable argument lists. And that is that if you decide later to change the function signature to add more positional parameters, then all of the callers of your function will also have to change.

In [43]:
# define a function that takes variable arguments
def addition(base, *args):
    result = 0
    for arg in args:
        result += arg
    return result

In [44]:
addition(5, 10, 15, 20)

45

In [45]:
addition(1, 2, 3)

5

In [46]:
# pass an existing list
myNums = [5, 10, 15, 20]
addition(*myNums)

45

### Lambda Functions

Lambda functions can be passed as arguments to other functions to perform some processing work. Typically, you see these used in situations where defining a whole separate function would needlessly increase the complexity of the code and reduce readability.  

Lambdas are defined by using the keyword lambda followed by any arguments that the lambda function takes, and then followed by an expression.

In [47]:
def CelsisusToFahrenheit(temp):
    return (temp * 9/5) + 32


def FahrenheitToCelsisus(temp):
    return (temp-32) * 5/9

In [48]:
ctemps = [0, 12, 34, 100]
ftemps = [32, 65, 100, 212]

In [49]:
# Use regular functions to convert temps
list(map(FahrenheitToCelsisus, ftemps))

[0.0, 18.333333333333332, 37.77777777777778, 100.0]

In [50]:
list(map(CelsisusToFahrenheit, ctemps))

[32.0, 53.6, 93.2, 212.0]

In [51]:
# Use lambdas to accomplish the same thing
list(map(lambda t: (t-32) * 5/9, ftemps))

[0.0, 18.333333333333332, 37.77777777777778, 100.0]

In [52]:
list(map(lambda t: (t * 9/5) + 32, ctemps))

[32.0, 53.6, 93.2, 212.0]

### Keywords Arguments

Python provides away for specifying function parameters using keywords. 

So for example, you can define a function that takes positional arguments along with keyword arguments that take optional values like this. Then, when you want to call the function, you can specify values by position or by keyword. In some cases however, you may want to require the callers of your particular function specify arguments using keywords only in order to provide better readability of the code. 

So for example, suppose we have a function that performs a critical operation. And it provides an option to suppress exceptions. So one way to write this function would be to specify a regular argument and have it default to a certain value. Now the problem with this approach is that the function can be invoked just by passing a regular positional argument. And since this parameter has a significant effect on how the program runs, it might be better to require that the parameter be specified by keyword. This way, the function caller is aware of the significance of the parameter and others who read the code can easily see and understand what's happening.

So to accomplish this in Python 3, you can separate your positional arguments with a single asterisk character followed by parameters that are keyword only.

In [53]:
# Demonstrate the use of keyword-only arguments
# use keyword-only arguments to help ensure code clarity
def myFunction(arg1, arg2, *, suppressExceptions=False):
    print(arg1, arg2, suppressExceptions)

In [54]:
# try to call the function without the keyword
myFunction(1, 2, True)

TypeError: myFunction() takes 2 positional arguments but 3 were given

In [55]:
myFunction(1, 2, suppressExceptions=True)

1 2 True


## Collections

### Named Tuple

Now, suppose I wanted to define a data structure to represent a geometric point on a typical x and y axis. I could easily do this by defining a regular tuple with two elements, the x and y values of the point and to access these values I can use
positional argument indexes to get each one. 

Namedtuples help to solve this problem by assigning meaning to each of the values along with the tuple itself. And they also provide some helpful functions for working with them.

In [56]:
from collections import namedtuple

In [58]:
# create a Point namedtuple
Point = namedtuple("Point", "x y")

In [59]:
p1 = Point(10, 20)
p1

Point(x=10, y=20)

In [60]:
p2 = Point(30, 40)
p2

Point(x=30, y=40)

In [61]:
(p1.x, p1.y)

(10, 20)

In [62]:
p1, p2

(Point(x=10, y=20), Point(x=30, y=40))

In [63]:
# use _replace to create a new instance
p1 = p1._replace(x=100)
p1

Point(x=100, y=20)

### Default Dictionary

The collections module provides two interesting dictionary subclasses to help out with common scenarios where a regular dictionary would need unnecessary code. It's a fairly common scenario to use dictionaries to keep track of data, such as the result of counting operations.

So, if you have a situation where the fact that a key is missing from the dictionary is an important indicator, then default dict is probably not the right collection to use. In other situations however, it can make your code simpler and easier to read and test.

In [64]:
from collections import defaultdict

In [66]:
# define a list of items that we want to count
fruits = ['apple', 'pear', 'orange', 'banana',
          'apple', 'grape', 'banana', 'banana']

In [67]:
# use a dictionary to count each element
fruitCounter = defaultdict(int)

In [68]:
# Count the elements in the list
for fruit in fruits:
    fruitCounter[fruit] += 1
    
# print the result
for (k, v) in fruitCounter.items():
    print(k + ": " + str(v))

apple: 2
pear: 1
orange: 1
banana: 3
grape: 1


###  Counters

Collections modules supplies a counter class which is a dictionary subclass for counting hashable objects. Counters have some nice additional features compared to default dictionary for working with numbers of items.

In [21]:
# Demonstrate the usage of defaultdict objects




def part4_2():
    







In [22]:
part4_2()

apple: 2
pear: 1
orange: 1
banana: 3
grape: 1


In [69]:
from collections import Counter

In [70]:
# list of students in class 1
class1 = ["Bob", "James", "Chad", "Darcy", "Penny", "Hannah"
          "Kevin", "James", "Melanie", "Becky", "Steve", "Frank"]

# list of students in class 2
class2 = ["Bill", "Barry", "Cindy", "Debbie", "Frank",
          "Gabby", "Kelly", "James", "Joe", "Sam", "Tara", "Ziggy"]

# Create a Counter for class1 and class2
c1 = Counter(class1)
c2 = Counter(class2)

In [71]:
# How many students in class 1 named James?
c1["James"]

2

In [72]:
# How many students are in class 1?
sum(c1.values())

11

In [73]:
# Combine the two classes
c1.update(class2)
sum(c1.values())

23

In [74]:
# What's the most common name in the two classes?
c1.most_common(3)

[('James', 3), ('Frank', 2), ('Bob', 1)]

In [75]:
# Separate the classes again
c1.subtract(class2)
c1.most_common(1)

[('James', 2)]

In [76]:
# What's common between the two classes?
c1 & c2

Counter({'James': 1, 'Frank': 1})

### Ordered Dictionary

One of the main downsides of the regular dictionary object in Python is that it doesn't keep track of any order among the items. The OrderedDict is a dictionary object that remembers the order in which items are inserted. This is a nice feature because it means you can substitute an OrderedDict anywhere you would use a regular dictionary.

In [77]:
from collections import OrderedDict

In [78]:
# list of sport teams with wins and losses
sportTeams = [("Royals", (18, 12)), ("Rockets", (24, 6)), 
            ("Cardinals", (20, 10)), ("Dragons", (22, 8)),
            ("Kings", (15, 15)), ("Chargers", (20, 10)), 
            ("Jets", (16, 14)), ("Warriors", (25, 5))]

In [79]:
# sort the teams by number of wins
sortedTeams = sorted(sportTeams, key=lambda t: t[1][0], reverse=True)

In [80]:
# create an ordered dictionary of the teams
teams = OrderedDict(sortedTeams)
teams

OrderedDict([('Warriors', (25, 5)),
             ('Rockets', (24, 6)),
             ('Dragons', (22, 8)),
             ('Cardinals', (20, 10)),
             ('Chargers', (20, 10)),
             ('Royals', (18, 12)),
             ('Jets', (16, 14)),
             ('Kings', (15, 15))])

In [81]:
# Use popitem to remove the top item
tm, wl = teams.popitem(False)
print("Top team: ", tm, wl)

Top team:  Warriors (25, 5)


In [82]:
# What are next the top 4 teams?
for i, team in enumerate(teams, start=1):
    print(i, team)
    if i == 4:
        break

1 Rockets
2 Dragons
3 Cardinals
4 Chargers


In [83]:
# test for equality
a = OrderedDict({"a": 1, "b": 2, "c": 3})
b = OrderedDict({"a": 1, "c": 3, "b": 2})
print("Equality test: ", a == b)

Equality test:  False


## Logging

### Basic Logging

There are five different methods to use for recording log messages. 

* Debug, 
* Info, 
* Warning, 
* Error,
* Critical.

Each of these methods corresponds to a particular type of message which are used to indicate different types of status of the application. 

* Debug messages are typically used to provide diagnostic information that's useful when you are trying to track down a problem. 
* Info messages are usually used to indicate that a particular interesting operation was able to complete normally. 
* Warning messages indicate that something unexpected happened or that a more serious problem might be approaching such as running out of storage space or the inability to communicate with a remote server. 
* Error messages indicate that a particular operation was unable to successfully complete.
* And critical messages indicate that the program has suffered a serious error and might not be able to continue running.

In [84]:
# use the built-in logging module
import logging

In [85]:
# Use basicConfig to configure logging
# this is only executed once, subsequent calls to
# basicConfig will have no effect
logging.basicConfig(level=logging.DEBUG,
                    filemode="w",
                    filename="output.log")

In [86]:
# Try out each of the log levels
logging.debug("This is a debug-level log message")
logging.info("This is an info-level log message")
logging.warning("This is a warning-level message")
logging.error("This is an error-level message")
logging.critical("This is a critical-level message")

In [87]:
# Output formatted string to the log
logging.info("Here's a {} variable and an int: {}".format("string", 10))

### Custom Logging

The basic config function takes two additional arguments, format and dateformat. The format argument specifies a string that controls the precise formatting of the output message that is sent to the log.

The date format argument is used in conjunction with the format argument. If the format argument contains a date specifier then the date format argument is used to format the date string using the same kind of date formatting strings that you would pass to the strftime function. This table lists some of the some of the formatting tokens you can use in the format argument. 

For example, the asctime token is a human readable time format. The filename and funcName tokens are for the file and function names where the log message originated and so on. 

In [88]:
# Already imported
# import logging

extData = {'user': 'noname@lol.com'}

def anotherFunction():
    logging.debug("This is a debug-level log message", extra=extData)

In [89]:
# set the output file and debug level, and
# use a custom formatting specification
fmtStr = "%(asctime)s: %(levelname)s: %(funcName)s Line:%(lineno)d User:%(user)s %(message)s"
dateStr = "%m/%d/%Y %I:%M:%S %p"
logging.basicConfig(filename="output.log",
                    level=logging.DEBUG,
                    format=fmtStr,
                    datefmt=dateStr)

In [90]:
logging.info("This is an info-level log message", extra=extData)
logging.warning("This is a warning-level message", extra=extData)
anotherFunction()

## Comprehensions

### List Comprehensions

In [91]:
# define two lists of numbers
evens = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
odds = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

In [92]:
# Perform a mapping and filter function on a list
evenSquared = list(map(lambda e: e**2, filter(lambda e: e > 4 and e < 16, evens)))
evenSquared

[36, 64, 100, 144, 196]

In [93]:
# Derive a new list of numbers frm a given list
evenSquared = [e ** 2 for e in evens]
evenSquared



[4, 16, 36, 64, 100, 144, 196, 256, 324, 400]

In [94]:
# Limit the items operated on with a predicate condition
oddSquared = [e ** 2 for e in odds if e > 3 and e < 17]
oddSquared

[25, 49, 81, 121, 169, 225]

### Dictionary Comprehensions

In [95]:
# define a list of temperature values
ctemps = [0, 12, 34, 100]

In [96]:
# Use a comprehension to build a dictionary
tempDict = {t: (t * 9/5) + 32 for t in ctemps if t < 100}
tempDict

{0: 32.0, 12: 53.6, 34: 93.2}

In [97]:
tempDict[12]

53.6

In [98]:
# Merge two dictionaries with a comprehension
team1 = {"Jones": 24, "Jameson": 18, "Smith": 58, "Burns": 7}
team2 = {"White": 12, "Macke": 88, "Perce": 4}
newTeam = {k: v for team in (team1, team2) for k, v in team.items()}
newTeam

{'Jones': 24,
 'Jameson': 18,
 'Smith': 58,
 'Burns': 7,
 'White': 12,
 'Macke': 88,
 'Perce': 4}

### Set Comprehensions

Sets in Python are used to contain unique values. So, that is each value in a given set can occur only once.

In [99]:
# define a list of temperature data points
ctemps = [5, 10, 12, 14, 10, 23, 41, 30, 12, 24, 12, 18, 29]

In [100]:
# build a set of unique Fahrenheit temperatures
ftemps1 = [(t * 9/5) + 32 for t in ctemps]
ftemps2 = {(t * 9/5) + 32 for t in ctemps}

In [101]:
ftemps1

[41.0, 50.0, 53.6, 57.2, 50.0, 73.4, 105.8, 86.0, 53.6, 75.2, 53.6, 64.4, 84.2]

In [102]:
ftemps2

{41.0, 50.0, 53.6, 57.2, 64.4, 73.4, 75.2, 84.2, 86.0, 105.8}

In [103]:
# build a set from an input source
sTemp = "The quick brown fox jumped over the lazy dog"
chars = {c.upper() for c in sTemp if not c.isspace()}
chars

{'A',
 'B',
 'C',
 'D',
 'E',
 'F',
 'G',
 'H',
 'I',
 'J',
 'K',
 'L',
 'M',
 'N',
 'O',
 'P',
 'Q',
 'R',
 'T',
 'U',
 'V',
 'W',
 'X',
 'Y',
 'Z'}