**WORKING WITH EXTERNAL LIBRARIES**

Imports, operator overloading, and survival tips for venturing into the world of external libraries


**Imports**

One of the best things about Python (especially if you're a data scientist) is the vast number of high-quality custom libraries that have been written for it. 
Some of these libraries are in the "standard library", meaning you can find them anywhere you run Python. Other libraries can be easily added, even if they aren't always shipped with Python. 
Either way, we'll access this code with imports.


In [1]:
import math

In [3]:
print("It's math! It has type {}".format(type(math)))

It's math! It has type <class 'module'>


math is a module. A module is just a collection of variables (a namespace, if you like) defined by someone else. We can see all the names in math using the built-in function dir().
--> Un módulo en Python es como una caja de herramientas: contiene muchas funciones y constantes útiles agrupadas bajo un mismo nombre.
En este caso, math contiene funciones matemáticas que no vienen “de fábrica” con Python, pero que puedes usar si lo importas

In [5]:
#La función dir() te muestra todos los nombres (funciones, constantes, variables internas...) 
#que están dentro del módulo math.
print(dir(math))

['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'cbrt', 'ceil', 'comb', 'copysign', 'cos', 'cosh', 'degrees', 'dist', 'e', 'erf', 'erfc', 'exp', 'exp2', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'isqrt', 'lcm', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'nextafter', 'perm', 'pi', 'pow', 'prod', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'sumprod', 'tan', 'tanh', 'tau', 'trunc', 'ulp']


We can access these variables using dot syntax. Some of them refer to simple values, like math.pi

In [11]:
print("This is pi to 4 significant digits = {:.4}".format(math.pi))

This is pi to 4 significant digits = 3.142


But most of what we'll find in the module are functions, like math.log:

In [13]:
math.log(32, 2)

5.0

If we don't know what math.log does, we can call help() on it:

In [15]:
help(math.log)

Help on built-in function log in module math:

log(...)
    log(x, [base=math.e])
    Return the logarithm of x to the given base.

    If the base is not specified, returns the natural logarithm (base e) of x.



We can also call help() on the module itself. This will give us the combined documentation for all the functions and values in the module (as well as a high-level description of the module). Click the "output" button to see the whole math help page.

In [17]:
help(math)

Help on built-in module math:

NAME
    math

DESCRIPTION
    This module provides access to the mathematical functions
    defined by the C standard.

FUNCTIONS
    acos(x, /)
        Return the arc cosine (measured in radians) of x.

        The result is between 0 and pi.

    acosh(x, /)
        Return the inverse hyperbolic cosine of x.

    asin(x, /)
        Return the arc sine (measured in radians) of x.

        The result is between -pi/2 and pi/2.

    asinh(x, /)
        Return the inverse hyperbolic sine of x.

    atan(x, /)
        Return the arc tangent (measured in radians) of x.

        The result is between -pi/2 and pi/2.

    atan2(y, x, /)
        Return the arc tangent (measured in radians) of y/x.

        Unlike atan(y/x), the signs of both x and y are considered.

    atanh(x, /)
        Return the inverse hyperbolic tangent of x.

    cbrt(x, /)
        Return the cube root of x.

    ceil(x, /)
        Return the ceiling of x as an Integral.

        This i

**Other import syntax**

If we know we'll be using functions in math frequently we can import it under a shorter alias to save some typing. 
(You may have seen code that does this with certain popular libraries like Pandas, Numpy, Tensorflow, or Matplotlib. For example, it's a common convention to import numpy as np and import pandas as pd.)

In [19]:
import math as mt
mt.pi

3.141592653589793

The as simply renames the imported module. It's equivalent to doing something like:
```
import math
mt = mat
```h

Wouldn't it be great if we could refer to all the variables in the math module by themselves? i.e. if we could just refer to pi instead of math.pi or mt.pi? Good news: we can do that.

In [21]:
from math import *
print(pi, log(32, 2))

3.141592653589793 5.0


import * makes **all the module's variables directly accessible** to you (without any dotted prefix).
Bad news: some purists might grumble at you for doing this.

Worse: they kind of have a point.

In [23]:
from math import *
from numpy import *
print(pi, log(32, 2))

TypeError: return arrays must be of ArrayType

These kinds of "star imports" can occasionally lead to weird, difficult-to-debug situations.

The problem in this case is that the math and numpy modules both have functions called log, but they have different semantics. Because we import from numpy second, its log overwrites (or "shadows") the log variable we imported from mat
.

A good compromise is to import only the specific things we'll need from each module:

In [25]:
from math import log, pi
from numpy import asarray
print(pi, log(32, 2))

3.141592653589793 5.0


**Submodules**

We've seen that modules contain variables which can refer to functions or values. Something to be aware of is that they can also have variables referring to other modules.

In [27]:
import numpy
print("numpy.random is a", type(numpy.random))
print("it contains names such as...",
      dir(numpy.random)[-15:]
     )

numpy.random is a <class 'module'>
it contains names such as... ['set_bit_generator', 'set_state', 'shuffle', 'standard_cauchy', 'standard_exponential', 'standard_gamma', 'standard_normal', 'standard_t', 'test', 'triangular', 'uniform', 'vonmises', 'wald', 'weibull', 'zipf']


So if we import numpy as above, then calling a function in the random "submodule" will require two dots.

In [29]:
# Roll 10 dice
rolls = numpy.random.randint(low=1, high=6, size=10)
rolls

array([3, 1, 2, 1, 5, 2, 3, 4, 2, 5])

**Tools**

As you work with various libraries for specialized tasks, you'll find that they define their own types which you'll have to learn to work with. For example, if you work with the graphing library matplotlib, you'll be coming into contact with objects it defines which represent Subplots, Figures, TickMarks, and Annotations. pandas functions will give you DataFrames and Series.

**Three tools for understanding strange objects**

In the cell above, we saw that calling a numpy function gave us an "array". We've never seen anything like this before (not in this course anyways). But don't panic: we have three familiar builtin functions to help us here.

**1: type()** --> 'what is this thing?'

In [31]:
type(rolls)

numpy.ndarray

**2: dir()** --> 'what can i do with it?'

In [33]:
print(dir(rolls))

['T', '__abs__', '__add__', '__and__', '__array__', '__array_finalize__', '__array_function__', '__array_interface__', '__array_prepare__', '__array_priority__', '__array_struct__', '__array_ufunc__', '__array_wrap__', '__bool__', '__buffer__', '__class__', '__class_getitem__', '__complex__', '__contains__', '__copy__', '__deepcopy__', '__delattr__', '__delitem__', '__dir__', '__divmod__', '__dlpack__', '__dlpack_device__', '__doc__', '__eq__', '__float__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__iand__', '__ifloordiv__', '__ilshift__', '__imatmul__', '__imod__', '__imul__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__ior__', '__ipow__', '__irshift__', '__isub__', '__iter__', '__itruediv__', '__ixor__', '__le__', '__len__', '__lshift__', '__lt__', '__matmul__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', 

In [35]:
# If I want the average roll, the "mean" method looks promising...
rolls.mean()

2.8

In [37]:
# Or maybe I just want to turn the array into a list, in which case I can use "tolist"
rolls.tolist()

[3, 1, 2, 1, 5, 2, 3, 4, 2, 5]

**3: help()** --> 'tell me more'

In [39]:
help(rolls.ravel)

Help on built-in function ravel:

ravel(...) method of numpy.ndarray instance
    a.ravel([order])

    Return a flattened array.

    Refer to `numpy.ravel` for full documentation.

    See Also
    --------
    numpy.ravel : equivalent function

    ndarray.flat : a flat iterator on the array.



In [None]:
# Okay, just tell me everything there is to know about numpy.ndarray
help(rolls)

**Operator overloading**

Es cuando un operador como +, -, *, ==, etc. se comporta de manera diferente dependiendo del tipo de datos que está usando.
```
# Para números
print(3 + 4)  # 7

# Para strings
print("Hola" + " Mundo")  # "Hola Mundo"

# Para listas
print([1, 2] + [3, 4])  # [1, 2, 3, 4]

In [43]:
[3, 4, 1, 2, 2, 1] + 10

TypeError: can only concatenate list (not "int") to list

In [45]:
rolls + 10

array([13, 11, 12, 11, 15, 12, 13, 14, 12, 15])

We might think that Python strictly polices how pieces of its core syntax behave such as +, <, in, ==, or square brackets for indexing and slicing. But in fact, it takes a very hands-off approach. When you define a new type, you can choose how addition works for it, or what it means for an object of that type to be equal to something else.

The designers of lists decided that adding them to numbers wasn't allowed. The designers of numpy arrays went a different way (adding the number to each element of the array).

Here are a few more examples of how numpy arrays interact unexpectedly with Python operators (or at least differently from lists).

In [47]:
# At which indices are the dice less than or equal to 3?
rolls <= 3

array([ True,  True,  True,  True, False,  True,  True, False,  True,
       False])

In [53]:
list = [1,2,3,4,5]
list<=3

TypeError: '<=' not supported between instances of 'list' and 'int'

In [55]:
xlist = [[1,2,3],[2,4,6],]
# Create a 2-dimensional array
x = numpy.asarray(xlist)
print("xlist = {}\nx =\n{}".format(xlist, x))

xlist = [[1, 2, 3], [2, 4, 6]]
x =
[[1 2 3]
 [2 4 6]]


In [57]:
# Get the last element of the second row of our numpy array
x[1,-1]

6

In [63]:
# Get the last element of the second sublist of our nested list?
xlist[1,-1]

TypeError: list indices must be integers or slices, not tuple

numpy's ndarray type is specialized for working with multi-dimensional data, so it defines its own logic for indexing, allowing us to index by a tuple to specify the index at each dimension.

¿Qué es un ndarray?
Es la estructura central de NumPy. Significa N-dimensional array.
Imaginálo como una lista, pero súper poderosa: puede ser 1D (una lista normal), 2D (como una tabla), 3D (como una pila de tablas), y más.
NumPy define su propia lógica de indexado, permitiendo acceder a elementos por tupla, *indicando la posición en cada dimensión*.

*When does 1 + 1 not equal 2?*
Things can get weirder than this. You may have heard of (or even used) tensorflow, a Python library popularly used for deep learning. It makes extensive use of operator overloading.

In [None]:
import tensorflow as tf
# Create two constants, each with value 1
a = tf.constant(1)
b = tf.constant(1)
# Add them together to get...
a + b

#<tf.Tensor: shape=(), dtype=int32, numpy=2>

a + b isn't 2, it is (to quote tensorflow's documentation)...
a symbolic handle to one of the outputs of an Operation. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow tf.Session.
It's important just to be aware of the fact that this sort of thing is possible and that libraries will often use operator overloading in non-obvious or magical-seeming ways.
Understanding how Python's operators work when applied to ints, strings, and lists is no guarantee that you'll be able to immediately understand what they do when applied to a tensorflow Tensor, or a numpy ndarray, or a pandas DataFrame.
Once you've had a little taste of DataFrames, for example, an expression like the one below starts to look appealingly intuitive:






In [None]:
# Get the rows with population over 1m in South America
df[(df['population'] > 10**6) & (df['continent'] == 'South America')]

But why does it work? The example above features something like 5 different overloaded operators. What's each of those operations doing? It can help to know the answer when things start going wrong.

*Curious how it all works?*
Have you ever called help() or dir() on an object and wondered what the heck all those names with the double-underscores were?

In [84]:
print(dir(list))

['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']


This turns out to be directly related to operator overloading.

When Python programmers want to define how operators behave on their types, they do so by implementing methods with special names beginning and ending with 2 underscores such as __lt__, __setattr__, or __contains__. Generally, names that follow this double-underscore format have a special meaning to Python.

So, for example, the expression x in [1, 2, 3] is actually calling the list method __contains__ behind-the-scenes. It's equivalent to (the much uglier) [1, 2, 3].__contains__(x).

If you're curious to learn more, you can check out Python's official documentation, which describes many, many more of these special "underscores" methods.

We won't be defining our own types in these lessons (if only there was time!), but I hope you'll get to experience the joys of defining your own wonderful, weird types later down the road.

**EXERCISES**

**Exercise 3**

Suppose we wanted to create a new type to represent hands in blackjack. One thing we might want to do with this type is overload the comparison operators like > and <= so that we could use them to check whether one hand beats another. e.g. it'd be cool if we could do this:

>>> hand1 = BlackjackHand(['K', 'A'])
>>> hand2 = BlackjackHand(['7', '10', 'A'])
>>> hand1 > hand2
True
Well, we're not going to do all that in this question (defining custom classes is a bit beyond the scope of these lessons), but the code we're asking you to write in the function below is very similar to what we'd have to write if we were defining our own BlackjackHand class. (We'd put it in the __gt__ magic method to define our custom behaviour for >.)

Fill in the body of the blackjack_hand_greater_than function according to the docstring.e:

In [71]:
def blackjack_hand_greater_than(hand_1, hand_2):
    """
    Return True if hand_1 beats hand_2, and False otherwise.
    
    In order for hand_1 to beat hand_2 the following must be true:
    - The total of hand_1 must not exceed 21
    - The total of hand_1 must exceed the total of hand_2 OR hand_2's total must exceed 21
    
    Hands are represented as a list of cards. Each card is represented by a string.
    
    When adding up a hand's total, cards with numbers count for that many points. Face
    cards ('J', 'Q', and 'K') are worth 10 points. 'A' can count for 1 or 11.
    
    When determining a hand's total, you should try to count aces in the way that 
    maximizes the hand's total without going over 21. e.g. the total of ['A', 'A', '9'] is 21,
    the total of ['A', 'A', '9', '3'] is 14.
    
    Examples:
    >>> blackjack_hand_greater_than(['K'], ['3', '4'])
    True
    >>> blackjack_hand_greater_than(['K'], ['10'])
    False
    >>> blackjack_hand_greater_than(['K', 'K', '2'], ['3'])
    False
    """
    #set the dict of values
    dict_A = {"K":10, "Q":10, "J":10, "2":2, "3":3, "4":4, "5":5, "6":6, "7":7, "8":8, "9":9, "10":10}

    #Count values for all cards except As
    list_A1 = sum([dict_A[i] for i in hand_1 if i in dict_A])
    list_A2 = sum([dict_A[i] for i in hand_2 if i in dict_A])
    
    #Sum value of A depending on total value of the rest cards
    for i in hand_1:
        if i == "A":
            if (list_A1) <=9:
                list_A1 += 11
            else:
                list_A1 += 1
    for i in hand_2:
        if i == "A":
            if (list_A2) <=9:
                list_A2 += 11
            else:
                list_A2 += 1

    #Print total points of each hand
    print("Hand 1: {} \nHand 2: {}".format(list_A1, list_A2))
    
    #Result by comparing the two hands (with conditions indicated)
    result = list_A1 <=21 and (list_A1>list_A2 or list_A2>21)
    print("Result -->", result)
    return result
    
print(blackjack_hand_greater_than(['K'], ['3', '4']))
print(blackjack_hand_greater_than(['K'], ['10']))
print(blackjack_hand_greater_than(['K', 'K', '2'], ['3']))
print(blackjack_hand_greater_than(['A', 'A', '9'], ['A', 'A', '9', '3']))

Hand 1: 10 
Hand 2: 7
Result --> True
True
Hand 1: 10 
Hand 2: 10
Result --> False
False
Hand 1: 22 
Hand 2: 3
Result --> False
False
Hand 1: 21 
Hand 2: 14
Result --> True
True


In [75]:
#Solution:
def hand_total(hand):
    """Helper function to calculate the total points of a blackjack hand.
    """
    total = 0
    # Count the number of aces and deal with how to apply them at the end.
    aces = 0
    for card in hand:
        if card in ['J', 'Q', 'K']:
            total += 10
        elif card == 'A':
            aces += 1
        else:
            # Convert number cards (e.g. '7') to ints
            total += int(card)
    # At this point, total is the sum of this hand's cards *not counting aces*.

    # Add aces, counting them as 1 for now. This is the smallest total we can make from this hand
    total += aces
    # "Upgrade" aces from 1 to 11 as long as it helps us get closer to 21
    # without busting
    while total + 10 <= 21 and aces > 0:
        # Upgrade an ace from 1 to 11
        total += 10
        aces -= 1
    return total

def blackjack_hand_greater_than(hand_1, hand_2):
    total_1 = hand_total(hand_1)
    total_2 = hand_total(hand_2)
    return total_1 <= 21 and (total_1 > total_2 or total_2 > 21)


print(blackjack_hand_greater_than(['K'], ['3', '4']))
print(blackjack_hand_greater_than(['K'], ['10']))
print(blackjack_hand_greater_than(['K', 'K', '2'], ['3']))
print(blackjack_hand_greater_than(['A', 'A', '9'], ['A', 'A', '9', '3']))

True
False
False
True


**Exercise 2:**

In [35]:
def best_items(racers):
    """Given a list of racer dictionaries, return a dictionary mapping items to the number
    of times those items were picked up by racers who finished in first place.
    [
    {'name': 'Peach', 'items': ['green shell', 'banana', 'green shell',], 'finish': 3},
    {'name': 'Bowser', 'items': ['green shell',], 'finish': 1},
    # Sometimes the racer's name wasn't recorded
    {'name': None, 'items': ['mushroom',], 'finish': 2},
    {'name': 'Toad', 'items': ['green shell', 'mushroom'], 'finish': 1},
    ]
    """
    dict = {}
    for racer in racers:
        if racer["finish"] == 1:
            for item in racer["items"]:
                if item in dict:
                    dict [item] += 1
                if item not in dict:
                    dict [item] = 1
    return dict     
    
print(best_items([
    {'name': 'Peach', 'items': ['green shell', 'banana', 'green shell',], 'finish': 1},
    {'name': 'Bowser', 'items': ['green shell',], 'finish': 1},
    # Sometimes the racer's name wasn't recorded
    {'name': None, 'items': ['mushroom',], 'finish': 2},
    {'name': 'Toad', 'items': ['green shell', 'mushroom'], 'finish': 1},
    ]))

{'green shell': 4, 'banana': 1, 'mushroom': 1}
