## writing-functions-in-python/best-practices



## Best Practices
Free
The goal of this course is to transform you into a Python expert, and so the first chapter starts off with best practices when writing functions. You'll cover docstrings and why they matter and how to know when you need to turn a chunk of code into a function. You will also learn the details of how Python passes arguments to functions, as well as some common gotchas that can cause debugging headaches when calling functions.
Play Chapter Now

Docstrings       50 xp
Crafting a docstring       100 xp
Retrieving docstrings       100 xp
Docstrings to the rescue!       50 xp
DRY and "Do One Thing"       50 xp
Extract a function       100 xp
Split up a function       100 xp
Pass by assignment       50 xp
Mutable or immutable?       50 xp
Best practice for default arguments       100 xp

## Context Managers
If you've ever seen the "with" keyword in Python and wondered what its deal was, then this is the chapter for you! Context managers are a convenient way to provide connections in Python and guarantee that those connections get cleaned up when you are done using them. This chapter will show you how to use context managers, as well as how to write your own.
Play Chapter Now

Using context managers       50 xp
The number of cats       100 xp
The speed of cats       100 xp
Writing context managers       50 xp
The timer() context manager       100 xp
A read-only open() context manager       100 xp
Advanced topics       50 xp
Context manager use cases       50 xp
Scraping the NASDAQ       100 xp
Changing the working directory       100 xp

## Decorators
Decorators are an extremely powerful concept in Python. They allow you to modify the behavior of a function without changing the code of the function itself. This chapter will lay the foundational concepts needed to thoroughly understand decorators (functions as objects, scope, and closures), and give you a good introduction into how decorators are used and defined. This deep dive into Python internals will set you up to be a superstar Pythonista.
Play Chapter Now

Functions are objects       50 xp
Building a command line data app       100 xp
Reviewing your co-worker's code       100 xp
Returning functions for a math game       100 xp
Scope       50 xp
Understanding scope       50 xp
Modifying variables outside local scope       100 xp
Closures       50 xp
Checking for closure       100 xp
Closures keep your values safe       100 xp
Decorators       50 xp
Using decorator syntax       100 xp
Defining a decorator       100 xp

## More on Decorators
Now that you understand how decorators work under the hood, this chapter gives you a bunch of real-world examples of when and how you would write decorators in your own code. You will also learn advanced decorator concepts like how to preserve the metadata of your decorated functions and how to write decorators that take arguments.
Play Chapter Now

Real-world examples       50 xp
Print the return type       100 xp
Counter       100 xp
Decorators and metadata       50 xp
Preserving docstrings when decorating functions       100 xp
Measuring decorator overhead       100 xp
Decorators that take arguments       50 xp
Run_n_times()       100 xp
HTML Generator       100 xp
Timeout(): a real world example       50 xp
Tag your functions       100 xp
Check the return type       100 xp
Great job!       50 xp

## Docstrings



it should be imperative language:
    for example: split the dataframe and stack the columns.
    instead of "this function will split the dataframe and stack the columns"

'''
1, description of what the function does,
2, description of the arguments, if any,
3, description of return values, if any,
4, descriptions of error raised, if any,
5, optional extra notes or example of useage.
'''


In [None]:
def split_and_stack(df, new_names):
    '''
    split a DataFrame's columns into two halves and then stack
    them vertically, retuning a new DataFrame with 'new_names' as the column names.
    
    Args:
    df (DataFrame): the DataFrameto split.
    new_names (interableof str): the column names for the new DataFrame
    
    Returns:
        DataFrame
    '''
    
    half = int(len(df.columns)/2)
    left = df.iloc[:,:half]
    right = df.iloc[:,half:]
    return pd.DataFrame(
        data = np.vstack([left.values, right.values]), columns = new_names)




In [None]:
Google style:
    
    
def function(arg_1, arg_2=42):
    '''Description of what the function does
    
    Args:
      arg_1 (str): Description of arg_1 that can be break into next line if needed.
      arg_2 (int, optional): Write optional when an argument has a default value
      
    Returns:
      bool: Optional description of the return value
      Extra lines are not indented
      
    Raise:
      ValueError: Include any error types that the fuction intentional raise
      
    Notes:
      see https:www.datacamp.com/community/tutorials/docstrings-python
      for more info
    '''

In [None]:
Numpydoc style:
    
    
def function(arg_1, arg_2=42):
    
    '''
    Description of what the function does
    
    
    Parameters
    ----------
    arg_1 : expected type of arg_1
      Description of arg_1
    arg_2 : int, optional
      Write optional when an argument has a default value
      Default=42
      
      
    Returns
    -------
    The type of the return value
      Can include a description of the return value
      Replace "Return" with "Yields" if the function is a generator
    
    '''

In [5]:
# Retrieving docstrings


def the_answer():
    '''Return the answer to life, the universe, and everything
    
    
    Returns:
      int
      
    '''
    return 42



print(the_answer.__doc__)


import inspect
print(inspect.getdoc(the_answer))

Return the answer to life, the universe, and everything
    
    
    Returns:
      int
      
    
Return the answer to life, the universe, and everything


Returns:
  int
  


## Crafting a docstring

You've decided to write the world's greatest open-source natural language processing Python package. It will revolutionize working with free-form text, the way numpy did for arrays, pandas did for tabular data, and scikit-learn did for machine learning.

The first function you write is count_letter(). It takes a string and a single letter and returns the number of times the letter appears in the string. You want the users of your open-source package to be able to understand how this function works easily, so you will need to give it a docstring. Build up a Google Style docstring for this function by following these steps.


Copy the following string and add it as the docstring for the function: Count the number of times `letter` appears in `content`.

In [13]:
# Add a docstring to count_letter()
def count_letter(content, letter):
    '''Count the number of times 'letter' appears in 'content'
    
    Args:
      content (str): The paragraph content of a doc or article
      letter (str): Any letter of 26 letters in alphabet
  
    Return:
      int: The times letter appeaded in whole content
  
    Raise:
      ValueError: Letter should be one character string
  
    Note:
      see https://campus.datacamp.com/courses/writing-functions-in-python/best-practices?ex=2
      for more info
    '''
    
    if (not isinstance(letter, str)) or len(letter) != 1:
        raise ValueError('`letter` must be a single character string.')
    return len([char for char in content if char == letter])




a = 'A static method in Java does not translate to a Python classmethod. Oh sure, \
it results in more or less the same effect, but the goal of a classmethod is actually \
to do something that’s usually not even possible in Java (like inheriting a non-default \
constructor). The idiomatic translation of a Java static method is usually a module-level \
function, not a classmethod or staticmethod. (And static final fields should translate to \
module-level constants.)'

print(count_letter(a,letter = 'f'))


print(count_letter.__doc__)

8
Count the number of times 'letter' appears in 'content'
    
    Args:
      content (str): The paragraph content of a doc or article
      letter (str): Any letter of 26 letters in alphabet
  
    Return:
      int: The times letter appeaded in whole content
  
    Raise:
      ValueError: Letter should be one character string
  
    Note:
      see https://campus.datacamp.com/courses/writing-functions-in-python/best-practices?ex=2
      for more info
    


## Retrieving docstrings


You and a group of friends are working on building an amazing new Python IDE (integrated development environment -- like PyCharm, Spyder, Eclipse, Visual Studio, etc.). The team wants to add a feature that displays a tooltip with a function's docstring whenever the user starts typing the function name. That way, the user doesn't have to go elsewhere to look up the documentation for the function they are trying to use. You've been asked to complete the build_tooltip() function that retrieves a docstring from an arbitrary function.

You will be reusing the count_letter() function that you developed in the last exercise to show that we can properly extract its docstring.




Begin by getting the docstring for the function count_letter(). Use an attribute of the count_letter() function.

Now use a function from the inspect module to get a better-formatted version of count_letter()'s docstring.
3

Now create a build_tooltip() function that can extract the docstring from any function that we pass to it.

Hint

    We don't want to call the function (e.g. count_letter()). Instead, treat the function as an object (e.g. count_letter.<attribute_name>).
    Try running dir(count_letter) in the shell to see a list of all of the attributes that the function has.





Now use a function from the inspect module to get a better-formatted version of count_letter()'s docstring.

Hint

    Try running dir(inspect) in the shell to see the names of all of the available functions in the inspect module.





Now create a build_tooltip() function that can extract the docstring from any function that we pass to it.



In [221]:
# Add a docstring to count_letter()
def count_letter(content, letter):
    '''Count the number of times 'letter' appears in 'content'
    
    Args:
      content (str): The paragraph content of a doc or article
      letter (str): Any letter of 26 letters in alphabet
  
    Return:
      int: The times letter appeaded in whole content
  
    Raise:
      ValueError: Letter should be one character string
  
    Note:
      see https://campus.datacamp.com/courses/writing-functions-in-python/best-practices?ex=2
      for more info
    '''
    
    if (not isinstance(letter, str)) or len(letter) != 1:
        raise ValueError('`letter` must be a single character string.')
    return len([char for char in content if char == letter])




a = 'A static method in Java does not translate to a Python classmethod. Oh sure, \
it results in more or less the same effect, but the goal of a classmethod is actually \
to do something that’s usually not even possible in Java (like inheriting a non-default \
constructor). The idiomatic translation of a Java static method is usually a module-level \
function, not a classmethod or staticmethod. (And static final fields should translate to \
module-level constants.)'

print(count_letter(a,letter = 's'))

dir(count_letter)
#?count_letter

import inspect
dir(inspect)
print(inspect.getdoc(count_letter))
print()
print()
print(count_letter.__doc__)

33
Count the number of times 'letter' appears in 'content'

Args:
  content (str): The paragraph content of a doc or article
  letter (str): Any letter of 26 letters in alphabet

Return:
  int: The times letter appeaded in whole content

Raise:
  ValueError: Letter should be one character string

Note:
  see https://campus.datacamp.com/courses/writing-functions-in-python/best-practices?ex=2
  for more info


Count the number of times 'letter' appears in 'content'
    
    Args:
      content (str): The paragraph content of a doc or article
      letter (str): Any letter of 26 letters in alphabet
  
    Return:
      int: The times letter appeaded in whole content
  
    Raise:
      ValueError: Letter should be one character string
  
    Note:
      see https://campus.datacamp.com/courses/writing-functions-in-python/best-practices?ex=2
      for more info
    


In [31]:
import inspect

def build_tooltip(function):
    """Create a tooltip for any function that shows the
    function's docstring.
    
    Args:
      function (callable): The function we want a tooltip for.

    Returns:
      str
    """
    
    # Get the docstring for the "function" argument by using inspect
    docstring = inspect.getdoc(function)
    border = '#' * 28
    return '{} \n{} \n{}'.format(border, docstring, border)

print(build_tooltip(count_letter))
#print(build_tooltip(range))
#print(build_tooltip(print))

############################ 
Count the number of times 'letter' appears in 'content'

Args:
  content (str): The paragraph content of a doc or article
  letter (str): Any letter of 26 letters in alphabet

Return:
  int: The times letter appeaded in whole content

Raise:
  ValueError: Letter should be one character string

Note:
  see https://campus.datacamp.com/courses/writing-functions-in-python/best-practices?ex=2
  for more info 
############################


## Docstrings to the rescue!

Some maniac has corrupted your installation of numpy! All of the functions still exist, but they've been given random names. You desperately need to call the numpy.histogram() function and you don't have time to reinstall the package. Fortunately for you, the maniac didn't think to alter the docstrings, and you know how to access them. numpy has a lot of functions in it, so we've narrowed it down to four possible functions that could be numpy.histogram() in disguise: numpy.leyud(), numpy.uqka(), numpy.fywdkxa() or numpy.jinzyxq().

Examine each of these functions' docstrings in the IPython shell to determine which of them is actually numpy.histogram().


Possible Answers

    numpy.leyud()
    numpy.uqka()
    numpy.fywdkxa()
    numpy.jinzyxq()
    

Hint

    To view a function's docstring, you can either use print(function_name.__doc__) or print(inspect.getdoc(function_name)).


In [196]:
import numpy as numpy

#numpy.leyud.__doc__

#numpy.uqka.__doc__

#numpy.fywdkxa.__doc__

#print(numpy.jinzyxq.__doc__)

#print(numpy.array.__doc__)


print(numpy.histogram.__doc__)
# ----------------------------------------------------------------------------- #


    Compute the histogram of a set of data.

    Parameters
    ----------
    a : array_like
        Input data. The histogram is computed over the flattened array.
    bins : int or sequence of scalars or str, optional
        If `bins` is an int, it defines the number of equal-width
        bins in the given range (10, by default). If `bins` is a
        sequence, it defines a monotonically increasing array of bin edges,
        including the rightmost edge, allowing for non-uniform bin widths.

        .. versionadded:: 1.11.0

        If `bins` is a string, it defines the method used to calculate the
        optimal bin width, as defined by `histogram_bin_edges`.

    range : (float, float), optional
        The lower and upper range of the bins.  If not provided, range
        is simply ``(a.min(), a.max())``.  Values outside the range are
        ignored. The first element of the range must be less than or
        equal to the second. `range` affects the automatic bin
        c

In [None]:
In [2]:
print(numpy.leyud.__doc__)

    Gives a new shape to an array without changing its data.

    Parameters
    ----------
    a : array_like
        Array to be reshaped.
    newshape : int or tuple of ints
        The new shape should be compatible with the original shape. If
        an integer, then the result will be a 1-D array of that length.
        One shape dimension can be -1. In this case, the value is
        inferred from the length of the array and remaining dimensions.
    order : {'C', 'F', 'A'}, optional
        Read the elements of `a` using this index order, and place the
        elements into the reshaped array using this index order.  'C'
        means to read / write the elements using C-like index order,
        with the last axis index changing fastest, back to the first
        axis index changing slowest. 'F' means to read / write the
        elements using Fortran-like index order, with the first index
        changing fastest, and the last index changing slowest. Note that
        the 'C' and 'F' options take no account of the memory layout of
        the underlying array, and only refer to the order of indexing.
        'A' means to read / write the elements in Fortran-like index
        order if `a` is Fortran *contiguous* in memory, C-like order
        otherwise.

    Returns
    -------
    reshaped_array : ndarray
        This will be a new view object if possible; otherwise, it will
        be a copy.  Note there is no guarantee of the *memory layout* (C- or
        Fortran- contiguous) of the returned array.

    See Also
    --------
    ndarray.reshape : Equivalent method.

    Notes
    -----
    It is not always possible to change the shape of an array without
    copying the data. If you want an error to be raised when the data is copied,
    you should assign the new shape to the shape attribute of the array::

     >>> a = np.zeros((10, 2))
     # A transpose makes the array non-contiguous
     >>> b = a.T
     # Taking a view makes it possible to modify the shape without modifying
     # the initial object.
     >>> c = b.view()
     >>> c.shape = (20)
     AttributeError: incompatible shape for a non-contiguous array

    The `order` keyword gives the index ordering both for *fetching* the values
    from `a`, and then *placing* the values into the output array.
    For example, let's say you have an array:

    >>> a = np.arange(6).reshape((3, 2))
    >>> a
    array([[0, 1],
           [2, 3],
           [4, 5]])

    You can think of reshaping as first raveling the array (using the given
    index order), then inserting the elements from the raveled array into the
    new array using the same kind of index ordering as was used for the
    raveling.

    >>> np.reshape(a, (2, 3)) # C-like index ordering
    array([[0, 1, 2],
           [3, 4, 5]])
    >>> np.reshape(np.ravel(a), (2, 3)) # equivalent to C ravel then C reshape
    array([[0, 1, 2],
           [3, 4, 5]])
    >>> np.reshape(a, (2, 3), order='F') # Fortran-like index ordering
    array([[0, 4, 3],
           [2, 1, 5]])
    >>> np.reshape(np.ravel(a, order='F'), (2, 3), order='F')
    array([[0, 4, 3],
           [2, 1, 5]])

    Examples
    --------
    >>> a = np.array([[1,2,3], [4,5,6]])
    >>> np.reshape(a, 6)
    array([1, 2, 3, 4, 5, 6])
    >>> np.reshape(a, 6, order='F')
    array([1, 4, 2, 5, 3, 6])

    >>> np.reshape(a, (3,-1))       # the unspecified value is inferred to be 2
    array([[1, 2],
           [3, 4],
           [5, 6]])
    
In [3]:
print(numpy.uqka.__doc__)

    Returns the indices that would sort an array.

    Perform an indirect sort along the given axis using the algorithm specified
    by the `kind` keyword. It returns an array of indices of the same shape as
    `a` that index data along the given axis in sorted order.

    Parameters
    ----------
    a : array_like
        Array to sort.
    axis : int or None, optional
        Axis along which to sort.  The default is -1 (the last axis). If None,
        the flattened array is used.
    kind : {'quicksort', 'mergesort', 'heapsort', 'stable'}, optional
        Sorting algorithm.
    order : str or list of str, optional
        When `a` is an array with fields defined, this argument specifies
        which fields to compare first, second, etc.  A single field can
        be specified as a string, and not all fields need be specified,
        but unspecified fields will still be used, in the order in which
        they come up in the dtype, to break ties.

    Returns
    -------
    index_array : ndarray, int
        Array of indices that sort `a` along the specified axis.
        If `a` is one-dimensional, ``a[index_array]`` yields a sorted `a`.
        More generally, ``np.take_along_axis(a, index_array, axis=a)`` always
        yields the sorted `a`, irrespective of dimensionality.

    See Also
    --------
    sort : Describes sorting algorithms used.
    lexsort : Indirect stable sort with multiple keys.
    ndarray.sort : Inplace sort.
    argpartition : Indirect partial sort.

    Notes
    -----
    See `sort` for notes on the different sorting algorithms.

    As of NumPy 1.4.0 `argsort` works with real/complex arrays containing
    nan values. The enhanced sort order is documented in `sort`.

    Examples
    --------
    One dimensional array:

    >>> x = np.array([3, 1, 2])
    >>> np.argsort(x)
    array([1, 2, 0])

    Two-dimensional array:

    >>> x = np.array([[0, 3], [2, 2]])
    >>> x
    array([[0, 3],
           [2, 2]])

    >>> np.argsort(x, axis=0)  # sorts along first axis (down)
    array([[0, 1],
           [1, 0]])

    >>> np.argsort(x, axis=1)  # sorts along last axis (across)
    array([[0, 1],
           [0, 1]])

    Indices of the sorted elements of a N-dimensional array:

    >>> ind = np.unravel_index(np.argsort(x, axis=None), x.shape)
    >>> ind
    (array([0, 1, 1, 0]), array([0, 0, 1, 1]))
    >>> x[ind]  # same as np.sort(x, axis=None)
    array([0, 2, 2, 3])

    Sorting with keys:

    >>> x = np.array([(1, 0), (0, 1)], dtype=[('x', '<i4'), ('y', '<i4')])
    >>> x
    array([(1, 0), (0, 1)],
          dtype=[('x', '<i4'), ('y', '<i4')])

    >>> np.argsort(x, order=('x','y'))
    array([1, 0])

    >>> np.argsort(x, order=('y','x'))
    array([0, 1])

    
In [4]:
print(numpy.fywdkxa.__doc__)

    Compute the histogram of a set of data.

    Parameters
    ----------
    a : array_like
        Input data. The histogram is computed over the flattened array.
    bins : int or sequence of scalars or str, optional
        If `bins` is an int, it defines the number of equal-width
        bins in the given range (10, by default). If `bins` is a
        sequence, it defines the bin edges, including the rightmost
        edge, allowing for non-uniform bin widths.

        .. versionadded:: 1.11.0

        If `bins` is a string, it defines the method used to calculate the
        optimal bin width, as defined by `histogram_bin_edges`.

    range : (float, float), optional
        The lower and upper range of the bins.  If not provided, range
        is simply ``(a.min(), a.max())``.  Values outside the range are
        ignored. The first element of the range must be less than or
        equal to the second. `range` affects the automatic bin
        computation as well. While bin width is computed to be optimal
        based on the actual data within `range`, the bin count will fill
        the entire range including portions containing no data.
    normed : bool, optional

        .. deprecated:: 1.6.0

        This is equivalent to the `density` argument, but produces incorrect
        results for unequal bin widths. It should not be used.

        .. versionchanged:: 1.15.0
            DeprecationWarnings are actually emitted.

    weights : array_like, optional
        An array of weights, of the same shape as `a`.  Each value in
        `a` only contributes its associated weight towards the bin count
        (instead of 1). If `density` is True, the weights are
        normalized, so that the integral of the density over the range
        remains 1.
    density : bool, optional
        If ``False``, the result will contain the number of samples in
        each bin. If ``True``, the result is the value of the
        probability *density* function at the bin, normalized such that
        the *integral* over the range is 1. Note that the sum of the
        histogram values will not be equal to 1 unless bins of unity
        width are chosen; it is not a probability *mass* function.

        Overrides the ``normed`` keyword if given.

    Returns
    -------
    hist : array
        The values of the histogram. See `density` and `weights` for a
        description of the possible semantics.
    bin_edges : array of dtype float
        Return the bin edges ``(length(hist)+1)``.


    See Also
    --------
    histogramdd, bincount, searchsorted, digitize, histogram_bin_edges

    Notes
    -----
    All but the last (righthand-most) bin is half-open.  In other words,
    if `bins` is::

      [1, 2, 3, 4]

    then the first bin is ``[1, 2)`` (including 1, but excluding 2) and
    the second ``[2, 3)``.  The last bin, however, is ``[3, 4]``, which
    *includes* 4.


    Examples
    --------
    >>> np.histogram([1, 2, 1], bins=[0, 1, 2, 3])
    (array([0, 2, 1]), array([0, 1, 2, 3]))
    >>> np.histogram(np.arange(4), bins=np.arange(5), density=True)
    (array([ 0.25,  0.25,  0.25,  0.25]), array([0, 1, 2, 3, 4]))
    >>> np.histogram([[1, 2, 1], [1, 0, 1]], bins=[0,1,2,3])
    (array([1, 4, 1]), array([0, 1, 2, 3]))

    >>> a = np.arange(5)
    >>> hist, bin_edges = np.histogram(a, density=True)
    >>> hist
    array([ 0.5,  0. ,  0.5,  0. ,  0. ,  0.5,  0. ,  0.5,  0. ,  0.5])
    >>> hist.sum()
    2.4999999999999996
    >>> np.sum(hist * np.diff(bin_edges))
    1.0

    .. versionadded:: 1.11.0

    Automated Bin Selection Methods example, using 2 peak random data
    with 2000 points:

    >>> import matplotlib.pyplot as plt
    >>> rng = np.random.RandomState(10)  # deterministic random data
    >>> a = np.hstack((rng.normal(size=1000),
    ...                rng.normal(loc=5, scale=2, size=1000)))
    >>> plt.hist(a, bins='auto')  # arguments are passed to np.histogram
    >>> plt.title("Histogram with 'auto' bins")
    >>> plt.show()

    
In [5]:
print(numpy.jinzyxq.__doc__)

    Return an array of zeros with the same shape and type as a given array.

    Parameters
    ----------
    a : array_like
        The shape and data-type of `a` define these same attributes of
        the returned array.
    dtype : data-type, optional
        Overrides the data type of the result.

        .. versionadded:: 1.6.0
    order : {'C', 'F', 'A', or 'K'}, optional
        Overrides the memory layout of the result. 'C' means C-order,
        'F' means F-order, 'A' means 'F' if `a` is Fortran contiguous,
        'C' otherwise. 'K' means match the layout of `a` as closely
        as possible.

        .. versionadded:: 1.6.0
    subok : bool, optional.
        If True, then the newly created array will use the sub-class
        type of 'a', otherwise it will be a base-class array. Defaults
        to True.

    Returns
    -------
    out : ndarray
        Array of zeros with the same shape and type as `a`.

    See Also
    --------
    empty_like : Return an empty array with shape and type of input.
    ones_like : Return an array of ones with shape and type of input.
    full_like : Return a new array with shape of input filled with value.
    zeros : Return a new array setting values to zero.

    Examples
    --------
    >>> x = np.arange(6)
    >>> x = x.reshape((2, 3))
    >>> x
    array([[0, 1, 2],
           [3, 4, 5]])
    >>> np.zeros_like(x)
    array([[0, 0, 0],
           [0, 0, 0]])

    >>> y = np.arange(3, dtype=float)
    >>> y
    array([ 0.,  1.,  2.])
    >>> np.zeros_like(y)
    array([ 0.,  0.,  0.])

    

## DRY and "Do One Thing"



copy and paste code cause problems, mistakes

In [None]:
train = pd.read_csv('abc.csv')
train_y = train['labels'].values
train_x = train[col for col in train.columns id col != 'labels'].values
train_pca = PCA(n_components=2).fit_transform(train_x)
plt.scatter(train_pca[:,0], train_pca[:,1])


val = pd.read_csv('abc.csv')
val_y = val['labels'].values
val_x = val[col for col in val.columns id col != 'labels'].values
val_pca = PCA(n_components=2).fit_transform(val_x)
plt.scatter(val_pca[:,0], val_pca[:,1])


test = pd.read_csv('abc.csv')
test_y = test['labels'].testues
test_x = test[col for col in test.columns id col != 'labels'].testues
test_pca = PCA(n_components=2).fit_transform(test_x)
plt.scatter(test_pca[:,0], test_pca[:,1])



# repeated code like this is a good sign that you should write a function, lets do this

In [None]:
def load_and_plot(path):
    '''Load a data set and plot the first two principal components
    
    Args:
      path (str): The location of csv file
      
    Returns:
      Tuple of numpy ndarray: (features, labels)
    
    '''
    
    data = pd.read_csv(path)
    Y = data['label'].values
    X = data[col for col in data.columns in col != 'label'].values
    pca = PCA(n_components=2).fit_transform(X)
    plt.scatter(pca[:,0], pca[:,1])
    
    return X, Y



train_X, train_y = load_and_plot('train.csv')
val_X, val_y = load_and_plot('val.csv')
test_X, test_y = load_and_plot('test.csv')


## Wrapping the repeated logic in a function and then calling that function several times.

## Every function should have one responsibility, Do One Thing principle




************************** THINK

In [45]:
import pandas as pd

def load_data(path):
    '''Load a data set
    
    Args:
      path (str): The locatio of csv file
      
    Returns:
      Tuple of ndarray: (features, labels)
      
    '''
    
    df = pd.read_csv(path)
    y = df['labels'].values
    X = df[[i for i in list(df.columns) if i != 'labels']].values
    #X = df[[i for i in df.columns if i != 'labels']].values
    #df[[i for i in list(df.columns) if i not in [list_of_columns_to_exclude]]]
    #df = pd.DataFrame([[i] for i in range(10)], columns=['num'])

    
    return X,  y


load_data('train.csv')

(array([[1, 'Cliff', 'DataScientist', 800000],
        [2, 'Frank', 'DataEngineer', 900008],
        [3, 'Steve', 'PythonDeveloper', 900001],
        [4, 'Coco', 'DataEngineer', 900002],
        [5, 'John', 'DataScientist', 900003]], dtype=object),
 array([1, 0, 0, 3, 1]))

In [224]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    #y = df['labels'].values
    y = df.loc[df.columns == 'labels']
        # <= pandas.DataFrame.loc, Single label. Note this returns the row as a Series

    X = df.loc[:, df.columns != 'labels']  # ******************************************************* #
        # <= pandas.DataFrame.loc, List of labels. Note using [[]] returns a DataFrame.

        
    #X = pd.DataFrame([i] for i in df.columns, if i != 'labels')
    #df = pd.DataFrame([[i] for i in range(10)], columns=['num'])
    #df.loc[:, df.columns != 'b']

    
    return X,  y


load_data('train.csv')
#ff = load_data('train.csv')
#?ff  # <= tuple return

(   id   name              job  salary
 0   1  Cliff    DataScientist  800000
 1   2  Frank     DataEngineer  900008
 2   3  Steve  PythonDeveloper  900001
 3   4   Coco     DataEngineer  900002
 4   5   John    DataScientist  900003,
    id  name            job  salary  labels
 4   5  John  DataScientist  900003       1)

In [226]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    
    y = df['labels']    
    # <= pandas.DataFrame.loc, Single label. Note this returns the row as a Series
    
    # In Pandas, we can select a single column with just using the index operator [],
    #   but without list as argument. However, the resulting object is a Pandas series 
    #   instead of Pandas Dataframe. For example, if we use df[‘A’], we would have 
    #   selected the single column as Pandas Series object
    
    X = df[['id', 'name', 'job', 'salary']]
    # <= pandas.DataFrame.loc, List of labels. Note using [[]] returns a DataFrame.
    
    return X,  y


load_data('train.csv')

(   id   name              job  salary
 0   1  Cliff    DataScientist  800000
 1   2  Frank     DataEngineer  900008
 2   3  Steve  PythonDeveloper  900001
 3   4   Coco     DataEngineer  900002
 4   5   John    DataScientist  900003,
 0    1
 1    0
 2    0
 3    3
 4    1
 Name: labels, dtype: int64)

In [227]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    #y = df['labels'].values
    X = df[['id', 'name', 'job', 'salary']]
    # <= pandas.DataFrame.loc, List of labels. *******Note using [[]] returns a DataFrame.
    
    return X


load_data('train.csv')

Unnamed: 0,id,name,job,salary
0,1,Cliff,DataScientist,800000
1,2,Frank,DataEngineer,900008
2,3,Steve,PythonDeveloper,900001
3,4,Coco,DataEngineer,900002
4,5,John,DataScientist,900003


In [69]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    y = df[['labels']]
    #X = df[['id', 'name', 'job', 'salary']]
    # <= pandas.DataFrame.loc, List of labels. Note using [[]] returns a DataFrame.
    
    return y


load_data('train.csv')

Unnamed: 0,labels
0,1
1,0
2,0
3,3
4,1


In [47]:
import pandas as pd

df = pd.read_csv('train.csv')
df

Unnamed: 0,id,name,job,salary,labels
0,1,Cliff,DataScientist,800000,1
1,2,Frank,DataEngineer,900008,0
2,3,Steve,PythonDeveloper,900001,0
3,4,Coco,DataEngineer,900002,3
4,5,John,DataScientist,900003,1


In [62]:
import pandas as pd

df = pd.read_csv('train.csv')
df['labels']    # <= returns a panda series
                # <=     A Pandas Series is like a column in a table.
                # <=     It is a one-dimensional array holding data of any type.


# ---------------------------------------------------------------------------------------- #
# go read this: 
# https://cmdlinetips.com/2020/04/3-ways-to-select-one-or-more-columns-with-pandas/

0    1
1    0
2    0
3    3
4    1
Name: labels, dtype: int64

In [51]:
import pandas as pd

df = pd.read_csv('train.csv')
df[['labels']]

Unnamed: 0,labels
0,1
1,0
2,0
3,3
4,1


In [32]:
import pandas as pd

df = pd.read_csv('train.csv')
df['labels'].values

array([1, 0, 0, 3, 1])

In [50]:
import pandas as pd

df = pd.read_csv('train.csv')
df[['labels']].values

array([[1],
       [0],
       [0],
       [3],
       [1]])

In [230]:
#print(i*i for i in range(10) if i%2==0)

#print(sum(i*i for i in range(4) if i%2 != 0),sum(i*i for i in range(7) if i%2 == 1))

print(i*i for i in range(4) if i%2 != 0)

print(list(i*i for i in range(4) if i%2 != 0))

print(tuple(i*i for i in range(4) if i%2 != 0))

<generator object <genexpr> at 0x7fb2c21ebeb0>
[1, 9]
(1, 9)


In [None]:
def plot_data(X):
    '''Plot the first two principal components of a matrix
    
    Args:
      X (numpy.ndarray): The data to plot
    
    '''
    
    pca = PCA(n_components=2).fit_transform(X)
    plot.scaller(pca[:,0], pca[:,1])
    
    

In [85]:
import pandas as pd

data = pd.read_csv('train.csv')
print(data.columns)
list(data.columns)
#X = data[col for col in data.columns if col != 'labels'].values
#X

Index(['id', ' name', ' job', ' salary', ' labels'], dtype='object')


['id', ' name', ' job', ' salary', ' labels']

In [231]:
import pandas as pd

data = pd.read_csv('train.csv')
data.columns
print(col for col in data.columns)
#X = data[col for col in list(data.columns) if col != 'labels'].values

X = data[[col for col in list(data.columns) if col != 'labels']].values
X

<generator object <genexpr> at 0x7fb2c21d9f90>


array([[1, 'Cliff', 'DataScientist', 800000],
       [2, 'Frank', 'DataEngineer', 900008],
       [3, 'Steve', 'PythonDeveloper', 900001],
       [4, 'Coco', 'DataEngineer', 900002],
       [5, 'John', 'DataScientist', 900003]], dtype=object)

In [92]:
for col in list(data.columns): print(col)

id
 name
 job
 salary
 labels


In [93]:
print([col] for col in list(data.columns))

<generator object <genexpr> at 0x7fb14169d740>


## Exercise
Exercise
Extract a function

While you were developing a model to predict the likelihood of a student graduating from college, you wrote this bit of code to get the z-scores of students' yearly GPAs. Now you're ready to turn it into a production-quality system, so you need to do something about the repetition. Writing a function to calculate the z-scores would improve this code.

# Standardize the GPAs for each year
df['y1_z'] = (df.y1_gpa - df.y1_gpa.mean()) / df.y1_gpa.std()
df['y2_z'] = (df.y2_gpa - df.y2_gpa.mean()) / df.y2_gpa.std()
df['y3_z'] = (df.y3_gpa - df.y3_gpa.mean()) / df.y3_gpa.std()
df['y4_z'] = (df.y4_gpa - df.y4_gpa.mean()) / df.y4_gpa.std()

Note: df is a pandas DataFrame where each row is a student with 4 columns of yearly student GPAs: y1_gpa, y2_gpa, y3_gpa, y4_gpa




    Finish the function so that it returns the z-scores of a column.
    Use the function to calculate the z-scores for each year (df['y1_z'], df['y2_z'], etc.) from the raw GPA scores (df.y1_gpa, df.y2_gpa, etc.).

Hint

    Notice how (df.y1_gpa - df.y1_gpa.mean()) / df.y1_gpa.std() is only performing operations on df.y1_gpa. So you should be able to pass df.y1_gpa as the column argument to the standardize() function.


In [None]:
def standardize(column):
    """Standardize the values in a column.

    Args:
      column (pandas Series): The data to standardize.

    Returns:
      pandas Series: the values as z-scores
    """
    # Finish the function so that it returns the z-scores
    z_score = (column - column.mean()) / column.std()
    return z_score

# Use the standardize() function to calculate the z-scores
df['y1_z'] = standardize(df.y1_gpa)
df['y2_z'] = standardize(df.y2_gpa)
df['y3_z'] = standardize(df.y3_gpa)
df['y4_z'] = standardize(df.y4_gpa)

## Split up a function

Another engineer on your team has written this function to calculate the mean and median of a sorted list. You want to show them how to split it into two simpler functions: mean() and median()

def mean_and_median(values):
  """Get the mean and median of a sorted list of `values`

  Args:
    values (iterable of float): A list of numbers

  Returns:
    tuple (float, float): The mean and median
  """
  mean = sum(values) / len(values)
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]

  return mean, median



Write the mean() function.

In [48]:
def mean_and_median(values):
    """Get the mean and median of a sorted list of `values`

    Args:
      values (iterable of float): A list of numbers

    Returns:
      tuple (float, float): The mean and median
    """
    mean = sum(values) / len(values)
    midpoint = int(len(values) / 2)
    if len(values) % 2 == 0:
        median = (values[midpoint - 1] + values[midpoint]) / 2
    else:
        median = values[midpoint]

    return mean, median    # <= Why returned value data structure is tuple



values = [1,3,4,7,2,8,9,12,13]
mean_and_median(values)

(6.555555555555555, 2)

In [49]:
def circleInfo(r):
    """ Return (circumference, area) of a circle of radius r """
    c = 2 * 3.14159 * r
    a = 3.14159 * r * r
    return (c, a)

print(circleInfo(10))

(62.8318, 314.159)


In [None]:
def mean(values):
    """Get the mean of a sorted list of values
  
    Args:
      values (iterable of float): A list of numbers
  
    Returns:
      float
    """
    # Write the mean() function
    mean = sum(values)/len(values)
    return mean

In [None]:
def median(values):
    """Get the median of a sorted list of values
  
    Args:
      values (iterable of float): A list of numbers
  
    Returns:
      float
    """
    # Write the median() function
    midpoint = int(len(values)/2)
    if len(values)%2 == 0:
        median = (values[midpoint-1] + values[midpoint])/2
    else:
        median = values[midpoint]
    return median

## Pass by assignment





The way that Python passes information to functions is different from many other languages, it is referred to as pass by assignment



__lets say we have a function fool() that takes a list and set the first value of list to 99, 

__then we set my_list to the value[1,2,3] and pass it to fool()

      what do you expect the value of my_list after calling fool()?
      
      
        **list in Python are mutable objects, meaning that they can be changed
        
        
        
        
__now lets say we have another function bar(), that takes an argument and add ninety to it,  

__then we assign the value 3 to the variable my_var, and call bar() with my_var as argument. 

      what do you expect the value of my_var, to be after we've called bar()?
      
      
        **in Python, integers are immutable, meaning they cant be changed

In [79]:
def fool(x):
    x[0] = 90
    return x


my_list = [1,2,3]

fool(my_list)

[90, 2, 3]

In [80]:
def fool(x):
    x[0] = 90


my_list = [1,2,3]

fool(my_list)

print(my_list)

[90, 2, 3]


In [82]:
def bar(x):
    x = x + 90
    return x
    
    
my_var = 3

bar(my_var)

93

In [83]:
def bar(x):
    x = x + 90
    
    
my_var = 3

bar(my_var)

print(my_var)

3


## Digging deeper





a = [1, 2, 3]    __<== you add new item to the group a, or you change the first item to others

__>                          <== you assigned value 3 to my_var, then 
my_var = 3,  x = x + 90    __<== my_var value was assigned to x, and x then changed, but not my_var



## In fact, there is no way in Python to change x or my_var directly, because integers are  immutable variables. 



__Whe we asign a list to the variable my_list, it set up a location in memory for it.  
__Then, when we pass my_list to the function fool(), the parameter x gets assigned to that same location. 
__So when the function modifies the thing that x points to, it also modifying the thing that my_list point to. 


__In the other example, we created a variable my_var, and assigned it the value 3, 
__Then we passed it the the function bar(), which caused argument x to point sameplace my_var is pointing
__But the bar() function assigns x to a new value, so the my_var variable isnt touched. 



## Immutable or Mutable



__there are only a few immutable  data types  in Python because almost everything is represented as an Object. 




## Immutable:

int,  float,  bool,  string,  bytes,  tuple,  frozenset,  None


## Mutable:

list,  dict,  set,  bytearray,  objects,  functions,  almost everything else


__to tell something is mutable or not is to see if there is a function or method that will change the object without assigning it to a new variable.   


## Mutable default arguments are dangerous







In [86]:
def fool(var=[]):
    var.append(1)
    return var


fool()

[1]

In [87]:
fool()

[1, 1]

In [101]:
def fool(var=[None]):
    var.append(1)
    return var


fool()

[None, 1]

In [98]:
def fool(var=None):
    var = []
    var.append(1)
    return var


fool()

[1]

In [99]:
fool()

[1]

## Mutable or immutable?


The following function adds a mapping between a string and the lowercase version of that string to a dictionary. What do you expect the values of d and s to be after the function is called?

def store_lower(_dict, _string):
  """Add a mapping between `_string` and a lowercased version of `_string` to `_dict`

  Args:
    _dict (dict): The dictionary to update.
    _string (str): The string to add.
  """
  orig_string = _string
  _string = _string.lower()
  _dict[orig_string] = _string

d = {}
s = 'Hello'

store_lower(d, s)


Unlike lists and tuples, there is no add(), insert(), or append() method that you can use to add items to your data structure. Instead, you have to create a new index key, which will then be used to store the value you want to store in your dictionary.


dictionary_name[key] = value

  __the way append key and value into a dictionary

In [232]:
dic = {}

values = 'HelloWorld'
lvalues = values.lower()

dic[values] = lvalues   # <= dictionary operation

dic

{'HelloWorld': 'helloworld'}

In [108]:
#  Accessing elements of a dictionary

#  The data inside a dictionary is available in a key/value pair. To access the elements 
#  from a dictionary, you need to use square brackets ([‘key’]) with the key inside it.


dic['HelloWorld']

'helloworld'

In [109]:
del dic['HelloWorld']

dic

{}

In [110]:
?dict

[0;31mInit signature:[0m [0mdict[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
dict() -> new empty dictionary
dict(mapping) -> new dictionary initialized from a mapping object's
    (key, value) pairs
dict(iterable) -> new dictionary initialized as if via:
    d = {}
    for k, v in iterable:
        d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
    in the keyword argument list.  For example:  dict(one=1, two=2)
[0;31mType:[0m           type
[0;31mSubclasses:[0m     OrderedDict, defaultdict, Counter, _EnumDict, StgDict, Bunch, Config, _DefaultOptionDict, ObjectDict, Struct, ...


In [111]:
dir(dict)

['__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__ior__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__or__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__ror__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

In [174]:
#from dict import pop


# To view a function's docstring, you can either use print(function_name.__doc__) 
# or print(inspect.getdoc(function_name)).


import inspect
inspect.getdoc(pop)
#pop.__doc__

NameError: name 'pop' is not defined

In [128]:
squares = [x**2 for x in range(10)]   # <= This gives you a list
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [126]:
?squares

[0;31mType:[0m        list
[0;31mString form:[0m [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0;31mLength:[0m      10
[0;31mDocstring:[0m  
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.


In [127]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    #y = df['labels']
    X = df[[i for i in list(df.columns) if i != 'labels']].values   # <= This gives you an array

    
    return X#,  y


load_data('train.csv')

array([[1, 'Cliff', 'DataScientist', 800000],
       [2, 'Frank', 'DataEngineer', 900008],
       [3, 'Steve', 'PythonDeveloper', 900001],
       [4, 'Coco', 'DataEngineer', 900002],
       [5, 'John', 'DataScientist', 900003]], dtype=object)

In [131]:
a = set()
?a

[0;31mType:[0m        set
[0;31mString form:[0m set()
[0;31mLength:[0m      0
[0;31mDocstring:[0m  
set() -> new empty set object
set(iterable) -> new set object

Build an unordered collection of unique elements.




## Think about date structure, why using set? why using tuple and others 

_*************************************************************************************************_
_*************************************************************************************************_
_*************************************************************************************************_










a function return a dictionary, or default tuple, any ideas







In [7]:
a = (22, 333)  # <= tuples are initialized with () brackets
b = ('a', 'helloworld', 23, print)

c = a + b
print(c)

print(c[3])


c[5].append('string')
pring(c)

?a

(22, 333, 'a', 'helloworld', 23, <built-in function print>)
helloworld


AttributeError: 'builtin_function_or_method' object has no attribute 'append'

In [13]:
import array as arr
a = arr.array("I",[3,6,9])
print(a)
repr(a)

#type(a)

array('I', [3, 6, 9])


"array('I', [3, 6, 9])"

In [None]:
import array as arr
a = arr.array("I",[3,6,9])
print(a)
repr(a)

a.tostring

#type(a)

## Primitive Data Structures


These are the most primitive or the basic data structures. They are the building blocks for data manipulation and contain pure, simple values of a data. Python has four primitive variable types:

    Integers
    Float
    Strings
    Boolean


## tuple and list

In [92]:

# tuple and list
# ------------------------------------------------------------------- #

# functions usually returns a tuple
# tuple can be initialized with (), like using [] to initialize a list
# we can think tuple as a immutable list, both orded



tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9)
tuple[3]   # tuple are orded




# tuple can append() if only such position is a list like this case
tuplea = (1, 2, 3, 4, 5, 6, 7, [8, 9])
tuplea[7].append(0)
tuplea


print(tuple[-1])
print(tuple[2])

9
3


## list and array




**compact way of collecting basic data types(string, integers, float, bo0llean), 
      all the entries in an array must be of the same data type


**arrays can be seen as a more efficient way of storing a certain kind of list. 
      This type of list has elements of the same data type, though.


**With arrays, you can perform an operations on all its item individually easily, 
      which may not be the case with lists


**Python is aware that all the items in an array are of the same data type
      and hence the operation behaves the same way on each element. 
    
**Thus, arrays can be very useful when dealing with a large collection of homogeneous data types

## Traditionally, the list data structure can be further categorised into linear and non-linear data structures. 



__Stacks and Queues are called "linear data structures", 

__whereas Graphs and Trees are "non-linear data structures".

## list stack & queen

In [90]:

# list stack & queen
# ------------------------------------------------------------------- #



stack = [1, 2, 3, 4, 5, 6, 7, 8, 9]
stack.append(0)
stack.pop(-1)

stack




queen = [1, 2, 3, 4, 5, 6, 7, 8, 9]
queen.append(0)
queen.pop(0)
print(queen[-1])

queen

0


[2, 3, 4, 5, 6, 7, 8, 9, 0]

## list graph and trees

## dictionary and set

In [65]:
## Dictionary unorded, just key value pares

# Dictionaries are exactly what you need if implement something similar to a telephone book.
# None of the data structures that you have seen before are suitable for a telephone book. 



x_dict = {'Edward':1, 'Jorge':2, 'Prem':3, 'Joe':4}
x_dict['Prem']

x_dict['John'] = 129

x_dict


x_dict.pop('Joe')
x_dict

{'Edward': 1, 'Jorge': 2, 'Prem': 3, 'John': 129}

## Set  unorded & unindexed


Set is an **unordered and unindexed** collection of items in Python

**Sets are a collection of distinct (unique) objects. 

These are useful to create lists that only hold unique values in the dataset. 


It is an **unordered collection but a mutable one, 

this is very helpful when going through a huge dataset.

In [83]:
alpha = [1,2,3,4,5,6,7,8,9,0]
dir(alpha)

iter_alpha = alpha.__iter__()
iter_alpha.__next__()
iter_alpha.__next__()


iter = alpha.__iter__()
?iter

#ite = iter.__iter__()
#print(ite.__doc__)

[0;31mType:[0m        list_iterator
[0;31mString form:[0m <list_iterator object at 0x7fd860779d30>
[0;31mDocstring:[0m   <no docstring>


In [89]:
set1 = {'John & Joe ', 'Joe' , 'Joe'}


#ss = set1.pop()
#print(ss)

set1.remove('Joe')
set1.add('32')

#set1[3] = 'hello'    # 'set' object does not support item assignment


print(set1)




#set2 = set('helloworld')

set2 = {'Jo', 'Jo', 'John', 'hello'}

set2.add('hello john')

set2.union(set1)



print(set2.__doc__)
?set

{'John & Joe ', '32'}
set() -> new empty set object
set(iterable) -> new set object

Build an unordered collection of unique elements.


[0;31mType:[0m        set
[0;31mString form:[0m {'John & Joe ', 'Joe'}
[0;31mLength:[0m      2
[0;31mDocstring:[0m  
set() -> new empty set object
set(iterable) -> new set object

Build an unordered collection of unique elements.


In [233]:
freshfruit = ['  banana', '  loganberry ', 'passion fruit  ']
{weapon.strip() for weapon in freshfruit}


#print(strip.__doc__)

{'banana', 'loganberry', 'passion fruit'}

In [202]:
import string

print(string.__doc__)

A collection of string constants.

Public module variables:

whitespace -- a string containing all ASCII whitespace
ascii_lowercase -- a string containing all ASCII lowercase letters
ascii_uppercase -- a string containing all ASCII uppercase letters
ascii_letters -- a string containing all ASCII letters
digits -- a string containing all ASCII decimal digits
hexdigits -- a string containing all ASCII hexadecimal digits
octdigits -- a string containing all ASCII octal digits
punctuation -- a string containing all ASCII punctuation characters
printable -- a string containing all ASCII characters considered printable




In [195]:
a = '  banana'
dir(a.strip)
print(a.strip.__doc__)  


# ------------------------------------------------------------------------------ #
# checking
# The dir() function returns all properties and methods of the specified object, 
#   without the values. This function will return all the properties and methods, 
#   even built-in properties which are default for all object.

Return a copy of the string with leading and trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.


In [183]:
import pickle
print(pickle.__doc__)

Create portable serialized representations of Python objects.

See module copyreg for a mechanism for registering custom picklers.
See module pickletools source for extensive comments.

Classes:

    Pickler
    Unpickler

Functions:

    dump(object, file)
    dumps(object) -> string
    load(file) -> object
    loads(bytes) -> object

Misc variables:

    __version__
    format_version
    compatible_formats




In [27]:
# dir(2)

In [124]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    #y = df['labels']
    X = df[[i for i in list(df.columns) if i != 'labels']]

    
    return X#,  y


load_data('train.csv')

Unnamed: 0,id,name,job,salary
0,1,Cliff,DataScientist,800000
1,2,Frank,DataEngineer,900008
2,3,Steve,PythonDeveloper,900001
3,4,Coco,DataEngineer,900002
4,5,John,DataScientist,900003


In [97]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    y = df['labels']    # Pandas Series data type
    #X = df[[i for i in list(df.columns) if i != 'labels']]

    
    return y#,  y


load_data('train.csv')

pp = load_data('train.csv')
pp
#?pp

0    1
1    0
2    0
3    3
4    1
Name: labels, dtype: int64

In [234]:
import pandas as pd

def load_data(path):
    
    df = pd.read_csv(path)
    #y = df['labels']
    X = df[[i for i in list(df.columns) if i != 'labels']]

    
    return X#,  y


load_data('train.csv')

Unnamed: 0,id,name,job,salary
0,1,Cliff,DataScientist,800000
1,2,Frank,DataEngineer,900008
2,3,Steve,PythonDeveloper,900001
3,4,Coco,DataEngineer,900002
4,5,John,DataScientist,900003


## Best practice for default arguments

One of your co-workers (who obviously didn't take this course) has written this function for adding a column to a pandas DataFrame. Unfortunately, they used a mutable variable as a default argument value! Please show them a better way to do this so that they don't get unexpected behavior.

def add_column(values, df=pandas.DataFrame()):
  """Add a column of `values` to a DataFrame `df`.
  The column will be named "col_<n>" where "n" is
  the numerical index of the column.

  Args:
    values (iterable): The values of the new column
    df (DataFrame, optional): The DataFrame to update.
      If no DataFrame is passed, one is created by default.

  Returns:
    DataFrame
  """
  df['col_{}'.format(len(df.columns))] = values
  return df

    
    

    Change the default value of df to an immutable value to follow best practices.
    Update the code of the function so that a new DataFrame is created if the caller didn't pass one.


In [142]:
# Use an immutable variable for the default argument
import pandas as pandas

def better_add_column(values, df=None):
    """Add a column of `values` to a DataFrame `df`.
    The column will be named "col_<n>" where "n" is
    the numerical index of the column.

    Args:
      values (iterable): The values of the new column
      df (DataFrame, optional): The DataFrame to update.
        If no DataFrame is passed, one is created by default.

    Returns:
      DataFrame
    """
    # Update the function to create a default DataFrame
    if df is None:
        df = pandas.DataFrame()
    df['col_{}'.format(len(df.columns))] = values
    return df




values = [1,2,3,4,5,6,7,8]

df = better_add_column(values)
better_add_column(values, df)

Unnamed: 0,col_0,col_1
0,1,1
1,2,2
2,3,3
3,4,4
4,5,5
5,6,6
6,7,7
7,8,8


## Using context managers




## *********************************************************************************************

## Think, when we need to use a context manager, why we need that? 

## Also thank about the decorator, the difference between two? and others



Managing Resources : In any programming language, the usage of resources like file operations or database connections is very common. But these resources are limited in supply. Therefore, the main problem lies in making sure to release these resources after usage. If they are not released then it will lead to resource leakage and may cause the system to either slow down or crash. It would be very helpful if user have a mechanism for the automatic setup and teardown of resources.In Python, it can be achieved by the usage of context managers which facilitate the proper handling of resources. The most common way of performing file operations is by using the with keyword as shown below:

**Python program showing a use of with keyword
  
with open("test.txt") as f:   
    data = f.read()






In this lession, we'll introduce the concept of context managers and show you how to use these special kinds of functions.  


#   __A context manager is type of function that set up a context, for your code to run in, runs your code, and then removes the context.  

In [206]:
# ************************************************************************************************ #
# a real world example of using context manager

with open('hello.txt') as file:
    text = file.read()
    length = len(text)
    
    
print('the file is {} characters long'.format(length))

file.close()

the file is 54 characters long


In [9]:
# a real world example of using context manager

path = 'Pride and Prejudice, by Jane Austen.txt'

with open(path, 'r') as file:
    #length = len(text)
    
    #text = file.readline()
    text = file.read()
    
    
print('the file content is: \n {} '.format(len(text)))

the file content is: 
 774838 


## the open function is a context manager, 


   when you write with open(),  it opens a file that you can ('r')read from or ('w')write to and ('a')append to.  
   
   Then it gives control back to your code so that you can perform operations on the file object.  
    
    
    __In this example, we read the text of the file, store the contents of the file in a variable text, and store the length of the contents in the variable length, when the code inside the indented block is done, the open() function makes sure that the file is closed before continuing on in the script.  
    
    __The print statement is outside of the context, so by the time it runs the file is closed. 
    
    
    
    
# *********************************************************************************
with <context-manager>(<args>) as <variable-name>:
    # Run your code here
    # This code is running "inside the context"
    
# This code runs after the context is removed
# *********************************************************************************

    
    __Any time you use a context manager, it will looks like this, 
         the key word 'with' lets Python know that you are trying to enter a context.  
    
    __Then you call a function, you can call any function that is built to work as a context manager. 
    __A context manager can take arguments like normal function, you end with statement with a colon. 
     
    Statements in Python that has a indented block after them, like foo loops, if/else statements, function definition, etc. are called Compound Statement.
    The with statement is another type of compound statement. Any code that you want to run inside the context that the context manager created needs to be indented,  When the indented block is done, the context manager gets chance to clean up anything that it needs to, like when the open() context manager closed the file. 
    
    Some context manager want to return a value that you can use inside the context. By adding as, and a variable name at the end of the with statement, you can assign the returned value to the variable name.
    

## with statement not with function




with file_obj as file:
    content = file.readline()


## The number of cats

You are working on a natural language processing project to determine what makes great writers so great. Your current hypothesis is that great writers talk about cats a lot. To prove it, you want to count the number of times the word "cat" appears in "Alice's Adventures in Wonderland" by Lewis Carroll. You have already downloaded a text file, alice.txt, with the entire contents of this great book.


Use the open() context manager to open alice.txt and assign the file to the file variable

In [None]:
# Open "alice.txt" and assign the file to "file"
with open('alice_in_wonderland.txt') as file:
    text = file.read()

    
n = 0
for word in text.split():
    if word.lower() in ['cat', 'cats']:
        n += 1

print('Lewis Carroll uses the word "cat" {} times'.format(n))

Lewis Carroll uses the word "cat" 0 times


In [43]:
with open ('alice_in_wonderland.txt') as file:
    text = file.read()
    #content = print(text)  #*
    
a = text.split()  # <= it becomes a list
# a.head()

a[0:5] 

#content  #*

["Alice's", 'Adventures', 'in', 'Wonderland', "ALICE'S"]

In [46]:
print(print.__doc__)

print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.


## The speed of cats

You're working on a new web service that processes Instagram feeds to identify which pictures contain cats (don't ask why -- it's the internet). The code that processes the data is slower than you would like it to be, so you are working on tuning it up to run faster. Given an image, image, you have two functions that can process it:

    process_with_numpy(image)
    process_with_pytorch(image)

Your colleague wrote a context manager, timer(), that will print out how long the code inside the context block takes to run. She is suggesting you use it to see which of the two options is faster. Time each function to determine which one to use in your web service.



Hint

    The timer() context manager does not take any arguments.
    The timer() context manager doesn't yield a value, so you don't need an as <variable name> in your with statement.


In [None]:
image = get_image_from_instagram()

# Time how long process_with_numpy(image) takes to run
with timer():
    print('Numpy version')
    process_with_numpy(image)

# Time how long process_with_pytorch(image) takes to run
with timer():
    print('Pytorch version')
    process_with_pytorch(image)

In [12]:
#import timer()
# how can I write this %time function, or agic %time function

import contextlib, time

# ********************************************************************************************* #
'''
@contextlib.contextmanager                    #  5, @contextlib.contextmanager decorater
def my_context:                               #  1, a function
    # add any setup code you need             #  2, (optional) setup code
    yield                                     #  3, yield keyword
    # add any teardown code you need          #  4, teardown code     '''
# ********************************************************************************************* #

@contextlib.contextmanager
def timer():
    start = time.time()
    try:
        yield
    finally:
        t_spend = time.time() - start
    print(f"Spend {t_spend} s to run")
    #print("Spend: {:.5}s to run".format(t_spend)
    
    
    
#import contextlib, time
#
#@contextlib.contextmanager
#def timer():
#    """Time the execution of a context block.
#    """
#    start = time.time()
#    # Send control back to the context block
#    
#    try:
#        yield
#        
#    finally:
#        end = time.time()
#        print('Elapsed: {:.7f}s'.format(end - start))   
#    
#    


# If the file is too big, can we apply file.readline() to solve this problem?
def count_bingley(path, count=0):
    with open(path, "r") as file:
        content = file.read()
        for word in content.split():
            if word.lower() in ["bingley", "bingley's"]:
                count += 1
    print(count)
    return count

def count_darcy(path, count=0):
    with open(path, "r") as file:
        content = file.read()
        for word in content.split():
            if word.lower() in ["darcy", "darcy's"]:
                count += 1
    print(count)
    return count


path = 'Pride and Prejudice, by Jane Austen.txt'

with timer():
    print('Bingley version')
    count_bingley(path)

with timer():
    print('Darcy version')
    count_darcy(path)

Bingley version
148
Spend 0.05097055435180664 s to run
Darcy version
214
Spend 0.05250740051269531 s to run


In [219]:
%time

def count_cat(path, count=0):
    
    with open(path) as file:
        content = file.read()
        
        for word in content.split():
            if word.lower() in ['cat', 'cats']:
                count += 1
            
    return count


count_cat('alice_in_wonderland.txt')

CPU times: user 3 µs, sys: 1e+03 ns, total: 4 µs
Wall time: 6.2 µs


24

## Writing context managers






Two ways to define a Context Manager in Python,  
    __by using a class that has special __enter__() and __exit__() methods or 
    __by decorating a certain function *
    
    
    
#    How to create a context manager? 
   
      1, define a function,
      2, (optional) you can add any setup code your context manager need
      3, you must use yield keyword to signal to Python that this is a special kind of function
      4, after yield statement, you can add any teardown code that you need to clean up the context
      5, finally you must decorate the function with the @contextmanager from the contextlib modele

In [None]:
# ---------------------------------------------------------------------------------------------- #

@contextlib.contextmanager                    #  5, @contextlib.contextmanager decorater
def my_context:                               #  1, a function
    # add any setup code you need             #  2, (optional) setup code
    yield                                     #  3, yield keyword
    # add any teardown code you need          #  4, teardown code
    
# ---------------------------------------------------------------------------------------------- #






## decorators will be discuessed in next chapter



__yield,  it means that you are going to return a value, 
      but you expect to finish the rest of the function at some point in the future. 

__the value that your context manager yields can be assigned to a variable in the with statement by adding as. 





in fact, a context manager function is technically a generator that yield a single value




### the ability for a function to yield control and know that it will get to finish running later is what makes context managers so useful. 







In [53]:
import contextlib


@contextlib.contextmanager
def my_context():
    print('hello')
    
    yield 42
    
    print('goodbye')
    
    
with my_context() as foo:
    print('fool is {}'.format(foo))

hello
fool is 42
goodbye


In [14]:
# ***************************************************************************************** #
# ***************************************************************************************** #
# ***************************************************************************************** #
from contextlib import contextmanager

@contextmanager
def CustomFileOpen(filename, method):
    """Custom context manager for opening a file."""

    f = open(filename, method)
    try:
        yield f
    finally:
        f.close()

        
        
with CustomFileOpen('Pride and Prejudice, by Jane Austen.txt', 'r') as file:
    content = file.readline()
    
print(content)

# ***************************************************************************************** #
# ***************************************************************************************** #
# ***************************************************************************************** #

﻿The Project Gutenberg eBook of Pride and Prejudice, by Jane Austen



In [52]:
import contextlib

@contextlib.contextmanager
def database(url):
    # Setup a database connection
    db = postgres.connect(url)
    
    yield db
    
    # Teardown database connection
    db.disconnect()
    
    
    
url = 'datacamp.com/data'

with database(url) as my_db:
    course_list = my_db.execute(
    'SELECT * FROM courses')

NameError: name 'postgres' is not defined

## Yielding a value or None






__The database() context manager yields a specific value - the database connection, that can be used in the context block.  


While some context manager dont yield an explicit value, in_dir() is a context manager that changes the current working directory to a specific path and then changes it back after the context block is done. It does not need to return anything with its yield statement.  

In [None]:
import contextlib

@contextlib.contextmanager
def in_dir(path):
    # save current working directory
    old_dir = os.getcwd()
    
    # switch to new working directory
    os.chdir(path)
    
    yield
    
    # change back to previous woring directory
    os.chdir(old_dir)
    
    
    
with in_dir('/data/project_1/'):
    project_files = os.listdir()

## The timer() context manager

A colleague of yours is working on a web service that processes Instagram photos. Customers are complaining that the service takes too long to identify whether or not an image has a cat in it, so your colleague has come to you for help. You decide to write a context manager that they can use to time how long their functions take to run.



    Add a decorator from the contextlib module to the timer() function that will make it act like a context manager.
    Send control from the timer() function to the context block.



Hint

    Remember that context managers use yield to return control to the context.
    timer() does not return a specific value when it yields control.


In [15]:
# ********************************************************************************************* #
# ********************************************************************************************* #
# ********************************************************************************************* #


# Add a decorator that will make timer() a context manager
import contextlib, time

@contextlib.contextmanager
def timer():
    """Time the execution of a context block.
    
    Yields:
      None
    """
    start = time.time()
    # Send control back to the context block
    
    try:
        yield
    finally:
        end = time.time()
        print('Elapsed: {:.7f}s'.format(end - start))   
        # {:.3f}, 0.3 means 3 figure after dot, f mean cut

    
with timer():
    print('This should take approximately 0.25 seconds')
    time.sleep(0.25)
    
    
    
#del timer()   # <= SyntaxError: cannot delete function call
del timer


This should take approximately 0.25 seconds
Elapsed: 0.2506285s


In [258]:
# Add a decorator that will make timer() a context manager
import contextlib, time

@contextlib.contextmanager
def timer():
    """Time the execution of a context block.
    """
    
    start = time.time()
    # Send control back to the context block
    yield
    end = time.time()
    print('Elapsed: {:.5}s'.format(end - start))
    
    

def count_cat(path, count=0):
    with open(path) as file:
        content = file.read()
        
        for word in content.split():
            if word.lower() in ['cat', 'cats', "cat's"]:
                count += 1
            
    return count



def count_alice(path, count=0):
    with open(path) as file:
        content = file.read()
        
        for word in content.split():
            if word.lower() in ['alice', "alice's"]:
                count += 1
            
    return count



def count_rabbit(path, count=0):
    with open(path) as file:
        content = file.read()
        
        for word in content.split():
            if word.lower() in ["rabbit's", 'rabbit']:
                count += 1
                
        print(count)



path = 'alice_in_wonderland.txt'

# Time how long process_with_numpy(image) takes to run
with timer():
    print('Cat version')
    count_cat(path)

# Time how long process_with_pytorch(image) takes to run
with timer():
    print('Alice version')
    count_alice(path)
    
# Time how long process_with_pytorch(image) takes to run
with timer():
    print('Rabbit version')
    count_rabbit(path)

Cat version
Elapsed: 0.013591s
Alice version
Elapsed: 0.0093617s
Rabbit version
33
Elapsed: 0.0090773s


## A read-only open() context manager

You have a bunch of data files for your next deep learning project that took you months to collect and clean. It would be terrible if you accidentally overwrote one of those files when trying to read it in for training, so you decide to create a read-only version of the open() context manager to use in your project.

The regular open() context manager:

    takes a filename and a mode ('r' for read, 'w' for write, or 'a' for append)
    opens the file for reading, writing, or appending
    yields control back to the context, along with a reference to the file
    waits for the context to finish
    and then closes the file before exiting

Your context manager will do the same thing, except it will only take the filename as an argument and it will only open the file for reading.




    Yield control from open_read_only() to the context block, ensuring that the read_only_file object gets assigned to my_file.
    Use read_only_file's .close() method to ensure that you don't leave open files lying around.

Hint

    The open() function creates a reference to a file.
    The function open_read_only() should send that file back to the context.
    You close a file with the .close() method of a file object.


In [257]:
@contextlib.contextmanager
def open_read_only(filename):
    """Open a file in read-only mode.

    Args:
      filename (str): The location of the file to read

    Yields:
      file object
    """
    read_only_file = open(filename, mode='r')
    # Yield read_only_file so it can be assigned to my_file
    
    yield read_only_file
    # Close read_only_file
    read_only_file.close()

        
    
with open_read_only('alice_in_wonderland.txt') as my_file:
    print(my_file.readline())

Alice's Adventures in Wonderland



## Advanced topics




In this lesson, we'll cover __nested context, handing errors, and how to know when to create a context manager. 


Image you are immplementation this copy() function that copies the contents of one file to another file. One way you could write this function would be to open & read the source file, read the content and stored the content file into content variable, then open & write the destination file, and write the vontent into it. 






## This approach works fine, untill we try to copy a file that is too large to fit in memory. 



The better idea would be open both file at once, and copy over one line at a time. The file object that the open() context manager return can be iterated over in a for loop. The statement "for line in my_file" will read in the contents of my_file one line at a time, untill the end of the file


with open('abc.txt') as my_file:
    for line in my_file:
        # do something here

In [253]:
def copy(src, dst):
    """Copy the contents of one file into another
    
    Args:
      src (str): File name of source to be copied
      dst (str): File name of new file
    """
    
    # Open the source file and read in the contents
    with open(src, 'r') as f_src:
        content = f_src.read()     # <= read the file to content variable
        
    # Open the destination file and write out the content
    with open(dst, 'a') as f_dst:
        f_dst.write(content)     # <= write the vontent variable into dest file
    
    
    
copy('abc.txt', 'ABC.txt')

In [18]:
# ***************************************************************************************** #
# ***************************************************************************************** #

def copy(src, dst):
    """Copy the contents of one file into another
    
    Args:
      src (str): File name of source to be copied
      dst (str): File name of new file
    """
    
    # Open the source file and read in the contents
    with open(src, 'r') as f_src:
        with open(dst, 'a') as f_dst:  # Then open the destnation file inside the source file's context
            for line in f_src:  
                # That means the code runs inside open destionation file has access both f_sr and f_dst
                f_dst.write(line)    
    
    
#copy('abc.txt', 'ABC.txt')



import contextlib, time

@contextlib.contextmanager
def timer():
    """Time the execution of a context block.
    """
    
    start = time.time()
    # Send control back to the context block
    
    yield
    end = time.time()
    print('Elapsed: {:.5}s'.format(end - start))
    
    
    
    
with timer():
    copy('abc.txt', 'ABC.txt')

Elapsed: 0.002811s


## Handling errors




One thing you want to think about when writting your context manager is : What happens if the programmer who uses your contextmanager writes code taht causes an error? 




In [None]:
def get_printer(ip):
    p = connect_to_printer(ip)
    
    yield
    
    # This MUST be called or no one else will be able to connect to the printer 
    p.disconnected
    print('disconnected from the printer')
    
    
    
    
doc = {"text": "This is my text"}
    
with get_printer('10.0.34.111') as printer:
    printer.print_page(doc['txt'])  
    # ---------------------------------------------------------------------------------------------- #
    # <= this will raise a KeyError, hence the code stops here and p.disconnected doesnt get called
    #                                                             ---------------------------------- #
    
    
    

## Handing errors



try:
    # code that might raise exception
    
except:
    # do something about errors
    
finally:
    # this code block runs no matter what
    
    
    
    
## Python OOP Course notes:
# --------------------------------------------------------------------------------------- #
try:
    # try running some code or program
    
except ExceptionNameHere:
    # run this code if ExceptionNameHere happened
    
except AnotherExceptionNameHere:
    # run this code if AnotherExceptionNameHere happened
    
......
    
finally:        # <= this is optional
    # run this code no matter what
    # this code block is best used for cleaning up, like closing open files. 

In [None]:
import contextlib

@contextlib.contextmanager
def get_printer(ip):
    p = connect_to_printer(ip)
    
    
    try:
        yield
        
    # except
    
    finally:
        p.disconnect()
        print('disconnected from the printer')
    

## When the sloopy programmer runs the code, they still get the KeyError, but finally ensured that p.disconnect() is called before the error is raced.  

In [None]:
import contextlib

@contextlib.contextmanager
def get_printer(ip):
    p = connect_to _printer(ip)
    
    try:
        yield
        
    finally:
        p.disconnected()
        print('disconnected from the printer')
        
        
        
doc = {"text": "This is my text"}
        
with get_printer(ip) as printer:
    printer.print_page(doc["text"])

In [50]:
def print_file(path):
    with open(path, 'r') as file:
        text = file.read()
        
    return {path: text}   # how dictionary was used in function output? expecially in production env



path = 'alice_in_wonderland.txt'

alice = print_file(path)

alice['test'] = 123
alice['test']

123

In [259]:

import contextlib

@contextlib.contextmanager
def get_printer(ip):
    p = connect_to_printer(ip)
    
    try:
        yield
    
    finally:
        
        # This MUST be called or no one else will be able to connect to the printer 
        p.disconnected
        print('disconnected from the printer')
    
    
    
doc = 'alice_in_wonderland.txt'

with open(doc, 'r') as file:
    content = file.read()
    
    #return {doc: content}
    
    
    
#def doc_print(doc):
#    with open(doc, 'r') as file:
#        content = file.read()
#        
#    return {doc, content}
    
    
    
#print(content)
#content
?content

#doc_print(doc)
    

#doc = {"text": "This is my text"}
#    
#with get_printer('10.0.34.111') as printer:
#    printer.print_page(doc['text'])

[0;31mType:[0m        str
[0;31mString form:[0m
Alice's Adventures in Wonderland
           
           ALICE'S ADVENTURES IN WONDERLAND
           
           <...>
           remembering her own child-life, and the happy summer days.
           
           THE END
[0;31mLength:[0m      148574
[0;31mDocstring:[0m  
str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.


In [23]:
def doc_print(doc):
    with open(doc, 'r') as file:
        content = file.read()
        
    return {doc, content}
    #return doc, content
    
    
    
doc = 'alice_in_wonderland.txt'
    
#print(content)
#content

#doc_print(doc)

## Context manager use cases

Which of the following would NOT be a good opportunity to use a context manager?
Answer the question
50XP
Possible Answers

    A function that starts a timer that keeps track of how long some block of code takes to run.
    press
    1
    A function that prints all of the prime numbers between 2 and some value n.
    press
    2
    A function that connects to a smart thermostat so that it can be programmed remotely.
    press
    3
    A function that prevents multiple users from updating an online spreadsheet at the same time by locking access to the spreadsheet before every operation.
    press
    4

In [67]:
a = []
n = 100

for i in range(2, n):
    for j in range(2, i):
        if i % j == 0:
            break
    else:
        a.append(i)
    
print(a)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]


In [180]:
# Python program to display all the prime numbers within an interval

lower = 2
upper = 100

print("Prime numbers between", lower, "and", upper, "are:")

a = []

for num in range(lower, upper + 1):
   # all prime numbers are greater than 1
   if num > 1:
        for i in range(2, num):
             if (num % i) == 0:
                    break
        else:
             a.append(num)
                
print(a)

Prime numbers between 2 and 100 are:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]


## Scraping the NASDAQ

Training deep neural nets is expensive! You might as well invest in NVIDIA stock since you're spending so much on GPUs. To pick the best time to invest, you are going to collect and analyze some data on how their stock is doing. The context manager stock('NVDA') will connect to the NASDAQ and return an object that you can use to get the latest price by calling its .price() method.

You want to connect to stock('NVDA') and record 10 timesteps of price data by writing it to the file NVDA.txt.

You will notice the use of an underscore when iterating over the for loop. If this is confusing to you, don't worry. It could easily be replaced with i, if we planned to do something with it, like use it as an index. Since we won't be using it, we can use a dummy operator, _, which doesn't use any additional memory.



    Use the stock('NVDA') context manager and assign the result to nvda.
    Open a file for writing with open('NVDA.txt', 'w') and assign the file object to f_out so you can record the price over time.


In [None]:
# Use the "stock('NVDA')" context manager
# and assign the result to the variable "nvda"
with stock('NVDA') as nvda:
    # Open "NVDA.txt" for writing as f_out
    with open("NVDA.txt", 'w') as f_out:
        for i in range(10):
            value = nvda.price()
            print('Logging ${:.2f} for NVDA'.format(value))
            f_out.write('{:.2f}\n'.format(value))

## Changing the working directory

You are using an open-source library that lets you train deep neural networks on your data. Unfortunately, during training, this library writes out checkpoint models (i.e., models that have been trained on a portion of the data) to the current working directory. You find that behavior frustrating because you don't want to have to launch the script from the directory where the models will be saved.

You decide that one way to fix this is to write a context manager that changes the current working directory, lets you build your models, and then resets the working directory to its original location. You'll want to be sure that any errors that occur during model training don't prevent you from resetting the working directory to its original location.



    Add a statement that lets you handle any errors that might occur inside the context.
    Add a statement that ensures os.chdir(current_dir) will be called, whether there was an error or not.


In [None]:
def in_dir(directory):
    """Change current working directory to `directory`,
    allow the user to run some code, and change back.
    
    Args:
      directory (str): The path to a directory to work in.
    """
    
    current_dir = os.getcwd()
    os.chdir(directory)
    
    # Add code that lets you handle errors
    try:
        yield
    
    # Ensure the directory is reset, whether there was an error or not
    finally:
        os.chdir(current_dir)

## Functions are objects





In this chapter, you are going to learn about decorators, a powerful way of modifying the behavior of functions. But first we need to build up some foundational concept that will make decorator easy to understand.  The main thing you should take away from this lession is that functiona are just like any other object in Python.  So you can do anything to or with them that you would do with any other object. 

In [179]:
def my_function():   # with parentheses
    print('hello world')
    
    
def my_fu(text):   # with parentheses
    print(text)
    
    
hell = my_function   # without parentheses, when assign a function to a variable
hell            # its an object
#hell()    # with parentheses

hel = my_fu
hell()     # with parentheses, type the function with the parentheses mean you are calling the function
hel('hola')

hello world
hola


Notice that when you assign a function to a variable, you do not include the parentheses after the function name.  This is subtle but very important distinction.  When you type the function with the parentheses, you are calling that function. However when you type function name without the parentheses, you are referencing the function.  It evaluates to a function object.  


Since the function is just like anything else in python, you can pass it as an argument to another function.  



def has_docstring(func):
    """Check to see if the function func has a docstring
    
    Args: 
      func (callable): A function
      
    Return:
      bool
    
    """

    return func.__doc__ is not None

In [39]:
def has_docstring(func):
    """Check to see if the function func has a docstring
    
    Args: 
      func (callable): A function
      
    Return:
      bool
    
    """

    return func.__doc__ is not None


def no():
    return 42

def yes():
    """Returns the value 42
    """
    return 42
        
has_docstring(no)
has_docstring(yes)

True

In [72]:
list_of_functions = ["__repr__", print, open]
list_of_functions[1]('hello world')

hello world


In [77]:
dict_of_functions = {'f1':open, 'f2':print, 'f3':'__repr__'}
dict_of_functions['f2']('hello')

hello


## Functions can also be defined inside another function.  







These kinds of functions are called **nested functions**, also you may also hear them called inner functions, helper functions, child functions.  A nested function can make code easy to read, 



In [94]:
def fool(x, y):
    if x > 4 and x < 10 and y > 4 and y < 10:
        print(x * y)
        
        
        
def fool2(x, y):       # nested functions is okay
    def in_range(v):
        return v > 4 and v < 10
    if in_range(x) and in_range(y):
        print(x * y)
        
fool2(5,9)

45


In [102]:
def get_function():
    def print_me(s):
        print(s)
    return print_me


get_function()   # returns an object

n_fun = get_function()
n_fun('hello')

hello


### Nested functions






### Using Inner Functions: The Basics



The use cases of Python inner functions are varied. You can use them to provide encapsulation and hide your functions from external access, you can write helper inner functions, and you can also create closures and decorators. In this section, you’ll learn about the former two use cases of inner functions, and in later sections, you’ll learn how to create closure factory functions and decorators.
Providing Encapsulation

A common use case of inner functions arises when you need to protect, or hide, a given function from everything happening outside of it so that the function is totally hidden from the global scope. This kind of behavior is commonly known as encapsulation.

Here’s an example that highlights that concept:


In [1]:
def increment(number):
    def inner_increment():
        return number + 1
    return inner_increment()


increment(10)

11

### Building Helper Inner Functions

Sometimes you have a function that performs the same chunk of code in several places within its body. For example, say you want to write a function to process a CSV file containing information about the Wi-Fi hotspots in New York City. To find the total number of hotspots in New York as well as the company that provides most of them, you create the following script:



Here, process_hotspots() takes file as an argument. The function checks if file is a string-based path to a physical file or a __file object__. Then it calls the helper inner function most_common_provider(), which takes a file object and performs the following operations:

    Read the file content into a generator that yields dictionaries using csv.DictReader.
    Create a list of Wi-Fi providers.
    Count the number of Wi-Fi hotspots per provider using a collections.Counter object.
    Print a message with the retrieved information.


 ### The canonical way to create a file object is by using the open() function.
 
 
 
 
 
 
 
 
 
Using Inner vs Private Helper Functions

Typically, you create helper inner functions like most_common_provider() when you want to provide encapsulation. You can also create inner functions if you think you’re not going to call them anywhere else apart from the containing function.

Although writing your helper functions as inner functions achieves the desired result, you’ll probably be better served by extracting them as top-level functions. In this case, you could use a leading underscore (_) in the name of the function to indicate that it’s private to the current module or class. This will allow you to access your helper functions from anywhere else in the current module or class and reuse them as needed.

Extracting inner functions into top-level private functions can make your code cleaner and more readable. This practice can produce functions that consequently apply the single-responsibility principle.


In [198]:
with open('alice_in_wonderland.txt') as file:
    content = file.readline()
    words = []
    words.append(content.split())
    
    del file  # so we delete after using the file object, to save the memory resourcces?
    
    
words
file   # so the file object still exist in the memory? which should be not good as memory occupancy

NameError: name 'file' is not defined

In [3]:
def factorial(number):
    # Validate input
    if not isinstance(number, int):
        raise TypeError("Sorry. 'number' must be an integer.")
    if number < 0:
        raise ValueError("Sorry. 'number' must be zero or positive.")
    
    # Calculate the factorial of number
    def inner_factorial(number):
        if number <= 1:
            return 1
        return number * inner_factorial(number - 1)
    
    return inner_factorial(number)


factorial(4)

24

In [19]:
# ------------------------------------------------------------------------------------------- #
# ------------------------------------------------------------------------------------------- #

# hotspots.py

import csv
from collections import Counter

def process_hotspots(file):
    def most_common_provider(file_obj):
        hotspots = []
        with file_obj as csv_file:      # <= 
            content = csv.DictReader(csv_file)
            #print(list(content))  # <= {'OBJECTID': '1', 'Borough': 'BK', 'Type': 'Limited Free',...}
                                  # <csv.DictReader object at 0x7f94a4455310>
            for row in content:
                hotspots.append(row["Provider"])    # <= row['Provider'] picking dict 
            #print(hotspots)  # <= a list of Provider info, "LinkNYC - Citybridge"

        counter = Counter(hotspots)
        print(f"There are {len(hotspots)} Wi-Fi hotspots in NYC.\n"
              f"{counter.most_common(1)[0][0]} has the most with "
              f"{counter.most_common(1)[0][1]}.")
        #print(counter)   # <= 'LinkNYC - Citybridge': 1731
        
    if isinstance(file, str):
        file_obj = open(file, "r")      # WHY ? 
        most_common_provider(file_obj)
    else:
        most_common_provider(file)

        
#process_hotspots("NYC_Wi-Fi_Hotspot_Locations.csv")   # <= file path as input of function

file_obj = open("./NYC_Wi-Fi_Hotspot_Locations.csv", "r")
process_hotspots(file_obj)                            # <= file object as input of function

del csv, Counter, process_hotspots, file_obj

There are 3167 Wi-Fi hotspots in NYC.
LinkNYC - Citybridge has the most with 1731.


In [267]:
# ****************************************************************************************** #
###  counter.most_common([n])

    # Return a list of the n most common elements and their counts from the most common to 
    # the least. If n is omitted or None, most_common() returns all elements in the counter. 
    # Elements with equal counts are ordered in the order first encountered:
    

Counter('abracadabra').most_common(3)
#[('a', 5), ('b', 2), ('r', 2)]


[('a', 5), ('b', 2), ('r', 2)]

In [196]:
class Employee:
  
    # Initializing 
    def __init__(self):
        print('Employee created')
  
    # Calling destructor
    def __del__(self):
        print("Destructor called")
        
def Create_obj():
    print('Making Object...')
    obj = Employee()
    print('function end...')
    return obj
  
print('Calling Create_obj() function...')
obj = Create_obj()
#print('Program End...')

Calling Create_obj() function...
Making Object...
Employee created
function end...
Destructor called


In [192]:
mytuple = ("apple", "banana", "cherry")
myit = iter(mytuple)

print(next(myit))
print(next(myit))
print(next(myit))

print(myit)

myit

?myit

apple
banana
cherry
<tuple_iterator object at 0x7fb308094b50>


[0;31mType:[0m        tuple_iterator
[0;31mString form:[0m <tuple_iterator object at 0x7fb308094b50>
[0;31mDocstring:[0m   <no docstring>


## Building a command line data app

You are building a command line tool that lets a user interactively explore a data set. We've defined four functions: mean(), std(), minimum(), and maximum() that users can call to analyze their data. Help finish this section of the code so that your users can call any of these functions by typing the function name at the input prompt.

Note: The function get_user_input() in this exercise is a mock version of asking the user to enter a command. It randomly returns one of the four function names. In real life, you would ask for input and wait until the user entered a value.



    Add the functions std(), minimum(), and maximum() to the function_map dictionary, like we did with mean().
    The name of the function the user wants to call is stored in func_name. Use the dictionary of functions, function_map, to call the chosen function and pass data as an argument.


In [108]:
# Add the missing function references to the function map
function_map = {
  'mean': mean,
  'std': std,
  'minimum': minimum,
  'maximum': maximum
}

#data = load_data()
data = [1,2,3,4,5,6,7,8,9,10]
mean(data)

print(data)

func_name = get_user_input()

# Call the chosen function and pass "data" as an argument
function_map[func_name](data)

NameError: name 'mean' is not defined

In [132]:
# Add the missing function references to the function map

import statistics
import numpy as np


#mean = statistics.mean()
#std = statistics.std()
#minimum = statistics.minimum()
#maximum = statistics.maximum()


function_map = {
    'mean': np.mean,
    'std': np.std,
    'minimum': np.minimum,
    'maximum': np.maximum
}

#data = load_data()
data = [1,2,3,4,5,6,7,8,9,10]
np.mean(data)

print(data)

#func_name = get_user_input()

# Call the chosen function and pass "data" as an argument
function_map['mean'](data)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


5.5

In [125]:
a = [1,2,3,4,5,6,7,8,9,10]

import statistics

statistics.mean(a)
#statistics.mean(1,2,3,4)
# TypeError: mean() takes 1 positional argument but 4 were given

5.5

## Reviewing your co-worker's code

Your co-worker is asking you to review some code that they've written and give them some tips on how to get it ready for production. You know that having a docstring is considered best practice for maintainable, reusable functions, so as a sanity check you decide to run this has_docstring() function on all of their functions.

def has_docstring(func):
  """Check to see if the function 
  `func` has a docstring.

  Args:
    func (callable): A function.

  Returns:
    bool
  """
  return func.__doc__ is not None



Instructions 1/3
35 XP

1
    Call has_docstring() on your co-worker's load_and_plot_data() function.

2
    Check if the function as_2D() has a docstring.

3
    Check if the function log_product() has a docstring.


In [148]:
import pandas as pd

def load_and_plot(path):
    '''Load a data set and plot the first two principal components
    
    Args:
      path (str): The location of csv file
      
    Returns:
      Tuple of numpy ndarray: (features, labels)
    
    '''
    
    data = pd.read_csv(path)
    Y = data['labels'].values
    X = data[[i for i in list(data.columns) if i != 'labels']].values
#    pca = PCA(n_components=2).fit_transform(X)
#    plt.scatter(pca[:,0], pca[:,1])
    
    return X, Y



#    df = pd.read_csv(path)
#    y = df['labels'].values
#    X = df[[i for i in list(df.columns) if i != 'labels']].values



train_X, train_y = load_and_plot('train.csv')
#val_X, val_y = load_and_plot('val.csv')
#test_X, test_y = load_and_plot('test.csv')


## Wrapping the repeated logic in a function and then calling that function several times.




# Call has_docstring() on the load_and_plot_data() function
ok = has_docstring(load_and_plot)

if not ok:   # Can we write Python code this way? So function runs will return Ture? 
    print("load_and_plot_data() doesn't have a docstring!")
else:
    print("load_and_plot_data() looks ok")
    
    
    


load_and_plot_data() looks ok


## Returning functions for a math game

You are building an educational math game where the player enters a math term, and your program returns a function that matches that term. For instance, if the user types "add", your program returns a function that adds two numbers. So far you've only implemented the "add" function. Now you want to include a "subtract" function.


Define the subtract() function. It should take two arguments and return the first argument minus the second argument.



    Define the subtract() function. It should take two arguments and return the first argument minus the second argument.



Hint

    subtract() should be defined as a nested function, just like add() is.


In [160]:
from numpy import subtract

# Read the code and think how they put things together, nested functions and others


def create_math_function(func_name):
    if func_name == 'add':
        def add(a, b):      # *********************
            return a + b
        return add          # *********************
    elif func_name == 'subtract':
        # Define the subtract() function
        def subtract(a, b):
            return a - b
        return subtract
    else:
        print("I don't know that one")
    
    
add = create_math_function('add')
print('5 + 2 = {}'.format(add(5, 2)))

subtract = create_math_function('subtract')
#print('5 - 2 = {}'.format(subtract(5, 2)))

print(f'5 - 2 = {subtract(5, 2)}')

5 + 2 = 7
5 - 2 = 3


## Scope



Python has to have strict rules aout which variable you are referring to when using a particualr variable name.  

**local scope:**
      the inside of your function, any arguments or variables
      
if the interpreter cant find the variable in the local scope, it expands search to the global scope


**Nonlocal scope**
      in the case of nested functions, this is the scope of parent function


**global scope:**
      the things defined outside your function, 
       
       
**build-in scope:**
      always in Python, for instance the print() function is in build-in scope


In [103]:
x = 7

def foo():
    x = 42
    print(x)
    
    
foo()

x

42


7

In [105]:
x = 7

def foo():
    global x    # global keyword to declare the global scope variable
                # same situation can be applied to nonlocal scope with nonlocal keyword
    x = 42
    print(x)
    
    
foo()

x

42


42

## Understanding scope

What four values does this script print?

x = 50

def one():
  x = 10

def two():
  global x
  x = 30

def three():
  x = 100
  print(x)

for func in [one, two, three]:
  func()     # function one local x=10, global x=50,   print global x=50
  print(x)   # function two local x=30, global x=30,   print global x=30
             # function three local x=100, global x=30,   print local x=100, then   print global=30



Hint

    x = 50 is defined in the global scope.
    First, determine whether each function is modifying the global x or a local x.
    Notice that one() and two() just modify x, but three() modifies x and prints a value.


In [107]:
x = 50

def one():
    x = 10    # no print()

def two():
    global x
    x = 30    # no print()

def three():
    x = 100
    print(x)

for func in [one, two, three]:
    func()
    print(x)

50
30
100
30


## Modifying variables outside local scope

Sometimes your functions will need to modify a variable that is outside of the local scope of that function. While it's generally not best practice to do so, it's still good to know how in case you need to do it. Update these functions so they can modify variables that would usually be outside of their scope.


Instructions 1/3

1    Add a keyword that lets us update call_count from inside the function.

2    Add a keyword that lets us modify file_contents from inside save_contents().

3    Add a keyword to done in check_is_done() so that wait_until_done() eventually stops looping.



two question missing, but I guess its the same trick, applying **globe** or **nonlocal** keyword before using such variable in the function or nested function

In [108]:
call_count = 0

def my_function():
    # Use a keyword that lets us update call_count 
    global call_count
    call_count += 1
    
    print("You've called my_function() {} times!".format(
      call_count
    ))

for _ in range(20):
    my_function()

You've called my_function() 1 times!
You've called my_function() 2 times!
You've called my_function() 3 times!
You've called my_function() 4 times!
You've called my_function() 5 times!
You've called my_function() 6 times!
You've called my_function() 7 times!
You've called my_function() 8 times!
You've called my_function() 9 times!
You've called my_function() 10 times!
You've called my_function() 11 times!
You've called my_function() 12 times!
You've called my_function() 13 times!
You've called my_function() 14 times!
You've called my_function() 15 times!
You've called my_function() 16 times!
You've called my_function() 17 times!
You've called my_function() 18 times!
You've called my_function() 19 times!
You've called my_function() 20 times!


## Closures


A clusure in Python is a tuple of variables that are no longer in scope, but that a function needs in order to run.  


Its a Python's way of attaching nonlocal variable to the returned function so that the function can operate even when it is called outside of its parents' scope. 








## Closure is a function object that remenber values in the enclosing scope even if they are not presented in memory
   **So cant call nested function argument, __closure__ is used to remenber values in enclosing scope




In [14]:
def outer():
    x = 3
    def inner():
        print(x)
    return inner()


outer()
a = outer()
a
print(a)           # think think

3
3
None


In [20]:
def outer():
    x = 3
    def inner():
        y = 6
        return x+y
    return inner()


outer()
a = outer()
#a()                 # TypeError: 'int' object is not callable
print(a)             # think think

9


In [16]:
def foo():
    a = 5
    def bar():
        print(a)
    return bar   # why not return bar()


func = foo()
func()

print(func)

5
<function foo.<locals>.bar at 0x7f90006353a0>


In [114]:
type(func.__closure__)

tuple

In [115]:
len(func.__closure__)

1

In [116]:
func.__closure__[0].cell_contents

5

## any nonlocal variable that nested function was going to need to the function object, 
   **those variable get stored in a tuple in the __closure__ attribute of the function.
    
    
    

In [124]:
x = 25


def foo(value):
    def bar():
        print(value)
    return bar


my_func = foo(x)
my_func()    # type the function with the parentheses mean you are calling the function

25


In [125]:
del x
my_func()   # ???????????????????????????????????

25


In [126]:
len(my_func.__closure__)

1

In [127]:
my_func.__closure__[0].cell_contents

25

## because the foo()'s value argument gets added to the closure attached the the new my_func() function
   **so even though x doesnt exits anymore, the value presists in it closure
   

In [23]:
def parent(arg_1, arg_2):
    value = 22
    my_dict = {'apple': 'good'}
    
    def child():
        
        pp = 123      # cant call this, __closure__ is used to remenber values in enclosing scope
        
        print(2 * value)
        print(my_dict['apple'])
        print(arg_1 + arg_2)
        
    return child



new_func = parent(3,4)
new_func()

#   print(new_func.value)   # how can we call it , no self.value, so object cant call it
#   print(parent(3,4).value)
# ******************************************************************************** #


print([cell.cell_contents for cell in new_func.__closure__])

## Closure is a function object that remenber values in the enclosing scope even 
## if they are not presented in memory
## So cant call nested function argument, __closure__ is used to remenber values in enclosing scope


44
good
7
[3, 4, {'apple': 'good'}, 22]


In [156]:
def hi():
    
    bye = 5
    sigh = 10
    return bye 

hi()
x = hi()

hi.bye
print(x.bye)

AttributeError: 'function' object has no attribute 'bye'

In [155]:
def hi():
    # other code...
    hi.bye = 42  # Create function attribute.
    sigh = 10

hi()
print(hi.bye)  # -> 42


42



## Its a Python's way of attaching nonlocal variable to the returned function so that the function can operate even when it is called outside of its parents' scope. 

## Checking for closure

You're teaching your niece how to program in Python, and she is working on returning nested functions. She thinks she has written the code correctly, but she is worried that the returned function won't have the necessary information when called. Show her that all of the nonlocal variables she needs are in the new function's closure.
Instructions 1/3
35 XP

    1
    2 missing
    3 missing




    Use an attribute of the my_func() function to show that it has a closure that is not None.



In [165]:
def return_a_func(arg1, arg2):
    def new_func():
        print('arg1 was {}'.format(arg1))
        print('arg2 was {}'.format(arg2))
    return new_func
    
    
    
my_func = return_a_func(2, 17)


# Show that my_func()'s closure is not None
print(my_func.__closure__ is not None)


#print([cell.cell_contents for cell in new_func.__closure__])
print([cell.cell_contents for cell in my_func.__closure__])

True
[2, 17]


## Closures keep your values safe

You are still helping your niece understand closures. You have written the function get_new_func() that returns a nested function. The nested function call_func() calls whatever function was passed to get_new_func(). You've also written my_special_function() which simply prints a message that states that you are executing my_special_function().

You want to show your niece that no matter what you do to my_special_function() after passing it to get_new_func(), the new function still mimics the behavior of the original my_special_function() because it is in the new function's closure.
Instructions 1/3
35 XP



1    Show that you still get the original message even if you redefine my_special_function() to only print "hello".

2    Show that even if you delete my_special_function(), you can still call new_func() without any problems.

3    Show that you still get the original message even if you overwrite my_special_function() with the new function.

In [176]:
def my_special_function():
    print('You are running my_special_function()')

def get_new_func(func):
    def call_func():
        func()
    return call_func

new_func = get_new_func(my_special_function)


    
                         # ******************************************************************* #
del my_special_function  # the __closure__ store happened in nesteing process
    
    


# Redefine my_special_function() to just print "hello"
def my_special_function():
    print("hello")
    
    
    
new_func()    # why 


You are running my_special_function()


In [79]:
def my_special_function():
    print('You are running my_special_function()')

def get_new_func(func):
    def call_func():
        func()            # ***************************************************************** #
    return call_func      # so the inner function returns the function calling


new_func = get_new_func(my_special_function)
new_func

# Redefine my_special_function() to just print "hello"
def my_special_function():
    print("hello")

    
new_func = get_new_func(my_special_function)   # interestng
    
    
new_func()    # why 


hello


In [10]:
# Python program to illustrate functions
# Functions can return another function
 
def create_adder(x):
    def adder(y):
        return x+y
 
    return adder

    
add_15 = create_adder(15)
                 # ********************************************************************** #
print(add_15)    # outer function called, but returns a inner function, not inner function calling
    
    
print(add_15(10))   # whe outer function was called, it gives back a return and thats inner fucntion
                    # so this is a inner function, and we passed 10 to the inner function

#print(add_15(10).__name__)   # AttributeError: 'int' object has no attribute '__name__'
                              # add_15(10) passed 10 to inner function, and inner function returns  x+y


#print([cell.cell_contents for cell in my_func.__closure__])
print([cell.cell_contents for cell in add_15.__closure__])   # cause add_15 is still a function: adder
                                                             # ***************************************
                                                             # ***************************************
    
pp = add_15(12)
print(pp)
print([cell.cell_contents for cell in pp.__closure__])             ## function got __closure__
     # AttributeError: 'int' object has no attribute '__closure__'

<function create_adder.<locals>.adder at 0x7f90047deaf0>
25
[15]
27


AttributeError: 'int' object has no attribute '__closure__'

## Decorators






A decorator is a rapper that you can place around a function that change that function's behavior.  You can modify the inputs, modify the outputs, or even change the behavior of the fuctin itself.  






In [13]:
@double_args
def multiply(a, b):
    return a*b


multiply(3, 4)

NameError: name 'double_args' is not defined

In [29]:
def multiply(a, b):
    return a*b


def double_args(func):
    return func                         # it returns a function, pass it to a variable, then call it


                                        # ***********************************************************
new_multiply = double_args(multiply)    # double_args was called, it returns the function passed to it
new_multiply(3, 5)                      # and its equal to multiply(), call it and pass arguments (3,5)

15

**In order for your decorator to return a modified function, its is usually helpful for it to define a new function to return.  We'll call that nested function - wrapper()

All wrapper does is take two arguments and passes them on to whatever function was passed to double_args in the first place.  If double_args them returns the new wrapper function, the return value acts exactly the same as whatever function was passed to double_args, assuming that the function passed to double_args also takes exactly two arguments.  


In [36]:
# *********************************************************************************************** #
# *********************************************************************************************** #
# *********************************************************************************************** #


def multiply(a, b):
    return a*b

def double_arg(func):
    # define new function that we can modify
    def wrapper(a, b):
        a_new, b_new = a*2, b*2
        # fo now just call the unmodified function
        return func(a_new, b_new)
        #return func(a/2, b/2)
    # return the new function  # the enclosing function do not return the nested function call, 
    return wrapper             # when using @decrator, the enclosing function returns nested function 


                                     # *************************************************************
new_multiply = double_arg(multiply)  # pass multiply to double_arg and assign result to new_multiply
print('new_multiply(3,7):', new_multiply(3, 7))  # then call new_multiple which is now equal to wrapper
                                                 # wrapper calls multiply, because its the function 
                                                 # passed to double_args. So wrapper calls multiply
                                                 # with the new arguments. 



@double_arg                     # this is very interesting
def new_add(a, b):              # new_multiply = double_arg(multiply), then  new_multiply(args)
    return a+b

print('new_add(3,7):', new_add(3, 7))


new_add = double_arg(new_add)
print('new_add(3,7)', new_add(3, 7))

del double_arg, new_add, multiply

new_multiply(3,7): 84
new_add(3,7): 20
new_add(3,7) 40


In [37]:
def multiply(a, b):
    return a*b

def double_arg(func):    
    def wrapper(a, b):
        a_new, b_new = a*2, b*2
        return func(a_new, b_new)
    # return the new function  # the enclosing function do not return the nested function call, 
    return wrapper             # when using @decrator, the enclosing function returns nested function 


# ************************************************************************************************ #
# over write the multiply variable, now calling multiply with argument gives us modified results
# we can do this because Python stores the original multiply function in the new function's closure

multiply = double_arg(multiply)  # pass multiply to double_arg and assign result to new_multiply
print('multiply(3,7):', multiply(3, 7))  # then call 


multiply.__closure__[0].cell_contents   # <function __main__.multiply(a, b)>

op = multiply(4, 6)
print(op)
#op.__closure__[0].cell_contents   # <function __main__.multiply(a, b)>

del op, multiply, double_arg

multiply(3,7): 84
96


In [38]:
def double_arg(func):
    
    # define new function that we can modify
    def wrapper(a, b):
        a_new, b_new = a*2, b*2
        # fo now just call the unmodified function
        return func(a_new, b_new)
        #return func(a/2, b/2)
        #a, b = a/2, b/2
    return wrapper
        
                
@double_arg
def multiply(a, b):
    return a*b

print(multiply(3, 5))

del multiply, double_arg

60


## Using decorator syntax

You have written a decorator called print_args that prints out all of the arguments and their values any time a function that it is decorating gets called.



Instructions 1/2
50 XP

1    Decorate my_function() with the print_args() decorator by redefining the my_function variable.

2    Decorate my_function() with the print_args() decorator using decorator syntax.

In [52]:
def my_function(a, b, c):
    print(a + b + c)
    
    
def print_args(func):
    def wrapper(a, b, c):      # just because I know the arguments we are using is a,b,c
        print(a, b, c)         # can this be universal? 
    return wrapper
    
    
    
    
# Decorate my_function() with the print_args() decorator ######################################
my_function = print_args(my_function)

my_function(1, 2, 3)

1 2 3


## Defining a decorator

Your buddy has been working on a decorator that prints a "before" message before the decorated function is called and prints an "after" message after the decorated function is called. They are having trouble remembering how wrapping the decorated function is supposed to work. Help them out by finishing their print_before_and_after() decorator.



    Call the function being decorated and pass it the positional arguments *args.
    Return the new decorated function.


In [35]:
def print_before_and_after(func):
    def wrapper(*args):
        print('Before {}'.format(func.__name__))
        # Call the function being decorated with *args
        func(*args)
        print('After {}'.format(func.__name__))
    # Return the nested function
    return wrapper


@print_before_and_after
def multiply(a, b):
    print(a * b)

    
multiply(5, 10)

#del multiply, print_before_and_after

Before multiply
50
After multiply


## Real-world examples







**although this is the last chapter, I should ask myself, did I really understand everything?

  **nested function immplementation,  data structure,  calling __closure__ in local, nonlocal, global?  



We'll go though some real world decorators, so that we can start to recognize common decorator patterns. 

   **The timer() decorator runs the decorated function   and then prints how long it took for the function to run
   **it is a pretty easy way to figure out hwere your computational bottlenecks are

In [33]:
import time

def timer(func):
    """A decorator that prints how long a function tooks to run
    """
    
    def wrapper(*args, **kwargs):
        start = time.time()              # when wrapper was called, get the current time
        result = func(*args, **kwargs)   # call the decorated function and store the result
        t_spend = time.time() - start
        print(f'{func.__name__} tooks {t_spend} to run')
        return result
    return wrapper


@timer
def hi():
    print('hi')
    
@timer
def sleep_n_seconds(n):
    time.sleep(n)
    
sleep_n_seconds(21)
    
    
print(hi())

del hi, sleep_n_seconds, timer, time

sleep_n_seconds tooks 21.031243085861206 to run
hi
hi tooks 0.00014543533325195312 to run
None


## **Memorizing is the process of storing the results of a function so that the next time the function is called with same arguments, you can just look up the answer. 

  **We startby setting up a dictionary that will map arguments to results, then as usual, we craete wrapper to be the new decorated function that this decorator returns.
  **When the new function gets called, we check to see wheather we've ever seen these arguments before. If we havent, we send them to the decorated function and store the result in the cache dictionary.  



In [31]:
def memorize(func):
    """Store the results of the decorated function for fast look up
    """
    # Store results in a dictionary that maps arguments to results
    cache = {}
    # Define wrapper function to return
    def wrapper(*args):
        if (args) not in cache:
            cache[(args)] = func(*args)
        return cache[(args)]
    return wrapper


# ********************************************************************************* #
# In fact, no mutable data structures can be used as keys in dictionaries


import time
@memorize
def simple_add(a, b, c):
    print('Sleep for 7 seconds')
    time.sleep(7)
    return a+b+c


print(simple_add(12,98,1000))

Sleep for 7 seconds
1110


In [32]:
print(simple_add(12,98,1000))

del simple_add, memorize

1110


In [25]:
def memorize(func):
    """Store the results of the decorated function for fast look up
    """
    # Store results in a dictionary that maps arguments to results
    cache = {}
    # Define wrapper function to return
    def wrapper(*args):
        if (args) not in cache:
            cache[(args)] = func(*args)
        return cache[(args)]
    return wrapper


# ********************************************************************************* #
# In fact, no mutable data structures can be used as keys in dictionaries


import time
@memorize           # can we use @timer as well
def simple_add(a, b, c):
    print('Sleep for 7 seconds')
    time.sleep(7)
    return a+b+c


print(simple_add(12,98,1000))

del memorize#, simple_add

Sleep for 7 seconds
1110


In [26]:
print(simple_add(12,98,1000))

del simple_add

1110


In [27]:
# ****************************************************************************************** #
def factorial(number):
    # Validate input
    if not isinstance(number, int):
        raise TypeError("Sorry. 'number' must be an integer.")
    if number < 0:
        raise ValueError("Sorry. 'number' must be zero or positive.")
    
    # Calculate the factorial of number
    def inner_factorial(number):
        if number <= 1:
            return 1
        return number * inner_factorial(number - 1)
    return inner_factorial(number)


factorial(19)

# ****************************************************************************************** #

121645100408832000

## Print the return type ###########################################################################

You are debugging a package that you've been working on with your friends. Something weird is happening with the data being returned from one of your functions, but you're not even sure which function is causing the trouble. You know that sometimes bugs can sneak into your code when you are expecting a function to return one thing, and it returns something different. For instance, if you expect a function to return a numpy array, but it returns a list, you can get unexpected behavior. To ensure this is not what is causing the trouble, you decide to write a decorator, print_return_type(), that will print out the type of the variable that gets returned from every call of any function it is decorating.



    Create a nested function, wrapper(), that will become the new decorated function.
    Call the function being decorated.
    Return the new decorated function.


In [94]:
def print_return_type(func):
    # Define wrapper(), the decorated function
    def wrapper(*args, **kwargs):
        # Call the function being decorated
        result = func(*args, **kwargs)
        print('{}() returned type {}'.format(
            func.__name__, type(result)
        ))
        return result
    # Return the decorated function
    return wrapper


@print_return_type
def foo(value):
    return value
  
print(foo(42))
print(foo([1, 2, 3]))
print(foo({'a': 42}))

foo() returned type <class 'int'>
42
foo() returned type <class 'list'>
[1, 2, 3]
foo() returned type <class 'dict'>
{'a': 42}


## Counter ###########################################################################

You're working on a new web app, and you are curious about how many times each of the functions in it gets called. So you decide to write a decorator that adds a counter to each function that you decorate. You could use this information in the future to determine whether there are sections of code that you could remove because they are no longer being used by the app.



    Call the function being decorated and return the result.
    Return the new decorated function.
    Decorate foo() with the counter() decorator.


In [100]:
def counter(func):
    def wrapper(*args, **kwargs):
        wrapper.count += 1
        # Call the function being decorated and return the result
        return func(*args, **kwargs)
    wrapper.count = 0
    # Return the new decorated function
    return wrapper

    
# Decorate foo() with the counter() decorator      # *******************************************
@counter                                           # This is very useful and very good idea
def foo(a, b):
    print(a+b)
    print('calling foo()')

foo(1,2)
foo(3,4)

print('foo() was called {} times.'.format(foo.count))

3
calling foo()
7
calling foo()
foo() was called 2 times.


## Decorators and metadata




One of the problems with decorators is that they obscure the decorated function's metadata.  
     I'll try call __closure__ method




In [20]:
import time
from functools import cache, wraps

def timer(func):
    """A decorator that prints how long a function tooks to run
    """
    
    def wrapper(*args, **kwargs):
        start = time.time()*1000
        results = func(*args, **kwargs)
        t_spend = time.time()*1000 - start
        print(f"func: {func.__name__}, spend {t_spend} ms time to run")
        return results
    return wrapper                     #(*args, **kwargs)

@cache
@timer
def factorial(n):
    if not isinstance(n, int):
        raise ValueError('input n should be integer')
    if n<=0:
        raise ValueError('input n should be positive')
    
    def inner_factorial(n):
        if n <=1:
            return 1
        return n*inner_factorial(n-1)
    return inner_factorial(n)
    
    
factorial(199)

#factorial.__name__      # the reason is factorial() was decorated, it actually running wrapper()
del timer, wraps, cache, factorial

func: factorial, spend 0.086669921875 ms time to run


## remember that when we write decorators, we almost always define a nested functionto return, 
## because the decorator over writes the decorated function, when we asks for it name or docstring, 
## we are actually referencing the nested  wrapper function that was returned by the decorator





**Now that we know how to preserve the metadata of the functions weare decorating 

In [15]:
import time
from functools import wraps, cache

def timer(func):
    """A decorator that prints how long a function tooks to run
    """
    
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()*1000
        results = func(*args, **kwargs)
        t_spend = time.time()*1000 - start
        print(f"func: {func.__name__}, spend {t_spend} ms time to run")
        return results
    return wrapper                     #(*args, **kwargs)

@timer
def factorial(n):
    """Calcualte factorial of an int
    """
    if not isinstance(n, int):
        raise ValueError('input n should be int')
    if n<=0:
        raise ValueError('input n should be positive')
        
    def inner_factorial(n):
        if n <=1:
            return 1
        return n*inner_factorial(n-1)
    return inner_factorial(n)
    
    
print(factorial(19))

print(factorial.__name__)    # the reason is factorial() was decorated, it actually running wrapper()
print(factorial.__doc__)
print(factorial.__wrapped__)
del timer, cache, wraps, factorial

func: factorial, spend 0.01025390625 ms time to run
121645100408832000
factorial
Calcualte factorial of an int
    
<function factorial at 0x7f786c190670>


## Preserving docstrings when decorating functions

Your friend has come to you with a problem. They've written some nifty decorators and added them to the functions in the open-source library they've been working on. However, they were running some tests and discovered that all of the docstrings have mysteriously disappeared from their decorated functions. Show your friend how to preserve docstrings and other metadata when writing decorators.


Instructions 1/4
25 XP

    1
    2    missing
    3    missing
    4    missing

    Decorate print_sum() with the add_hello() decorator to replicate the issue that your friend saw - that the docstring disappears.


In [135]:
from functools import wraps


def add_hello(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print('Hello')
        return func(*args, **kwargs)
    return wrapper

# Decorate print_sum() with the add_hello() decorator
@add_hello
def print_sum(a, b):
    """Adds two numbers and prints the sum"""
    print(a + b)

print_sum(10, 20)
print_sum_docstring = print_sum.__doc__
print(print_sum_docstring)

Hello
30
Adds two numbers and prints the sum


## Measuring decorator overhead

Your boss wrote a decorator called check_everything() that they think is amazing, and they are insisting you use it on your function. However, you've noticed that when you use it to decorate your functions, it makes them run much slower. You need to convince your boss that the decorator is adding too much processing time to your function. To do this, you are going to measure how long the decorated function takes to run and compare it to how long the undecorated function would have taken to run. This is the decorator in question:

def check_everything(func):
  @wraps(func)
  def wrapper(*args, **kwargs):
    check_inputs(*args, **kwargs)
    result = func(*args, **kwargs)
    check_outputs(result)
    return result
  return wrapper

Instructions
100 XP

    Call the original function instead of the decorated version by using an attribute of the function that the wraps() statement in your boss's decorator added to the decorated function.


In [21]:
from functools import wraps
import time

def check_inputs(my_list):                # what the hell? this stupid draft function works?
    if not isinstance(my_list, list):     # what the hell? this stupid draft function works?
        raise ValueError('input my_list should be a list')
    
def check_outputs(result):                # what the hell? this stupid draft function works?
    if not isinstance(result, list):      # what the hell? this stupid draft function works?
        raise ValueError('output result should be a list')

def check_everything(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        check_inputs(*args, **kwargs)     # check_inputs function
        result = func(*args, **kwargs)
        check_outputs(result)             # check_outputs function
        return result
    return wrapper


@check_everything
def duplicate(my_list):
    """Return a new list that repeats the input twice"""
    return my_list + my_list


t_start = time.time()
duplicate(list(range(5000)))
t_end = time.time()
decorated_time = t_end - t_start

t_start = time.time()
# Call the original function instead of the decorated one
duplicate.__wrapped__(list(range(5000)))
t_end = time.time()
undecorated_time = t_end - t_start

print('Decorated time: {:.5f}s'.format(decorated_time))
print('Undecorated time: {:.5f}s'.format(undecorated_time))


del check_inputs, check_outputs, check_everything, duplicate, time

Decorated time: 0.00022s
Undecorated time: 0.00023s


In [14]:
from functools import wraps, cache
import time
import random


def check_inputs(my_list):                # what the hell? this stupid draft function works?
    if not isinstance(my_list, list):     # what the hell? this stupid draft function works?
        raise ValueError('input my_list should be a list')
    else:
        print('check_inputs passed')
    
def check_outputs(result):                # what the hell? this stupid draft function works?
    if not isinstance(result, list):      # what the hell? this stupid draft function works?
        raise ValueError('output result should be a list')
    else:
        print('check_outputs passed')

def check_everything(func):
    @wraps(func)
    def wrapper(*args):
        check_inputs(*args)     # check_inputs function
        result = func(*args)
        check_outputs(result)             # check_outputs function
        return result
    return wrapper

def timer(func):
    @wraps(func)
    def wrapper(*args):
        start = time.time()*1000
        result = func(*args)
        t_spend = time.time()*1000 - start
        print(f"func: {func.__name__}, spend {t_spend} ms time to run")
        return result
    return wrapper


@timer
@check_everything
def duplicate(my_list):
    """Return a new list that repeats the input twice"""
    return my_list + my_list

a = [random.randrange(1, 2000) for i in range(4000)]
#print(a)

#duplicate(a)

alpha = duplicate(a)
print(alpha[:10])


del check_inputs, check_outputs, check_everything, timer, wraps, cache, duplicate, a

check_inputs passed
check_outputs passed
func: duplicate, spend 0.132568359375 ms time to run
[1511, 935, 588, 1452, 955, 99, 208, 337, 522, 1664]


## how to write more efficient function, spend less time




In [12]:
import time
from functools import wraps, cache


def timer(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        t_spend = time.time()-start
        print('func: {}, spend {:.7f}s to run'.format(func.__name__, t_spend))
        #print(f"func: {func.__name__}, spend {t_spend} s to run")
        return result
    return wrapper


#def time_stop_5():
    
@timer
@cache
def simple_add(a, b, c):
    print('Sleep for 7 seconds')
    time.sleep(7)
    return a+b+c

simple_add(32, 17, 98)


del timer, cache, simple_add, wraps

Sleep for 7 seconds
func: simple_add, spend 7.0007727s to run


## Decorators that take arguments





**We want pass a as an argument, instead of hard-coding it in the decorator,  If we had some way to pass n into decorator, we could decorate the function we wanted with @run_n_times(n).  but a decorator is only supposed to take one argument - the function it is decorate


## To make run_n_times() work, we have to turn it into a function that returns a decorator, rather than a function that is a decorator.  

**Because when we use a decorator syntax, the thing after the at symbol must be a reference to a decorator function

In [23]:
def run_three_times(func):
    def wrapper(*args, **kwargs):
        for i in range(3):
            func(*args, **kwargs)
    return wrapper

@run_three_times
def print_sum(a, b):
    print(a+b)
    
print_sum(9, 7)

del run_three_times, print_sum

16
16
16




**Add another level of nesting,

In [24]:
def run_n_times(n):
    def decorator(func):
        def wrapper(*args, **kwargs):
            for i in range(n):
                func(*args, **kwargs)
        return wrapper
    return decorator

@run_n_times(4)
def print_sum(a, b):
    print(a+b)
    
    
print_sum(9,7)

del print_sum, run_n_times

16
16
16
16


In [41]:
def run_n_times(n):
    def decorator(func):
        def wrapper(*args, **kwargs):
            for i in range(n):
                func(*args, **kwargs)
        return wrapper
    return decorator

run_3_times = run_n_times(3)   # the function that returns a decorator


@run_3_times
def print_sum(a, b):
    print(a+b)
    
def print_hello():
    print('HelloWorld')
    
print_sum(9,7)


print(run_3_times(print_hello))
ff = run_3_times(print_hello)
ff()

print = run_3_times(print)
print('pppp')

del print_sum, run_n_times, run_3_times, print_hello, ff

16
16
16
<function run_n_times.<locals>.decorator.<locals>.wrapper at 0x7f94a4464d30>
HelloWorld
HelloWorld
HelloWorld
pppp
pppp
pppp


In [43]:
print('12')

del print

12
12
12


In [44]:
print("hello")

hello


## Run_n_times()

In the video exercise, I showed you an example of a decorator that takes an argument: run_n_times(). The code for that decorator is repeated below to remind you how it works. Practice different ways of applying the decorator to the function print_sum(). Then I'll show you a funny prank you can play on your co-workers.

def run_n_times(n):
  """Define and return a decorator"""
  def decorator(func):
    def wrapper(*args, **kwargs):
      for i in range(n):
        func(*args, **kwargs)
    return wrapper
  return decorator



Instructions 1/3
35 XP

    1   Add the run_n_times() decorator to print_sum() using decorator syntax so that print_sum() runs 10 times.

    2   Use run_n_times() to create a decorator run_five_times() that will run any function five times.

    3   Here's the prank: use run_n_times() to modify the built-in print() function so that it always prints 20 times!
    
    
    
## too easy

## HTML Generator

You are writing a script that generates HTML for a webpage on the fly. So far, you have written two decorators that will add bold or italics tags to any function that returns a string. You notice, however, that these two decorators look very similar. Instead of writing a bunch of other similar looking decorators, you want to create one decorator, html(), that can take any pair of opening and closing tags.

def bold(func):
  @wraps(func)
  def wrapper(*args, **kwargs):
    msg = func(*args, **kwargs)
    return '<b>{}</b>'.format(msg)
  return wrapper

def italics(func):
  @wraps(func)
  def wrapper(*args, **kwargs):
    msg = func(*args, **kwargs)
    return '<i>{}</i>'.format(msg)
  return wrapper



Instructions 1/4
25 XP

    1   Return the decorator and the decorated function from the correct places in the new html() decorator.

    2   Use the html() decorator to wrap the return value of hello() in the strings <b> and </b> (the HTML tags that mean "bold").

    3   Use html() to wrap the return value of goodbye() in the strings <i> and </i> (the HTML tags that mean "italics").

    4   Use html() to wrap hello_goodbye() in a DIV, which is done by adding the strings <div> and </div> tags around a string.
    
    
    
Hint

    html() is like a decorator factory. It creates a new decorator and returns it.
    wrapper() will eventually overwrite whatever function is being decorated.


In [54]:
from functools import cache, wraps

def bold(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        msg = func(*args, **kwargs)
        return '<b>{}</b>'.format(msg)
    return wrapper

def italics(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        msg = func(*args, **kwargs)
        return '<i>{}</i>'.format(msg)
    return wrapper


def html(open_tag, close_tag):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            msg = func(*args, **kwargs)
            return '{}{}{}'.format(open_tag, msg, close_tag)
        # Return the decorated function
        return wrapper
    # Return the decorator
    return decorator


@html("<i>", "</i>")
def print_hello():
    return "Hello World"
    
@html("<div>", "</div>")
def hello_goodbye():
    return "hello and goodbye"
    
print(print_hello())
print(hello_goodbye())
    
del print_hello, hello_goodbye, html, italics, bold, cache, wraps

<i>Hello World</i>
<div>hello and goodbye</div>


## Timeout(): a real world example
50 XP





**Say we have some functions that occationally either run for longer than we want them to or just hang and never return.  It would be nice if we could add some kind of timeout() decorator to those functions that will raise an error if the function runs for longer than expected.  


    To create the timeout() decorator, we are going to use some functions from Python's signal module. These functions have nothingto do with decorators but understing them will help you understsnd the timeout() decorator.  
    
    The raise_timeout() function simply raise a TimeoutError when called.  
    
    The signal() function tells Python, when you see the signal whose signal is signalnum, call the handler function.  
    
    The alarm() function lets us set an alarm for some number of seconds in the future.  

In [1]:
import signal

def raise_timeout(*args, **kwargs):
    raise TimeoutError()
    
# When an alarm signal goes off, call raise_timeout function
signal.signal(signalnum=signal.SIGALRM, handler=raise_timeout)

# Set an alarm for 5 seconds
signal.alarm(5)

# Cancle the alarm
signal.alarm(0)

5

In [2]:
import signal, time

def raise_timeout(*args, **kwargs):
    raise TimeoutError()
    
# When an alarm signal goes off, call raise_timeout function
signal.signal(signalnum=signal.SIGALRM, handler=raise_timeout)

def timeout_in_5s(func):
    def wrapper(*args, **kwargs):
        signal.alarm(5)
        try:
            return func(*args, **kwargs)
        finally:
            signal.alarm(0)
    return wrapper

@timeout_in_5s
def sleep_6s():
    time.sleep(6)
    print("sleep 6 seconds")
    
sleep_6s()

del sleep_6s, timeout_in_5s, signal, time

TimeoutError: 

## Signal Enum Members

Signals are represented as integers. The signal library has a number of Enum members which we can use.

The must-know enum members are:

    SIGBREAK (Unix), CTRL_BREAK_EVENT (Windows): When CTRL + BREAK is pressed on the keyboard
    SIG_DFL: Perform the default function for the signal
    ......
    SIGABRT: Abort signal
    SIGALRM: Timer signal   <= 

In [9]:
import signal, time

def raise_timeout(*args, **kwargs):
    raise TimeoutError("Timeout Occured, out of second range")
    
# When an alarm signal goes off, call raise_timeout function
signal.signal(signalnum=signal.SIGALRM, handler=raise_timeout)

def timeout(n):
    def decorator(func):
        def wrapper(*args, **kwargs):
            signal.alarm(n)
            try:
                return func(*args, **kwargs)
            finally:
                signal.alarm(0)
        return wrapper
    return decorator

@timeout(7)
def sleep_n_second(n):
    time.sleep(n)
    print(f"sleep for {n} seconds")
            
sleep_n_second(8)

del sleep_n_second, timeout, signal, time, raise_timeout

TimeoutError: Timeout Occured, out of second range

## Tag your functions


## *****************************************************************************************

Tagging something means that you have given that thing one or more strings that act as labels. For instance, we often tag emails or photos so that we can search for them later. You've decided to write a decorator that will let you tag your functions with an arbitrary list of tags. You could use these tags for many things:

    Adding information about who has worked on the function, so a user can look up who to ask if they run into trouble using it.
    Labeling functions as "experimental" so that users know that the inputs and outputs might change in the future.
    Marking any functions that you plan to remove in a future version of the code.
    Etc.

Instructions
100 XP

    Define a new decorator, named decorator(), to return.
    Ensure the decorated function keeps its metadata.
    Call the function being decorated and return the result.
    Return the new decorator.



Hint

    Remember: decorators like tag() that take arguments are actually decorator factories. tag() should create a new decorator and return it.
    Use a decorator from the functools module to attach the metadata from func() to the wrapper() function.
    wrapper() is the decorated function that decorator() returns. Its only job is to call the function being decorated.


In [6]:
def tag(*tags):
    # Define a new decorator, named "decorator", to return
    def decorator(func):
        # Ensure the decorated function keeps its metadata
        #@wraps(func)
        def wrapper(*args, **kwargs):
            # Call the function being decorated and return the result
            return func(*args, **kwargs)
        wrapper.tags = tags
        return wrapper
    # Return the new decorator
    return decorator

@tag('test', 'this is a tag')
def foo():
    pass

print(foo.tags)

('test', 'this is a tag')


## Check the return type

Python's flexibility around data types is usually cited as one of the benefits of the language. It can sometimes cause problems though if incorrect data types go unnoticed. You've decided that in order to ensure your code is doing exactly what you want it to do, you will explicitly check the return types in all of your functions and make sure they're returning what you expect. To do that, you are going to create a decorator that checks if the return type of the decorated function is correct.

Note: assert is a keyword that you can use to test whether something is true. If you type assert condition and condition is True, this function doesn't do anything. If condition is False, this function raises an error. The type of error that it raises is called an AssertionError.



    Start by completing the returns_dict() decorator so that it raises an AssertionError if the return type of the decorated function is not a dictionary.



Hint

    Make sure wrapper() is flexible enough to take any arguments that the function it is decorating might take.
    wrapper() must call the function being decorated.
    Don't forget to return the newly decorated function.


In [7]:
def returns_dict(func):
    # Complete the returns_dict() decorator
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        assert type(result) == dict
        return result
    return wrapper

@returns_dict
def foo(value):
    return value

try:
    print(foo([1,2,3]))
except AssertionError:
    print('foo() did not return a dict!')

foo() did not return a dict!


## Youtube Python Tutorials - Closure | Nested Functions






## understing the nested functions relashionships





In [6]:
def outer():
    x = 'hi'
    def inner():
        print(x)
    inner()
        
        
o = outer
o()

p = outer()
p

hi
hi


In [11]:
def outer():
    x = 'hi'
    def inner():
        print(x)
    return inner()
        
        
o = outer()             # *********************************************************************** #
o                       # we call the outer function, it returns a claaing of inner function, 


hi


In [16]:
def outer():
    x = 'hi'
    def inner():
        print(x)        # *********************************************************************** #
    return inner        # this is enclosing function returning the nested function not calling
        
        
o = outer()             # *********************************************************************** #
o                       # <function __main__.outer.<locals>.inner()>, 
o()                     # we call the outer function, it returns inner function, 
                        # a calling outer function calling is equal to inner function calling, 

print(o.__name__)       # because this outer function calling returns inner function

hi
inner


In [9]:
def outer():
    x = 'hi'
    def inner():
        y = 'jho'
        string = x+y
        return string
    return inner()
        
                        # *********************************************************************** #
o = outer()             # we call the outer function, returns a calling of inner function
print(o)                # value returning function, we need print() result out

#print(o.__name__)

hijho


In [10]:
def outer():
    x = 'hi'
    def inner():
        y = 'jho'
        string = x+y
        return string
    return inner()
        
                       # *********************************************************************** #
o = outer              # outer function not called, outer function returns calling of inner function
print(o)
o()
print(o.__name__)      # because we wrote    o = outer


<function outer at 0x7f87a0153e50>
outer


In [65]:
def outer():
    x = 'hi'
    def inner():
        y = 'jho'
        string = x+y
        return string
    return inner
        
        
o = outer              # *********************************************************************** #
print(o)               # the outer function at a memory location
print(o())             # we execute the inner function body 
print(o().__name__)    # when outer function called, the return is a inner function
o()


<function outer at 0x7f9a687af4c0>
<function outer.<locals>.inner at 0x7f9a687afca0>
inner


<function __main__.outer.<locals>.inner()>

In [177]:
list(range(10, 1, -1))

[10, 9, 8, 7, 6, 5, 4, 3, 2]

In [181]:
[x*2 for x in range(2,15)]

[4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

In [184]:
import numpy as np

a = np.array([1,2,3,4])
print(a[[False, True, False, True]])

[2 4]


In [185]:
[1,2,3]*3

[1, 2, 3, 1, 2, 3, 1, 2, 3]

In [190]:
import numpy as np

table = np.array([[1,8],
                  [6,4]])
print(table.max(axis = 0))

[6 8]


In [191]:
import numpy as np

table = np.array([[1,8],
                  [6,4]])
print(table.max(axis = 1))

[8 6]


In [194]:
y = "a:b:apple"

c = y.split(':')
print(c)
len(c)

['a', 'b', 'apple']


3