# Fundamentals of Information Systems

## Python Programming (for Data Science)

### Master's Degree in Data Science

#### Giorgio Maria Di Nunzio
#### (Courtesy of Gabriele Tolomei FIS 2018-2019)
<a href="mailto:giorgiomaria.dinunzio@unipd.it">giorgiomaria.dinunzio@unipd.it</a><br/>
University of Padua, Italy<br/>
2019/2020<br/>

# Lecture 5: Functions & I/O

# Functions

## Motivations and Basic Syntax

-  Functions are the primary and most important method of **code organization** and **reuse** in Python (and any programming language, for that matters!). 

-  In Python, functions are declared using the <code>**def**</code> keyword and returned from using the <code>**return**</code>.

-  You may have multiple <code>**return**</code> statements. 

-  If the end of a function's body is reached without encountering a <code>**return**</code> statement, <code>**None**</code> is returned automatically.

## Input Arguments: _positional_ vs. _keyword_

-  Each function can have some number of **_positional_** arguments and some number of **_keyword_** arguments.

-  Keyword arguments are typically used to specify _default values_ or optional arguments.

-  The main restriction is that the keyword arguments **must** follow positional arguments (if any).

In [1]:
'''
Example of function definition.
my_function is the name of the function
This takes 3 input arguments: 
- "a" and "b" are positional arguments
- "c" and "d" are keyword arguments
'''
def my_function(a, b, c=1.5, d=3):
    if c > 1:
        return c * (a + b) ** d
    else:
        return c / (a + b) ** d
    
# The function above can be called in either one of the following ways:
# 1. Without keyword arguments (i.e., default keyword argument values are used)
print(my_function(2, 3))

# 2.1. With keyword arguments (without specifying keywords)
print(my_function(2, 3, .8, 2))

# 2.2. With keyword arguments (specifying keywords in any order, disregarding the signature)
print(my_function(2, 3,d=.5 , c=2))

187.5
0.032
4.47213595499958


## Namespaces, Scope, and Local Functions

-  Functions can access variables in 2 different scopes: **global** and **local**. 

-  An alternate and more descriptive name describing a variable scope in Python is a **namespace**. 

-  Any variable that is assigned within a function by default is assigned to the **local namespace**. 

    -  The local namespace is created when the function is called and immediately populated by the function's arguments. After the function returns, the local namespace is destroyed (with some exceptions).

## _Global_ and _Local_ Variable Scope

-  Python variables are **local** if not otherwise declared. The reason being is that **global** variables are generally bad practice and should be avoided.

-  When you define variables inside a function definition, they are local to **this** function by default. As such:
    -  Anything you do to those variables in the body of the function will have no effect on other variables outside of the function, **even if they have the same name**. 
    -  In other words, the function body is the **scope** of those variables, i.e., the context where names with their values are associated. 

## _Global_ and _Local_ Variable Scope

-  All variables have the scope of the block, where they are **_declared_** and **_defined_**. (_They can only be used after the point of their declaration._)

-  Just to make things clear: Variables don't have to be and can't be declared in the way they are declared in programming languages like Java or C. 

-  Variables in Python are implicitly declared by defining them, i.e., the first time you assign a value to a variable, this variable is declared and has automatically the data type of the object which has to be assigned to it.

In [2]:
# Example 1: Trying to access a variable outside the local scope
# of the function where it has been defined (and declared)
def foo():
    # We define my_list within the function's body
    # This variable is assigned to the local namespace of this function
    my_list = [36, 49, 64]
    
    # Print the values contained in my_list
    print("Inside foo(), my_list is {}".format(my_list))
    
# Print the values contained in my_list before calling foo()
# NOTE: my_list has been only defined (and declared) within foo()'s scope!
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

NameError: name 'my_list' is not defined

In [3]:
# Example 2: Access a variable inside the local scope of a function yet 
# defined (and declared) outside of it
# We define a function foo() which in its body uses the variable my_list.
# Such a variable is defined (and therefore declared) before calling foo()
# Define foo()
def foo():
    # Print the values contained in my_list
    print("Inside foo(), my_list is {}".format(my_list))

# Define my_list outside the scope of foo() and before calling foo()
my_list = [42, 73, 96]

# Print the values contained in my_list before calling foo()
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

# As there is no local variable my_list defined in foo()'s body, 
# i.e., there is no assignment to my_list,
# the value from the outside global variable my_list will be used, 
# as this is the only existing binding of the name 'my_list' to a proper value.
# So, we expect the output to be the list [42, 73, 96].

Before calling foo(), my_list is [42, 73, 96]
Inside foo(), my_list is [42, 73, 96]


## Checkpoint Quiz

What will happen, if we change the value of *<code>**my_list**</code> *inside of the function *<code>**foo()**</code>*? Will this affect the outside global variable as well?

In [4]:
# Example 3: Modifying the value of a variable inside the local scope of a function yet 
# defined (and declared) outside of it is reflected to the global variable
def foo():
    # We modify the variable my_list within the function's body
    my_list.extend([36, 49, 64])
    
    # Print the values contained in my_list
    print("Inside foo(), my_list is {}".format(my_list))

# Define my_list outside the scope of foo() and before calling foo()
my_list = [42, 73, 96]

# Print the values contained in my_list before calling foo()
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

# Print the values contained in my_list when foo() returns
print("After calling foo(), my_list is {}".format(my_list))

Before calling foo(), my_list is [42, 73, 96]
Inside foo(), my_list is [42, 73, 96, 36, 49, 64]
After calling foo(), my_list is [42, 73, 96, 36, 49, 64]


In [5]:
# Example 4: Re-binding a variable inside the local scope of a function yet 
# defined (and declared) outside of it does not affect global variable
def foo():
    # We re-define the variable my_list within the function's body
    # Actually, we are re-binding the name my_list (already used) to a different object
    my_list = [36, 49, 64]
    
    # Print the values contained in my_list
    print("Inside foo(), my_list is {}".format(my_list))

# Define my_list outside the scope of foo() and before calling foo()
my_list = [42, 73, 96]

# Print the values contained in my_list before calling foo()
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

# Print the values contained in my_list when foo() returns
print("After calling foo(), my_list is {}".format(my_list))

Before calling foo(), my_list is [42, 73, 96]
Inside foo(), my_list is [36, 49, 64]
After calling foo(), my_list is [42, 73, 96]


## Checkpoint Quiz

What if we combine the last two examples, i.e., within the *<code>**foo()**</code>* function, we first access *<code>**my_list**</code>* with a *<code>**print()**</code>*, hoping to get the value associated with the __global namespace__ (outside *<code>**foo()**</code>*'s local scope), and then assigning a new value to it? 
Assigning a value to it, means creating a __local__ variable *<code>**my_list**</code>*. So, we would have the same name *<code>**my_list**</code>* bound both to a __global__ and a __local__ variable in the same scope, i.e., the body of the function.

In [6]:
# Example 5: Access AND try to re-bind a variable inside the local scope of a function 
# yet defined (and declared) outside of it
def foo():
    # Print the values contained in my_list
    print("Inside foo(), my_list is {}".format(my_list))
    
    # We re-define the variable my_list within the function's body
    # Actually, we are re-binding the name my_list (already used) to a different object
    my_list = [36, 49, 64]
    
    # Print the values contained in my_list
    print("Inside foo(), my_list is {}".format(my_list))

# Define my_list outside the scope of foo() and before calling foo()
my_list = [42, 73, 96]

# Print the values contained in my_list before calling foo()
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

# Print the values contained in my_list when foo() returns
print("After calling foo(), my_list is {}".format(my_list))

Before calling foo(), my_list is [42, 73, 96]


UnboundLocalError: local variable 'my_list' referenced before assignment

## Observations

-  A variable can't be both **_local_** and **_global_** inside of a function. 

-  Python assumes that <code>**my_list**</code> in the first <code>**print()**</code> statement inside <code>**foo()**</code>'s local scope refers to the **_local_** variable <code>**my_list**</code> that is defined right after and **not** to the **_global_** one defined outside.

-  That is why we get the <code>**UnboundLocalError**</code>, as Python thinks we are trying to access a variable before its definition.

-  To tell Python that we want to use the **_global_** variable, we have to explicitly state this by using the keyword <code>**global**</code>.

In [7]:
# Example 6: Access AND try to re-bind a variable inside the local scope of a function 
# yet defined (and declared) outside of it by explicitly telling Python that 
# we are referring to the global (outside defined) variable. 
def foo():
    # Explicitly tell Python we are referring to the variable my_list
    # defined outsilde this local scope (i.e., global)
    global my_list
    
    # Print the values contained in (global) my_list
    print("Inside foo(), my_list is {}".format(my_list))
    
    # We re-define the global variable my_list within the function's body
    # Actually, we are re-binding the name my_list (already used) to a different object
    my_list = [36, 49, 64]
    
    # Print the values contained in my_list
    print("Inside foo(), my_list is {}".format(my_list))

# Define my_list outside the scope of foo() and before calling foo()
my_list = [42, 73, 96]

# Print the values contained in my_list before calling foo()
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

# Print the values contained in my_list when foo() returns
print("After calling foo(), my_list is {}".format(my_list))

Before calling foo(), my_list is [42, 73, 96]
Inside foo(), my_list is [42, 73, 96]
Inside foo(), my_list is [36, 49, 64]
After calling foo(), my_list is [36, 49, 64]


## _Nonlocal_ Variables

-  Pyhton 3 introduces also <code>**nonlocal**</code> variables.

-  Those are similar to <code>**global**</code> variables but they can only be used inside of nested functions. 

-  In other words, a <code>**nonlocal**</code> variable has to be defined in the enclosing function scope.

In [8]:
# Example 7: nonlocal variables can only be used if they are defined in one of the
# enclosing nested function
def foo():
    # Trying to access my_list as if it was global turns into an error!
    nonlocal my_list
    
    # Print the values contained in (nonlocal) my_list
    print("Inside foo(), my_list is {}".format(my_list))
    
# Define my_list outside the scope of foo() and before calling foo()
my_list = [42, 73, 96]

# Print the values contained in my_list before calling foo()
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

# Print the values contained in my_list when foo() returns
print("After calling foo(), my_list is {}".format(my_list))

SyntaxError: no binding for nonlocal 'my_list' found (1680621625.py, line 5)

In [9]:
# Example 8: correct usage of nonlocal variables
def foo():
    # Define my_list inside the scope of foo() and before calling bar()
    my_list = [36, 49, 64]
    
    # Print the values contained in my_list before calling bar()
    print("Inside foo() and before calling bar(), my_list is {}".format(my_list))
    
    def bar():
        # Trying to access my_list defined in foo()'s local scope
        nonlocal my_list
        
        my_list = [6, 7, 8]
        
        # Print the values contained in (nonlocal) my_list
        print("Inside bar(), my_list is {}".format(my_list))
        
    # Calling bar()
    bar()
    
    # Print the values contained in my_list after calling bar()
    print("Inside foo() and after calling bar(), my_list is {}".format(my_list))

# Define my_list outside the scope of foo() and before calling foo()
my_list = [42, 73, 96]

# Print the values contained in my_list before calling foo()
print("Before calling foo(), my_list is {}".format(my_list))

# Calling foo()
foo()

# Print the values contained in my_list when foo() returns
print("After calling foo(), my_list is {}".format(my_list))

Before calling foo(), my_list is [42, 73, 96]
Inside foo() and before calling bar(), my_list is [36, 49, 64]
Inside bar(), my_list is [6, 7, 8]
Inside foo() and after calling bar(), my_list is [6, 7, 8]
After calling foo(), my_list is [42, 73, 96]


## How Function Call (Roughly) Works

-  Calling function <code>**foo()**</code> results into the creation of a so-called **stack frame** (a.k.a. **activation frame** or **activation record**).

-  The stack frame contains essentially 3 pieces of information:
    -  the actual input arguments to the called function (if any);
    -  the (memory) address where to return after the called function terminates;
    -  the called function's internal state (e.g., local variables).
    
-  Upon returning to the "caller", the stack frame corresponding to the called function is destroyed!

<span style="color: red"><b>NOTE:</b></span> *When the stack frame corresponding to a function call is destroyed, any variable definition that is in the __local namespace__ of the function is destroyed as well __but__ if the *<code>**global**</code>* keyword is used to refer to a variable living inside the __global namespace__ instead then this will be also visible once the function returns to the caller.*

## Returning Multiple Values

-  One of the most useful feature of Python functions is their ability to possibly return multiple values

-  In many applications (especially, in data science), you will likely encounter many functions that may have multiple outputs.

In [10]:
# Consider the following function returning 3 values
def bar():
    x = 3
    y = 4
    z = 5
    return x, y, z

# Assign the values returned by bar() to 3 variables: a, b, and c
a,b,c = bar()
print("a = {0}; b = {1}; c = {2}".format(a,b,c))

a = 3; b = 4; c = 5


In [11]:
# Consider the same function returning 3 values
def bar():
    x = 3
    y = 4
    z = 5
    return x, y, z

# We can assign the values returned by bar() to an iterable object (a tuple)
record = bar()
print("a = {0}; b = {1}; c = {2}".format(record[0],record[1],record[2]))

a = 3; b = 4; c = 5


## Functions _are_ Objects

-  This allows the expression of many construct that are difficult to do in other programming languages.

In [12]:
# Consider the following example, where we do some data cleaning 
# and need to apply a bunch of transformations to a ("messy") list of strings
states = ['   Alabama ', 'Georgia!', 'CAlifoRnia', 'texas!!', 'FlOrIda',
          'south carolina##', 'West virginia?']

# Lots of so-called preprocessing steps need to be done 
# to make this list of strings uniform and ready for analysis: 
# e.g., whitespace stripping, removing punctuation symbols, and proper capitalization

In [13]:
# 1st Solution
# The following Python statement tells the interpreter to import a specific module
# In particular, re stands for the regular expression module (more on import later)
import re

# Define a function that takes as input a list of strings and returns another list
# where strings are properly preprocessed and normalized
def clean_strings(strings):
    result = []
    
    for value in strings:
        # remove leading and trailing whitespaces
        value = value.strip() 
        
        # remove punctuation
        value = re.sub('[!#?]', '', value) #re.sub(what_to_sub, with_what, in_which)
        
        # capitalize only the first letter
        value = value.title() 
        
        result.append(value)
        
    return result

# Call the function defined above
clean_strings(states)

['Alabama',
 'Georgia',
 'California',
 'Texas',
 'Florida',
 'South Carolina',
 'West Virginia']

In [14]:
# 2nd Solution
# An alternate approach is to make a list of the operations to apply to the list of strings 
# functional pattern (exploiting the fact that functions are objects)

# Define a function for removing punctuation from a string
def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

# Create a list of operations: either built-in functions (e.g., str.title) or user-defined
clean_ops = [str.strip, remove_punctuation, str.title]

# Define our preprocessing function so that it takes the list of functions as 2nd argument 
def clean_strings(strings, ops):
    result = []

    # loop through each string as above
    for value in strings:
        # loop through each operation (function)
        for function in ops: 
            # function is the reference to a function object
            value = function(value) 
            
        result.append(value)
        
    return result
    
# Call the function defined above (this time with the list of operations)
clean_strings(states, clean_ops)

['Alabama',
 'Georgia',
 'California',
 'Texas',
 'Florida',
 'South Carolina',
 'West Virginia']

## Anonymous (_lambda_) Functions

-  Python has support for so-called **_anonymous_** or **_lambda_** functions.

-  Those are just functions consisting of a single statement, the result of which is the return value.

-  They are defined using the <code>**lambda**</code> keyword.

-  They are very convenient in data analysis because in many cases data transformation functions will take functions as arguments. 

In [15]:
# In the following example, we define a single-statement function in 2 ways:
# 1) The usual way as below
def short_foo(x):
    return x * 2

# 2) Using anonymous function via the lambda keyword
lambda_short_foo = lambda x: x * 2

In [9]:
# Consider the following function which takes as argument another function foo
# foo is in turn applied to each element of the list
def apply_to_list(some_list, foo):
    return [foo(x) for x in some_list]

# values is the input list
values = [4, 0, 1, 5, 6]

# When calling apply_to_list, instead of defining explicitly foo just use lambda
apply_to_list(values, lambda x: x * 2)

[8, 0, 2, 10, 12]

## Checkpoint Quiz

How would you implement <code>**apply_to_list**</code> function without using <code>**lambda**</code>?

In [10]:
def apply_to_list_no_lambda(some_list):
    '''
    TODO: given the input list some_list, return another list
    so that its i-th element is equal to the i-th element of some_list yet multiplied by 2 
    '''
    result_list = [x * 2 for x in some_list]
    return result_list

    
    
values = [1, 2, 3, 4]
apply_to_list_no_lambda(values)
    

[2, 4, 6, 8]

In [11]:
# Check if results are the same when invoking apply_to_list and apply_to_list_no_lambda
apply_to_list(values, lambda x: x * 2) == apply_to_list_no_lambda(values)

True

## Extended Function Call Syntax: <code>\*args</code>, <code>\**kwargs</code>

-  The way that function arguments work under the hood in Python is actually very simple. 

-  When you write <code>**foo(a, b, c, d=d_value, e=e_value)**</code>, **_positional_** arguments are packed into a <code>**tuple**</code>, whilst **_keyword_** arguments into a <code>**dict**</code>. 

-  So, the internal function receives a <code>**tuple args**</code> and <code>**dict kwargs**</code>.

In [12]:
# Consider the following function definition
# This takes as input another function foo, plus the list of positional arguments (*args)
# and the list of keyword arguments (**kwargs)
def say_hello_then_call_foo(foo, *args, **kwargs):
    print('args = {}'.format(args))
    print('kargs = {}'.format(kwargs))
    print("Hello! Now I'm going to call '{}({}, {}, {})'".format(foo.__name__, args[0], args[1], kwargs['c']))
    return foo(*args, **kwargs)

# This is the function which will be input to the above
def bar(a, b, c=1):
    return (a + b) / c

# Call say_hello_then_call_foo with bar
say_hello_then_call_foo(bar, 5, 7, c=4.0)

args = (5, 7)
kargs = {'c': 4.0}
Hello! Now I'm going to call 'bar(5, 7, 4.0)'


3.0

## Iterators

-  Having a consistent way to iterate over sequences, like objects in a list is an important Python feature. 

-  This is accomplished by means of the **_iterator protocol_**, a generic way to make objects **_iterable_**.

In [20]:
# For example, iterating over a dict yields the dict keys
a_dict = {'x': 1, 'y': 2, 'z': 3}

# When you write the loop below, the Python interpreter first attempts
# to create an iterator out of a_dict
for key in a_dict:
    print(key)
    
# An iterator is any object that will yield objects to the Python interpreter 
# when used in a context like a for loop. 
# Most methods expecting a list or list-like object will also accept an iterable object
dict_iterator = iter(a_dict)
dict_iterator

x
y
z


<dict_keyiterator at 0x7fd6985484f0>

## Generators

-  A **_generator_** is a concise way to construct a new iterable object. 

-  Typically, functions execute and return one or more values as a whole.

-  Generators, instead, are functions that return a sequence of values *lazily*, pausing after each one until the next one is requested. 

-  To create a generator function, <code>**yield**</code> keyword rather than <code>**return**</code> is used.

In [16]:
# Consider the following example
# Note that squares(n) uses yield instead of return
def squares(n=10):
    print('Generating squares of numbers from 1 to {0}'.format(n))
    for i in range(1, n + 1):
        yield i ** 2

# Check this is actually a generator
squares_gen = squares(12)
print(type(squares_gen))

# Until you request elements from the generator this won't execute its code

# Use the generator
for x in squares_gen:
    print(x, end=' ') # end=' ' Appends a whitespace instead of a newline
 

'''
When a generator function contains the yield keyword, it becomes a generator. When the generator is iterated over,
the function is executed up to the point where it encounters the yield statement. 
At that point, the function's state is saved, and the yielded value is returned to the caller. 
The next time the generator is iterated, it resumes execution from where it was paused and continues until it hits
the next yield statement or reaches the end of the function.
'''

<class 'generator'>
Generating squares of numbers from 1 to 12
1 4 9 16 25 36 49 64 81 100 121 144 

"\nWhen a generator function contains the yield keyword, it becomes a generator. When the generator is iterated over,\nthe function is executed up to the point where it encounters the yield statement. \nAt that point, the function's state is saved, and the yielded value is returned to the caller. \nThe next time the generator is iterated, it resumes execution from where it was paused and continues until it hits\nthe next yield statement or reaches the end of the function.\n"

# I/O

# I/O Basics

-  In this class, we will be using high-level tools provided by third-party modules (e.g., <code>**pandas**</code>) to load data files from disk into Python data structures. 

-  Still, it is important to understand the basics of how to work with files in Python!

-  To open a file for reading or writing, use the built-in <code>**open()**</code> function with either a relative or absolute file path.

-  The full API specification for <code>**open()**</code> can be found [here](https://docs.python.org/3/library/functions.html#open).

In [22]:
# Assuming we have a file 'sample.txt' stored at this location (relative path)
path = './data/sample.txt'

# The result of the 'open' function is a 'handle' to a file object
# By deafult, file is opened in 'read-only' text mode (binary mode is also available)
# In text mode, if encoding is not specified the encoding used is platform dependent:
# locale.getpreferredencoding(False) is called to get the current locale encoding
infile = open(path) # This is equal to open(path, 'r')

FileNotFoundError: [Errno 2] No such file or directory: './data/sample.txt'

In [23]:
# infile can be thought of as a reference to a list of (Unicode) strings
# You can loop through each line of the file (delimited by '\n') as follows
for line in infile:
    print(line)

NameError: name 'infile' is not defined

In [24]:
# The following is used to remove EOL character from each line
lines = [line.strip() for line in open(path)]
print(lines)

FileNotFoundError: [Errno 2] No such file or directory: './data/sample.txt'

In [25]:
# Whatever we have to do with a file's handle (reading from it or writing to it),
# eventually we should call the 'close()' method on it
infile.close()

NameError: name 'infile' is not defined

In [26]:
# Instead of explicitly opening/closing the file's handle, Python has this nice solution
with open(path) as infile:
    for line in infile:
        if line.strip() != '':
            print(line.strip())

FileNotFoundError: [Errno 2] No such file or directory: './data/sample.txt'

## Reading from Files

-  For readable files, some of the most commonly used methods are <code>**read()**</code>, <code>**seek()**</code>, and <code>**tell()**</code>. 

-  <code>**read()**</code> returns a certain number of characters from the file. 

-  What constitutes a "character" depends on whether the file is opened in **text** or **binary** mode.

In [27]:
# Example of using 'read' method
# Re-open the previously closed file in text mode (by default, UTF-8 encoding is assumed)
text_file = open(path) # This is equal to open(path, 'r')

# Read 10 characters
print(text_file.read(10))

FileNotFoundError: [Errno 2] No such file or directory: './data/sample.txt'

In [28]:
# What if we use the 'read' method when the file is opened in binary mode
binary_file = open(path, 'rb')

# Read 10 characters
print(binary_file.read(10))

FileNotFoundError: [Errno 2] No such file or directory: './data/sample.txt'

In [29]:
# The 'read' method advances the file handle's position by the number of bytes read. 
# The 'tell' method gives you the current position
# This is the output when this is called on a file opened in text mode
print("Number of bytes read = {}".format(text_file.tell()))

# This is the output when this is called on a file opened in binary mode
print("Number of bytes read = {}".format(binary_file.tell()))

# The difference is due to the fact that the text-mode file handle 
# needs to advance by 11 bytes in order to decode 10 characters 
# (which is assumed to be UTF-8-encoded by default). The extra byte originates
# from the fact that the'ñ' character is encoded with 2 bytes using UTF-8
# With byte-mode file handle, instead, the number of characters read
# also correspond to the number of bytes read!

NameError: name 'text_file' is not defined

In [30]:
# Lastly, 'seek' changes the file position to the indicates byte in the file
# Text-mode file handle
text_file.seek(3)
print("Current byte position (text-mode opened) = {}".format(text_file.read(1)))

# Byte-mode file handle
binary_file.seek(3)
print("Current byte position (binary-mode opened) = {}".format(binary_file.read(1)))

# Finally, close both file handles
text_file.close()
binary_file.close()

NameError: name 'text_file' is not defined

## Writing to Files

-  To write text to a file, you can use either the file's <code>**write()**</code> or <code>**writelines()**</code> methods. 

-  For example, we could create a version of the previous text file with no blank lines.

In [31]:
# Write each line of the original file to a new temporary file
with open('data/tmp.txt', 'w') as outfile:
    outfile.writelines(line for line in open(path) if len(line) > 1)

# See if the written file contains what we actually expect
with open('data/tmp.txt') as infile:
    print(infile.readlines())

FileNotFoundError: [Errno 2] No such file or directory: 'data/tmp.txt'