# Writing a first Python script

To get started, let us take a deep dive into Python and write our first script. We will calculate a factorial of an integer in a few different ways. 

Here, you will learn about:
##### How to use of significant whitespace
##### Writing basic for loops
##### Importing from python standard library


## Recursive implementation of n-factorial

In [2]:
def simple_factorial(n):
    result = 1
    for i in range(2,n+1):
        result *= i
    return result

print("Calculate factorial with a for-loop")
print(simple_factorial(3))

def recursive_factorial(n):
    if n == 0:
        return 1
    else:
        return n * recursive_factorial(n-1)

print("Calculate factorial using a recursive call")
print(recursive_factorial(3))    

Calculate factorial with a for-loop
6
Calculate factorial using a recursive call
6


## Calculcte n-factorial using Python standard library 

<img src="graphics/std_lib.png">

In [4]:
import codecs
help(codecs)

Help on module codecs:

NAME
    codecs - codecs -- Python Codec Registry, API and helpers.

MODULE REFERENCE
    http://docs.python.org/3.5/library/codecs
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    
    Written by Marc-Andre Lemburg (mal@lemburg.com).
    
    (c) Copyright CNRI, All Rights Reserved. NO WARRANTY.

CLASSES
    builtins.object
        Codec
            StreamReader
            StreamWriter
        IncrementalDecoder
        IncrementalEncoder
        StreamReaderWriter
        StreamRecoder
    builtins.tuple(builtins.object)
        CodecInfo
    
    class Codec(builtins.object)
     |  Defines the interface for stateless encoders/decoders.
     |  
     |  The .encode()/.de

In [6]:
from math import factorial
#from math import *
#import math
#import as math as my

print(factorial(3))

6


# Python coding best practices and Python Enhancement Proposals

Coding style matters. I would strongly recommend evryone to spend some time and get acquianted with basic Python rules formulated in the Zen of Python:

In [8]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


**As well as more advanced PEPs:** https://www.python.org/dev/peps/

# Python scalar types

Variables are nothing but reserved memory locations to store values. This means that when you create a variable you reserve some space in memory.

Based on the data type of a variable, the interpreter allocates memory and decides what can be stored in the reserved memory. Therefore, by assigning different data types to variables, you can store integers, decimals or characters in these variables.

There are four **scalar** fundamental types supported in Python: 
*Int*, *long*, *Float*, *Bool* and *None*

<img src="graphics/int.png">

Python has arbitrary precision integers so there is no true fixed maximum. You're only limited by available memory.

In [10]:
# Integers are implemented using long in C, which gives them at least 32 bits of precision
10

10

In [10]:
type(10)

int

In [12]:
#Unlimited precision
long(10)

#In Python3 int and long were unified into a single arbitrary precision int type.

NameError: name 'long' is not defined

In [13]:
#binary representation of integers
0b10

2

In [14]:
#octal
0o10

8

In [15]:
#hexadecimal
0x10

16

<img src="graphics/bool.png">

Bool is a subtype of integer.

In [16]:
bool(0)

False

In [17]:
type(False)

bool

In [18]:
bool(1)

True

In [19]:
bool("")

False

In [20]:
bool("zero")

True

<img src="graphics/none.png">

In [21]:
type(None)

NoneType

<img src="graphics/float.png">

In [22]:
type(3.)

float

In [23]:
sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

# Interacting Between Different Variable Types

Beware of integer division with Python 2. Unlike R, Python 2 doesn't assume that everything is a float unless explicitly told; it recognises that 2 is an integer, and this can be good and bad. In Python 3, we don't need to worry about this; the following code was run under a Python 3 kernel, but test it under Python 2 to see the difference.

In [24]:
myint = 2
myfloat = 3.14
print(type(myint), type(myfloat))

<class 'int'> <class 'float'>


In [25]:
# Multiplying an int with a float gives a float : the int was promoted.
print(myint * myfloat)
print(type(myint * myfloat))

6.28
<class 'float'>


In [26]:
# A minor difference between Python 2 and Python 3 :
print(7 / 3)
# Py2 : 2
# Py3 : 2.3333

2.3333333333333335


In [27]:
# In Python 2, operations between same type gives the same type :
print(type(7 / 3))
# Py2 : <type 'int'>
# Py3 : <class 'float'>

<class 'float'>


In [28]:
# Quick hack with ints to floats - there's no need to typecast, just give it a float 
print(float(7) / 3)
print(7 / 3.0)

2.3333333333333335
2.3333333333333335


In [29]:
# In Python 3, this is handled "correctly"; you can use // as integer division
print(7 // 3)

2


In [30]:
# Quick note for Py2 users - see https://www.python.org/dev/peps/pep-0238/
from __future__ import division
print(7 / 3)

2.3333333333333335


# Relational operators  

<img src="graphics/rel_ops.png">

In [32]:
var1 = 2
var2 = 2.
var3 = 4.5

print("var3 is greater than var1: ", var3 > var1)
print("var3 is greater or equal to var1: ", var3 >= var1)
print("var1 is equal to vars: ", var1 == var2)
print("var1 is var2: ", var1 is var2)

var3 is greater than var1:  True
var3 is greater or equal to var1:  True
var1 is equal to vars:  True
var1 is var2:  False


# Conditional statements: if, elif, else

In [33]:
x = 5

if x > 3 :
    print("x is greater than 3.")
    
elif x == 5 :
    print("We aren't going to see this. Why ?")
    
else :
    print("x is not greater than 3.")
    
print("We can see this, it's not in the if statement.")

x is greater than 3.
We can see this, it's not in the if statement.


# While loops, for loops and control flow 

In [34]:
for outer in range(1, 3) :
    print("BIG CLICK, outer loop change to {}".format(outer))
    
    for inner in range(4) :
        print("*little click*, outer is still {}, and inner is {}.".format(outer, inner))
        
print("I'm done here.")

BIG CLICK, outer loop change to 1
*little click*, outer is still 1, and inner is 0.
*little click*, outer is still 1, and inner is 1.
*little click*, outer is still 1, and inner is 2.
*little click*, outer is still 1, and inner is 3.
BIG CLICK, outer loop change to 2
*little click*, outer is still 2, and inner is 0.
*little click*, outer is still 2, and inner is 1.
*little click*, outer is still 2, and inner is 2.
*little click*, outer is still 2, and inner is 3.
I'm done here.


In [35]:
#Same code with the while loop
outer_index = 1
while outer_index < 3:
    print("BIG CLICK, outer loop change to {}".format(outer_index))
    outer_index += 1
    
    inner_index = 0
    while inner_index < 4:
        print("*little click*, outer is still {}, and inner is {}.".format(outer_index, inner_index))
        inner_index += 1
        
print("I'm done here.")

BIG CLICK, outer loop change to 1
*little click*, outer is still 2, and inner is 0.
*little click*, outer is still 2, and inner is 1.
*little click*, outer is still 2, and inner is 2.
*little click*, outer is still 2, and inner is 3.
BIG CLICK, outer loop change to 2
*little click*, outer is still 3, and inner is 0.
*little click*, outer is still 3, and inner is 1.
*little click*, outer is still 3, and inner is 2.
*little click*, outer is still 3, and inner is 3.
I'm done here.


In [None]:
#Example of an infinite loop

while True:
    print "Looping!"

# Strings and Bytes

Strings in Python are identified as a contiguous set of characters represented in the quotation marks. Python allows for either pairs of single or double quotes. Subsets of strings can be taken using the slice operator ([ ] and [:] ) with indexes starting at 0 in the beginning of the string and working their way from -1 at the end.

The plus (+) sign is the string concatenation operator and the asterisk (*) is the repetition operator. For example −

In [36]:
str = 'Hello World!'

print(str)          # Prints complete string
print(str[0])       # Prints first character of the string
print(str[2:5])     # Prints characters starting from 3rd to 5th
print(str[2:])      # Prints string starting from 3rd character
print(str * 2)      # Prints string two times
print(str + "TEST") # Prints concatenated string

Hello World!
H
llo
llo World!
Hello World!Hello World!
Hello World!TEST


# Lists, Dictionaries, and Tuples

## Python Lists

Lists are the most versatile of Python's compound data types. A list contains items separated by commas and enclosed within square brackets ([]). To some extent, lists are similar to arrays in C. One difference between them is that all the items belonging to a list can be of different data type.

The values stored in a list can be accessed using the slice operator ([ ] and [:]) with indexes starting at 0 in the beginning of the list and working their way to end -1. The plus (+) sign is the list concatenation operator, and the asterisk (*) is the repetition operator. For example −

In [37]:
list = [ 'abcd', 786 , 2.23, 'john', 70.2 ]
tinylist = [123, 'john']

print(list)          # Prints complete list
print(list[0])       # Prints first element of the list
print(list[1:3])     # Prints elements starting from 2nd till 3rd 
print(list[2:])      # Prints elements starting from 3rd element
print(tinylist * 2)  # Prints list two times
print(list + tinylist) # Prints concatenated lists
print(list[-1])

['abcd', 786, 2.23, 'john', 70.2]
abcd
[786, 2.23]
[2.23, 'john', 70.2]
[123, 'john', 123, 'john']
['abcd', 786, 2.23, 'john', 70.2, 123, 'john']
70.2


## Python Tuples

A tuple is another sequence data type that is similar to the list. A tuple consists of a number of values separated by commas. Unlike lists, however, tuples are enclosed within parentheses.

The main differences between lists and tuples are: Lists are enclosed in brackets ( [ ] ) and their elements and size can be changed, while tuples are enclosed in parentheses ( ( ) ) and cannot be updated. Tuples can be thought of as read-only lists. For example −

In [38]:
tuple = ( 'abcd', 786 , 2.23, 'john', 70.2  )
tinytuple = (123, 'john')

print(tuple)           # Prints complete list
print(tuple[0])        # Prints first element of the list
print(tuple[1:3])      # Prints elements starting from 2nd till 3rd 
print(tuple[2:])       # Prints elements starting from 3rd element
print(tinytuple * 2)   # Prints list two times
print(tuple + tinytuple) # Prints concatenated lists

('abcd', 786, 2.23, 'john', 70.2)
abcd
(786, 2.23)
(2.23, 'john', 70.2)
(123, 'john', 123, 'john')
('abcd', 786, 2.23, 'john', 70.2, 123, 'john')


The following code is invalid with tuple, because we attempted to update a tuple, which is not allowed. Similar case is possible with lists −

In [39]:
tuple = ( 'abcd', 786 , 2.23, 'john', 70.2  )
list = [ 'abcd', 786 , 2.23, 'john', 70.2  ]
tuple[2] = 1000    # Invalid syntax with tuple
list[2] = 1000     # Valid syntax with list

TypeError: 'tuple' object does not support item assignment

## Python Dictionary

Python's dictionaries are kind of hash table type. They work like associative arrays or hashes found in Perl and consist of key-value pairs. A dictionary key can be almost any Python type, but are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object.

Dictionaries are enclosed by curly braces ({ }) and values can be assigned and accessed using square braces ([]). For example −

In [None]:
dict = {}
dict['one'] = "This is one"
dict[2]     = "This is two"

tinydict = {'name': 'john','code':6734, 'dept': 'sales'}


print(dict['one'])       # Prints value for 'one' key
print(dict[2])           # Prints value for 2 key
print(tinydict)         # Prints complete dictionary
print(tinydict.keys())   # Prints all the keys
print(tinydict.values()) # Prints all the values

Dictionaries have no concept of order among elements. It is incorrect to say that the elements are "out of order"; they are simply unordered.

# Checkpoint 1

### Write a Python code segment that constructs a list of (double-precision) floats with integer values from 0 to n.

### For each of the types bool, int, float, str, tuple, list, NoneType, which value yields false in an if clause?

### Give examples of two (builtin) types of objects that are mutable.

### Illustrate how the * operator can be used to construct a list.

### Illustrate how to access the last 4 characters of a string mystring.

### Create a list of a thousand No's.

# Creating, running and importing a Python module

Let us develop an example module in Python.

<img src="graphics/module.png">

Typically, the Python code is structured as shown above: we have multiple .py files - libraries and a main driver script where we import functions defined in the libraries from. Alternatively, one can simply import these functions from the Python REPL. We will talk more about Object Oriented Python later in this course.

## Word count script
We will use a simple word-count example for this section.

In [40]:
from urllib.request import urlopen

with urlopen("http://sixty-north.com/c/t.txt") as story:
    story_words = []
    for line in story:
        line_words = line.decode("utf-8").split()
        for word in line_words:
            story_words.append(word)
            
    print(story_words)

['It', 'was', 'the', 'best', 'of', 'times', 'it', 'was', 'the', 'worst', 'of', 'times', 'it', 'was', 'the', 'age', 'of', 'wisdom', 'it', 'was', 'the', 'age', 'of', 'foolishness', 'it', 'was', 'the', 'epoch', 'of', 'belief', 'it', 'was', 'the', 'epoch', 'of', 'incredulity', 'it', 'was', 'the', 'season', 'of', 'Light', 'it', 'was', 'the', 'season', 'of', 'Darkness', 'it', 'was', 'the', 'spring', 'of', 'hope', 'it', 'was', 'the', 'winter', 'of', 'despair', 'we', 'had', 'everything', 'before', 'us', 'we', 'had', 'nothing', 'before', 'us', 'we', 'were', 'all', 'going', 'direct', 'to', 'Heaven', 'we', 'were', 'all', 'going', 'direct', 'the', 'other', 'way', 'in', 'short', 'the', 'period', 'was', 'so', 'far', 'like', 'the', 'present', 'period', 'that', 'some', 'of', 'its', 'noisiest', 'authorities', 'insisted', 'on', 'its', 'being', 'received', 'for', 'good', 'or', 'for', 'evil', 'in', 'the', 'superlative', 'degree', 'of', 'comparison', 'only']


**Create a file in a text editor of your choice (I will use Vim)** Let us take this standalone function, and put it in a separate library file. You can call it **word_utils.py**. The contents of the file should be as follows:

```python
#Contents of the word_utils.py file

#Add the mandatory shabeng line at the top of the file. It points to a default location of Python distrivtution, and tells that we ar eusing the python3

#!/usr/bin/env python3
from urllib import request

def fetch_words():
    story = request.urlopen("http://sixty-north.com/c/t.txt")
    story_words = []
    for line in story:
        line_words = line.decode('utf-8').split()
        for word in line_words:
            story_words.append(word)
    return story_words


def print_items(items):
    for item in items:
        print(item)
```   

As you can see, we simply put an original code into a function, and removed the print statement under the with statement adding an extra function to print items.

## Test your module with Python REPL or iPython notebook

Simply try and import the functions defined in the word_utils.py. There are various ways you can do it:

In [42]:
import word_utils
word_utils.fetch_words()

['It',
 'was',
 'the',
 'best',
 'of',
 'times',
 'it',
 'was',
 'the',
 'worst',
 'of',
 'times',
 'it',
 'was',
 'the',
 'age',
 'of',
 'wisdom',
 'it',
 'was',
 'the',
 'age',
 'of',
 'foolishness',
 'it',
 'was',
 'the',
 'epoch',
 'of',
 'belief',
 'it',
 'was',
 'the',
 'epoch',
 'of',
 'incredulity',
 'it',
 'was',
 'the',
 'season',
 'of',
 'Light',
 'it',
 'was',
 'the',
 'season',
 'of',
 'Darkness',
 'it',
 'was',
 'the',
 'spring',
 'of',
 'hope',
 'it',
 'was',
 'the',
 'winter',
 'of',
 'despair',
 'we',
 'had',
 'everything',
 'before',
 'us',
 'we',
 'had',
 'nothing',
 'before',
 'us',
 'we',
 'were',
 'all',
 'going',
 'direct',
 'to',
 'Heaven',
 'we',
 'were',
 'all',
 'going',
 'direct',
 'the',
 'other',
 'way',
 'in',
 'short',
 'the',
 'period',
 'was',
 'so',
 'far',
 'like',
 'the',
 'present',
 'period',
 'that',
 'some',
 'of',
 'its',
 'noisiest',
 'authorities',
 'insisted',
 'on',
 'its',
 'being',
 'received',
 'for',
 'good',
 'or',
 'for',
 'evil',
 'i

We will make one quick change to our code: remove hardcoded URL from the body of the function, replacing it with a url string parameter. Making out functions look like:

```python
#!/usr/bin/env python3
from urllib import request

def fetch_words(url):
    story = request.urlopen(url)
    story_words = []
    for line in story:
        line_words = line.decode('utf-8').split()
        for word in line_words:
            story_words.append(word)
    return story_words


def print_items(items):
    for item in items:
        print(item)
```  

As the amount of code using your library functions increases, you need to have a separate function/script for that - the main() function. Let us create a new file called **driver.py** which would contain something like this:

In [44]:
#!/usr/bin/env python
""" Retrieve and print words from a URL.

Usage:

    python3 driver.py
"""
from word_utils import * #fetch_words,print_items

def main(url):
    """ Print each word from a text document from a URL.

    Args:
        url:  The URL of a UTF-8 text document.
    """
    words = fetch_words(url)
    print_items(words)


if __name__ == '__main__':
    main("http://sixty-north.com/c/t.txt")

It
was
the
best
of
times
it
was
the
worst
of
times
it
was
the
age
of
wisdom
it
was
the
age
of
foolishness
it
was
the
epoch
of
belief
it
was
the
epoch
of
incredulity
it
was
the
season
of
Light
it
was
the
season
of
Darkness
it
was
the
spring
of
hope
it
was
the
winter
of
despair
we
had
everything
before
us
we
had
nothing
before
us
we
were
all
going
direct
to
Heaven
we
were
all
going
direct
the
other
way
in
short
the
period
was
so
far
like
the
present
period
that
some
of
its
noisiest
authorities
insisted
on
its
being
received
for
good
or
for
evil
in
the
superlative
degree
of
comparison
only


## Special attributes

<img src="graphics/name.png">

For a complete list of special attributes do:

In [45]:
import word_utils as wu
dir(wu.fetch_words)

['__annotations__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__kwdefaults__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

## Docstrings

A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.

All modules should normally have docstrings, and all functions and classes exported by a module should also have docstrings. Public methods (including the __init__ constructor) should also have docstrings. A package may be documented in the module docstring of the __init__.py file in the package directory.

https://www.python.org/dev/peps/pep-0257/

We will go ahead and modify our word_utils.py functions to add docstrings:

```python
#!/usr/bin/env python3

from urllib import request

def fetch_words(url="http://sixty-north.com/c/t.txt"):
    """Fetch a list of words from a URL.

    Args:
        url: The URL of a UTF-8 text document.

    Returns:
        A list of string containing the words from the document.
    """
    story = request.urlopen(url)
    story_words = []
    for line in story:
        line_words = line.decode('utf-8').split()
        for word in line_words:
            story_words.append(word)
    return story_words


def print_items(items):
    """ Print items one per line.

    Args:
        An iterable series of printable items.
    """
    for item in items:
        print(item)
```

In [12]:
import word_utils
help(fetch_words)

Help on function fetch_words in module word_utils:

fetch_words(url)
    Fetch a list of words from a URL.
    
    Args:
        url: The URL of a UTF-8 text document.
    
    Returns:
        A list of string containing the words from the document.



Last change we will make is to the driver script, by allowing to pass URL from the command line:

```python
#!/usr/bin/env python
""" Retrieve and print words from a URL.

Usage:

    python3 driver.py<URL>
"""

import sys
from urllib import request
from word_utils import * #fetch_words,print_items


def main(url):
    """ Print each word from a text document from a URL.

    Args:
        url:  The URL of a UTF-8 text document.
    """
    words = fetch_words(url)
    print_items(words)


if __name__ == '__main__':
    main(sys.argv[1])  # The 0th argument is the module filename
```

And try running it from Python REPL as:

```bash
python3 driver.py "http://sixty-north.com/c/t.txt"
```


<img src="graphics/modularity.png">

<img src="graphics/modularity2.png">

# Variables and functions 


A Python variable is a **reference** to an object. To be more specific, a Python variable contains the
address of an object in memory. 
Each object x has a unique ID id(x), typically its address in memory. Whether or not variables
x and y refer to the same object can be checked with x is y.

What happens in the following:

In [46]:
x = 2
id(x)

4297537984

In [47]:
y = x
id(y)

4297537984

In [49]:
x = 3
id(x)

4297537984

is that x first contains the address of 2, then this address is copied into y, and finally the address
of 3 is put into x. The actual object consists of both a reference to its type and its data.

A list a: 

In [58]:
a = [[1,2],[3,4]]
id(a)

4385751688

In [59]:
id(a[0])

4385616072

is itself a sequence of references, each one of which might refer to
a different type of object.

The assignment b = a would make b a reference to the same list as a:

In [62]:
b = a
print(id(b))

print(id(a[0]))
print(id(b[0]))

4385751688
4385616072
4385616072


On the other hand:

In [63]:
b = a[:]
print(id(b))

print(id(a[0]))
print(id(b[0]))

4388958536
4385616072
4385616072


would make b refer to a copy of the list but not of the objects that are referenced by a[0], a[1]. 

<img src="graphics/shallow_copy_list.png">

If these objects are mutable and copies of them are also desired, import copy and set:

In [64]:
import copy

b = copy.deepcopy(a)
print(id(b))

print(id(a[0]))
print(id(b[0]))

4388923272
4385616072
4388916680


If **a** is a dictionary, use ```python b = dict.copy(a)``` to make a shallow copy.

To save storage space, the compiler sometimes stores two immutable objects having the same value
in the same memory. Therefore, the identity operator is should not be used to compare two
immutable objects: instead, use the equality operator ==.

## Functions 

### Call by reference

Consider the code:

In [23]:
def zero (x, y):
    x = 0.
    for i in range(len(y)): y[i] = 0.
    a = 1.
    b = [2., 3., 5.]
    zero(a, b)

When the function **zero** is called, the references in a and b are copied to local variables x and y,
respectively. The function changes x to refer to 0., but there is no change to a. 

The function does not change y. It changes just the references constituting the list, and this same list is referenced by  b as well as y. Upon exit, the local variables become undefined. The intended effect is successful
for b but not a. The modification of **b** is an example of a side effect. 

All Python functions return a value, the default is None.

## Argument lists

Python permits keyword arguments, which are optional labeled arguments. e.g., the matplotlib function call:
**plot(x, y, linewidth=1.0)**
It also permits a variable number of arguments as in min(a, b, c).

## Passing function names
A couple of examples are:
**myplot(f, -1., 1.)**


## Lambda functions in Python

Python supports the creation of anonymous functions (i.e. functions that are not bound to a name) at runtime, using a construct called "lambda".
Sometimes you need to pass a function as an argument, or you want to do a short but complex operation multiple times. You could define your function the normal way, or you could make a lambda function, a mini-function that returns the result of a single expression. The two definitions are completely identical:

In [None]:
##traditional named function
def add(a,b): return a+b

##lambda function
add2 = lambda a,b: a+b

The advantage of the lambda function is that it is in itself an expression, and can be used inside another statement. Here's an example using the map function, which calls a function on every element in a list, and returns a list of the results:

In [None]:
squares = map(lambda a: a*a, [1,2,3,4,5])
print squares

All Python functions return a value, the default is None.

### Scope

<img src="graphics/scope.png">

The parameters of a function have only local scope. In the following code:

In [None]:
def func(x):
    a = 1.
    return x + a + b
a = 2.; b = 3.
func(5.)

the variable *a* is a local variable because it is defined in its first occurrence, whereas b is a global
variable because it is used without first being defined. 

An uninitialized variable will be assumed to come from the enclosing static context but not otherwise from its calling function. 

The presumption that a is local can be overridden by declaring it to be global before using it.
The list of defined variable names in the current scope is available as **dir()** called on a function.

<img src="graphics/functions1.png">

<img src="graphics/functions2.png">

<img src="graphics/typing.png">

# Comprehensions, map and filter

## List comprehensions
List comprehensions provide a concise way to create lists. Common applications are to make new lists where each element is the result of some operations applied to each member of another sequence or iterable, or to create a subsequence of those elements that satisfy a certain condition.

For example, assume we want to create a list of squares, like:

In [5]:
squares = []
for x in range(10):
    squares.append(x**2)
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [4]:
#We can obtain the same result with:
squares = [x**2 for x in range(10)]
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [10]:
#This is also equivalent to 
squares = map(lambda x: x**2, range(10))
for x in squares:
    print(x),

0
1
4
9
16
25
36
49
64
81


A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it. For example, this listcomp combines the elements of two lists if they are not equal:

In [11]:
combs = [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
print(combs)

#and it’s equivalent to:

combs = []
for x in [1,2,3]:
    for y in [3,1,4]:
        if x != y:
            combs.append((x, y))
print(combs)

#Note how the order of the for and if statements is the same in both these snippets.

[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]


## Filtering a list

What if you're more interested in filtering the list? Say you want to remove every element with a value equal to or greater than 4? 

In [12]:
numbers = [1,2,3,4,5]
numbers_under_4 = []
for number in numbers:
    if number < 4:
        numbers_under_4.append(number)
        # Now, numbers_under_4 contains [1,4,9]
print("Numbers under 4 only: ", numbers_under_4)


#You could reduce the size of the code with the filter function:
numbers = [1,2,3,4,5]
numbers_under_4 = filter(lambda x: x < 4, numbers)
# Now, numbers_under_4 contains [1,2,3]
print("Numbers under 4 only: ",numbers_under_4)

Numbers under 4 only:  [1, 2, 3]
Numbers under 4 only:  <filter object at 0x105779470>


## Dictionary comperehensions 

You already know that the dict() constructor builds dictionaries directly from sequences of key-value pairs:

In [None]:
dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
{'sape': 4139, 'jack': 4098, 'guido': 4127}

In addition, dict comprehensions can be used to create dictionaries from arbitrary key and value expressions:

In [13]:
{x: x**2 for x in (2, 4, 6)}

{2: 4, 4: 16, 6: 36}

# Operating system commands

https://docs.python.org/2/library/os.html

Operating system commands can be executed using generic Python commands. There are two modules that are good for that: os and subprocess

In [14]:
import os
os.listdir(os.getcwd())                         

['.git',
 '.gitignore',
 '.ipynb_checkpoints',
 '0.Intro-and-Setup.ipynb',
 '0.Intro-and-Setup.slides.html',
 '1.BasicPython.ipynb',
 '1.BasicPython.slides.html',
 '2.Stack.ipynb',
 '2.Stack.slides.html',
 '3.Demos.ipynb',
 '3.Demos.slides.html',
 '__pycache__',
 'bool.png',
 'CoffeeTimeSeries.csv',
 'depts.png',
 'driver.py',
 'example.txt',
 'float.png',
 'functions1.png',
 'functions2.png',
 'int.png',
 'LICENSE',
 'liver.jpg',
 'modularity.png',
 'modularity2.png',
 'module.png',
 'name.png',
 'none.png',
 'Part0.ipynb',
 'Part1.ipynb',
 'pickled_human.p',
 'poi2.png',
 'README.md',
 'rel_ops.png',
 'scope.png',
 'sheep.csv',
 'std_lib.png',
 'typing.png',
 'word_utils.py',
 'word_utils.pyc']

In [15]:
import glob
glob.glob('*')

['0.Intro-and-Setup.ipynb',
 '0.Intro-and-Setup.slides.html',
 '1.BasicPython.ipynb',
 '1.BasicPython.slides.html',
 '2.Stack.ipynb',
 '2.Stack.slides.html',
 '3.Demos.ipynb',
 '3.Demos.slides.html',
 '__pycache__',
 'bool.png',
 'CoffeeTimeSeries.csv',
 'depts.png',
 'driver.py',
 'example.txt',
 'float.png',
 'functions1.png',
 'functions2.png',
 'int.png',
 'LICENSE',
 'liver.jpg',
 'modularity.png',
 'modularity2.png',
 'module.png',
 'name.png',
 'none.png',
 'Part0.ipynb',
 'Part1.ipynb',
 'pickled_human.p',
 'poi2.png',
 'README.md',
 'rel_ops.png',
 'scope.png',
 'sheep.csv',
 'std_lib.png',
 'typing.png',
 'word_utils.py',
 'word_utils.pyc']

In [16]:
os.mkdir("mytmp")

In [21]:
os.listdir("mytmp")  

FileNotFoundError: [Errno 2] No such file or directory: 'mytmp'

In [22]:
os.rmdir("mytmp")

FileNotFoundError: [Errno 2] No such file or directory: 'mytmp'

In [26]:
os.system("touch mynewfile")

0

In [28]:
os.listdir(os.getcwd())

NotADirectoryError: [Errno 20] Not a directory: 'mynewfile'

# 

# File IO

This is a very broad topic. Every third-party Python library has their own IO modules.
Let us quickly look at how to read plain text, CSV and JSON files here and briefly talk about serialization and binary file formats.

## Reading text files 

insert image

In [61]:
myfile = open("files/example.txt", mode="rt", encoding="utf-8")
myfile.readline()

'This is an example text file.\n'

In [62]:
myfile.readline()

'\n'

In [63]:
myfile.readline()

'It contains three lines.\n'

In [64]:
myfile.seek(0)
myfile.readline()

'This is an example text file.\n'

Read fixed number of characters:

In [67]:
myfile.seek(0)
myfile.read(20)

'This is an example t'

Files support an iterator protocol:

In [60]:
myfile = open("files/example.txt", mode="rt", encoding="utf-8")
for line in myfile :
    print(line)

This is an example text file.



It contains three lines.



Maybe four.



Four, definitely four.



In [69]:
myfile = open("files/example.txt", mode="rt", encoding="utf-8")
for line in myfile.readlines():
    print(line)

This is an example text file.



It contains three lines.



Maybe four.



Four, definitely four.



0

## Writing to text file

In [77]:
import sys
mynewfile = open("files/new_example.txt",mode="wt",encoding="utf-8")
#mynewfile.write("First line")
#mynewfile.write("Second line")
#mynewfile.write("Third line")

sys.stdout.write("First line")
sys.stdout.write("Second line")
sys.stdout.write("Third line")
mynewfile.close()

First lineSecond lineThird line

In [76]:
mynewfile = open("files/new_example.txt",mode="rt",encoding="utf-8")
for line in mynewfile.readlines():
    print(line)

## CSV files

CSV are actually plain text files also, they just have structure. Typically, when you work with CSV files - think you should be using Pandas third party library. We will talk about later in the course, but here I would just like to show how to read CSV files with a standard IO

https://docs.python.org/2/library/csv.html

## JSON files

Finally, the JSON. JSON stands for Java Standard Object Notation and it is a semi-structured data format.

Note also, that file objects support context manager "with". Which opens and closes the file for us.

In [34]:
import json
from pprint import pprint

with open('files/example.json') as data_file:    
    data = json.load(data_file)

print(data)

{'$schema': 'http://json-schema.org/draft-04/schema#',
 'definitions': {'authReq': {'properties': {'realms': {'items': {'type': 'string'},
                                                       'type': 'array'},
                                            'scheme': {'type': 'string'}},
                             'required': ['scheme'],
                             'type': 'object'},
                 'hints': {'description': 'A resource hint',
                           'properties': {'accept-patch': {'$ref': '#/definitions/mediaTypeArray',
                                                           'notes': ['spec '
                                                                     'says '
                                                                     'PATCH '
                                                                     'should '
                                                                     'be '
                                                                  

## Pickle

## HDF5

# Interfacing Python with compiled code, Cython

There are numerous ways to do this:

#### C API to Python and NumPy 

This is a library of C functions and variables that can be used
to create wrapper functions that together with the targeted C code can be compiled into fast
binary Python modules. See: https://docs.python.org/3/extending/extending.html for more information.

#### ctypes module and attribute 

The ctypes module from the Python standard library and the
ctypes attribute of NumPy arrays can be used to create a Python wrapper for an existing
dynamically-loaded library written in C.

#### Cython 

This facilitates the writing of C extensions for Python.
weave This allows the inclusion of C code in Python programs.

#### SWIG 

This automates the process of writing wrappers in C for C functions. SWIG is easy to
use if the argument list is to be limited to builtin Python types but can be cumbersome if
efficient conversion to NumPy arrays is desired. The difficulty is due to the need to match
C array parameters to predefined patterns in the numpy.

#### f2py 

This is for interfacing to Fortran.
See http://www.scipy.org/Topical_Software for links to some of these. Presented here is the
use of ctypes. Unlike the use of the C API or SWIG, it permits the interface to be written in
Python.



Let us start by writing some C code. The dot product of two vectors for instance:

```C
double dot_product(double v[], double u[], int n)
{
    double result = 0.0;
    for (int i = 0; i < n; i++)
        result += v[i]*u[i];
    return result;
}
```

Next we compile it, and build a shared object (in the command line, not in the notebook):

```bash
gcc -c -Wall -Werror -fpic my_dot.c 
gcc -shared -o my_dot.so my_dot.o
```

The ctypes module of the Python standard library provides definitions of fundamental data types that can be passed to C programs. For example, assuming we


In [37]:
import ctypes as C
#these types would have names like C.c int and C.c double.
#They can be used constructors, e.g.,
x = C.c_double(2.71828)
#for which x.value returns the Python object.
print(type(2.71828))
print(type(x))

<class 'float'>
<class 'ctypes.c_double'>


In [39]:
#Fundamental types can be composed to get new types, e.g.,
xp = C.POINTER(C.c_double)(); 
xp.contents = x
print(xp)
print(x)

<__main__.LP_c_double object at 0x105797b70>
c_double(2.71828)


In [40]:
#or simply xp = C.POINTER(C.c_double)(x) . You can change the value of x using
xp[0] = 3.14159

In [44]:
#Array types can be created by \multiplying" a ctype by a positive integer, e.g.,
ylist = [1.,2.3,4.,5.]
n = len(ylist)
y = (C.c_double*n)()
y[:] = ylist
#or simply
y = (C.c_double*n)(*ylist)
print(ylist)
print(y[0])

[1.0, 2.3, 4.0, 5.0]
1.0


The asterisk is a Python operator for expanding the elements of a sequence into the arguments of a
function. Convert a C array back to a Python value or list by indexing it with an int or a slice.
The ctypes module has a utility subpackage to assist in locating a dynamically-loaded library,
e.g.,

In [48]:
import ctypes.util # an explicit import is necessary
C.util.find_library('my_dot')
#locates the C math library. 
#For loading a library there are constructors, e.g.,
myDL = C.CDLL('./my_dot.so')
print(myDL)
#which makes my a module-like object (a CDLL object to be precise).

<CDLL './my_dot.so', handle 10560dd60 at 0x105094fd0>


Similar to a Python module, myDL has as attributes function-like objects (C function pointers to
be precise) which have the same names as the C functions in the library, e.g., myDL.dot. These
function-like objects themselves have an attribute restype, which must be used to declare the type
of its result. For a C function whose result type is void, use None. 

Here is a full example:

In [8]:
import time
start = time.time()

from ctypes import CDLL, c_int, c_double
mydot = CDLL('my_dot.so').dot_product
def dot(vec1, vec2): # vec1, vec2 are lists
    n = len(vec1)
    mydot.restype = c_double
    return mydot((c_double*n)(*vec1), (c_double*n)(*vec2), c_int(n))

vec1 = [x for x in range(100000000)]
vec2 = [x for x in range(100000000)]
print(dot(vec1,vec2))
end = time.time()
print("Elaspsed time: ", end-start)

3.333333283334632e+23
Elaspsed time:  124.00988602638245


The arguments should be explicitly converted to the appropriate C type. 
The result is automatically converted to a regular Python type, based on the restype attribute.

**Warning.** If you use the extension .so for the name of a file, do not make its stem the same as a
.py file in the same directory, e.g., do not have both a funcs.py and a funcs.so. One convention
is to use funcs.py and funcs.so.


### Repeat the same in Cython

The fundamental nature of Cython can be summed up as follows: Cython is Python with C data types.
As Cython can accept almost any valid python source file, one of the hardest things in getting started is just figuring out how to compile your extension.

Here is the bare Python implementation of the dot product of two lists/vectors:

In [25]:
start = time.time()

#def frange(x, y, jump):
#    while x < y:
#        yield x
#    x += jump
       
def dot_product(vec1,vec2):
    result = 0.0
    n = len(vec1)
    for i in range(n):
        result += vec1[i]*vec2[i]
    return result

vec1 = [x for x in range(100000000)]
vec2 = [x for x in range(100000000)]
print(dot_product(vec1,vec2))
end = time.time()
print("Elaspsed time: ", end-start)

KeyboardInterrupt: 

Let us take the dot_product function and put it in the .pyx file:

```python
cimport cython


@cython.boundscheck(False) # Will not check indexing, so ensure indices are valid and non-negative
@cython.wraparound(False) # Will not allow negative indexing
@cython.cdivision(True) # Will not check for division by zero
def dot_product(vec1,vec2):
    cdef float result = 0.0
    cdef unsigned int n = len(vec1)

    for i in range(n):
        result += vec1[i]*vec2[i]

    return result
```

We would need a setup file in addition to that:

```python
from distutils.core import setup
from Cython.Build import cythonize

setup(
  name = 'my dot',
  ext_modules = cythonize("cython_dot.pyx")
)
```

save it in cython_setup.py file and build:

```bash
python cython_setup.py build_ext --inplace
```

you will now see the .so file appear in your folder.

In [18]:
start = time.time()
from cython_dot import dot_product
vec1 = [x for x in range(100000000)]
vec2 = [x for x in range(100000000)]
print(dot_product(vec1,vec2))
end = time.time()
print("Elaspsed time: ", end-start)

2.4287628162912635e+23
Elaspsed time:  51.588934898376465


How can we make Cython implementation even faster? use less python's generic data structures and more numpy arrays!
Change the cython_dot.pyx file to look like:

```python
cimport cython
import numpy as np
cimport numpy as np

DTYPE = np.float64
ctypedef np.float64_t DTYPE_t

@cython.boundscheck(False) # Will not check indexing, so ensure indices are valid and non-negative
@cython.wraparound(False) # Will not allow negative indexing
@cython.cdivision(True) # Will not check for division by zero
def dot_product(np.ndarray[DTYPE_t, ndim=1] vec1, np.ndarray[DTYPE_t, ndim=1] vec2):
    cdef float result = 0.0
    cdef unsigned int i
    cdef unsigned int n = vec1.shape[0]

    for i in range(n):
        result += vec1[i]*vec2[i]

    return result
```   

and change the cython_setup.py to looks like this:

```python
#!/usr/bin/env python3

from distutils.core import setup
from Cython.Build import cythonize
import numpy as np

setup(
  name = 'my dot',
  ext_modules = cythonize("cython_dot2.pyx"),
  include_dirs = [np.get_include()]
)
```

rebuild the shared object, and rerun:

In [26]:
start = time.time()
import numpy as np
from cython_dot2 import dot_product
vec1 = np.arange(100000000,dtype=float)
vec2 = np.arange(100000000,dtype=float)
print(dot_product(vec1,vec2))
end = time.time()
print("Elaspsed time: ", end-start)

2.4287628162912635e+23
Elaspsed time:  8.972913026809692


In [27]:
#no recompile with fast math and O3

start = time.time()
import numpy as np
from cython_dot2 import dot_product
vec1 = np.arange(100000000,dtype=float)
vec2 = np.arange(100000000,dtype=float)
print(dot_product(vec1,vec2))
end = time.time()
print("Elaspsed time: ", end-start)

2.4287628162912635e+23
Elaspsed time:  0.9875469207763672


Finally, try with OpenMP
https://clang-omp.github.io

# Parallel computing with Python

https://docs.python.org/2/library/multiprocessing.html

We have seen how one can use OpenMP and shared memory programming to parallelized python code (well, Cython).

## Multiprocessing library
The **multiprocessing** package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

### Pool class

This basic example of data parallelism is using the Pool:

In [83]:
import multiprocessing as mp

def f(x):
    return x*x

if __name__ == '__main__':
    p = mp.Pool(5)
    print(p.map(f, [1, 2, 3]))
    #print(mp.cpu_count())

[1, 4, 9]
4


### Process class

In multiprocessing, processes are spawned by creating a Process object and then calling its start() method. Process follows the API of threading.Thread. A trivial example of a multiprocess program is:

In [85]:
from multiprocessing import Process

def f(name):
    print('hello', name)

if __name__ == '__main__':
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

hello bob


To show the individual process IDs involved, here is an expanded example:



In [90]:
from multiprocessing import Process
import os

def info(title):
    print(title)
    print('module name:', __name__)
    if hasattr(os, 'getppid'):  # only available on Unix
        print('parent process:', os.getppid())
    print('process id:', os.getpid())

def f(name):
    info('function f')
    print('hello', name)

if __name__ == '__main__':
    info('main line')
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

main line
module name: __main__
parent process: 29401
process id: 29411
function f
module name: __main__
parent process: 29411
process id: 33018
hello bob


## Joblib library

https://pythonhosted.org/joblib/

Let us go back to our dot poroduct example, and we will assume that we need to perfrom this to a list of vectors.

In [92]:
start = time.time()
import numpy as np
from cython_dot2 import dot_product

Nvectors = 10
results = list()

for round in range(Nvectors):
    vec1 = np.arange(100000000,dtype=float)
    vec2 = np.arange(100000000,dtype=float)
    results.append(dot_product(vec1,vec2))

print(len(results))    
end = time.time()
print("Elaspsed time: ", end-start)

10
Elaspsed time:  10.794227838516235


In [101]:
start = time.time()
from joblib import Parallel, delayed  
import multiprocessing
import numpy as np
from cython_dot2 import dot_product

num_cores = 2 #multiprocessing.cpu_count()-1
print("Running on ", num_cores, " CPU cores")

Nvectors = 10
results = list()

def getProduct():
    vec1 = np.arange(100000000,dtype=float)
    vec2 = np.arange(100000000,dtype=float)
    return dot_product(vec1,vec2)

results = Parallel(n_jobs=num_cores)(delayed(getProduct)() for i in range(Nvectors))  

print(len(results))    
end = time.time()
print("Elaspsed time: ", end-start)

Running on  2  CPU cores
10
Elaspsed time:  5.442961931228638


# Exception handling

https://docs.python.org/2/tutorial/errors.html

There is a difference between syntax errors and exceptions.
Here, is an example of a syntax error:

In [28]:
print "Hello world"

SyntaxError: Missing parentheses in call to 'print' (<ipython-input-28-3c090b498326>, line 1)

No need to handle this exception, just need to fix it.

Let us consider a very simple example of a function, which takes an argument and converts it to integter:

In [52]:
def convert(x):
    return int(x)    

Clearly, all numeric types are OK:

In [47]:
convert(43.)

43

In [48]:
convert(-1)

-1

Strings of number are OK too:

In [50]:
convert("43")

43

Let us try some arbitrary long string:

In [53]:
convert("Longer string")

ValueError: invalid literal for int() with base 10: 'Longer string'

In [43]:
def convert(x):
    try: 
        return int(x)      
    except (ValueError,TypeError) as e:
        return x
        raise
        
    return result    

In [42]:
convert([2,3,4])

Convertion not possible


-1

# Debugging

Debugging is a very important step of Python application development. The easiest and most common way to debug among beginners is by inserting multiple print statements inside the code.

Luckily, Python has a debugger, which is available as a module called **pdb** (stands for “Python DeBugger”). This is a very simple and useful tool to learn if you are writing any Python programs.

Let us look at a simple (though not very meaningful) code below:

In [78]:
# epdb1.py -- experiment with the Python debugger, pdb
a = "aaa"
b = "bbb"
c = "ccc"
final = a + b + c
print(final)

aaabbbccc


Debugging with PDB is as simple as importing the corresponding module:

In [56]:
import pdb

Now find a spot where you would like tracing to begin, and insert the following code:

In [None]:
#pdb.set_trace()

So now our program looks like (we will copy it to the python file and run it from the command line):

In [None]:
# epdb1.py -- experiment with the Python debugger, pdb
import pdb
a = "aaa"
pdb.set_trace()
b = "bbb"
c = "ccc"
final = a + b + c
print(final)

## Exploring the program with "n" command

Start run the debug run by typing:

```python
python3 epdb1.py
```

Execute the next statemen with “n” (next)

At the Pdb prompt, press the lower-case letter “n” (for “next”) on your keyboard, and then press the ENTER key. This will tell pdb to execute the current statement. Keep doing this — pressing “n”, then ENTER.

Eventually you will come to the end of your program, and it will terminate and return you to the normal command prompt.

## Repeating the last debugging command with ENTER

This time, do the same thing as you did before. Start your program running. At the (Pdb) prompt, press the lower-case letter “n” (for “next”) on your keyboard, and then press the ENTER key.

But this time, after the first time that you press “n” and then ENTER, don’t do it any more. Instead, when you see the (Pdb) prompt, just press ENTER. You will notice that pdb continues, just as if you had pressed “n”. 

**If you press ENTER without entering anything, pdb will re-execute the last command that you gave it.**

In this case, the command was “n”, so you could just keep stepping through the program by pressing ENTER.

Notice that as you passed the last line (the line with the “print” statement), it was executed and you saw the output of the print statement (“aaabbbccc”) displayed on your screen.

## Quitting it all with “q” (quit)

The debugger can do all sorts of things, some of which you may find totally mystifying. So the most important thing to learn now — before you learn anything else — is how to quit debugging.

It is easy. When you see the (Pdb) prompt, just press “q” (for “quit”) and the ENTER key. Pdb will quit and you will be back at your command prompt. Try it, and see how it works.

## Printing the value of variables with “p” (print)

The most useful thing you can do at the (Pdb) prompt is to print the value of a variable. Here’s how to do it.

When you see the (Pdb) prompt, enter “p” (for “print”) followed by the name of the variable you want to print. And of course, you end by pressing the ENTER key.

Note that you can print multiple variables, by separating their names with commas (just as in a regular Python “print” statement). For example, you can print the value of the variables a, b, and c this way.


## Seeing where you are with “l” (list)

As you are debugging, there is a lot of stuff being written to the screen, and it gets really hard to get a feeling for where you are in your program. That’s where the “l” (for “list”) command comes in. 

“l” shows you, on the screen, the general area of your program’s souce code that you are executing. By default, it lists 11 (eleven) lines of code. The line of code that you are about to execute (the “current line”) is right in the middle, and there is a little arrow “–>” that points to it.

So a typical interaction with pdb might go like this

The pdb.set_trace() statement is encountered, and you start tracing with the (Pdb) prompt
You press “n” and then ENTER, to start stepping through your code.
You just press ENTER to step again.
You just press ENTER to step again.
You just press ENTER to step again. etc. etc. etc.
Eventually, you realize that you are a bit lost. You’re not exactly sure where you are in your program any more. So…
You press “l” and then ENTER. This lists the area of your program that is currently being executed.
You inspect the display, get your bearings, and are ready to start again. So….
You press “n” and then ENTER, to start stepping through your code.
You just press ENTER to step again.
You just press ENTER to step again. etc. etc. etc.

## Stepping into subroutines… with “s” (step into)

Eventually, you will need to debug larger programs — programs that use subroutines. And sometimes, the problem that you’re trying to find will lie buried in a subroutine. Consider the following program.

Let us consider a more involved program:

In [79]:
# epdb2.py -- experiment with the Python debugger, pdb
import pdb

def combine(s1,s2):      # define subroutine combine, which...
    s3 = s1 + s2 + s1    # sandwiches s2 between copies of s1, ...
    s3 = '"' + s3 +'"'   # encloses it in double quotes,...
    return s3            # and returns it.

a = "aaa"
pdb.set_trace()
b = "bbb"
c = "ccc"
final = combine(a,b)
print(final)

--Return--
> <ipython-input-79-508ded407db0>(10)<module>()->None
-> pdb.set_trace()
(Pdb) q


BdbQuit: 

Unlike "n" command which steps through your program line by line starting at the line where you have inserted the *set_trace()* statement, command "s" will step into functions and subroutines. Namely, if you press "n" on:
```python
final = combine(a,b)
```

it will just proceed to:

```python
print(final)
```

while "s" will step into the combine() function.

## Continuing to the end of the current subroutine with “r” (return)

When you use “s” to step into subroutines, you will often find yourself trapped in a subroutine. You have examined the code that you’re interested in, but now you have to step through a lot of uninteresting code in the subroutine.

In this situation, what you’d like to be able to do is just to skip ahead to the end of the subroutine. That is, you want to do something like the “c” (“continue”) command does, but you want just to continue to the end of the subroutine, and then resume your stepping through the code.

You can do it. The command to do it is “r” (for “return” or, better, “continue until return”). If you are in a subroutine and you enter the “r” command at the (Pdb) prompt, pdb will continue executing until the end of the subroutine. At that point — the point when it is ready to return to the calling routine — it will stop and show the (Pdb) prompt again, and you can resume stepping through your code.