## Writing Efficient Python Code

- How to write clean, fast, and efficient Python Code
- How to profile your code for bottlenecks
- How to eliminate bottlenecks and bad design patterns

### Defining _efficient_
- Minimal completion time (fast runtime)
- Minimal resource consumption (small memory footprint)

### Defining Pythonic 
- Focus on readability
- Using Python's constructs as intended

In [36]:
import this

In [7]:
numbers = [1,2,3,4,5]

In [8]:
# Non-Pythonic
doubled_numbers = []        
for i in range(len(numbers)):
    doubled_numbers.append(numbers[i]*2)

In [9]:
# Pythonic
doubled_numbers = [x * 2 for x in numbers]

In [11]:
names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman']
# Suppose you wanted to collect the names in the above list that have six letters or more.

In [12]:
# Print the list, new_list that was created using a Non-Pythonic approach.
i = 0
new_list = []
while i < len(names):
    if len(names[i]) >= 6:
        new_list.append(names[i])
    i += 1
print(new_list)

['Kramer', 'Elaine', 'George', 'Newman']


In [13]:
# A more Pythonic approach would loop over the contents of names, rather than using an index variable. 
better_list  =  []
for name in names:
    if len(name) >=6:
        better_list.append(name)
print(better_list)

['Kramer', 'Elaine', 'George', 'Newman']


In [15]:
# The best Pythonic way of doing this is by using list comprehension. 
best_list = [name for name in names if len(name)>=6]
print(best_list)

['Kramer', 'Elaine', 'George', 'Newman']


### The Python Standard Library
- Built-in types : list, tuple, set, dict, and others
- Built-in functions : print(), len(), range(), round(), enumerate(), map(), zip(), and others
- Built-in modules : os, sys, itertools, collections, math, and others


In [17]:
even_nums = range(2,11,2)
even_nums_list = list(even_nums)
print(even_nums_list)

[2, 4, 6, 8, 10]


In [19]:
letters = ["a", "b"]
indexed_letters = enumerate(letters) # Creates an indexed list of objects.
print(indexed_letters)
print(list(indexed_letters))

<enumerate object at 0x7fd8033b9480>
[(0, 'a'), (1, 'b')]


In [20]:
indexed_letters = enumerate(letters, start = 5)
print(list(indexed_letters))

[(5, 'a'), (6, 'b')]


In [3]:
nums = [1, 2, 3, 4, 5]
sqrd_nums = map(lambda x:x*2, nums) # Applies a function over an object
print(sqrd_nums)
print(list(sqrd_nums))

<map object at 0x7fb8744b3e20>
[2, 4, 6, 8, 10]


### The Power of Numpy Arrays

In [5]:
# Alternative to Python Lists
nums_list = list(range(5))
print(nums_list)
import numpy as np
nums_np = np.array(range(5))
print(nums_np)

[0, 1, 2, 3, 4]
[0 1 2 3 4]


In [7]:
# Numpy array homogeneity
nums_np_ints = np.array([1, 2, 3])
nums_np_ints.dtype

dtype('int64')

In [8]:
nums_np_floats = np.array([2, 3.5, 4])
nums_np_floats.dtype

dtype('float64')

In [10]:
# Numpy array broadcasting 
# Python lists don't support broadcasting
nums = [1, 2, 3, 4, 5]
nums ** 2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [12]:
# For loop (inefficient option)
sqrd_nums = []
for num in nums:
    sqrd_nums.append(num**2)
print(sqrd_nums)

[1, 4, 9, 16, 25]


In [13]:
# List comprehension (better option but not best)
sqrd_nums = [num ** 2 for num in nums]
print(sqrd_nums)

[1, 4, 9, 16, 25]


In [14]:
# Numpy array broadcasting for the win!
nums_np = np.array([-2, -1, 0, 1, 2])
nums_np ** 2

array([4, 1, 0, 1, 4])

In [17]:
# Indexing
# Basic 2-D indexing (lists)
nums2 = [[1,2,3],[4,5,6]]    
# Basic 2-D indexing (arrays)
nums2_np = np.array(nums2)
nums2[0][1], nums2_np[0,1]

(2, 2)

In [18]:
[row[0] for row in nums2]

[1, 4]

In [20]:
nums2_np[:,0]

array([1, 4])

In [25]:
# Boolean indexing
# No boolean indexing for lists
nums2_np > 0

array([[ True,  True,  True],
       [ True,  True,  True]])

In [26]:
nums2_np[nums2_np > 2]

array([3, 4, 5, 6])

### Tricks
- Use multiple assignments.
- Do not use global variables if it is not necessary.
- Concatenate strings with join.
- Use generators.
- Use 1 for inifinity loops
- Use speed up applications
- Use special libraries to process large datasets

In [26]:
# Use multiple assignments
a, b, c ,d = 2, 3, 5 ,7

In [29]:
# Do not use global variables if it is not necessary

In [32]:
# Concatenate strings with join
ConcatenatedString = " ".join(["Programming", "is", "fun"])
print(ConcatenatedString)

Programming is fun


In [50]:
# Use generators, If you have a large amount of data in your list and you need to use one data at a time and for once then use generators. It will save you time.

"""It is fairly simple to create a generator in Python. It is as easy as defining a normal function, but with a yield statement instead of a return statement.

If a function contains at least one yield statement (it may contain other yield or return statements), it becomes a generator function. Both yield and return will return some value from a function.

The difference is that while a return statement terminates a function entirely, yield statement pauses the function saving all its states and later continues from there on successive calls."""

# A simple generator function
def my_gen():
    n = 1
    print('This is printed first')
    # Generator function contains yield statements
    yield n

    n += 1
    print('This is printed second')
    yield n

    n += 1
    print('This is printed at last')
    yield n



In [51]:
# It returns an object but does not start execution immediately.
a= my_gen()
a

<generator object my_gen at 0x7fd803368ac0>

In [52]:
# We can iterate through the items using next().
next(a)

This is printed first


1

In [53]:
# Once the function yields, the function is paused and control is transferred to the caller.
# Local variables and their states are remembered between successive calls.
next(a)

This is printed second


2

In [54]:
next(a)

This is printed at last


3

In [55]:
# Finall, when the function terminates, StopIteration is raised automatically on further calls.

next(a)

StopIteration: 

In [57]:
"""One interesting thing to note in the above example is that the value of variable n is remembered between each call.

Unlike normal functions, the local variables are not destroyed when the function yields. Furthermore, the generator object can be iterated only once.

To restart the process we need to create another generator object using something like a = my_gen()."""

'One interesting thing to note in the above example is that the value of variable n is remembered between each call.\n\nUnlike normal functions, the local variables are not destroyed when the function yields. Furthermore, the generator object can be iterated only once.\n\nTo restart the process we need to create another generator object using something like a = my_gen().'

In [62]:
# Python Generators with a Loop
# A for loop takes an iterator and iterates over it using next() function. It automatically ends when StopIteration is raised.
def rev_str(my_str):
    length = len(my_str)
    for i in range(length - 1, -1, -1):
        yield my_str[i]
# For loop to reverse the string
for char in rev_str("hello"):
    print(char)

o
l
l
e
h


In [63]:
# The major difference between a list comprehension and a generator expression is that a list comprehension produces the entire list while the generator expression produces one item at a time.
# Initialize the list
my_list = [1, 3, 6, 10]

# square each term using list comprehension
list_ = [x**2 for x in my_list]

# same thing can be done using a generator expression
# generator expressions are surrounded by parenthesis ()
generator = (x**2 for x in my_list)

print(list_)
print(generator)

[1, 9, 36, 100]
<generator object <genexpr> at 0x7fd803343890>


In [75]:
for i in my_list:
    print(i)

1
3
6
10


In [78]:
for i in (x**2 for x in my_list):
    print(i)

1
9
36
100


In [79]:
print("my_list:", my_list ,"generator:", list(generator))

my_list: [1, 3, 6, 10] generator: []


In [88]:
# Generators can be implemented in a clear and concise way as compared to their iterator class counterpart. 
# Following is an example to implement a sequence of power of 2 using an iterator class.
class PowTwo:
    def __init__(self, max=0):
        self.n = 0
        self.max = max

    def __iter__(self):
        return self

    def __next__(self):
        if self.n > self.max:
            raise StopIteration

        result = 2 ** self.n
        self.n += 1
        return result

In [96]:
#The above program was lengthy and confusing. Now, let's do the same using a generator function.
def PowTwoGen(max=0):
    n = 0;
    while n < max :
        yield 2 ** n;
        n += 1

In [97]:
print(list(PowTwo(5)))

[1, 2, 4, 8, 16, 32]


In [98]:
print(list(PowTwoGen(5)))

[1, 2, 4, 8, 16]


In [99]:
"""A normal function to return a sequence will create the entire sequence in memory before returning the result. This is an overkill, if the number of items in the sequence is very large.

Generator implementation of such sequences is memory friendly and is preferred since it only produces one item at a time."""

'A normal function to return a sequence will create the entire sequence in memory before returning the result. This is an overkill, if the number of items in the sequence is very large.\n\nGenerator implementation of such sequences is memory friendly and is preferred since it only produces one item at a time.'

In [100]:
# Represent Infinite Stream
"""Generators are excellent mediums to represent an infinite stream of data. 
Infinite streams cannot be stored in memory, and since generators produce only one item at a time, they can represent an infinite stream of data."""
def all_even():
    n = 0;
    while True:
        yield n
        n += 1

In [1]:
# Pipelining  Generators
def fibonacci_numbers(nums):
    x, y = 0, 1
    for _ in range(nums):
        x, y = y, x+y
        yield x

def square(nums):
    for num in nums:
        yield num**2

print(sum(square(fibonacci_numbers(10))))

4895


In [2]:
# Use 1 for inifinity loops. Use while 1 instead of while True. It will reduce some runtime.

In [None]:
# Use special libraries to process large datasets. C/C++ is faster than python.
# So, many packages and modules have been written in C/C++ that you can use in your python programme.
# Numpy, Scipy and Pandas are three of them and are popular for processing large datasets.