In [1]:
import time
import numpy as np
import pandas as pd

In this notebook we will start exploring some key functions in python that can enable  mode efficient (faster and cleaner) code. 

We will investigate the following Functions
1. Eliminating Loops: list, dictionary comprehension
2. Combining Objects
3. Counting and iterating over Objects


### 1. Eliminating Loops
Using extraneous loops can be inefficient and costly. Let’s explore some tools that can help us eliminate the need to use loops in our code. Although all of these looping patterns are supported by Python, we should be careful when using them. Because most loops are evaluated in a piece-by-piece manner, they are often inefficient solutions.

Python comes with a few looping patterns that can be used when we want to iterate over an object’s contents:

- For loops iterate over elements of a sequence piece-by-piece.
- While loops execute a loop repeatedly as long as some Boolean condition is met.
- Nested loops use multiple loops inside one another.


####  Row sums
Given a matrix compute row sums

In [5]:
random_data = np.random.randint(100, size=(3,4))
random_data

array([[74, 16, 37, 16],
       [54, 28, 57, 81],
       [82, 43, 74,  3]])

#### Using For Loops

In [7]:
%%timeit  totals = []
## row sums
# start = time.time()
for row in random_data:
    totals.append(sum(row))
end = time.time()
# total_time = end-start
# print(f'Run time: {total_time} s')

3.93 µs ± 479 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


#### Using List comprehensions

In [8]:
%timeit totals_comp = [sum(row) for row in random_data]

3.09 µs ± 320 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


#### Dictionary Comprehensions

In [39]:
# Lists to represent keys and values
keys = ['Ahmed', 'Youssef', 'Mohammed']
values = [25, 27, 40]
 
# but this line shows dict comprehension here 
myDict = { k:v for (k,v) in zip(keys, values)} 
myDict

{'Ahmed': 25, 'Youssef': 27, 'Mohammed': 40}

### 2. Combinations

Combining, counting, and iterating over objects can be done very efficiently in python. Imagine we have two lists: one of the names and the other is the age for each of them. We want to combine these lists so that each name is stored next to its age. We can use the following methods to obtain the same result:

1. enumerate
2. zip (more elegant)

#### Using Enumerate:
We can iterate over the names list using enumerate and grab each name's corresponding age using the index variable.

In [15]:
%%timeit -r10 -n100
# combining objects 
names = ['Ahmed', 'Youssef', 'Mohammed']
age = [25, 27, 40]
combined = []

for i,name in enumerate(names):
    combined.append((name, age[i]))
# print(combined)

875 ns ± 116 ns per loop (mean ± std. dev. of 10 runs, 100 loops each)


#### Using zip( built in python function)
The name “zip” describes how this function combines objects like a zipper on a jacket (making two separate things become one). zip returns a zip object that must be unpacked into a list and printed to see the contents. Each item is a tuple of elements from the original lists.

In [10]:
# Combining objects with zip
combined_zip = zip(names, age)
print(type(combined_zip))

<class 'zip'>


In [11]:
combined_zip_list = [*combined_zip]
print(combined_zip_list)

[('Ahmed', 25), ('Youssef', 27), ('Mohammed', 40)]


In [17]:
%%timeit -r10 -n100
names = ['Ahmed', 'Youssef', 'Mohammed']
age = [25, 27, 40]
# Combining objects with zip
combined_zip = zip(names, age)
# print(type(combined_zip))

271 ns ± 10.7 ns per loop (mean ± std. dev. of 10 runs, 100 loops each)


### 3. Counting Objects
The [collections](https://www.geeksforgeeks.org/python-collections-module/) module contains specialized datatypes that can be used as alternatives to standard dictionaries, lists, sets, and tuples. A few notable specialized datatypes are:

- namedtuple: tuple subclasses with named fields
- deque: list-like container with fast appends and pops
- Counter: dict for counting hashable objects
- OrderedDict: dict that retains the order of entries
- defaultdict: dict that calls a factory function to supply missing values

The counter funtion from the Collections module will enable more effiecient way of counting items. To illustrate this, we will use the [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html).

In [23]:
# IMPORT THE IRIS DATA FROM THE 
# SKLEARN MODULE
from sklearn.datasets import load_iris
  
# LOAD THE IRIS DATASET BY CALLING
# THE FUNCTION
iris_data = load_iris()
  
# PLACE THE IRIS DATA IN A PANDAS
# DATAFRAME
df = pd.DataFrame(data=iris_data.data, 
                  columns=iris_data.feature_names)
df['target'] = iris_data.target
old_names=[0,1,2]
new_names =['setosa' ,'versicolor','virginica']
## replace target values with names
df['target'] = df['target'].replace({0:'setosa', 1:'versicolor', 2:'virginica'})
  
# DISPLAY FIRST 5 RECORDS OF THE 
# DATAFRAME
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


#### using loops

In [32]:
%%timeit -r10 -n1000
# Counting number of setosas,versicolors,virginicas with for loop
iris_types = df['target']
type_counts = {}
for iris_type in iris_types:
    if iris_type not in type_counts:
        type_counts[iris_type] = 1
    else:
        type_counts[iris_type] += 1
# print(type_counts)

20.3 µs ± 2.31 µs per loop (mean ± std. dev. of 10 runs, 1000 loops each)


#### Using Counter 

In [31]:
%%timeit -r10 -n1000
iris_types = df['target']
# counting with collections.Counter()
from collections import Counter
type_counts = Counter(iris_types)
# print(type_counts)

13.9 µs ± 2.04 µs per loop (mean ± std. dev. of 10 runs, 1000 loops each)


### 4. Object Iterations
[itertools](https://docs.python.org/3/library/itertools.html), contains a number of functional tools for working with iterators. Some of these tools include:

Infinite iterators: count, cycle, repeat
Finite iterators: accumulate, chain, zip_longest, etc.
Combination generators: product, permutations, combinations

In [33]:
#Combinations with loop
iris_types = ['setosa' ,'versicolor','virginica']
combos = []
for x in iris_types:
    for y in iris_types:
        if x == y:
            continue
        if ((x,y) not in combos) & ((y,x) not in combos):
            combos.append((x,y))
print(combos)

[('setosa', 'versicolor'), ('setosa', 'virginica'), ('versicolor', 'virginica')]


In [34]:
# combinations with itertools
iris_types = ['setosa' ,'versicolor','virginica']
from itertools import combinations
combos_obj = combinations(iris_types, 2)
print(type(combos_obj))

combos = [*combos_obj]
print(combos)

<class 'itertools.combinations'>
[('setosa', 'versicolor'), ('setosa', 'virginica'), ('versicolor', 'virginica')]


In [37]:
# using product with itertools
iris_types = ['setosa' ,'versicolor','virginica']
from itertools import product
prods_obj = product(iris_types, repeat = 2)
print(type(prods_obj))

prods = [*prods_obj]
print(prods)

<class 'itertools.product'>
[('setosa', 'setosa'), ('setosa', 'versicolor'), ('setosa', 'virginica'), ('versicolor', 'setosa'), ('versicolor', 'versicolor'), ('versicolor', 'virginica'), ('virginica', 'setosa'), ('virginica', 'versicolor'), ('virginica', 'virginica')]
