# Optimizing Code: Common Books
Here's the code your coworker wrote to find the common book ids in `books_published_last_two_years.txt` and `all_coding_books.txt` to obtain a list of recent coding books.

In [1]:
import time
import pandas as pd
import numpy as np

In [2]:
with open('books_published_last_two_years.txt') as f:
    recent_books = f.read().split('\n')
    
with open('all_coding_books.txt') as f:
    coding_books = f.read().split('\n')

In [3]:
start = time.time()
recent_coding_books = []

for book in recent_books:
    if book in coding_books:
        recent_coding_books.append(book)

print(len(recent_coding_books))
print('Duration: {} seconds'.format(time.time() - start))

96
Duration: 33.59370994567871 seconds


### Tip #1: Use vector operations over loops when possible

Use numpy's `intersect1d` method to get the intersection of the `recent_books` and `coding_books` arrays.

In [4]:
help(np.intersect1d)

Help on function intersect1d in module numpy.lib.arraysetops:

intersect1d(ar1, ar2, assume_unique=False)
    Find the intersection of two arrays.
    
    Return the sorted, unique values that are in both of the input arrays.
    
    Parameters
    ----------
    ar1, ar2 : array_like
        Input arrays.
    assume_unique : bool
        If True, the input arrays are both assumed to be unique, which
        can speed up the calculation.  Default is False.
    
    Returns
    -------
    intersect1d : ndarray
        Sorted 1D array of common and unique elements.
    
    See Also
    --------
    numpy.lib.arraysetops : Module with a number of other functions for
                            performing set operations on arrays.
    
    Examples
    --------
    >>> np.intersect1d([1, 3, 4, 3], [3, 1, 2, 1])
    array([1, 3])
    
    To intersect more than two arrays, use functools.reduce:
    
    >>> from functools import reduce
    >>> reduce(np.intersect1d, ([1, 3, 4, 3], [3, 1

In [7]:
start = time.time()
recent_coding_books =  np.intersect1d(recent_books, 
                                      coding_books) # TODO: compute intersection of lists
print(len(recent_coding_books))
print('Duration: {} seconds'.format(time.time() - start))

96
Duration: 0.03519916534423828 seconds


### Tip #2: Know your data structures and which methods are faster
Use the set's `intersection` method to get the common elements in `recent_books` and `coding_books`.

In [13]:
start = time.time()
recent_coding_books =  set(recent_books).interset(coding_books))# TODO: compute intersection of lists
print(len(recent_coding_books))
print('Duration: {} seconds'.format(time.time() - start))

96
Duration: 0.013179302215576172 seconds
