# Optimizing Code: Common Books
Here's the code your coworker wrote to find the common book ids in `books_published_last_two_years.txt` and `all_coding_books.txt` to obtain a list of recent coding books.

In [1]:
import time
import pandas as pd
import numpy as np

In [3]:
with open('books-published-last-two-years.txt') as f:
    recent_books1 = f.read().split('\n')
    
with open('all-coding-books.txt') as f:
    coding_books1 = f.read().split('\n')
    


In [4]:
recent_books = pd.read_csv('books-published-last-two-years.txt', sep='\n', names=['IDs'])
coding_books =  pd.read_csv('all-coding-books.txt', sep='\n', names=['IDs'])

In [5]:
coding_books.head()

Unnamed: 0,IDs
0,4140074
1,3058732
2,4181244
3,8709089
4,9097893


In [6]:
start = time.time()
recent_coding_books = []

for book in recent_books1:
    if book in coding_books1:
        recent_coding_books.append(book)

print(len(recent_coding_books))
print('Duration: {} seconds'.format(time.time() - start))

96
Duration: 11.288668870925903 seconds


### Tip #1: Use vector operations over loops when possible

Use numpy's `intersect1d` method to get the intersection of the `recent_books` and `coding_books` arrays.

In [7]:
start = time.time()
recent_coding_books = np.intersect1d(recent_books, coding_books)  # TODO: compute intersection of lists
print(len(recent_coding_books))
print('Duration: {} seconds'.format(time.time() - start))

96
Duration: 0.011968851089477539 seconds


### Tip #2: Know your data structures and which methods are faster
Use the set's `intersection` method to get the common elements in `recent_books` and `coding_books`.

In [11]:
start = time.time()
recent_coding_books2 = set(recent_books.IDs).intersection(coding_books.IDs)  # TODO: compute intersection of lists
print(len(recent_coding_books))
print('Duration: {} seconds'.format(time.time() - start))

96
Duration: 0.02122783660888672 seconds
