# Optimizing Code: Common Books

* **Objective:** Learn how to optimize for structures using numpy array operations and data structures from Python. It makes the code simpler and faster to run.
* **Dataset:** .txt files available in the Data Science Nanodegree from Udacity

<hr />


# Table of content
* [1) Import libraries](#import)
* [2) Read dataset](#dataset)
* [3) Find common books](#books)
    * [3.1) Long solution](#books1)
    * [3.2) Vector operation solution](#books2)
    * [3.3) Data structure solution](#books3)

## 1) Import libraries <a class="anchor" id="import"></a>

In [1]:
import time
import pandas as pd
import numpy as np

## 2) Read dataset <a class="anchor" id="dataset"></a>

In [4]:
with open("./dataset/books_published_last_two_years.txt") as f:
    recent_books = f.read().split("\n")
    
with open("./dataset/all_coding_books.txt") as f:
    coding_books = f.read().split("\n")

## 3) Find common books <a class="anchor" id="books"></a>

Here's the code your coworker wrote to find the common book ids in `books_published_last_two_years.txt` and `all_coding_books.txt` to obtain a list of recent coding books.

### 3.1) Long solution <a class="anchor" id="books1"></a>

In [5]:
start = time.time()
recent_coding_books = []
for book in recent_books:
    if book in coding_books:
        recent_coding_books.append(book)

print(len(recent_coding_books))
print("Duration: {} seconds".format(time.time()-start))

96
Duration: 11.261584520339966 seconds


### 3.2) Vector operation solution <a class="anchor" id="books2"></a>

In [6]:
# Using numpy arrays operation
start = time.time()
recent_coding_books = np.intersect1d(list(coding_books), list(recent_coding_books))
print(len(recent_coding_books))
print("Duration: {} seconds".format(time.time()-start))

96
Duration: 0.02099919319152832 seconds


### 3.3) Data structure solution <a class="anchor" id="books3"></a>

In [7]:
#Set data structure from Python
start = time.time()
recent_coding_books = set(coding_books).intersection(set(recent_coding_books))
print(len(recent_coding_books))
print("Duration: {} seconds".format(time.time()-start))

96
Duration: 0.005002498626708984 seconds
