# #readMoreCanlit | Notebook 2: Data acquisition

<center><img src='../img/readMoreCanlit.png'></center>

## Overview

To populate the corpus and app, three sets of data needed to be gathered:

> 1. information on international books (title, author, description)
> 2. information on Canadian books (title, author, description)
> 3. book cover art for Canadian books

For the international books, a broad search was undertaken to find lists of ISBNs (the international standard book number used in publishing to distinguish books/editions from one another) online. Sources were found at openlibrary.org and data.world that included over 2.7 million ISBNs (the lists contain the ISBN and no further information). 

To gather the necessary metadata, the ISBNdb.com API was employed to query a database of 12 million books. Progress was slow, however. First, there was a limitation of 15,000 queries per day. Second, the ISBNdb.com database is incomplete; many ISBNs had no entry in the database and of those that were present, many of them lacked descriptions (the primary piece of metadata that is required for the recommender system to work effectively. On average, about 5 percent of the returned ISBN information was usable. Over 100,000 ISBN queries were executed to produce a dataset of 6,000 international titles.

Canadian titles were more easily sourced via the website 49thshelf.com. Through webscraping, a set of URLs was requested in order to produce a set of title, author and description information for 8,500 Canadian fiction titles (it was not possible to differentiate the international titles by genre). The website provided book cover images for all of the titles as well.


### Imports

In [2]:
# pandas and numpy
import pandas as pd
import numpy as np

pd.options.display.max_seq_items = 2000
pd.options.display.max_rows = 4000

# other imports
from bs4 import BeautifulSoup
import json
import requests
import time
import urllib.request
from datetime import datetime

## Get international book metadata

In [27]:
# Read in the ISBN list

isbn = pd.read_csv('../data/data_acquisition/international_for_download.csv')

# My for loop below requires the ISBNs to be interpreted as strings so they can be interpolated into URLs
isbn = isbn.applymap(str)

# Confirm the change
isbn.dtypes

isbn        object
title       object
authors     object
overview    object
dtype: object

In [28]:
# Reduce to a subset matching the ISBNdb.com daily limit

isbn = isbn[0:15000]
isbn

Unnamed: 0,isbn,title,authors,overview
0,9781780104089,,,
1,9781780104065,,,
2,9781780104058,,,
3,9781780104041,,,
4,9781780104034,,,
5,9781780104027,,,
6,9781780104003,,,
7,9781780103990,,,
8,9781780103983,,,
9,9781780103976,,,


In [29]:
# Iterate through dataframe containing the list of ISBNs, 
# constructing URLs to pass to requests
# along with the ISBNdb authorization key
# return the necessary content in JSON format
# and write it back into the dataframe

# Note, this process was repeated many times
# it is commented out so I don't incur costs if all cells are run

# for j in range(len(isbn)):

#     header = {'Authorization': '44023_23ab132f3977ad9849e8f1a5d7dc73bf'}
#     base_url = ('https://api2.isbndb.com/book/')
#     response = requests.get(base_url + isbn['isbn'][j], headers=header)
#     payload = response.json()
      
#     try:
#         isbn['title'][j] = payload['book']['title']
    
#     except:
#         isbn['title'][j] = np.nan
    
#     try:
#         isbn['authors'][j] = payload['book']['authors']
    
#     except:
#         isbn['authors'][j] = np.nan
              
#     try:
#         isbn['overview'][j] = payload['book']['overview']
    
#     except:
#         isbn['overview'][j] = np.nan
  
#     print('Info downloaded for book ' + str(j + 1) + ' of ' +  str(len(isbn)) + ' books.')
              
#     time.sleep(1)
    

Info downloaded for book 1 of 3000 books.
Info downloaded for book 2 of 3000 books.
Info downloaded for book 3 of 3000 books.
Info downloaded for book 4 of 3000 books.
Info downloaded for book 5 of 3000 books.
Info downloaded for book 6 of 3000 books.
Info downloaded for book 7 of 3000 books.
Info downloaded for book 8 of 3000 books.
Info downloaded for book 9 of 3000 books.
Info downloaded for book 10 of 3000 books.
Info downloaded for book 11 of 3000 books.
Info downloaded for book 12 of 3000 books.
Info downloaded for book 13 of 3000 books.
Info downloaded for book 14 of 3000 books.
Info downloaded for book 15 of 3000 books.
Info downloaded for book 16 of 3000 books.
Info downloaded for book 17 of 3000 books.
Info downloaded for book 18 of 3000 books.
Info downloaded for book 19 of 3000 books.
Info downloaded for book 20 of 3000 books.
Info downloaded for book 21 of 3000 books.
Info downloaded for book 22 of 3000 books.
Info downloaded for book 23 of 3000 books.
Info downloaded for 

Info downloaded for book 190 of 3000 books.
Info downloaded for book 191 of 3000 books.
Info downloaded for book 192 of 3000 books.
Info downloaded for book 193 of 3000 books.
Info downloaded for book 194 of 3000 books.
Info downloaded for book 195 of 3000 books.
Info downloaded for book 196 of 3000 books.
Info downloaded for book 197 of 3000 books.
Info downloaded for book 198 of 3000 books.
Info downloaded for book 199 of 3000 books.
Info downloaded for book 200 of 3000 books.
Info downloaded for book 201 of 3000 books.
Info downloaded for book 202 of 3000 books.
Info downloaded for book 203 of 3000 books.
Info downloaded for book 204 of 3000 books.
Info downloaded for book 205 of 3000 books.
Info downloaded for book 206 of 3000 books.
Info downloaded for book 207 of 3000 books.
Info downloaded for book 208 of 3000 books.
Info downloaded for book 209 of 3000 books.
Info downloaded for book 210 of 3000 books.
Info downloaded for book 211 of 3000 books.
Info downloaded for book 212 of 

Info downloaded for book 377 of 3000 books.
Info downloaded for book 378 of 3000 books.
Info downloaded for book 379 of 3000 books.
Info downloaded for book 380 of 3000 books.
Info downloaded for book 381 of 3000 books.
Info downloaded for book 382 of 3000 books.
Info downloaded for book 383 of 3000 books.
Info downloaded for book 384 of 3000 books.
Info downloaded for book 385 of 3000 books.
Info downloaded for book 386 of 3000 books.
Info downloaded for book 387 of 3000 books.
Info downloaded for book 388 of 3000 books.
Info downloaded for book 389 of 3000 books.
Info downloaded for book 390 of 3000 books.
Info downloaded for book 391 of 3000 books.
Info downloaded for book 392 of 3000 books.
Info downloaded for book 393 of 3000 books.
Info downloaded for book 394 of 3000 books.
Info downloaded for book 395 of 3000 books.
Info downloaded for book 396 of 3000 books.
Info downloaded for book 397 of 3000 books.
Info downloaded for book 398 of 3000 books.
Info downloaded for book 399 of 

Info downloaded for book 564 of 3000 books.
Info downloaded for book 565 of 3000 books.
Info downloaded for book 566 of 3000 books.
Info downloaded for book 567 of 3000 books.
Info downloaded for book 568 of 3000 books.
Info downloaded for book 569 of 3000 books.
Info downloaded for book 570 of 3000 books.
Info downloaded for book 571 of 3000 books.
Info downloaded for book 572 of 3000 books.
Info downloaded for book 573 of 3000 books.
Info downloaded for book 574 of 3000 books.
Info downloaded for book 575 of 3000 books.
Info downloaded for book 576 of 3000 books.
Info downloaded for book 577 of 3000 books.
Info downloaded for book 578 of 3000 books.
Info downloaded for book 579 of 3000 books.
Info downloaded for book 580 of 3000 books.
Info downloaded for book 581 of 3000 books.
Info downloaded for book 582 of 3000 books.
Info downloaded for book 583 of 3000 books.
Info downloaded for book 584 of 3000 books.
Info downloaded for book 585 of 3000 books.
Info downloaded for book 586 of 

Info downloaded for book 751 of 3000 books.
Info downloaded for book 752 of 3000 books.
Info downloaded for book 753 of 3000 books.
Info downloaded for book 754 of 3000 books.
Info downloaded for book 755 of 3000 books.
Info downloaded for book 756 of 3000 books.
Info downloaded for book 757 of 3000 books.
Info downloaded for book 758 of 3000 books.
Info downloaded for book 759 of 3000 books.
Info downloaded for book 760 of 3000 books.
Info downloaded for book 761 of 3000 books.
Info downloaded for book 762 of 3000 books.
Info downloaded for book 763 of 3000 books.
Info downloaded for book 764 of 3000 books.
Info downloaded for book 765 of 3000 books.
Info downloaded for book 766 of 3000 books.
Info downloaded for book 767 of 3000 books.
Info downloaded for book 768 of 3000 books.
Info downloaded for book 769 of 3000 books.
Info downloaded for book 770 of 3000 books.
Info downloaded for book 771 of 3000 books.
Info downloaded for book 772 of 3000 books.
Info downloaded for book 773 of 

Info downloaded for book 938 of 3000 books.
Info downloaded for book 939 of 3000 books.
Info downloaded for book 940 of 3000 books.
Info downloaded for book 941 of 3000 books.
Info downloaded for book 942 of 3000 books.
Info downloaded for book 943 of 3000 books.
Info downloaded for book 944 of 3000 books.
Info downloaded for book 945 of 3000 books.
Info downloaded for book 946 of 3000 books.
Info downloaded for book 947 of 3000 books.
Info downloaded for book 948 of 3000 books.
Info downloaded for book 949 of 3000 books.
Info downloaded for book 950 of 3000 books.
Info downloaded for book 951 of 3000 books.
Info downloaded for book 952 of 3000 books.
Info downloaded for book 953 of 3000 books.
Info downloaded for book 954 of 3000 books.
Info downloaded for book 955 of 3000 books.
Info downloaded for book 956 of 3000 books.
Info downloaded for book 957 of 3000 books.
Info downloaded for book 958 of 3000 books.
Info downloaded for book 959 of 3000 books.
Info downloaded for book 960 of 

Info downloaded for book 1122 of 3000 books.
Info downloaded for book 1123 of 3000 books.
Info downloaded for book 1124 of 3000 books.
Info downloaded for book 1125 of 3000 books.
Info downloaded for book 1126 of 3000 books.
Info downloaded for book 1127 of 3000 books.
Info downloaded for book 1128 of 3000 books.
Info downloaded for book 1129 of 3000 books.
Info downloaded for book 1130 of 3000 books.
Info downloaded for book 1131 of 3000 books.
Info downloaded for book 1132 of 3000 books.
Info downloaded for book 1133 of 3000 books.
Info downloaded for book 1134 of 3000 books.
Info downloaded for book 1135 of 3000 books.
Info downloaded for book 1136 of 3000 books.
Info downloaded for book 1137 of 3000 books.
Info downloaded for book 1138 of 3000 books.
Info downloaded for book 1139 of 3000 books.
Info downloaded for book 1140 of 3000 books.
Info downloaded for book 1141 of 3000 books.
Info downloaded for book 1142 of 3000 books.
Info downloaded for book 1143 of 3000 books.
Info downl

Info downloaded for book 1305 of 3000 books.
Info downloaded for book 1306 of 3000 books.
Info downloaded for book 1307 of 3000 books.
Info downloaded for book 1308 of 3000 books.
Info downloaded for book 1309 of 3000 books.
Info downloaded for book 1310 of 3000 books.
Info downloaded for book 1311 of 3000 books.
Info downloaded for book 1312 of 3000 books.
Info downloaded for book 1313 of 3000 books.
Info downloaded for book 1314 of 3000 books.
Info downloaded for book 1315 of 3000 books.
Info downloaded for book 1316 of 3000 books.
Info downloaded for book 1317 of 3000 books.
Info downloaded for book 1318 of 3000 books.
Info downloaded for book 1319 of 3000 books.
Info downloaded for book 1320 of 3000 books.
Info downloaded for book 1321 of 3000 books.
Info downloaded for book 1322 of 3000 books.
Info downloaded for book 1323 of 3000 books.
Info downloaded for book 1324 of 3000 books.
Info downloaded for book 1325 of 3000 books.
Info downloaded for book 1326 of 3000 books.
Info downl

Info downloaded for book 1488 of 3000 books.
Info downloaded for book 1489 of 3000 books.
Info downloaded for book 1490 of 3000 books.
Info downloaded for book 1491 of 3000 books.
Info downloaded for book 1492 of 3000 books.
Info downloaded for book 1493 of 3000 books.
Info downloaded for book 1494 of 3000 books.
Info downloaded for book 1495 of 3000 books.
Info downloaded for book 1496 of 3000 books.
Info downloaded for book 1497 of 3000 books.
Info downloaded for book 1498 of 3000 books.
Info downloaded for book 1499 of 3000 books.
Info downloaded for book 1500 of 3000 books.
Info downloaded for book 1501 of 3000 books.
Info downloaded for book 1502 of 3000 books.
Info downloaded for book 1503 of 3000 books.
Info downloaded for book 1504 of 3000 books.
Info downloaded for book 1505 of 3000 books.
Info downloaded for book 1506 of 3000 books.
Info downloaded for book 1507 of 3000 books.
Info downloaded for book 1508 of 3000 books.
Info downloaded for book 1509 of 3000 books.
Info downl

Info downloaded for book 1671 of 3000 books.
Info downloaded for book 1672 of 3000 books.
Info downloaded for book 1673 of 3000 books.
Info downloaded for book 1674 of 3000 books.
Info downloaded for book 1675 of 3000 books.
Info downloaded for book 1676 of 3000 books.
Info downloaded for book 1677 of 3000 books.
Info downloaded for book 1678 of 3000 books.
Info downloaded for book 1679 of 3000 books.
Info downloaded for book 1680 of 3000 books.
Info downloaded for book 1681 of 3000 books.
Info downloaded for book 1682 of 3000 books.
Info downloaded for book 1683 of 3000 books.
Info downloaded for book 1684 of 3000 books.
Info downloaded for book 1685 of 3000 books.
Info downloaded for book 1686 of 3000 books.
Info downloaded for book 1687 of 3000 books.
Info downloaded for book 1688 of 3000 books.
Info downloaded for book 1689 of 3000 books.
Info downloaded for book 1690 of 3000 books.
Info downloaded for book 1691 of 3000 books.
Info downloaded for book 1692 of 3000 books.
Info downl

Info downloaded for book 1854 of 3000 books.
Info downloaded for book 1855 of 3000 books.
Info downloaded for book 1856 of 3000 books.
Info downloaded for book 1857 of 3000 books.
Info downloaded for book 1858 of 3000 books.
Info downloaded for book 1859 of 3000 books.
Info downloaded for book 1860 of 3000 books.
Info downloaded for book 1861 of 3000 books.
Info downloaded for book 1862 of 3000 books.
Info downloaded for book 1863 of 3000 books.
Info downloaded for book 1864 of 3000 books.
Info downloaded for book 1865 of 3000 books.
Info downloaded for book 1866 of 3000 books.
Info downloaded for book 1867 of 3000 books.
Info downloaded for book 1868 of 3000 books.
Info downloaded for book 1869 of 3000 books.
Info downloaded for book 1870 of 3000 books.
Info downloaded for book 1871 of 3000 books.
Info downloaded for book 1872 of 3000 books.
Info downloaded for book 1873 of 3000 books.
Info downloaded for book 1874 of 3000 books.
Info downloaded for book 1875 of 3000 books.
Info downl

Info downloaded for book 2037 of 3000 books.
Info downloaded for book 2038 of 3000 books.
Info downloaded for book 2039 of 3000 books.
Info downloaded for book 2040 of 3000 books.
Info downloaded for book 2041 of 3000 books.
Info downloaded for book 2042 of 3000 books.
Info downloaded for book 2043 of 3000 books.
Info downloaded for book 2044 of 3000 books.
Info downloaded for book 2045 of 3000 books.
Info downloaded for book 2046 of 3000 books.
Info downloaded for book 2047 of 3000 books.
Info downloaded for book 2048 of 3000 books.
Info downloaded for book 2049 of 3000 books.
Info downloaded for book 2050 of 3000 books.
Info downloaded for book 2051 of 3000 books.
Info downloaded for book 2052 of 3000 books.
Info downloaded for book 2053 of 3000 books.
Info downloaded for book 2054 of 3000 books.
Info downloaded for book 2055 of 3000 books.
Info downloaded for book 2056 of 3000 books.
Info downloaded for book 2057 of 3000 books.
Info downloaded for book 2058 of 3000 books.
Info downl

Info downloaded for book 2220 of 3000 books.
Info downloaded for book 2221 of 3000 books.
Info downloaded for book 2222 of 3000 books.
Info downloaded for book 2223 of 3000 books.
Info downloaded for book 2224 of 3000 books.
Info downloaded for book 2225 of 3000 books.
Info downloaded for book 2226 of 3000 books.
Info downloaded for book 2227 of 3000 books.
Info downloaded for book 2228 of 3000 books.
Info downloaded for book 2229 of 3000 books.
Info downloaded for book 2230 of 3000 books.
Info downloaded for book 2231 of 3000 books.
Info downloaded for book 2232 of 3000 books.
Info downloaded for book 2233 of 3000 books.
Info downloaded for book 2234 of 3000 books.
Info downloaded for book 2235 of 3000 books.
Info downloaded for book 2236 of 3000 books.
Info downloaded for book 2237 of 3000 books.
Info downloaded for book 2238 of 3000 books.
Info downloaded for book 2239 of 3000 books.
Info downloaded for book 2240 of 3000 books.
Info downloaded for book 2241 of 3000 books.
Info downl

Info downloaded for book 2403 of 3000 books.
Info downloaded for book 2404 of 3000 books.
Info downloaded for book 2405 of 3000 books.
Info downloaded for book 2406 of 3000 books.
Info downloaded for book 2407 of 3000 books.
Info downloaded for book 2408 of 3000 books.
Info downloaded for book 2409 of 3000 books.
Info downloaded for book 2410 of 3000 books.
Info downloaded for book 2411 of 3000 books.
Info downloaded for book 2412 of 3000 books.
Info downloaded for book 2413 of 3000 books.
Info downloaded for book 2414 of 3000 books.
Info downloaded for book 2415 of 3000 books.
Info downloaded for book 2416 of 3000 books.
Info downloaded for book 2417 of 3000 books.
Info downloaded for book 2418 of 3000 books.
Info downloaded for book 2419 of 3000 books.
Info downloaded for book 2420 of 3000 books.
Info downloaded for book 2421 of 3000 books.
Info downloaded for book 2422 of 3000 books.
Info downloaded for book 2423 of 3000 books.
Info downloaded for book 2424 of 3000 books.
Info downl

Info downloaded for book 2586 of 3000 books.
Info downloaded for book 2587 of 3000 books.
Info downloaded for book 2588 of 3000 books.
Info downloaded for book 2589 of 3000 books.
Info downloaded for book 2590 of 3000 books.
Info downloaded for book 2591 of 3000 books.
Info downloaded for book 2592 of 3000 books.
Info downloaded for book 2593 of 3000 books.
Info downloaded for book 2594 of 3000 books.
Info downloaded for book 2595 of 3000 books.
Info downloaded for book 2596 of 3000 books.
Info downloaded for book 2597 of 3000 books.
Info downloaded for book 2598 of 3000 books.
Info downloaded for book 2599 of 3000 books.
Info downloaded for book 2600 of 3000 books.
Info downloaded for book 2601 of 3000 books.
Info downloaded for book 2602 of 3000 books.
Info downloaded for book 2603 of 3000 books.
Info downloaded for book 2604 of 3000 books.
Info downloaded for book 2605 of 3000 books.
Info downloaded for book 2606 of 3000 books.
Info downloaded for book 2607 of 3000 books.
Info downl

Info downloaded for book 2769 of 3000 books.
Info downloaded for book 2770 of 3000 books.
Info downloaded for book 2771 of 3000 books.
Info downloaded for book 2772 of 3000 books.
Info downloaded for book 2773 of 3000 books.
Info downloaded for book 2774 of 3000 books.
Info downloaded for book 2775 of 3000 books.
Info downloaded for book 2776 of 3000 books.
Info downloaded for book 2777 of 3000 books.
Info downloaded for book 2778 of 3000 books.
Info downloaded for book 2779 of 3000 books.
Info downloaded for book 2780 of 3000 books.
Info downloaded for book 2781 of 3000 books.
Info downloaded for book 2782 of 3000 books.
Info downloaded for book 2783 of 3000 books.
Info downloaded for book 2784 of 3000 books.
Info downloaded for book 2785 of 3000 books.
Info downloaded for book 2786 of 3000 books.
Info downloaded for book 2787 of 3000 books.
Info downloaded for book 2788 of 3000 books.
Info downloaded for book 2789 of 3000 books.
Info downloaded for book 2790 of 3000 books.
Info downl

Info downloaded for book 2952 of 3000 books.
Info downloaded for book 2953 of 3000 books.
Info downloaded for book 2954 of 3000 books.
Info downloaded for book 2955 of 3000 books.
Info downloaded for book 2956 of 3000 books.
Info downloaded for book 2957 of 3000 books.
Info downloaded for book 2958 of 3000 books.
Info downloaded for book 2959 of 3000 books.
Info downloaded for book 2960 of 3000 books.
Info downloaded for book 2961 of 3000 books.
Info downloaded for book 2962 of 3000 books.
Info downloaded for book 2963 of 3000 books.
Info downloaded for book 2964 of 3000 books.
Info downloaded for book 2965 of 3000 books.
Info downloaded for book 2966 of 3000 books.
Info downloaded for book 2967 of 3000 books.
Info downloaded for book 2968 of 3000 books.
Info downloaded for book 2969 of 3000 books.
Info downloaded for book 2970 of 3000 books.
Info downloaded for book 2971 of 3000 books.
Info downloaded for book 2972 of 3000 books.
Info downloaded for book 2973 of 3000 books.
Info downl

In [30]:
# Drop the ISBN column now that it is no longer needed
# and save the current set of international book metadata out to csv
# with the file named for the current date and time.

now = datetime.now()
dt = now.strftime("%d-%m-%Y_%H-%M-%S")


isbn.drop('isbn', axis=1, inplace=True)
isbn.to_csv('../data/saved/isbn' + dt +'.csv', index = False)

# Note that this process was repeated many times to assemble the 
# international portion of the app's dataframe

## Get Canadian book metadata

In [None]:
# Read in the list of Canadian ISBNs
canadian = pd.read_csv('../data/data_acquisition/canadian_for_download.csv')

# My for loop below requires the ISBNs to be interpreted as strings so they can be interpolated into URLs

canadian = canadian.applymap(str)

# Confirm the change
canadian 

In [None]:
# Start an empty list to house the descriptions
# Iterate through the Canadian book metadata dataframe (populated from a csv);
# Grab the URL where the book description lives and use beautifulsoup
# to grab the relevant content; at the end, write it all back to the dataframe

description_list = []

for c in (range(len(canadian))):
    response = requests.get(canadian['title_url'][c])
    soup = BeautifulSoup(response.text, 'html.parser')
    
    try:
        for tag in soup.find_all("meta"):
            if tag.get("property", None) == "og:description":
                print(tag.get("content", None))
                description = tag.get("content", None)
                description_list.append(description)
    except:
        description_list.append(np.nan)
        
        
    time.sleep(2)
        
canadian['description'] = description_list

In [None]:
# Drop the ISBN column now that it is no longer needed
# and save the current set of Canadian book metadata out to csv
# with the file named for the current date and time.

canadian.drop('isbn', axis=1, inplace=True)
canadian.to_csv('../data/processed/canadian_books.csv', index = False)
canadian.shape

# Note: This process was run once to populate the Canadian book metadata dataframe

In [6]:
canadian = pd.read_csv('../data/processed/canadian_books.csv')

# Remove duplicate entries from the dataframe
canadian = canadian.drop_duplicates(subset='title', keep="first")
canadian.to_csv('../data/processed/canadian_books.csv', index = False)

## Get Canadian book cover art

In [8]:
# Read in the list of Canadian book metadata
# that contains URLs for book-cover imagery
images = pd.read_csv('../data/processed/canadian_books.csv')

# My for loop below requires the ISBNs to be interpreted as strings so they can be interpolated into URLs
images = images.applymap(str)

# Confirm the change
images.dtypes

id             object
title          object
author         object
description    object
image          object
dtype: object

In [9]:
for i in range(len(images)):

    try:
        urllib.request.urlretrieve(images['image'][i], '../img/books/' + images['id'][i] + '.jpg')
        print('Just captured image number ' + images['id'][i])

    except:
        print('Failed to capture image number ' + images['id'][i])
        
    time.sleep(2)


Just captured image number 0
Just captured image number 1
Just captured image number 2
Just captured image number 3
Just captured image number 4
Just captured image number 5
Just captured image number 6
Just captured image number 7
Just captured image number 8
Just captured image number 9
Just captured image number 10
Just captured image number 11
Just captured image number 12
Just captured image number 13
Just captured image number 14
Just captured image number 15
Just captured image number 16
Just captured image number 17
Just captured image number 18
Just captured image number 19
Just captured image number 20
Just captured image number 21
Just captured image number 22
Just captured image number 23
Just captured image number 24
Just captured image number 25
Just captured image number 26
Just captured image number 27
Just captured image number 28
Failed to capture image number 29
Just captured image number 30
Just captured image number 31
Just captured image number 32
Just captured im

Just captured image number 264
Just captured image number 265
Just captured image number 266
Failed to capture image number 267
Just captured image number 268
Just captured image number 269
Just captured image number 270
Just captured image number 271
Just captured image number 272
Just captured image number 273
Just captured image number 274
Just captured image number 275
Failed to capture image number 276
Just captured image number 277
Just captured image number 278
Just captured image number 279
Just captured image number 280
Just captured image number 281
Just captured image number 282
Failed to capture image number 283
Just captured image number 284
Just captured image number 285
Just captured image number 286
Just captured image number 287
Just captured image number 288
Just captured image number 289
Just captured image number 290
Just captured image number 291
Just captured image number 292
Just captured image number 293
Just captured image number 294
Just captured image number 

Just captured image number 524
Failed to capture image number 525
Just captured image number 526
Just captured image number 527
Just captured image number 528
Failed to capture image number 529
Just captured image number 530
Just captured image number 531
Just captured image number 532
Just captured image number 533
Just captured image number 534
Just captured image number 535
Just captured image number 536
Just captured image number 537
Failed to capture image number 538
Just captured image number 539
Just captured image number 540
Failed to capture image number 541
Failed to capture image number 542
Failed to capture image number 543
Failed to capture image number 544
Failed to capture image number 545
Just captured image number 546
Just captured image number 547
Just captured image number 548
Just captured image number 549
Just captured image number 550
Just captured image number 551
Failed to capture image number 552
Failed to capture image number 553
Just captured image number 554

Just captured image number 782
Just captured image number 783
Just captured image number 784
Just captured image number 785
Just captured image number 786
Failed to capture image number 787
Just captured image number 788
Just captured image number 789
Just captured image number 790
Failed to capture image number 791
Just captured image number 792
Just captured image number 793
Just captured image number 794
Failed to capture image number 795
Just captured image number 796
Just captured image number 797
Just captured image number 798
Just captured image number 799
Just captured image number 800
Just captured image number 801
Just captured image number 802
Failed to capture image number 803
Just captured image number 804
Just captured image number 805
Just captured image number 806
Just captured image number 807
Just captured image number 808
Just captured image number 809
Just captured image number 810
Just captured image number 811
Just captured image number 812
Just captured image num

Just captured image number 1040
Just captured image number 1041
Just captured image number 1042
Just captured image number 1043
Just captured image number 1044
Just captured image number 1045
Just captured image number 1046
Just captured image number 1047
Just captured image number 1048
Just captured image number 1049
Just captured image number 1050
Just captured image number 1051
Just captured image number 1052
Just captured image number 1053
Just captured image number 1054
Just captured image number 1055
Just captured image number 1056
Just captured image number 1057
Just captured image number 1058
Just captured image number 1059
Just captured image number 1060
Just captured image number 1061
Just captured image number 1062
Just captured image number 1063
Just captured image number 1064
Failed to capture image number 1065
Just captured image number 1066
Just captured image number 1067
Failed to capture image number 1068
Just captured image number 1069
Failed to capture image number 1

Just captured image number 1292
Just captured image number 1293
Just captured image number 1294
Just captured image number 1295
Failed to capture image number 1296
Just captured image number 1297
Failed to capture image number 1298
Just captured image number 1299
Just captured image number 1300
Just captured image number 1301
Just captured image number 1302
Just captured image number 1303
Just captured image number 1304
Just captured image number 1305
Just captured image number 1306
Just captured image number 1307
Failed to capture image number 1308
Failed to capture image number 1309
Just captured image number 1310
Just captured image number 1311
Just captured image number 1312
Just captured image number 1313
Failed to capture image number 1314
Just captured image number 1315
Just captured image number 1316
Just captured image number 1317
Failed to capture image number 1318
Just captured image number 1319
Failed to capture image number 1320
Just captured image number 1321
Just capture

Just captured image number 1544
Just captured image number 1545
Just captured image number 1546
Just captured image number 1547
Just captured image number 1548
Just captured image number 1549
Just captured image number 1550
Just captured image number 1551
Failed to capture image number 1552
Failed to capture image number 1553
Just captured image number 1554
Just captured image number 1555
Just captured image number 1556
Just captured image number 1557
Just captured image number 1558
Just captured image number 1559
Just captured image number 1560
Just captured image number 1561
Just captured image number 1562
Just captured image number 1563
Just captured image number 1564
Just captured image number 1565
Just captured image number 1566
Just captured image number 1567
Failed to capture image number 1568
Just captured image number 1569
Just captured image number 1570
Just captured image number 1571
Just captured image number 1572
Just captured image number 1573
Just captured image number 1

Just captured image number 1797
Just captured image number 1798
Just captured image number 1799
Just captured image number 1800
Just captured image number 1801
Just captured image number 1802
Just captured image number 1803
Just captured image number 1804
Just captured image number 1805
Just captured image number 1806
Just captured image number 1807
Failed to capture image number 1808
Just captured image number 1809
Just captured image number 1810
Just captured image number 1811
Just captured image number 1812
Just captured image number 1813
Just captured image number 1814
Just captured image number 1815
Just captured image number 1816
Just captured image number 1817
Just captured image number 1818
Just captured image number 1819
Just captured image number 1820
Just captured image number 1821
Just captured image number 1822
Just captured image number 1823
Just captured image number 1824
Just captured image number 1825
Just captured image number 1826
Just captured image number 1827
Just

Just captured image number 2049
Just captured image number 2050
Just captured image number 2051
Failed to capture image number 2052
Just captured image number 2053
Failed to capture image number 2054
Just captured image number 2055
Just captured image number 2056
Just captured image number 2057
Failed to capture image number 2058
Just captured image number 2059
Just captured image number 2060
Just captured image number 2061
Just captured image number 2062
Just captured image number 2063
Just captured image number 2064
Just captured image number 2065
Just captured image number 2066
Just captured image number 2067
Just captured image number 2068
Just captured image number 2069
Just captured image number 2070
Just captured image number 2071
Failed to capture image number 2072
Just captured image number 2073
Just captured image number 2074
Just captured image number 2075
Just captured image number 2076
Just captured image number 2077
Just captured image number 2078
Failed to capture image 

Just captured image number 2301
Just captured image number 2302
Just captured image number 2303
Just captured image number 2304
Failed to capture image number 2305
Just captured image number 2306
Just captured image number 2307
Just captured image number 2308
Just captured image number 2309
Failed to capture image number 2310
Just captured image number 2311
Failed to capture image number 2312
Just captured image number 2313
Failed to capture image number 2314
Just captured image number 2315
Just captured image number 2316
Failed to capture image number 2317
Just captured image number 2318
Just captured image number 2319
Just captured image number 2320
Just captured image number 2321
Just captured image number 2322
Just captured image number 2323
Just captured image number 2324
Just captured image number 2325
Just captured image number 2326
Just captured image number 2327
Failed to capture image number 2328
Just captured image number 2329
Just captured image number 2330
Failed to captur

Failed to capture image number 2552
Just captured image number 2553
Just captured image number 2554
Just captured image number 2555
Just captured image number 2556
Just captured image number 2557
Just captured image number 2558
Failed to capture image number 2559
Just captured image number 2560
Just captured image number 2561
Just captured image number 2562
Just captured image number 2563
Just captured image number 2564
Just captured image number 2565
Just captured image number 2566
Just captured image number 2567
Just captured image number 2568
Just captured image number 2569
Just captured image number 2570
Just captured image number 2571
Just captured image number 2572
Just captured image number 2573
Just captured image number 2574
Just captured image number 2575
Just captured image number 2576
Just captured image number 2577
Just captured image number 2578
Failed to capture image number 2579
Just captured image number 2580
Just captured image number 2581
Failed to capture image numb

Just captured image number 2804
Failed to capture image number 2805
Just captured image number 2806
Just captured image number 2807
Just captured image number 2808
Just captured image number 2809
Just captured image number 2810
Just captured image number 2811
Just captured image number 2812
Failed to capture image number 2813
Just captured image number 2814
Just captured image number 2815
Just captured image number 2816
Failed to capture image number 2817
Just captured image number 2818
Just captured image number 2819
Failed to capture image number 2820
Just captured image number 2821
Failed to capture image number 2822
Just captured image number 2823
Just captured image number 2824
Just captured image number 2825
Just captured image number 2826
Just captured image number 2827
Failed to capture image number 2828
Just captured image number 2829
Just captured image number 2830
Just captured image number 2831
Just captured image number 2832
Just captured image number 2833
Just captured im

Just captured image number 3055
Just captured image number 3056
Failed to capture image number 3057
Just captured image number 3058
Just captured image number 3059
Just captured image number 3060
Just captured image number 3061
Just captured image number 3062
Just captured image number 3063
Failed to capture image number 3064
Just captured image number 3065
Just captured image number 3066
Just captured image number 3067
Just captured image number 3068
Just captured image number 3069
Just captured image number 3070
Just captured image number 3071
Just captured image number 3072
Just captured image number 3073
Just captured image number 3074
Just captured image number 3075
Just captured image number 3076
Just captured image number 3077
Just captured image number 3078
Just captured image number 3079
Just captured image number 3080
Just captured image number 3081
Just captured image number 3082
Just captured image number 3083
Just captured image number 3084
Just captured image number 3085


Just captured image number 3307
Just captured image number 3308
Just captured image number 3309
Just captured image number 3310
Just captured image number 3311
Just captured image number 3312
Just captured image number 3313
Just captured image number 3314
Just captured image number 3315
Just captured image number 3316
Just captured image number 3317
Just captured image number 3318
Just captured image number 3319
Just captured image number 3320
Just captured image number 3321
Just captured image number 3322
Just captured image number 3323
Just captured image number 3324
Just captured image number 3325
Just captured image number 3326
Just captured image number 3327
Just captured image number 3328
Just captured image number 3329
Just captured image number 3330
Just captured image number 3331
Just captured image number 3332
Just captured image number 3333
Just captured image number 3334
Just captured image number 3335
Just captured image number 3336
Just captured image number 3337
Just cap

Just captured image number 3560
Just captured image number 3561
Just captured image number 3562
Just captured image number 3563
Just captured image number 3564
Just captured image number 3565
Just captured image number 3566
Just captured image number 3567
Just captured image number 3568
Just captured image number 3569
Just captured image number 3570
Just captured image number 3571
Failed to capture image number 3572
Just captured image number 3573
Just captured image number 3574
Just captured image number 3575
Failed to capture image number 3576
Just captured image number 3577
Just captured image number 3578
Just captured image number 3579
Just captured image number 3580
Just captured image number 3581
Just captured image number 3582
Just captured image number 3583
Just captured image number 3584
Just captured image number 3585
Just captured image number 3586
Just captured image number 3587
Just captured image number 3588
Just captured image number 3589
Just captured image number 3590


Just captured image number 3814
Just captured image number 3815
Just captured image number 3816
Just captured image number 3817
Just captured image number 3818
Failed to capture image number 3819
Just captured image number 3820
Just captured image number 3821
Just captured image number 3822
Just captured image number 3823
Just captured image number 3824
Just captured image number 3825
Just captured image number 3826
Failed to capture image number 3827
Just captured image number 3828
Just captured image number 3829
Just captured image number 3830
Failed to capture image number 3831
Failed to capture image number 3832
Just captured image number 3833
Just captured image number 3834
Just captured image number 3835
Just captured image number 3836
Just captured image number 3837
Just captured image number 3838
Just captured image number 3839
Just captured image number 3840
Just captured image number 3841
Just captured image number 3842
Just captured image number 3843
Just captured image numb

Just captured image number 4067
Just captured image number 4068
Just captured image number 4069
Failed to capture image number 4070
Just captured image number 4071
Just captured image number 4072
Just captured image number 4073
Just captured image number 4074
Failed to capture image number 4075
Failed to capture image number 4076
Just captured image number 4077
Failed to capture image number 4078
Just captured image number 4079
Failed to capture image number 4080
Failed to capture image number 4081
Just captured image number 4082
Failed to capture image number 4083
Just captured image number 4084
Just captured image number 4085
Failed to capture image number 4086
Just captured image number 4087
Just captured image number 4088
Just captured image number 4089
Just captured image number 4090
Just captured image number 4091
Just captured image number 4092
Just captured image number 4093
Just captured image number 4094
Just captured image number 4095
Just captured image number 4096
Just cap

Just captured image number 4319
Just captured image number 4320
Just captured image number 4321
Just captured image number 4322
Just captured image number 4323
Failed to capture image number 4324
Failed to capture image number 4325
Just captured image number 4326
Just captured image number 4327
Just captured image number 4328
Just captured image number 4329
Failed to capture image number 4330
Just captured image number 4331
Failed to capture image number 4332
Just captured image number 4333
Just captured image number 4334
Failed to capture image number 4335
Just captured image number 4336
Just captured image number 4337
Failed to capture image number 4338
Just captured image number 4339
Just captured image number 4340
Just captured image number 4341
Failed to capture image number 4342
Just captured image number 4343
Just captured image number 4344
Just captured image number 4345
Just captured image number 4346
Failed to capture image number 4347
Just captured image number 4348
Just cap

Just captured image number 4570
Just captured image number 4571
Just captured image number 4572
Just captured image number 4573
Just captured image number 4574
Just captured image number 4575
Failed to capture image number 4576
Failed to capture image number 4577
Just captured image number 4578
Just captured image number 4579
Failed to capture image number 4580
Just captured image number 4581
Just captured image number 4582
Just captured image number 4583
Just captured image number 4584
Failed to capture image number 4585
Failed to capture image number 4586
Just captured image number 4587
Just captured image number 4588
Just captured image number 4589
Just captured image number 4590
Just captured image number 4591
Just captured image number 4592
Just captured image number 4593
Just captured image number 4594
Just captured image number 4595
Just captured image number 4596
Just captured image number 4597
Just captured image number 4598
Failed to capture image number 4599
Just captured im

Failed to capture image number 4819
Just captured image number 4820
Just captured image number 4821
Just captured image number 4822
Just captured image number 4823
Just captured image number 4824
Just captured image number 4825
Just captured image number 4826
Just captured image number 4827
Just captured image number 4828
Just captured image number 4829
Just captured image number 4830
Just captured image number 4831
Just captured image number 4832
Just captured image number 4833
Just captured image number 4834
Just captured image number 4835
Just captured image number 4836
Failed to capture image number 4837
Just captured image number 4838
Failed to capture image number 4839
Just captured image number 4840
Just captured image number 4841
Just captured image number 4842
Failed to capture image number 4843
Failed to capture image number 4844
Just captured image number 4845
Failed to capture image number 4846
Just captured image number 4847
Failed to capture image number 4848
Just capture

Just captured image number 5070
Failed to capture image number 5071
Just captured image number 5072
Just captured image number 5073
Failed to capture image number 5074
Just captured image number 5075
Just captured image number 5076
Just captured image number 5077
Just captured image number 5078
Just captured image number 5079
Just captured image number 5080
Just captured image number 5081
Failed to capture image number 5082
Just captured image number 5083
Just captured image number 5084
Just captured image number 5085
Just captured image number 5086
Failed to capture image number 5087
Just captured image number 5088
Just captured image number 5089
Just captured image number 5090
Just captured image number 5091
Just captured image number 5092
Failed to capture image number 5093
Just captured image number 5094
Failed to capture image number 5095
Failed to capture image number 5096
Just captured image number 5097
Failed to capture image number 5098
Just captured image number 5099
Just cap

Failed to capture image number 5319
Just captured image number 5320
Just captured image number 5321
Failed to capture image number 5322
Just captured image number 5323
Failed to capture image number 5324
Failed to capture image number 5325
Failed to capture image number 5326
Failed to capture image number 5327
Just captured image number 5328
Just captured image number 5329
Failed to capture image number 5330
Just captured image number 5331
Failed to capture image number 5332
Just captured image number 5333
Just captured image number 5334
Just captured image number 5335
Just captured image number 5336
Just captured image number 5337
Just captured image number 5338
Just captured image number 5339
Just captured image number 5340
Just captured image number 5341
Failed to capture image number 5342
Just captured image number 5343
Just captured image number 5344
Just captured image number 5345
Just captured image number 5346
Failed to capture image number 5347
Failed to capture image number 5

Just captured image number 5568
Just captured image number 5569
Just captured image number 5570
Just captured image number 5571
Just captured image number 5572
Just captured image number 5573
Just captured image number 5574
Just captured image number 5575
Just captured image number 5576
Just captured image number 5577
Failed to capture image number 5578
Failed to capture image number 5579
Just captured image number 5580
Failed to capture image number 5581
Failed to capture image number 5582
Just captured image number 5583
Just captured image number 5584
Just captured image number 5585
Failed to capture image number 5586
Just captured image number 5587
Just captured image number 5588
Just captured image number 5589
Just captured image number 5590
Failed to capture image number 5591
Just captured image number 5592
Just captured image number 5593
Just captured image number 5594
Just captured image number 5595
Just captured image number 5596
Failed to capture image number 5597
Just capture

Failed to capture image number 5818
Just captured image number 5819
Just captured image number 5820
Failed to capture image number 5821
Just captured image number 5822
Just captured image number 5823
Just captured image number 5824
Just captured image number 5825
Just captured image number 5826
Failed to capture image number 5827
Just captured image number 5828
Just captured image number 5829
Just captured image number 5830
Just captured image number 5831
Failed to capture image number 5832
Just captured image number 5833
Just captured image number 5834
Just captured image number 5835
Failed to capture image number 5836
Just captured image number 5837
Just captured image number 5838
Just captured image number 5839
Just captured image number 5840
Just captured image number 5841
Just captured image number 5842
Just captured image number 5843
Failed to capture image number 5844
Just captured image number 5845
Failed to capture image number 5846
Just captured image number 5847
Just capture

Just captured image number 6068
Failed to capture image number 6069
Just captured image number 6070
Just captured image number 6071
Just captured image number 6072
Just captured image number 6073
Just captured image number 6074
Failed to capture image number 6075
Just captured image number 6076
Just captured image number 6077
Just captured image number 6078
Just captured image number 6079
Just captured image number 6080
Just captured image number 6081
Just captured image number 6082
Just captured image number 6083
Just captured image number 6084
Just captured image number 6085
Failed to capture image number 6086
Failed to capture image number 6087
Just captured image number 6088
Just captured image number 6089
Just captured image number 6090
Just captured image number 6091
Failed to capture image number 6092
Just captured image number 6093
Just captured image number 6094
Just captured image number 6095
Just captured image number 6096
Just captured image number 6097
Just captured image 

Just captured image number 6320
Just captured image number 6321
Just captured image number 6322
Failed to capture image number 6323
Failed to capture image number 6324
Just captured image number 6325
Failed to capture image number 6326
Failed to capture image number 6327
Just captured image number 6328
Just captured image number 6329
Just captured image number 6330
Failed to capture image number 6331
Just captured image number 6332
Just captured image number 6333
Just captured image number 6334
Just captured image number 6335
Failed to capture image number 6336
Just captured image number 6337
Failed to capture image number 6338
Failed to capture image number 6339
Just captured image number 6340
Just captured image number 6341
Just captured image number 6342
Just captured image number 6343
Just captured image number 6344
Just captured image number 6345
Just captured image number 6346
Just captured image number 6347
Just captured image number 6348
Just captured image number 6349
Just cap

Just captured image number 6573
Just captured image number 6574
Failed to capture image number 6575
Just captured image number 6576
Just captured image number 6577
Just captured image number 6578
Just captured image number 6579
Just captured image number 6580
Just captured image number 6581
Just captured image number 6582
Just captured image number 6583
Just captured image number 6584
Failed to capture image number 6585
Just captured image number 6586
Just captured image number 6587
Just captured image number 6588
Just captured image number 6589
Just captured image number 6590
Just captured image number 6591
Just captured image number 6592
Just captured image number 6593
Just captured image number 6594
Just captured image number 6595
Failed to capture image number 6596
Just captured image number 6597
Just captured image number 6598
Just captured image number 6599
Just captured image number 6600
Just captured image number 6601
Just captured image number 6602
Just captured image number 6

In [None]:
images.drop(columns='image', inplace=True)