# Collecting Data from Amazon
---
In this notebook, we scrape the missing parts of our dataset directly from Amazon using BeautifulSoup.

The dataset we want:

| ID | Review Score | Sales Rank | Category    | Title | Author | Date    | Visual Features     |
| -- | ------------ | ---------- | ----------- | ----- | ------ | ------- | ------------------- |
| |

The dataset we have, as downloaded from [here](https://github.com/uchidalab/book-dataset):

| ID | Filename | Image URL | Title | Author | Category ID | Category |
| -- | -------- | --------- | ----- | ------ | ----------- | -------- |
| |

The `ID` column in the data can be used to access the webpage of each book, by connecting to https://www.amazon.com/dp/book-id. This allows us to scrape any data that is missing directly from Amazon.

We already have the Title, Author and Category of each book ready to be used.

For everything else, there's ~~Mastercard~~ BeautifulSoup.

In [93]:
# To request data from Amazon
import requests
from bs4 import BeautifulSoup

# To open image links
import urllib

# To process data
import pandas as pd

# To extract information from weirdly formatted Amazon info
import re

# To create random delays to trick the Amazon bot detector
from time import sleep
from random import randint
import random

# To rotate IPs while scraping
from torrequest import TorRequest
tor = TorRequest(password='ilovecs401')

# To read data
import csv

# To check if a file is downloaded already
import os

# To print an image in the notebook programmatically
from IPython.display import Markdown

# Set data directories
ORIGINAL_DATA_DIR = 'Original Data/'
COLLECTED_DATA_DIR = 'Collected Data/'
IMAGE_DIR = COLLECTED_DATA_DIR + 'Cover Images/'
HTML_DIR = '/Users/dogatekin/Data/HTML Files/'

# Constants
DEFAULT_USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36'
USER_AGENTS = [
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.1 Safari/605.1.15',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36',
    'Mozilla/5.0 (X11; Linux i686; rv:64.0) Gecko/20100101 Firefox/64.0',
    'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0',
    'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16',
    'Opera/9.80 (Macintosh; Intel Mac OS X 10.14.1) Presto/2.12.388 Version/12.16'
]

## Preprocessing

Load the data:

In [74]:
header_names = ['ID', 'Filename', 'Image URL', 'Title', 'Author', 'Category ID', 'Category']

books = pd.read_csv(ORIGINAL_DATA_DIR + 'book32-listing.csv', encoding='latin1', header=None, names=header_names)
books.head()

Unnamed: 0,ID,Filename,Image URL,Title,Author,Category ID,Category
0,761183272,0761183272.jpg,http://ecx.images-amazon.com/images/I/61Y5cOdH...,Mom's Family Wall Calendar 2016,Sandra Boynton,3,Calendars
1,1623439671,1623439671.jpg,http://ecx.images-amazon.com/images/I/61t-hrSw...,Doug the Pug 2016 Wall Calendar,Doug the Pug,3,Calendars
2,B00O80WC6I,B00O80WC6I.jpg,http://ecx.images-amazon.com/images/I/41X-KQqs...,"Moleskine 2016 Weekly Notebook, 12M, Large, Bl...",Moleskine,3,Calendars
3,761182187,0761182187.jpg,http://ecx.images-amazon.com/images/I/61j-4gxJ...,365 Cats Color Page-A-Day Calendar 2016,Workman Publishing,3,Calendars
4,1578052084,1578052084.jpg,http://ecx.images-amazon.com/images/I/51Ry4Tsq...,Sierra Club Engagement Calendar 2016,Sierra Club,3,Calendars


Inspect the categories:

In [75]:
print('\n'.join(books['Category'].unique()))

Calendars
Comics & Graphic Novels
Test Preparation
Mystery, Thriller & Suspense
Science Fiction & Fantasy
Romance
Humor & Entertainment
Literature & Fiction
Gay & Lesbian
Engineering & Transportation
Cookbooks, Food & Wine
Crafts, Hobbies & Home
Arts & Photography
Education & Teaching
Parenting & Relationships
Self-Help
Computers & Technology
Medical Books
Science & Math
Health, Fitness & Dieting
Business & Money
Law
Biographies & Memoirs
History
Politics & Social Sciences
Reference
Christian Books & Bibles
Religion & Spirituality
Sports & Outdoors
Teen & Young Adult
Children's Books
Travel


We only want the Children's Books:

In [76]:
books = books[books['Category'] == "Children's Books"].reset_index(drop=True)
# We don't need the Category or Category ID columns anymore
books.drop(columns=['Category ID', 'Category'], inplace=True)
books.head()

Unnamed: 0,ID,Filename,Image URL,Title,Author
0,545790352,0545790352.jpg,http://ecx.images-amazon.com/images/I/51MIi4p2...,Harry Potter and the Sorcerer's Stone: The Ill...,J.K. Rowling
1,1419717014,1419717014.jpg,http://ecx.images-amazon.com/images/I/61YgGsg-...,Diary of a Wimpy Kid: Old School,Jeff Kinney
2,1423160916,1423160916.jpg,http://ecx.images-amazon.com/images/I/611CmvkL...,"Magnus Chase and the Gods of Asgard, Book 1: T...",Rick Riordan
3,1476789886,1476789886.jpg,http://ecx.images-amazon.com/images/I/51KqU7Dw...,Rush Revere and the Star-Spangled Banner,Rush Limbaugh
4,1338029991,1338029991.jpg,http://ecx.images-amazon.com/images/I/61kvq74k...,Harry Potter Coloring Book,Scholastic


Let's check how many books we have left:

In [77]:
len(books)

13605

Finally, let's fix the IDs in the dataset. For some reason, the ID column has the leading 0s removed (normally all of them should be 10 characters long), which makes the webpages inaccessible. The filename column has the correct IDs with the correct number of leading 0s. So let's use the Filename column as the new ID column, we can add the `.jpg` extension later when downloading:

In [78]:
books['ID'] = books['Filename'].apply(lambda row: re.findall(u'(.*).jpg', row)[0])
books.drop(columns='Filename', inplace=True)
books.head()

Unnamed: 0,ID,Image URL,Title,Author
0,545790352,http://ecx.images-amazon.com/images/I/51MIi4p2...,Harry Potter and the Sorcerer's Stone: The Ill...,J.K. Rowling
1,1419717014,http://ecx.images-amazon.com/images/I/61YgGsg-...,Diary of a Wimpy Kid: Old School,Jeff Kinney
2,1423160916,http://ecx.images-amazon.com/images/I/611CmvkL...,"Magnus Chase and the Gods of Asgard, Book 1: T...",Rick Riordan
3,1476789886,http://ecx.images-amazon.com/images/I/51KqU7Dw...,Rush Revere and the Star-Spangled Banner,Rush Limbaugh
4,1338029991,http://ecx.images-amazon.com/images/I/61kvq74k...,Harry Potter Coloring Book,Scholastic


## Scraping

The columns we need to scrape are: `Review Score`, `Sales Rank` and `Date`. We also need to download the images from the URLs so that we can extract visual features from them, completing our dataset.

First we will demonstrate the scraping process for each column on an arbitrary example, then we will combine these in a function and scrape the information for all the books.

In [79]:
example_book = books.iloc[0]
example_book

ID                                                  0545790352
Image URL    http://ecx.images-amazon.com/images/I/51MIi4p2...
Title        Harry Potter and the Sorcerer's Stone: The Ill...
Author                                            J.K. Rowling
Name: 0, dtype: object

Get the webpage using the ID:

In [80]:
response = tor.get('https://www.amazon.com/dp/' + example_book['ID'], headers={'User-Agent': DEFAULT_USER_AGENT})
response.status_code

200

Now that we have access to the page content, we can turn it into a useful soup:

In [81]:
soup = BeautifulSoup(response.text, 'lxml')

#### Sales Rank and Date

We can get both of these from the product details table on the webpage, which is in a table conveniently named `productDetailsTable`:

In [82]:
soup.select('#productDetailsTable li b')

[<b>Age Range:</b>,
 <b>Grade Level:</b>,
 <b>Series:</b>,
 <b>Hardcover:</b>,
 <b>Publisher:</b>,
 <b>Language:</b>,
 <b>ISBN-10:</b>,
 <b>ISBN-13:</b>,
 <b>
     Product Dimensions: 
     </b>,
 <b>Shipping Weight:</b>,
 <b>Average Customer Review:</b>,
 <b>Amazon Best Sellers Rank:</b>,
 <b><a href="https://www.amazon.com/gp/bestsellers/books/3153/ref=pd_zg_hrsr_books_1_5_last/144-8355913-0477422">Friendship</a></b>,
 <b><a href="https://www.amazon.com/gp/bestsellers/books/2967/ref=pd_zg_hrsr_books_2_3_last/144-8355913-0477422">Action &amp; Adventure</a></b>,
 <b><a href="https://www.amazon.com/gp/bestsellers/books/3017/ref=pd_zg_hrsr_books_3_4_last/144-8355913-0477422">Fantasy &amp; Magic</a></b>]

We can use regex to extract the info we need from the table:

In [83]:
for li in soup.select('#productDetailsTable li'):
    # We only need two of the list items
    if(li.b.string == 'Amazon Best Sellers Rank:'):
        # The rank is given in the format #1,234,567
        sales_rank = re.findall(u'#([\d,]+)', li.b.nextSibling)[0]
    elif(li.b.string == 'Publisher:'):
        # The date is in the last set of parantheses
        date = re.findall(u'\(([^\(\)]*)\)$', li.b.nextSibling)[0]
        
print(f'Sales Rank: {sales_rank}\nDate: {date}')

Sales Rank: 114
Date: October 6, 2015


#### Review Score

You might have noticed there is also an item called `Average Customer Review` in the table we just used to extract the Rank and Date. Inside that item, all the review scores are found in a table with the id `histogramTable`, that gives the percentages of users for each score from 1 to 5 stars.

In [84]:
reviews = soup.select('#histogramTable')[0].text
reviews

'5 star87%4 star8%3 star2%2 star1%1 star2%'

The formatting is not great, but it's nothing we can't fix by using a simple regular expression:

In [85]:
reviews = re.findall(u'(\d) star(\d+)%', reviews)
reviews

[('5', '87'), ('4', '8'), ('3', '2'), ('2', '1'), ('1', '2')]

The weighted average of these scores is our final Review Score for the given book:

In [86]:
score = 0
for pair in reviews:
    score += int(pair[0]) * int(pair[1])/100  # weights are percentages

round(score, 3)

4.77

#### Cover Image

The image URL of each book is available in the original dataset, let's make a HashMap of `ID:URL` pairs:

In [87]:
urls = books[['ID', 'Image URL']].set_index('ID').to_dict()['Image URL']
example_url = urls[example_book['ID']]
example_url

'http://ecx.images-amazon.com/images/I/51MIi4p2YyL.jpg'

Have a look:

In [88]:
Markdown(f'![Example Image]({example_url})')

![Example Image](http://ecx.images-amazon.com/images/I/51MIi4p2YyL.jpg)

Let's turn it into a function:

In [89]:
def download_image(book_id):
    url = urls[book_id]
    filename = book_id + '.jpg'
    # Download only if not already downloaded
    if not os.path.isfile(IMAGE_DIR + filename):
        downloaded_img = urllib.request.urlopen(url)
        f = open(IMAGE_DIR + filename, mode='wb')
        f.write(downloaded_img.read())
        downloaded_img.close()
        f.close()

#### Raw HTML

Just in case we need some other information in the future from the webpages, let's save the raw HTML files somewhere so we don't have to scrape them from Amazon again.

In [90]:
def save_html(book_id, html_text):
    filename = book_id + '.html'
    # Save only if not already saved
    if not os.path.isfile(HTML_DIR + filename):
        html_file = open(HTML_DIR + filename,"w")
        html_file.write(html_text)
        html_file.close()

#### Bringing it together

In [91]:
def scrape_info(book_id, agent=DEFAULT_USER_AGENT):
    # Initial values
    sales_rank = date = score = None
    
    # Trick the bot detector
    sleep(randint(1,3))
    
    # Get the soup of the relevant page
    response = tor.get('https://www.amazon.com/dp/' + book_id, headers={'User-Agent': agent})
    status = response.status_code
    
    if(status == 200):
        # Successfully reached webpage
        
        # Make some soup
        soup = BeautifulSoup(response.text, 'lxml')
        
        if(soup.title.string != 'Robot Check'):
            # Did not get detected 😎
        
            # Save the raw HTML in case it is needed in the future
            save_html(book_id, response.text)

            # Get sales rank and date
            for li in soup.select('#productDetailsTable li'):
                if(li.b.string == 'Amazon Best Sellers Rank:'):
                    try:
                        sales_rank = re.findall(u'#([\d,]+)', li.b.nextSibling)[0]  # Format: #1,234,567
                        sales_rank = int(sales_rank.replace(',',''))  # Remove the commas and convert to integer
                    except:
                        sales_rank = None  # couldn't scrape
                elif(li.b.string == 'Publisher:'):
                    try:
                        date = re.findall(u'\(([^\(\)]*)\)$', li.b.nextSibling)[0]  # Format: Inside last parantheses
                    except:
                        date = None  # couldn't scrape

            # Get average review score
            try:
                reviews = soup.select('#histogramTable')[0].text
                reviews = re.findall(u'(\d) star(\d+)%', reviews)

                score = 0
                for pair in reviews:
                    score += int(pair[0]) * int(pair[1])/100  # weights are percentages

                score = round(score, 3)
            except:
                score = None  # couldn't scrape

            # Download the cover image
            download_image(book_id)
        else:
            # Got detected 😔
            status = -1
    else:
        # Could not reach webpage
        sales_rank = date = score = f'Error {status}'

    return status, book_id, sales_rank, date, score

Let's do a final test on the example book we used above:

In [67]:
scraped = scrape_info(example_book['ID'])
scraped

(200, '0545790352', 114, 'October 6, 2015', 4.77)

## Completing the dataset

To be able to stop and continue at will, we will write the scraped info to a csv file as we go along, and simultaneously download cover images. Let's initialize this file with a meaningful header:

In [32]:
with open(COLLECTED_DATA_DIR + 'scraped.csv', 'a') as file:
    writer = csv.writer(file)
    writer.writerow(['ID', 'Sales Rank', 'Date', 'Review Score'])

Now we go through the dataset, starting scraping from where we last left off:

In [95]:
with open(COLLECTED_DATA_DIR + 'scraped.csv', 'a+') as file:
    reader = csv.reader(file)
    writer = csv.writer(file)
    
    # Look at the last scraped book to continue from the next one in the dataset
    file.seek(0)
    last_scraped = next(reversed(list(reader)))[0]
    
    if(last_scraped == 'ID'):
        # Nothing was scraped yet, start from the beginning
        index = 0
    else:
        # At least one book was scraped, find the index of the last scraped book and start from the next one
        last_scraped_index = books.index[books['ID'] == last_scraped].tolist()[0]
        index = last_scraped_index + 1
        
    caught = False
    while not caught:
        current_id = books.iloc[index]['ID']
        print(f'ID: {current_id} | Scrape Info: ', end='')

        scraped = scrape_info(current_id)
        
        while(scraped[0] == -1):
            tor.reset_identity()
            new_agent = random.choice(USER_AGENTS)
            scraped = scrape_info(current_id, new_agent)
        
        writer.writerow(scraped[1:])
        file.flush()
        print(f'Success')
            
        index += 1

ID: 0803737319 | Scrape Info: Success
ID: 1580892620 | Scrape Info: Success
ID: 1479571660 | Scrape Info: Success
ID: 0395963311 | Scrape Info: Success
ID: 0486273466 | Scrape Info: Success
ID: 1908714182 | Scrape Info: Success
ID: 0786855592 | Scrape Info: Success
ID: 0763654833 | Scrape Info: Success
ID: 1509101268 | Scrape Info: Success
ID: 0803732457 | Scrape Info: Success
ID: 0762404094 | Scrape Info: Success
ID: 1564584674 | Scrape Info: Success
ID: 0486283399 | Scrape Info: Success
ID: 3791372408 | Scrape Info: Success
ID: 1931414157 | Scrape Info: Success
ID: 0805063137 | Scrape Info: Success
ID: 1931414084 | Scrape Info: Success
ID: 1410942538 | Scrape Info: Success
ID: 0448487853 | Scrape Info: Success
ID: 0789401975 | Scrape Info: Success
ID: 0763626848 | Scrape Info: Success
ID: 1429622571 | Scrape Info: Success
ID: 0531137848 | Scrape Info: Success
ID: 0531210480 | Scrape Info: Success
ID: 379137043X | Scrape Info: Success
ID: 1883982421 | Scrape Info: Success
ID: 08877635

ID: 0689850417 | Scrape Info: Success
ID: 081091381X | Scrape Info: Success
ID: 0618494170 | Scrape Info: Success
ID: 1416935401 | Scrape Info: Success
ID: 1574219936 | Scrape Info: Success
ID: 006446119X | Scrape Info: Success
ID: 0760347123 | Scrape Info: Success
ID: 1590783468 | Scrape Info: Success
ID: 1597112887 | Scrape Info: Success
ID: 0688154735 | Scrape Info: Success
ID: 0688149715 | Scrape Info: Success
ID: 0439110165 | Scrape Info: Success
ID: 0064462064 | Scrape Info: Success
ID: 0439079470 | Scrape Info: Success
ID: 0763674958 | Scrape Info: Success
ID: 0547727348 | Scrape Info: Success
ID: 0688151655 | Scrape Info: Success
ID: 0064462242 | Scrape Info: Success
ID: 0802795471 | Scrape Info: Success
ID: 1936218135 | Scrape Info: Success
ID: 1563977982 | Scrape Info: Success
ID: 1580178480 | Scrape Info: Success
ID: 1847803520 | Scrape Info: Success
ID: 0142300241 | Scrape Info: Success
ID: 1481401653 | Scrape Info: Success
ID: 0763671541 | Scrape Info: Success
ID: 07660313

ID: 1616894563 | Scrape Info: Success
ID: 0531213242 | Scrape Info: Success
ID: 1423121058 | Scrape Info: Success
ID: 0810994925 | Scrape Info: Success
ID: 0531225399 | Scrape Info: Success
ID: 0805087524 | Scrape Info: Success
ID: 0811862887 | Scrape Info: Success
ID: 161689282X | Scrape Info: Success
ID: 0439680131 | Scrape Info: Success
ID: 0811837661 | Scrape Info: Success
ID: 0316515264 | Scrape Info: Success
ID: 0142414085 | Scrape Info: Success
ID: 0679886079 | Scrape Info: Success
ID: 0761463100 | Scrape Info: Success
ID: 0764155741 | Scrape Info: Success
ID: 1416936556 | Scrape Info: Success
ID: 0764162829 | Scrape Info: Success
ID: 0812065832 | Scrape Info: Success
ID: 068984624X | Scrape Info: Success
ID: 0547558643 | Scrape Info: Success
ID: 0545535832 | Scrape Info: Success
ID: 0764150316 | Scrape Info: Success
ID: 0547907257 | Scrape Info: Success
ID: 0545132916 | Scrape Info: Success
ID: 0307978931 | Scrape Info: Success
ID: 0811858030 | Scrape Info: Success
ID: 07641604

ID: 1600608981 | Scrape Info: Success
ID: 087358872X | Scrape Info: Success
ID: 0061227838 | Scrape Info: Success
ID: 0516445375 | Scrape Info: Success
ID: 0531070956 | Scrape Info: Success
ID: 1570916373 | Scrape Info: Success
ID: 0763662453 | Scrape Info: Success
ID: 0764167618 | Scrape Info: Success
ID: 0516445391 | Scrape Info: Success
ID: 1570917000 | Scrape Info: Success
ID: 0448478927 | Scrape Info: Success
ID: 1419716441 | Scrape Info: Success
ID: 0439269679 | Scrape Info: Success
ID: 1596439734 | Scrape Info: Success
ID: 1569767114 | Scrape Info: Success
ID: 0786814209 | Scrape Info: Success
ID: 053121429X | Scrape Info: Success
ID: 0516265342 | Scrape Info: Success
ID: 0544339223 | Scrape Info: Success
ID: 0062219049 | Scrape Info: Success
ID: 0516264672 | Scrape Info: Success
ID: 0062202081 | Scrape Info: Success
ID: 0375869735 | Scrape Info: Success
ID: 1467702366 | Scrape Info: Success
ID: 1467745472 | Scrape Info: Success
ID: 0516260766 | Scrape Info: Success
ID: 05164453

ID: 069811440X | Scrape Info: Success
ID: 0448401703 | Scrape Info: Success
ID: 0060576138 | Scrape Info: Success
ID: 0375803963 | Scrape Info: Success
ID: 1423104080 | Scrape Info: Success
ID: 0756649447 | Scrape Info: Success
ID: 159643998X | Scrape Info: Success
ID: 0756608341 | Scrape Info: Success
ID: 0545232562 | Scrape Info: Success
ID: 0448484285 | Scrape Info: Success
ID: 0698116461 | Scrape Info: Success
ID: 0448458551 | Scrape Info: Success
ID: 0399240403 | Scrape Info: Success
ID: 0545568838 | Scrape Info: Success
ID: 0756635284 | Scrape Info: Success
ID: 082341177X | Scrape Info: Success
ID: 0545222680 | Scrape Info: Success
ID: 0689869096 | Scrape Info: Success
ID: 0756621119 | Scrape Info: Success
ID: 1442440929 | Scrape Info: Success
ID: 1556526563 | Scrape Info: Success
ID: 0545669111 | Scrape Info: Success
ID: 0789473771 | Scrape Info: Success
ID: 1481451138 | Scrape Info: Success
ID: 0689859228 | Scrape Info: Success
ID: 0544105087 | Scrape Info: Success
ID: 08027379

ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))

# Shady Shit

In [50]:
from torrequest import TorRequest

In [51]:
tor = TorRequest(password='ilovecs401')

In [52]:
response= requests.get('http://ipecho.net/plain')
print ("My Original IP Address:",response.text)

My Original IP Address: 185.25.193.110


In [53]:
tor.reset_identity() #Reset Tor
response = tor.get('http://ipecho.net/plain')
print ("New Ip Address",response.text)

New Ip Address 171.25.193.25


In [62]:
# tor.reset_identity()
response = tor.get('https://www.amazon.com/dp/' + example_book['ID'], headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.1 Safari/605.1.15'})
soup = BeautifulSoup(response.text, 'lxml')
soup.title.string

"Harry Potter and the Sorcerer's Stone: The Illustrated Edition (Harry Potter, Book 1): J.K. Rowling, Jim Kay: 9780545790352: Amazon.com: Books"

# Test Zone

In [36]:
response = requests.get('https://www.amazon.com/dp/' + example_book['ID'], headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36'})

In [39]:
soup = BeautifulSoup(response.text, 'lxml')

In [43]:
soup.title.string == 'Robot Check'

True

In [59]:
tor.reset_identity()

In [63]:
user_agents = [
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.1 Safari/605.1.15',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36',
    'Mozilla/5.0 (X11; Linux i686; rv:64.0) Gecko/20100101 Firefox/64.0',
    'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0',
    'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16',
    'Opera/9.80 (Macintosh; Intel Mac OS X 10.14.1) Presto/2.12.388 Version/12.16'
]