# Watch Dealer
### Programmers: Tom McKenzie, Eddy Nassif
### Class: Data Mining, Spring 2019
### Final Project



# Getting Started
Our project aims to help consumers find the best deals on a given product based on previous price history and it’s trend for the past 90 days. We will be focusing on Rolex wristwatches, and will be using a variety of classification methods to help forecast pricing and ultimately decide whether or not the price for a current listing is a good or bad deal. The dataset will be created by using Ebay’s API to get data on previously sold listings. 

## The Data
We will be acquiring our data using Ebay's API. This API is easy to use and makes grabbing previous sales easy! However, no dataset is perfect, and we had to go through hefty pre-processing to get the data reaedy. There were three steps to our data collection: Gathering, Cleaning, and Classifying.

#### Gathering the data:
Like we said earlier, data gathering was completed using Ebay's API. [Here](https://developer.ebay.com/docs) is a link to the documentation if you want to investigate further.


Below is the function we used to send a response to Ebay's API. As you can see, we pass in some keywords, sort the results from newest to oldest, ask for 100 entries per page, give it a page number, and a minimum and maximum price for previously sold listings. We use minimum and maximum price as a screen for watch parts and accessories. By setting a mininum price at a level that no Rolex will go under, we save a lot of time in cleaning.

In [7]:
def response(Keywords, pageNum, minPrice, maxPrice, api):
    response = api.execute('findCompletedItems', {
        'keywords': Keywords,
        'sortOrder': 'EndTimeLatest',
        'paginationInput': {'entriesPerPage': '100',
                            'pageNumber': pageNum},
        'itemFilter': [
            # {'name': 'Condition', 'value': condition},
            {'name': 'SoldItemsOnly', 'value': True},
            {'name': 'MinPrice', 'value': minPrice},
            {'name': 'MaxPrice', 'value': maxPrice}
        ]
    }
                           )
    return response

As with any API, you need an API Key. We are sharing an API key, and we will be including ours as a means of demonstration. If you would like to try this project for yourself, please register for your own API key.

In [22]:
from ebaysdk.finding import Connection as finding
api = finding(appid='EddyNass-Scraper-PRD-651ca6568-7ae32d61', config_file=None)

Now that we have set the parameters for the API, we can send a request! Let's gather all the items on the first four pages, 400 items in total.

In [21]:
from bs4 import BeautifulSoup

Keywords = "Rolex Wristwatch"
minPrice = 3000
maxPrice = 12000
pageNum = 1
# Collect all items from ebay on page1 through 4
while pageNum <= 4:
    soup = BeautifulSoup(response(Keywords, pageNum, minPrice, maxPrice, api).content, 'lxml')
    if pageNum == 1:
        items = soup.find_all('item')
    else:
        items += soup.find_all('item')
    pageNum += 1
items[:1]

[<item><itemid>254217635900</itemid><title>2005 Unique Rolex Explorer 114270 Black PVD / DLC</title><globalid>EBAY-US</globalid><primarycategory><categoryid>31387</categoryid><categoryname>Wristwatches</categoryname></primarycategory><galleryurl>http://thumbs1.ebaystatic.com/m/mpdOrwj6IVbG2M_ae6sI2cg/140.jpg</galleryurl><viewitemurl>http://www.ebay.com/itm/2005-Unique-Rolex-Explorer-114270-Black-PVD-DLC-/254217635900</viewitemurl><paymentmethod>PayPal</paymentmethod><autopay>false</autopay><postalcode>33132</postalcode><location>Miami,FL,USA</location><country>US</country><shippinginfo><shippingservicecost currencyid="USD">65.0</shippingservicecost><shippingtype>Flat</shippingtype><shiptolocations>Worldwide</shiptolocations><expeditedshipping>true</expeditedshipping><onedayshippingavailable>false</onedayshippingavailable><handlingtime>2</handlingtime></shippinginfo><sellingstatus><currentprice currencyid="USD">4050.0</currentprice><convertedcurrentprice currencyid="USD">4050.0</convert

Ew! The result doesn't look too pretty! We even had to shorten it significantly so it didn't take up the whole notebook! What's happening here is that "items" is collecting all of the information that Ebay is sending us, but we can parse it out using BeautifulSoup. To get the necessary 