In [1]:
import requests
import json

### Query Params for eBay's "Completed Listings" API Endpoint

In [2]:
API_KEY = 'RobertBo-cinemaro-PRD-171ca9e35-af8bdcbb' # Enter your API Key/"App ID" Here. Mine was 40 chars long.

In [3]:
# FILM_CAMS = '15230'
ELEC_GUITARS = '33034' # Category code for electric guitars on eBay.

In [4]:
USA = 'EBAY-US' # USA Marketplace only. API seems to start returning stuff from other markets eventually anyhow

In [5]:
USED = '3000' # Just the code for item condition "used". This is the focus of this project.

In [6]:
ITEM_FILTER_0 = f'itemFilter(0).name=Condition&itemFilter(0).value={USED}' # Only used guitars
ITEM_FILTER_1 = f'itemFilter(1).name=HideDuplicateItems&itemFilter(1).value=true' # No duplicate listings
ITEM_FILTER_2 = f'itemFilter(2).name=MinPrice&itemFilter(2).value=249' # Only items that sold for > this value
ITEM_FILTER_3 = f'itemFilter(3).name=MaxQuantity&itemFilter(3).value=1' # No lots or batch sales. One axe at a time
ITEM_FILTER_4 = f'itemFilter(4).name=SoldItemsOnly&itemFilter(4).value=true' # Only looking at listings that sold.
ITEM_FILTER_5 = f'itemFilter(5).name=ListingType&itemFilter(5).value=Auction' # Only looking at auctions.

In [7]:
FIND_COMPLETED = 'findCompletedItems' # This is the eBay API endpoint service we'll be querying.

### Actual Query Function

Here we go:

In [8]:
def find_completed(PAGE):
    '''Make a request to the eBay API and return the JSON text of this page number'''
    r = requests.get(f'https://svcs.ebay.com/services/search/FindingService/v1?'
                 f'OPERATION-NAME={FIND_COMPLETED}&'
                 f'X-EBAY-SOA-SECURITY-APPNAME={API_KEY}&'
                 f'RESPONSE-DATA-FORMAT=JSON&' # This value can be altered if you're not into JSON responses
                 f'REST-PAYLOAD&'
                 f'GLOBAL-ID={USA}&' # seems to prioritize the value you enter but returns other stuff too
                 f'categoryId={ELEC_GUITARS}&' # Product category goes here
                 f'descriptionSearch=true&' # More verbose responses
                 f'{ITEM_FILTER_0}&'
                 f'{ITEM_FILTER_1}&'
                 f'{ITEM_FILTER_2}&' # Filters defined previously
                 f'{ITEM_FILTER_3}&'
                 f'{ITEM_FILTER_4}&'
                 f'{ITEM_FILTER_5}&'
                 f'paginationInput.pageNumber={str(PAGE)}&' # value to be looped through when collecting lotsa data
                 f'outputSelector=PictureURLLarge') # Why not grab the thumbnail URLs too. Could be cool
    return r.json()['findCompletedItemsResponse'][0]['searchResult'][0]['item']

In [9]:
page_1 = find_completed(1)

In [10]:
axe_1 = page_1[0] # Inspecting the last item of response
axe_1

{'itemId': ['254093291269'],
 'title': ['Gibson Les Paul Studio with Gibson Hard Shell Case'],
 'globalId': ['EBAY-US'],
 'primaryCategory': [{'categoryId': ['33034'],
   'categoryName': ['Electric Guitars']}],
 'galleryURL': ['http://thumbs2.ebaystatic.com/m/mfR4RvdXH4OaV9ag0mQyo9Q/140.jpg'],
 'viewItemURL': ['http://www.ebay.com/itm/Gibson-Les-Paul-Studio-Gibson-Hard-Shell-Case-/254093291269'],
 'productId': [{'@type': 'ReferenceID', '__value__': '96926466'}],
 'paymentMethod': ['PayPal'],
 'autoPay': ['false'],
 'postalCode': ['72034'],
 'location': ['Conway,AR,USA'],
 'country': ['US'],
 'shippingInfo': [{'shippingServiceCost': [{'@currencyId': 'USD',
     '__value__': '25.0'}],
   'shippingType': ['Flat'],
   'shipToLocations': ['Worldwide'],
   'expeditedShipping': ['true'],
   'oneDayShippingAvailable': ['false'],
   'handlingTime': ['3']}],
 'sellingStatus': [{'currentPrice': [{'@currencyId': 'USD',
     '__value__': '700.0'}],
   'convertedCurrentPrice': [{'@currencyId': 'USD'

In [11]:
id_1 = axe_1['itemId'][0]
id_1 # We can pass each item's ID to another endpoint for more granular info

'254093291269'

### Which of these fields might be useful for a regression project?

* itemID, for passing to getSingleItem API


* NLP: Title
* NLP: SubTitle


* CNN: Thumbnail images


* scalar: Shipping Cost
* scalar: handlingTime
* scalar: Duration (derive from start/end cols)
* scalar: __convertedCurrentPrice__ - THIS IS THE INDEPENDENT VAR


* 1-hot: Payment method
* 1-hot: Seller Country
* 1-hot: ShippingType
* 1-hot: shipToLocations
* 1-hot: expeditedShipping
* 1-hot: oneDayShippingAvailable
* 1-hot: bestOfferEnabled
* 1-hot: Day of week (derive from start/end cols)
* 1-hot: Returns Accepted
* 1-hot: isMultiVariation (? What is that)
* 1-hot: topRatedListing (? what is that)

***

### What can we only get from the single-item API (below)?

* NLP: Description text
* NLP: Detailed __Condition__ Description


* Scalar: Length of description text (non-html tags)
* Scalar: Number of images on eBay normal
* Scalar: Year of manufacture
* Scalar: Seller.FeedbackScore
* Scalar: Seller.PositiveFeedbackPercent
* Scalar: ReturnsWithin number of days
* Scalar: len(ExcludeShipToLocation) aka how many countries excluded?


* 1-hot (or NLP?): Brand
* 1-hot (or NLP?): Model
* 1-hot: Right/left handed
* 1-hot: Body color
* 1-hot: Body material
* 1-hot: String Config (6-string, etc)
* 1-hot: Body type (Solid / hollow, etc)
* 1-hot: Soundboard Style (arch / flat top etc)
* 1-hot: Country of manufacture
* 1-hot: Case
* 1-hot: NewBestOffer
* 1-hot: AutoPay

### Item-Specific Queries, for More Features

Beware, you only get 5000 queries a day, so don't loop on this too hard until you're persisting data.

In [43]:
def get_specs(AXE_ID):
    '''Return the specifics of a single eBay auction. String input.'''
    r2 = requests.get('http://open.api.ebay.com/shopping?'
                    f'callname=GetSingleItem&'
                    f'responseencoding=JSON&'
                    f'appid={API_KEY}&'
                    f'siteid=0&' # USA Store
                    f'version=967&' # What is this?
                    f'ItemID={AXE_ID}&' # Assigned above
                    f'IncludeSelector=Details,ItemSpecifics,TextDescription')
    return r2.json()['Item']

Seems like there's some variability between items when it comes to item specifics field.

### Persisting some Data for Analysis

Just write first page of listing results to .json files:

In [27]:
page_1

[{'itemId': ['254093291269'],
  'title': ['Gibson Les Paul Studio with Gibson Hard Shell Case'],
  'globalId': ['EBAY-US'],
  'primaryCategory': [{'categoryId': ['33034'],
    'categoryName': ['Electric Guitars']}],
  'galleryURL': ['http://thumbs2.ebaystatic.com/m/mfR4RvdXH4OaV9ag0mQyo9Q/140.jpg'],
  'viewItemURL': ['http://www.ebay.com/itm/Gibson-Les-Paul-Studio-Gibson-Hard-Shell-Case-/254093291269'],
  'productId': [{'@type': 'ReferenceID', '__value__': '96926466'}],
  'paymentMethod': ['PayPal'],
  'autoPay': ['false'],
  'postalCode': ['72034'],
  'location': ['Conway,AR,USA'],
  'country': ['US'],
  'shippingInfo': [{'shippingServiceCost': [{'@currencyId': 'USD',
      '__value__': '25.0'}],
    'shippingType': ['Flat'],
    'shipToLocations': ['Worldwide'],
    'expeditedShipping': ['true'],
    'oneDayShippingAvailable': ['false'],
    'handlingTime': ['3']}],
  'sellingStatus': [{'currentPrice': [{'@currencyId': 'USD',
      '__value__': '700.0'}],
    'convertedCurrentPrice':

In [37]:
def persist_listings_to_json(PAGE):
    '''Saves a page of JSON responses to one json per axe'''
    for i in range(len(PAGE)):
        with open("axe_listings/axe_%s.json" % (PAGE[i]['itemId'][0]), 'w') as f:  # writing JSON object
            json.dump(PAGE[i], f)

In [38]:
persist_listings_to_json(page_1)

Now write one page of details to a JSON:

In [56]:
axe_ids_1 = [item['itemId'][0] for item in page_1]

In [57]:
def persist_spec_to_json(spec):
    '''Writes one page of Axe Specs to one json'''
    with open("axe_specs/axe_%s.json" % (spec['ItemID']), 'w') as f:  # writing JSON object
        json.dump(spec, f)

Okay, careful, this is where we start to hammer the eBay API a little bit.

In [59]:
# (persist_spec_to_json(get_specs(axe)) for axe in axe_ids_1)