# How `trademe` searches for listings

(If you just want code examples, scroll to the bottom of the page).

## search() and make_url()

From a user's point of view, you need to:
1. Specify your search criteria - this is done with `make_url()`.
2. Search - this is done with, uh, `search()`.

In fact, `search()` and `make_url()` are all you need to use `trademe` in Python.

`make_url()` returns a URL string that `search()` treats as the first page of a given search result. All `search()` knows how to do is paginate, and process results; all `make_url()` does is stitch a bunch of criteria together in such a way that TradeMe won't reject it.

Let's look at each function:

In [1]:
from trademe.search import search, make_url

In [2]:
help(search)

Help on function search in module trademe.search:

search(timeout=None, driver_arguments=['--headless=new', '--start-maximized'], *urls)
    Searches TradeMe using URLs. 
    
    For each URL, search() paginates until it can't find any more listings, 
    then returns.
    
    Note: search() uses a Chrome webdriver, so it's recommended you have 
    the relevant Chrome drivers downloaded in advance.
    
    Args:
        timeout: The implicit wait used (in seconds) for the Selenium webdriver
            under the hood.
        driver_arguments: The arguments set for the webdriver.
        *urls: URL strings to be treated as the first page of a set of search
            results, which search() will paginate over.
    
    Returns:
        A list of Listing objects.



The most basic way of using `search()` is just to pass in a bunch of URLs (or just one), *args-style.
Optionally, you can specify driver settings (via `driver_arguments`) and the webdriver's implicit wait (via `timeout`).

`search()` returns a list of Listing objects. The Listing class holds the title, address, price, features, link, availability, parking, agent, and agency of a listing.

In [3]:
help(make_url)

Help on function make_url in module trademe.search:

make_url(sale_or_rent: str, region: str = '', district: str = '', suburb: str = '', **kwargs)
    Make URL for search() from search criteria.
    
    Note there is no data validation for region, district, or suburb names, or
    for kwargs.
    
    Also note that two-word locations should be spelled with dashes (not
    spaces), e.g. instead of suburb="Aro Valley", do suburb="aro-valley".
    Capitalisation doesn't matter.
    
    Valid kwargs:
    For rent or sale searches:
        *Integers*:
        - bathrooms_min
        - bathrooms_max
        - bedrooms_min
        - bedrooms_max
        - price_min
        - price_max
        *Other*
        - property_type; can be apartment, carpark, house, townhouse, unit
        - adjacent_suburbs; true/false
        - search_string; can be any string, e.g. "Comprende"
    For rent searches only:
        - available_now; true/false
        - pets_ok; true/false
    For sale searches onl

In short, the arguments should speak for themselves. `make_url()` does a good job of checking that the right arguments are present in the right combinations, but it won't spell-check anything.

Note that if you *do* misspell something, `search()` won't be able to find any listings, and will return an empty list.

## Threading

It's entirely possible to use threading to run multiple URLs in parallel. Note that, while `search()` *can* take multiple URLs, it does *not* do the threading for you - if you give it lots of URLs, they'll be handled sequentially, not concurrently.

Because `search()` is I/O-bound - or, more specifically, because it makes a system call - it bypasses the Python GIL, and as such can be used for multithreading.

**Threading is a *great* idea for big searches!** E.g. if you want to search for listings in a handful of suburbs in Wellington, it might be a good idea to specify them all separately, and take advantage of multiple processes to run them concurrently.

### Single-threaded example

In [4]:
%%time
# Generate URL:
url = make_url("rent", search_string="Comprende")

# Run search:
listings = search(None, [], url)

CPU times: user 436 ms, sys: 54.4 ms, total: 490 ms
Wall time: 13.9 s


In [7]:
listings  # Comprende properties for rent, as at 18 September:

[Listing(title='503/8 Wigan Street, Wellington Central, Wellington', address=None, price='$520 per week', features='1 bedrooms. 1 bathrooms.', link='https://www.trademe.co.nz/a/property/residential/rent/wellington/wellington/wellington-central/listing/4330212692?rsqid=bd0b51044a374bc79b9947f4e84ddb25-001', availability='Available: Fri, 22 Sep', parking='n/a', agent=' Rebekah Joyce ', agency='Comprende Ltd'),
 Listing(title='65 Roseneath Terrace (upper), Roseneath, Wellington', address=None, price='$850 per week', features='4 bedrooms. 1 bathrooms.', link='https://www.trademe.co.nz/a/property/residential/rent/wellington/wellington/roseneath/listing/4330206937?rsqid=bd0b51044a374bc79b9947f4e84ddb25-001', availability='Available: Fri, 29 Sep', parking='n/a', agent=' Rebekah Joyce ', agency='Comprende Ltd'),
 Listing(title='12D/126 The Terrace, Wellington Central, Wellington', address=None, price='$490 per week', features='1 bedrooms. 1 bathrooms.', link='https://www.trademe.co.nz/a/proper

In [6]:
listings[6]

Listing(title='8 Koru Loop, Paraparaumu, Kapiti Coast', address=None, price='$730 per week', features='3 bedrooms. 1 bathrooms.', link='https://www.trademe.co.nz/a/property/residential/rent/wellington/kapiti-coast/paraparaumu/listing/4320859586?rsqid=bd0b51044a374bc79b9947f4e84ddb25-001', availability='Available: Now', parking='n/a', agent=' Rebekah Joyce ', agency='Comprende Ltd')

### Multi-threaded example

I'm a threading noob, but hopefully this conveys the general idea: running `search()` on more threads means less runtime.

In [6]:
import threading

In [7]:
%%time
# Generate URLs: using the same URL as single-thread example for fair runtime 
# comparison
_url = make_url("rent", search_string="Comprende")
urls = _url, _url, _url

# Run search: 
threads = []
for url in urls:  # Will open three threads
    t = threading.Thread(target=search, args=(None, [], url))
    t.start()
    threads.append(t)

for t in threads:
    t.join()

CPU times: user 1.36 s, sys: 129 ms, total: 1.49 s
Wall time: 37.3 s


Note that this took 38 seconds, which is less than you would expect if you ran these URLs sequentially. (The single-threaded example took ~16s, so you might take 3*16=~48s with a single thread).

## Turning your list of Listings into a DataFrame/CSV

In [8]:
import pandas as pd

# Converts dataclass instances (each Listing object is a dataclass) to dict:
from dataclasses import asdict  

In [9]:
# To get a DataFrame, we'll want to convert our list[Listing] into list[dict]

# Using `listings` from above, the result of searching for "rent" and 
# search_string="Comprende":
list_of_dicts = [asdict(l) for l in listings]

In [10]:
listings[0]  # Listing object

Listing(title='503/8 Wigan Street, Wellington Central, Wellington', address=None, price='$520 per week', features='1 bedrooms. 1 bathrooms.', link='https://www.trademe.co.nz/a/property/residential/rent/wellington/wellington/wellington-central/listing/4330212692?rsqid=ad4dc34d1ff64e23863bf92dfc888afe-001', availability='Available: Fri, 22 Sep', parking='n/a', agent=' Rebekah Joyce ', agency='Comprende Ltd')

In [11]:
list_of_dicts  # list[dict] is much easier to convert to a Pandas DataFrame

[{'title': '503/8 Wigan Street, Wellington Central, Wellington',
  'address': None,
  'price': '$520 per week',
  'features': '1 bedrooms. 1 bathrooms.',
  'link': 'https://www.trademe.co.nz/a/property/residential/rent/wellington/wellington/wellington-central/listing/4330212692?rsqid=ad4dc34d1ff64e23863bf92dfc888afe-001',
  'availability': 'Available: Fri, 22 Sep',
  'parking': 'n/a',
  'agent': ' Rebekah Joyce ',
  'agency': 'Comprende Ltd'},
 {'title': '65 Roseneath Terrace (upper), Roseneath, Wellington',
  'address': None,
  'price': '$850 per week',
  'features': '4 bedrooms. 1 bathrooms.',
  'link': 'https://www.trademe.co.nz/a/property/residential/rent/wellington/wellington/roseneath/listing/4330206937?rsqid=ad4dc34d1ff64e23863bf92dfc888afe-001',
  'availability': 'Available: Fri, 29 Sep',
  'parking': 'n/a',
  'agent': ' Rebekah Joyce ',
  'agency': 'Comprende Ltd'},
 {'title': '12D/126 The Terrace, Wellington Central, Wellington',
  'address': None,
  'price': '$490 per week',

In [12]:
# Make it a DataFrame:
df_listings = pd.DataFrame(list_of_dicts)

In [13]:
df_listings 

Unnamed: 0,title,address,price,features,link,availability,parking,agent,agency
0,"503/8 Wigan Street, Wellington Central, Wellin...",,$520 per week,1 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,"Available: Fri, 22 Sep",,Rebekah Joyce,Comprende Ltd
1,"65 Roseneath Terrace (upper), Roseneath, Welli...",,$850 per week,4 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,"Available: Fri, 29 Sep",,Rebekah Joyce,Comprende Ltd
2,"12D/126 The Terrace, Wellington Central, Welli...",,$490 per week,1 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,"Available: Fri, 20 Oct",,Rebekah Joyce,Comprende Ltd
3,"1602A/111 Dixon Street, Wellington Central, We...",,$440 per week,1 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,"Available: Fri, 13 Oct",,Rebekah Joyce,Comprende Ltd
4,"207/169 The Terrace, Wellington Central, Welli...",,$330 per week,1 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,Available: Now,,Rebekah Joyce,Comprende Ltd
5,"504/169 The Terrace, Wellington Central, Welli...",,$440 per week,1 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,"Available: Fri, 29 Sep",,Rebekah Joyce,Comprende Ltd
6,"8 Koru Loop, Paraparaumu, Kapiti Coast",,$730 per week,3 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,Available: Now,,Rebekah Joyce,Comprende Ltd
7,"5B/49 Manners Street, Wellington Central, Well...",,$410 per week,1 bedrooms. 1 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,Available: Now,,Rebekah Joyce\n,Comprende Ltd
8,"702/74 Taranaki Street, Wellington Central, We...",,$650 per week,2 bedrooms. 2 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,"Available: Fri, 6 Oct",,Rebekah Joyce\n,Comprende Ltd
9,"6B/6 Ferry Road, Days Bay, Lower Hutt",,$800 per week,3 bedrooms. 2 bathrooms.,https://www.trademe.co.nz/a/property/residenti...,Available: Now,,Rebekah Joyce\n,Comprende Ltd


In [14]:
# Make it a CSV:
df_listings.to_csv("18_Sep_2023_Results.csv")