# Plane Prices
My objective is to collect plane prices as a function of time and model.
My datasource is [Trade-A-Plane](https://www.trade-a-plane.com). I am interested in
Vans RV-10, Cessna 182, and all Maules. I want to create a table with the following
fields:

1. Year
2. Manufacturer
3. Model
4. TTAF
5. SMOH
6. Price
7. Price-Date



In [None]:
import requests
website = 'https://trade-a-plane.com'
response = requests.get(website)
response.text

The above requests the target homepage. We can't magically get all the information
we want from the response. We need to find a way to filter to only the data we want.
Let us look for Cessna 182 type aircraft.

In [None]:
cessna_182_cat = 'https://www.trade-a-plane.com/search?category_level1=Single+Engine+Piston&make=CESSNA&model_group=CESSNA+182+SERIES&s-type=aircraft'
test_response = requests.get(cessna_182_cat).text

[Beautiful Soup](https://beautiful-soup-4.readthedocs.io/en/latest/) is designed for web scraping. That's exactly what we are doing.

In [None]:
from IPython.display import HTML
from bs4 import BeautifulSoup

soup = BeautifulSoup(test_response)
with open('test_page.html', 'wt') as f:
    f.write(soup.prettify())
    
def is_listing_result(tag):
    """
    True if the node is a result listing.
    
    Result listings are <div> tags with `class="result_listing"
    """
    if not tag.name == 'div':
        return False
    if not tag.has_attr('class'):
        return False
    classes = tag['class']
    return ('result_listing' in classes
     and 'result' in classes)
filtered = soup.find_all(is_listing_result)
filtered[0]

In [None]:
type(filtered[0])

In [None]:
with open('filtered_0.html', 'wt') as f:
    f.write(str(filtered[0]))

For each `result_listing`, there is a descendent tag of `<p class="description">`. It contains a link to obtain more information. We want this information. Find the child `<a class="log_listing_click" href="url/to/detail/page">`.

In [None]:
filtered[0].attrs['data-listing_id']

In [None]:
import re
from datetime import datetime
tag = filtered[0].find(name='p', class_='last-update')
text = tag.text
search_result = re.search(r'\d{2}/\d{2}/\d{4}', text).group(0)
datetime.strptime(search_result, '%m/%d/%Y')

In [None]:
# Drill down to the link
# The whole description
description = filtered[0].find(name='p', class_='description')
display(type(description.text))
display(description.text)
# Just the anchor tag
detail_link = description.select('a.log_listing_click')[0]
display(detail_link)
# Just the href
detail_link['href']

Navigate to the link and get a new page.

In [None]:
website = 'https://trade-a-plane.com'
detail_url = website + detail_link['href']
print('Getting page {}'.format(detail_url))
detail_tree = BeautifulSoup(requests.get(detail_url).text)
with open('aircraft-detail.html', 'wt') as f:
    f.write(detail_tree.prettify())

In [None]:
main_info = detail_tree.find(name='div', id='main_info')
float(main_info.find(name='span', itemprop='price').text)

In [None]:
from planefinder.data import AircraftSaleEntry
AircraftSaleEntry(url='https://www.trade-a-plane.com/search?category_level1=Single+Engine+Piston&make=CESSNA&model=182Q+SKYLANE&listing_id=2400626&s-type=aircraft',
                  price=15000,
                  make_model='CESSNA 182Q SKYLANE',
                  registration='N735GS',
                  description='1977 Cessna 182Q Skylane, 3461TT, 798 SMOH, 483 SPOH, Garmin GTN 430W, Stratus ES ADS-B Out Transponder (ADS-B In WiFI Traffic and Wx Link to IPad (Foreflight), Narco Mark 12D, Garmin GMA 340, Bendix King KI206, JPI EGT-701 Engine Monitor, Horton STOL Kit (Leading Edge Cuff, Droop Wing Tips, Stall Fences), Rosen Sun Visors, Standby Altimeter, & More!',
                  search_date=datetime(2021, 12, 12, 11, 53),
                  ttaf=0,
                  smoh=0)