 ## Introduction
 
 In this document, I aim to explain how you can use the functions provided by the package `coches_net` to obtain data from cars publicised in [coches.net](https://www.coches.net/)

In [1]:
# 3rd-party modules
from os import scandir  # to list files in directory
import datetime  # to add day of download as argument

# my package
import coches_net  # should already be installed with pip

## How to get the HTML pages

It is as easy as following these instructions:

1. On the main page of [coches.net](https://www.coches.net/), click on any of the links that appear when you hover over 'Buscar' in the upper menu. 
2. You will be redirected to a page such as [this page](https://www.coches.net/segunda-mano/). Here, introduce your search parameters. 
3. Then, save each page you want to analyse as a complete webpage. Enigmatically, this is the only way that I have been able to find to avoid being detected as a robot

## Ingesting the data

In the folder `data/`, I have stored some html files with all the ads on Fiat Pandas sold by professional sellers from at least 2013 (those were the parameters of my search, I don't want to get a car with more than seven years). Let's have a look at them:

In [2]:
files_at_dir = [entry.name for entry in scandir('original_data/')
                if entry.is_file()]
sorted(files_at_dir)

['fiat_panda_profesional_1.html',
 'fiat_panda_profesional_10.html',
 'fiat_panda_profesional_2.html',
 'fiat_panda_profesional_3.html',
 'fiat_panda_profesional_4.html',
 'fiat_panda_profesional_5.html',
 'fiat_panda_profesional_6.html',
 'fiat_panda_profesional_7.html',
 'fiat_panda_profesional_8.html',
 'fiat_panda_profesional_9.html']

In fact, you can check the structure of these HTML files just by opening them in any web browser. 

Let us now create a dataframe from all the ads contained in these files by invoking the function `coches_net.get_all_cars`. The argument `check_all_pages` is passed as `True` first to be sure that the function processes each page:

In [3]:
fiat_panda_ads = coches_net.get_all_cars(source_dir='original_data/',
                                        date_download=datetime.datetime(2020, 10, 19),
                                        check_all_pages=True)

There we go! Just like magic, our pandas dataframe should be ready for analysis :) Let's have a look at the first rows of this dataset:

In [4]:
fiat_panda_ads.head(3)

Unnamed: 0,title,date,spot_price,financed_price,location,type_petrol,year,kilometrage,warranty,office,page
0,FIAT Panda 1.3 75cv Diesel 4x4 E5 5p.,2020-09-07,9500.0,8990.0,Guipúzcoa,Diésel,2014.0,20000.0,1,False,fiat_panda_profesional_10.html
1,FIAT Panda 1.2 Lounge 51kW 69CV EU6 5p.,2020-09-07,6500.0,5900.0,Baleares,Gasolina,2017.0,55500.0,1,False,fiat_panda_profesional_10.html
2,FIAT Panda 1.2 Lounge 69cv EU6 5p.,2020-09-07,5500.0,5000.0,Málaga,Gasolina,2016.0,43000.0,1,False,fiat_panda_profesional_10.html


### Add more observations to an existing dataframe

Here, I will give an example on how to use `coches_net.add_new_pages`

## Finding discounts!

In this section, we will see the function `coches_net.find_discounts` in action!