# Prefect Workflow Lab

### Introduction

In this lesson, we'll work with building a workflow in prefect.  We'll do so by pulling used car info from craigslist, and then loading this data into a CSV file.

### Our Use Case

There are multiple reasons why we may want to do this.  One reason is simple arbitrage -- we can purchase below market cars and then sell them to someone willing to pay more.

Or perhaps we simply want to alert our users to when a below market car comes onto market. 

> We could use the [Kelley Bluebook API](http://developer.kbb.com/#!/idws/99-Swagger) as a way to compare  the expected price vs the price listed on craigslist.

### Pulling the data

So how can we get this data?  Well one way may be to write a scraper using beautiful soup.  And we could save some time, by copying a snippet of the HTML we want to scrape from the [for sale cars page](https://sandiego.craigslist.org/search/cta#search=1~list~0~0).

> Below we click inspect element to pull up the google developer console, and then identify a relevant `<li>` element indicating a car listing.

<img src="./edit-html.png" width="80%">

By clicking Edit as HTML, we can highlight the relevant HTML, and copy it.  Then we can ask chatgpt to pull out the relevant data for us.  

```md
Write a scraper using beautiful soup extract the metadata and title of each list element from the following html: <ol><li data-pid="7664884196" class="cl-search-result cl-search-view-mode-list" title="** 2015 Jeep Grand Cherokee  ***NORTHPORT MOTORS***"><div class="result-node-wide"><button type="button" tabindex="0" class="bd-button cl-favorite-button icon-only" title="add to favorites list">
```

Chatgpt does a good job. But an even better approach is to see if someone has written a craigslist scraper for us.

### The Python Craigslist Scraper

Ok, [they have](https://github.com/juliomalegria/python-craigslist).  There are multiple craigslist scrapers, but this one by Julio has the most number of stars, and looks fairly easy to use.

> <img src="./stars.png" width="60%">

Looking through the examples in the readme, we can search for cars in an area with something like the following.

```python
import pycraigslist

autos = pycraigslist.forsale.cta(site="sfbay", area="eby", query="Mazda Miata")
for auto in all_autos.search_detail(include_body=True):
    print(auto)
```

Ok, so this is what we want to do. 