Skip to content

breck7/measurementscrawlers

Repository files navigation

import header.scroll
title MeasurementsCrawlers

Crawlers for extracting measurements from the web for Scroll datasets.

* Crawlers generally:

1. Match - match entity ids from the source to concept ids
2. Fetch - fetch content from the source site and save to disk cache
3. Parse - parse the content into JSON objects and save to disk cache
4. Update - map the content to the measureParser and save to the concept base

permalink index.html
import footer.scroll