You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 5, 2022. It is now read-only.
All sources scrapers (both crawl and scrape) should be subject to end to end integration tests, wherein both are exercised against live cache or internet.
Crawl: if a function, should execute and return a valid url or object containing { url, cookie }
Scrape: should load out of the live production cache and return a well-formed result.
If the cache misses, the integration test runner can invoke a crawl for that source and write it to disk locally to complete the test.
The text was updated successfully, but these errors were encountered:
ryanblock
changed the title
End to end source integration tests
End to end integration tests for sourcesApr 17, 2020
Every test scenario in that google doc has been added except for this:
Every date in the live prod cache should be scrapable, and not return any errors. We will iterate through all dates in the cache (potentially even every date/time, which indicates a data set), and the corresponding scraper should return data. Some crawlers use an array of “subregion names” (e.g. county names) to create URLs, but then change that list over time. That would result in cache misses, which must never occur.
I still think this is necessary for successful regeneration and system/data stability.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
All sources scrapers (both
crawl
andscrape
) should be subject to end to end integration tests, wherein both are exercised against live cache or internet.Crawl: if a function, should execute and return a valid url or object containing
{ url, cookie }
Scrape: should load out of the live production cache and return a well-formed result.
If the cache misses, the integration test runner can invoke a crawl for that source and write it to disk locally to complete the test.
The text was updated successfully, but these errors were encountered: