You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The COVID-19 homepage has a list of announcements near the bottom we can scrape (no equivalent RSS I can find).
The Public Health Department has a newsroom we can scrape. Can’t find any RSS or Atom feeds for it. :\
The Office of Public Affairs also has a newsroom of the same format with slightly broader coverage. As far as I can tell, though, the Public Health Department one pretty well covers all the coronavirus-related stuff.
There are some SOAP services linked from the COVID-19 page, but they seem to require authentication to access.
The text was updated successfully, but these errors were encountered:
Unfortunately, the page we're scraping gets populated at runtime via JavaScript, so, like Alameda, I wound up using Selenium. This also fixes missing support for tags in our RSS output (the news items here have “categories,” like “press release” or “announcement,” and this code sets those as tags on each news item).
Fixes#64.
News scrapers live in the
news
directory. You can follow the San Francisco scraper as an example.Santa Clara County
The COVID-19 homepage has a list of announcements near the bottom we can scrape (no equivalent RSS I can find).
The Public Health Department has a newsroom we can scrape. Can’t find any RSS or Atom feeds for it. :\
The Office of Public Affairs also has a newsroom of the same format with slightly broader coverage. As far as I can tell, though, the Public Health Department one pretty well covers all the coronavirus-related stuff.
There are some SOAP services linked from the COVID-19 page, but they seem to require authentication to access.
The text was updated successfully, but these errors were encountered: