Challenge.gov scraper

Challenge.gov is a government website that hosts prize competitions and challenges across the U.S. federal government.

All of the currently active challenges listed are on the homepage, with more details in permalinks for each.

AFAICT, there is no RSS feed or way to be notified where there new challenges posted. So this project is a way to get the challenges into a machine-readable format by scraping the homepage periodically.

Usage

This repo is set up to work as an automated, periodic process in the manner of Git scraping as described by Simon Willison. See .github/workflows/scrape.yml.

The scraper reads the Challenge.gov homepage, which appears to be the canonical place for the list of currently active challenges. The scraper parses the HTML and extracts details about each challenge, then serializes them to a formatted JSON document, challenges.json in the top-level directory.

The GitHub action that runs the scraper periodically automatically checks in any differences in challenges.json, producing a diff history over time.

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.github/workflows		.github/workflows
public		public
snapshots		snapshots
vendor		vendor
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
challenges.json		challenges.json
go.mod		go.mod
go.sum		go.sum
main.go		main.go
main_test.go		main_test.go
rss.go		rss.go
tools.go		tools.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Challenge.gov scraper

Usage

About

Languages

License

paulsmith/challenge.gov-scraper

Folders and files

Latest commit

History

Repository files navigation

Challenge.gov scraper

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Languages