GitHub Pulse is an end-to-end analytics project for turning raw GitHub repository metadata into a reproducible public dashboard. It uses:
- Prefect for orchestration,
- ClickHouse as the warehouse,
- dbt for tested transformations,
- Evidence for the dashboard,
- and GitHub Actions/GitHub Pages for cloud publication.
Live dashboard: https://aipavlo.github.io/github-pulse/
GitHub Pulse collects public GitHub repository metadata, builds analytics tables in ClickHouse, and publishes a static analytics site with Evidence and GitHub Pages.
The project is meant to answer a few practical questions in a simple, reproducible way:
- which repositories are the most visible and active;
- how stars, forks, and activity change over time;
- which owners, languages, and topics dominate the dataset.
Target publication flow:
Prefect -> ingestion -> dbt run -> dbt test -> export static datasets -> commit datasets branch -> PR checks -> merge to main -> v*.*.* tag -> Evidence build -> GitHub Pages
Core rules:
- GitHub Actions and GitHub Pages never connect to ClickHouse.
- Prefect never commits, pushes, tags, or otherwise writes to git.
- The site is built only from files committed to the repository.
- Published datasets are replaced atomically instead of accumulating over time.
- Site build output is not committed to git.
- Everything under
evidence/sources/site_data/current/is public data.
Pythonfor ingestion, export, and orchestration utilitiesClickHouseas the warehousedbtfor publish-ready models and data testsPrefectfor end-to-end orchestrationEvidencefor the static siteGitHub Actions + GitHub Pagesfor build and deployment
Prepare the local environment:
make env
make build
make upRun the main local validation flow:
make qa-python
make dbt-deps
make dbt-run
make dbt-test
make export-site-data
make check-siteIf you want the full orchestration run through Prefect:
make prefect-runmake prefect-run updates only the local public dataset directory. It does not perform git commit, push, or tag operations. RUN_DATE defaults to the first day of the current month.
The Evidence app lives in evidence/ and is configured as a project site with basePath=/github-pulse.
Useful commands:
make evidence-install
make evidence-dev
make evidence-build
make pages-build-localmake evidence-devstarts the local dev server athttp://localhost:3000make evidence-buildruns a strict local buildmake pages-build-localrefreshes flat-file sources and produces a Pages-ready artifact inevidence/build/
Committed to git:
- ingestion, dbt, orchestration, and site code;
- public datasets and metadata in
evidence/sources/site_data/current/; - CI/CD configuration and tests.
Not committed to git:
evidence/build/evidence/.evidence/evidence/sources/site_data/_tmp/node_modules, npm caches, and local build caches- secrets, tokens, and ClickHouse access details
The delivery model has three explicit steps:
- Data contour: Prefect and dbt prepare data, then export refreshes
evidence/sources/site_data/current/locally. - Git contour: the updated public datasets are committed to a separate branch, reviewed through PR checks, and merged to
main. - Delivery contour: a
v*.*.*tag, such asv0.0.1, triggers GitHub Actions, which reads committed datasets only, runsnpm run sourcesandnpm run build:strict, then deploysevidence/build/to GitHub Pages.
Release path: v*.*.* tag -> deploy Pages.
This keeps data production, git publication, and Pages deployment separate and auditable.
- Empty datasets: run
make dbt-test, then retrymake export-site-data; use the flow with--fail-on-emptywhen needed. - dbt packages:
make dbt-runandmake prefect-runrundbt depsfirst; usemake dbt-depsto refresh packages directly. npm run sources: this usually means the files inevidence/sources/site_data/current/are missing or invalid.npm run build:strict: make suremake evidence-installandmake check-sitewere run first.basePath: GitHub Pages builds must use/github-pulse; this is fixed inevidence/evidence.config.yaml.- Orphan cleanup: temporary export directories are cleaned automatically, and
make clean-site-data-tmpis available for manual cleanup.
Use this as the main local validation command:
make checkIt runs Python QA, the dbt layer, static dataset export, and the site build validation flow.