Skip to content

elizaOS/elizaos.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Actions & Data Pipeline

This document outlines the setup and workflow of our GitHub Actions data pipeline. The primary goal is to manage and version-control data generated by our CI/CD processes, with a special focus on handling a SQLite database and its schema migrations.

Core Components

_data Branch

A dedicated, orphaned branch (_data) serves as the storage for data artifacts. This keeps large data files and frequent data updates out of the main source code history, making the main repository lighter and faster to clone.

actions/pipeline-data ("Setup Pipeline Data Branch")

This composite action manages the interaction with the _data branch by creating a git worktree.

  • operation: setup: Checks out the _data branch into a .pipeline-data-worktree directory and uses rsync to copy the entire contents of the worktree's data directory into the main workspace's data directory.
  • operation: update: Uses rsync to sync the data directory from the main workspace to the worktree, then commits and force-pushes the changes to the _data branch.
  • operation: cleanup: Removes the .pipeline-data-worktree directory. This should be run at the end of a workflow, typically using an if: always() condition to ensure cleanup happens even if other steps fail.

actions/restore-db ("SQLite Database Operations")

This action handles the dumping and restoring of the SQLite database in a way that is compatible with our migration-based schema management.

  • operation: dump:

    1. Dumps the live SQLite database (e.g., data/db.sqlite) into a diffable format in the specified dump directory (e.g., data/dump) using sqlite-diffable.
    2. Copies the Drizzle migration journal (drizzle/meta/_journal.json) into the dump directory as _journal.json. This is a critical step that versions the database schema state along with the data itself.
  • operation: restore:

    1. Reads the latest migration number from the _journal.json file located within the dump directory.
    2. Initializes a new, empty database.
    3. Runs database migrations from the main branch's drizzle directory up to the version specified in the journal file. This creates a database with the exact schema that corresponds to the dumped data.
    4. Loads the data from the diffable dump into the database.
    5. Runs any remaining migrations from the main drizzle folder to bring the database schema fully up to date with the latest code in the main branch.

Workflows

This repository uses several GitHub Actions workflows to automate testing, data processing, and deployment.

run-pipelines.yml ("Run Pipelines")

This is the main data processing workflow. It's responsible for fetching the latest data from sources like GitHub, processing it, and generating summaries.

  • Triggers:

    • Runs on a daily schedule (cron: "0 23 * * *").
    • Can be manually triggered (workflow_dispatch) with various options to control its behavior (e.g., forcing re-ingestion, specifying date ranges).
  • Key Jobs:

    • ingest-export:
      1. Checks out the _data branch and restores the database.
      2. Runs the ingest pipeline to fetch new data (issues, PRs, etc.).
      3. Runs the process pipeline to calculate scores and other metrics.
      4. Runs the export pipeline to save processed data.
      5. Dumps the updated database and pushes all new data artifacts to the _data branch.
    • generate-summaries:
      1. Depends on the successful completion of ingest-export.
      2. Restores the latest database from the _data branch.
      3. Uses an AI service to generate project and contributor summaries.
      4. On the daily schedule, it generates project summaries daily and contributor summaries weekly.
      5. Pushes the generated summaries and updated database state back to the _data branch.

pr-checks.yml ("PR Checks")

This workflow runs on every pull request against the main branch to ensure code quality and prevent regressions.

  • Triggers:

    • pull_request on the main branch.
  • Key Jobs:

    • check: Lints the code and runs type-checking with TypeScript.
    • build: Ensures the Next.js application builds successfully with the PR changes. It restores the production data to ensure the build process is realistic.
    • test-pipelines: Runs the core data pipelines (ingest, process, export) in a test mode to verify their integrity.
    • check-migrations: If the database schema (src/lib/data/schema.ts) is modified, this job verifies that a corresponding Drizzle migration has been generated.

deploy.yml ("Deploy to GitHub Pages")

This workflow handles the deployment of the application to GitHub Pages.

  • Triggers:

    • Manually via workflow_dispatch.
    • Automatically after the Run Pipelines workflow successfully completes on the main branch.
  • Key Steps:

    1. Restores the latest data from the _data branch.
    2. Runs any pending database migrations.
    3. Builds the Next.js application for production.
    4. Copies the data directory into the out directory to be included in the deployment.
    5. Deploys the contents of the out directory to GitHub Pages.

Releases

No releases published

Packages

No packages published

Contributors 9