Skip to content

feat: Add automated Scribe-Data update and deployment workflow#40

Merged
andrewtavis merged 15 commits into
scribe-org:mainfrom
DeleMike:test/toolforge
Aug 9, 2025
Merged

feat: Add automated Scribe-Data update and deployment workflow#40
andrewtavis merged 15 commits into
scribe-org:mainfrom
DeleMike:test/toolforge

Conversation

@DeleMike
Copy link
Copy Markdown
Collaborator

@DeleMike DeleMike commented Aug 2, 2025

Contributor checklist

  • This pull request is on a separate branch and not the main branch
  • I have ran the ./pre-commit executable as well as make lint and have fixed all reported issues

Description

Automates the monthly Scribe-Data update process with a GitHub Actions workflow that runs the existing update_data.sh script and deploys SQLite files to Toolforge.

What it does

  • Triggers: Push to main, manual dispatch, monthly schedule
  • Process: Uses existing update_data.sh → packages SQLite files → deploys to Toolforge via SSH
  • Safety: Creates backups before deployment, stores artifacts.

Requires setup of TOOLFORGE_SSH_KEY and TOOLFORGE_USER repository secrets for deployment. It also requires user to paste the value of TOOLFORGE_SSH_KEY into their toolforge account

Related issue

- Add GitHub Actions workflow for automated Scribe-Data updates
- Triggers on push to main, manual dispatch, and monthly schedule
- Uses existing update_data.sh script to generate SQLite files
- Automatically deploys to Toolforge via SSH
- Includes backup mechanism and artifact storage
- Resolves monthly manual update process
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Aug 2, 2025

Thank you for the pull request! ❤️

The Scribe-Server team will do our best to address your contribution as soon as we can. If you're not already a member of our public Matrix community, please consider joining! We'd suggest that you use the Element client as well as Element X for a mobile app, and definitely join the General and Data rooms once you're in. Also consider attending our bi-weekly Saturday dev syncs. It'd be great to meet you 😊

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Aug 2, 2025

Maintainer Checklist

The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)
  • The continuous integration (CI) workflows within the PR checks do not indicate new errors in the files changed

@DeleMike
Copy link
Copy Markdown
Collaborator Author

DeleMike commented Aug 2, 2025

🔧 Toolforge Deployment Testing (via GitHub Actions)

This guide explains how to test the GitHub Actions workflow that updates data and deploys to Toolforge.

1. Generate a New SSH Key (One-Time)

Create a new key pair (without passphrase) specifically for GitHub Actions and Toolforge use:

ssh-keygen -t ed25519 -C "github-actions-scribe-data" -f ~/.ssh/scribe_toolforge_deploy
  • Press Enter when asked for a passphrase (leave it empty).

2. Add Your Public Key to Toolforge

Copy the public key content:

cat ~/.ssh/scribe_toolforge_deploy.pub

Then log into Toolforge and add it:

ssh your-username@login.toolforge.org
become <your-tool-name>
echo "paste-your-public-key-here" >> ~/.ssh/authorized_keys

3. Add GitHub Secrets

Copy your private key content:

cat ~/.ssh/scribe_toolforge_deploy

In your GitHub repository → Settings → Secrets and variables → Actions, add the following:

Secret NameValue
TOOLFORGE_SSH_KEYPaste the full output of the private key
TOOLFORGE_USERYour Toolforge username (e.g. johndoe)

REQUIRED: Also copy public key to Toolforge (for manual SSH if needed), If not the login process will not work!

🔑 Visit your Toolforge account and add your public SSH key here:
https://toolsadmin.wikimedia.org/profile/settings/ssh-keys/

4. Run the Workflow

  • Go to GitHub → Actions → “Update Scribe Data and Deploy to Toolforge”
  • Click “Run workflow” → Choose branch if needed → Run workflow
  • Check logs for status and output

@axif0
Copy link
Copy Markdown
Member

axif0 commented Aug 4, 2025

We need to add a feature to log which language data types were updated and how many new entries were added. Also, can we notify the Matrix data channel once the data update completes / before the update ? 🤔

CC: @DeleMike , @andrewtavis

@DeleMike
Copy link
Copy Markdown
Collaborator Author

DeleMike commented Aug 5, 2025

Many thanks @axif0 ! 💯🙏🏾

I have some questions:

  1. this workflow just calls update_data.sh and what that script does is to fetch all data and then it updates our db. How are we gonna know what datatypes were added and what new entries? We are gonna affect the current migration process to provide us this information?

  2. I think we can notify Matrix with this information once all is completed.


Do you mean in summary, getting what new data was added? Either like a new language was added and a new data type for an existing language was added?

Could you help elaborate, please?

@andrewtavis
Copy link
Copy Markdown
Member

Let's maybe make another issue for adding in the notification to this, as I'll need to add Scribe-Bot into the data channel and get the repo set up to send messages there :)

force_update:
description: "Force data update even if no release"
required: false
default: "false"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that I'm wondering here, @DeleMike: Why is this a string value "false"? I'm getting a warning on my end that this should just be a boolean, but maybe there's a reason why it needs the quotes? Similarly above, does the description need the quotes?

If there are needed changes here, then it'd be great if they could be included in the next PR 😊

Copy link
Copy Markdown
Member

@andrewtavis andrewtavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving and bringing in per calls and discussions with @axif0 and @DeleMike 😊 Thanks for the initiative here, both of you! Really is great to have this pivot towards integrating GitHub Actions into Scribe-Server :)

@andrewtavis
Copy link
Copy Markdown
Member

Above commit moves the great guide here into the contributing guide for Scribe-Server 😊

@andrewtavis andrewtavis merged commit 9ad7098 into scribe-org:main Aug 9, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set up cron job on Toolforge for update_data.sh Test update_data.sh integration on Toolforge Verify API Behavior on Toolforge

3 participants