Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate existing reusable GitHub Actions workflow? #159

Open
KonradHoeffner opened this issue Sep 21, 2022 · 1 comment
Open

Integrate existing reusable GitHub Actions workflow? #159

KonradHoeffner opened this issue Sep 21, 2022 · 1 comment

Comments

@KonradHoeffner
Copy link
Contributor

KonradHoeffner commented Sep 21, 2022

I would like to share a reusable GitHub Actions workflow to use pySHACL in a CI pipeline.
The repository is https://github.com/KonradHoeffner/shacl and it is on the market place as konradhoeffner/shacl@v1.
I found it really useful for our projects and would like to know if this is useful for the pySHACL developers / userbase.
If yes, are you interested in integrating it into pySHACL and/or the RDFLib organization?
If you are interested and it makes sense to integrate it directly into this repository, then I could create a pull request.
However I'm not sure how that would be managed in the best way, because right now it installs the newest version <2 from pip, and maybe the action number should be synchronized with the pySHACL version.

Example

On https://github.com/hitontology/ontology you can see an example of how it works.

Workflow

The workflow .github/workflows/shacl.yaml calls the reusable workflow konradhoeffner/shacl@master (could be @v1 as well).

name: SHACL

on:
  workflow_dispatch:
  push:
    branches:
      - dist

jobs:
  shacl:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3
        with:
          ref: 'dist'

      - name: Build and Validate
        uses: konradhoeffner/shacl@master
        with:
          data: all.ttl
          shacl: shacl.ttl

Status report on success

Screenshot from 2022-09-21 10-08-21

Status report on failure from another repository

Screenshot from 2022-09-21 10-09-50

Badge

shaclbadge

Technical details and history

At first there was a separate workflow that built a Dockerfile that just installed pySHACL on top of a Python image and included a custom entry point script that calls pySHACL in a loop over the potentially multiple data files and that creates the right output strings to interface with GitHub to get nice output like errors, warnings and info notifications and groups.
The reason to use a Dockerfile was that GitHub actions use a lot of time and clutter the log with messages to install dependencies and I wanted a quick build time and a clean log with only the statements that are interesting for the user who is validating SHACL data and is not interested in logging about installing things.
However I learned that while Docker is amazing for many things, it is not the preferred GitHub-way of doing things, because an action already has a "runner" which is kind of like a virtual machine / container, and using Docker on top of that does seem to integrate that well.
For example, the idea was to build the Docker image only once and save all the time for setup, but now GitHub actions would create the Docker image at every run, ruining all the potential benefits.
So I had to create another workflow inside the repository that built the Dockerfile and deployed it to the GitHub container registry.
While this was a few seconds quicker and the output more clean, this approach was much too convoluted and could get out of sync between the Docker image and the action itself, for example when running an older version of the workflow.
So the current version v1 replaces the Dockerfile with a composite action that installs Python and then runs the entrypoint script directly.
While that created a few problems, for example that a workflow referencing the shared workflow from another repository couldn't access the entrypoint script or that there was no file at the target repository to create a hash for the setup-python cache for, this was solvable by disabling the setup-python cache and using actions/cache instead and by using the ${{github.action_path}}.

@aucampia
Copy link
Member

aucampia commented Sep 3, 2023

Having pySHACL work with GitHub Actions also make it a bit easier for contributors to debug issues, as they don't need their own drone setup to make it work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants