Skip to content
alert-circle

GitHub Action

Presidio Action

v1.1.2 Latest version

Presidio Action

alert-circle

Presidio Action

Github Action to execute PII checks using the Presidio CLI

Installation

Copy and paste the following snippet into your .yml file.

              

- name: Presidio Action

uses: insightsengineering/presidio-action@v1.1.2

Learn more about this action in insightsengineering/presidio-action

Choose a version

Presidio Action

Github Action that analyzes text for PII entities with Microsoft's Presidio framework.

Author

Insights Engineering

Inputs

  • path:

    Description: Path to verify

    Required: false

    Default: "."

  • configuration-file:

    Description: Path to custom configuration file

    Required: false

    Default: "default"

  • configuration-data:

    Description: Configuration data as an inline YAML configuration

    Required: false

    Default: ""

  • output:

    Description: Format of output

    Required: false

    Default: "auto"

  • publish:

    Description: Publish result as a PR comment

    Required: false

    Default: "true"

  • upload:

    Description: Upload results as an artifact

    Required: false

    Default: "true"

  • presidio-cli-version:

    Description: Presidio CLI version

    Required: false

    Default: "latest"

  • lang-models:

    Description: List of additional language models to install

    Required: false

    Default: ""

  • only-changed-files:

    Description: Only run checks for changed files

    Required: false

    Default: false

Outputs

An output depends on the output parameter:

The default format is auto.

Available formats:

  • standard - standard output format
tests/conftest.py
  34:58     0.85     PERSON
  37:33     0.85     PERSON
  • github - similar to diff function in github
::group::tests/conftest.py
::0.85 file=tests/conftest.py,line=34,col=58::34:58 [PERSON]
::0.85 file=tests/conftest.py,line=37,col=33::37:33 [PERSON]
::endgroup::
  • colored - standard output format but with colors

  • parsable - easy to parse automaticaly

{"entity_type": "PERSON", "start": 57, "end": 62, "score": 0.85, "analysis_explanation": null}
{"entity_type": "PERSON", "start": 32, "end": 37, "score": 0.85, "analysis_explanation": null}
  • auto - default format, switches automatically between those 2 modes:
    • github, if run on github - environment variables GITHUB_ACTIONS and GITHUB_WORKFLOW are set
    • colored, otherwise

How it works

Presidio action uses presidio-cli based on presidio-analyzer from Microsoft Presidio framework to check code against undesirable types of data such as 'EMAIL_ADDRESS' or 'PHONE_NUMBER' inside application's code.

For more information please see a full list of supported entities.

Usage

Example usage:

---
name: Presidio check

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main

jobs:
  presidio-action:
    runs-on: ubuntu-latest
    name: Presidio check

    steps:
      - name: Checkout Code
        uses: actions/checkout@v3
        with:
          # 0 fetch-depth is needed if you set `only-changed-files` to true
          # and if you are configuring this check to run on push events
          fetch-depth: 0

      - name: Produce the presidio report
        uses: insightsengineering/presidio-action@v1
        # all parameters below are optional
        with:
          # path to project.
          # if project does not have a specific 'my-project' path,
          # '.' - current folder is a default value
          path: "my-project"
          # configuration-file - path to file with specific configuration
          # or use one of predefined files:
          #   - default - `conf/default.yaml` file from action repository, check default list of entities
          #                and ignore content of `.git` folder
          #   - limited - `conf/limited.yaml` file from action repository, check only PERSON, EMAIL_ADDRESS and CREDIT_CARD
          #                and ignore `.git` folder and *.cfg files
          configuration-file: "my-project/conf/my-presidio-config.yaml"
          # configuration-data - content of configuration in raw yaml format.
          # Give possibility to prepare own configuration without adding file to project
          # any value in this field will block usage of configuration file
          configuration-data: |
            entities:
              - PERSON
            threshold: 0.9
          # output - specify one of output formats
          output: "parsable"
          # only-changed-files - only run the check for files that were changed
          # NOTE: You must set fetch-depth: 0 in the actions/checkout@v3 step
          # for push events while this paramater is set to true
          only-changed-files: true

Example of comment added to the PR:

Screenshot with PR comment example