Skip to content

Align GitLab token regexes with TruffleHog and add source provenance#566

Merged
frjcomp merged 5 commits intomainfrom
copilot/update-glpat-detection-logic
Mar 18, 2026
Merged

Align GitLab token regexes with TruffleHog and add source provenance#566
frjcomp merged 5 commits intomainfrom
copilot/update-glpat-detection-logic

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 18, 2026

Custom pipeleek GitLab token regexes diverged from TruffleHog's patterns and lacked source attribution. This aligns them and documents each regex's origin.

Changes

  • glpat v3: Use exact TruffleHog regex including \b word boundaries, pinned to commit afd5336:
    // https://github.com/trufflesecurity/trufflehog/blob/afd5336caad0f61da51750ffe39869974b27b0db/pkg/detectors/gitlab/v3/gitlab_v3.go#L34
    Regex: `\b(glpat-[a-zA-Z0-9\-=_]{27,300}.[0-9a-z]{2}.[a-z0-9]{9})\b`
  • glpat v2: Split from single glpat- rule into separate v2/v3 rules matching TruffleHog's detector versioning ({20,22} vs {27,300} with routable token suffix)
  • Character class: Aligned to [a-zA-Z0-9\-=_] across all token rules (added =)
  • Added glrtr-: Runner Registration Token prefix was missing (REGISTRATION_RUNNER_TOKEN_PREFIX)
  • Fixed naming: glrt- → "Runner Authentication Token" (not "Registration"), GR1348941 → "Runner Token (Legacy)"
  • Source URLs: Every regex now has a comment linking to its origin — TruffleHog source, GitLab source model, or GitLab docs

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • gitlab.com
    • Triggering command: /tmp/go-build2004454262/b001/engine.test /tmp/go-build2004454262/b001/engine.test -test.paniconexit0 -test.timeout=10m0s -test.v=true -test.count=1 -test.run=GitLab 742219/b094/vet.-ifaceassert ux-amd64/pkg/tool/linux_amd64/vet -p github.com/truff-atomic t ux-amd64/pkg/too-buildtags (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>glpat detection</issue_title>
<issue_description>If the trufflehog verification of the gitlab scan command is active and a glpat is found its always verified againts gitlab.com. This suppresses glpat findings in case of self hosted gitlab instances.

If possible it should be verified against the gitlab instance configured in pipeleek. If this is not possible we should always report glpats e.g. by adding a builtin rule for all types of the gitlab tokens.</issue_description>

Comments on the Issue (you are @copilot in this section)


📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

…f TruffleHog verification

TruffleHog verifies GitLab tokens against gitlab.com only, which suppresses
findings for self-hosted GitLab instances. These built-in regex rules ensure
all GitLab token types (glpat, gldt, glrt, glptt, glcbt, gloas, glsoat, glft,
glimt, glffct, glagent, GR1348941) are always detected by the pattern-based
scanner with high confidence.

Fixes: glpat detection issue

Co-authored-by: frjcomp <107982661+frjcomp@users.noreply.github.com>
Copilot AI changed the title [WIP] [issue_title] Update glpat detection to support self-hosted GitLab Add built-in regex rules for GitLab token detection Mar 18, 2026
Copilot AI requested a review from frjcomp March 18, 2026 08:19
Copilot AI and others added 2 commits March 18, 2026 08:39
- Split glpat rule into v2 and v3 matching TruffleHog's exact regexes
- Add source URL comments above each regex (TruffleHog source or GitLab source)
- Align character class to [a-zA-Z0-9\-=_] matching TruffleHog's charset
- Add glrtr- (Runner Registration Token) which was missing
- Rename glrt- from "Runner Registration Token" to "Runner Authentication Token"
- Rename GR1348941 from "Runner Authentication Token (Legacy)" to "Runner Token (Legacy)"

Co-authored-by: frjcomp <107982661+frjcomp@users.noreply.github.com>
Co-authored-by: frjcomp <107982661+frjcomp@users.noreply.github.com>
Copilot AI changed the title Add built-in regex rules for GitLab token detection Add built-in regex rules for GitLab token types aligned with TruffleHog patterns Mar 18, 2026
@frjcomp frjcomp marked this pull request as ready for review March 18, 2026 08:54
Copilot AI review requested due to automatic review settings March 18, 2026 08:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Pipeleek’s built-in scanner rules with regex-based detectors for multiple GitLab token formats so tokens are still detected even when TruffleHog verification suppresses unverified GitLab findings (notably for self-hosted GitLab instances).

Changes:

  • Added built-in regex rules for a range of GitLab token types (PAT v2/v3, runner tokens, deploy/build tokens, etc.).
  • Updated existing rules tests to account for the new built-in rules and added assertions that the new rules exist and are marked high confidence.
  • Added an engine-level test intended to validate detection behavior when TruffleHog verification is enabled.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pkg/scanner/rules/rules.go Appends new built-in GitLab token regex rules to the secrets patterns list.
pkg/scanner/rules/rules_test.go Updates rule-count expectations and adds a test ensuring all GitLab token rules exist with high confidence.
pkg/scanner/engine/engine_test.go Adds a detection test for GitLab tokens while TruffleHog verification is enabled.

Comment thread pkg/scanner/rules/rules.go Outdated
Comment thread pkg/scanner/engine/engine_test.go
Update the GitLab Personal Access Token v3 regex to match the exact
pattern from TruffleHog's gitlab_v3.go detector, including \b word
boundaries and capture group. Pin the source URL to the specific commit.

Co-authored-by: frjcomp <107982661+frjcomp@users.noreply.github.com>
Copilot AI changed the title Add built-in regex rules for GitLab token types aligned with TruffleHog patterns Align GitLab token regexes with TruffleHog and add source provenance Mar 18, 2026
@frjcomp frjcomp merged commit 6cd145a into main Mar 18, 2026
12 checks passed
@frjcomp frjcomp deleted the copilot/update-glpat-detection-logic branch March 18, 2026 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

glpat detection

3 participants