Skip to content

CI/CD Security Hardening: Extract tokens and secrets from run blocks into env mappings #24743

@dagecko

Description

@dagecko

CI/CD Security Hardening: Extract tokens and secrets from run blocks into env mappings

Context

I've been working with a significant number of open source maintainers over the last 5 weeks to harden their projects against the active supply chain attack campaign. Several maintainers requested that I look into this project because their generated workflow files inherit patterns that they're trying to fix downstream.

From what I learned working with these maintainers and to make my work more efficient I built an open source CI/CD pipeline scanner called Runner Guard that performs mechanical, non-AI analysis of GitHub Actions workflows to detect supply chain vulnerability patterns. I used it to scan this repo and found 422 instances where tokens and secrets are interpolated directly in run: blocks. 411 of these come from a single line in the Go generator code. The remaining 11 are in hand-written workflow files.

When a token is placed directly in a run block the value gets pasted into the shell before execution. Moving it to an env: mapping treats the value as data so the shell never interprets it as syntax.

Agentic Implementation Plan

Step 1: Fix the generator - pkg/workflow/git_configuration_steps.go

This is the root cause for 411 of 422 findings.

File: pkg/workflow/git_configuration_steps.go
Function: generateGitConfigurationStepsWithToken

Current code (around line 42-48):
```go
return []string{
" - name: Configure Git credentials\n",
" env:\n",
fmt.Sprintf(" REPO_NAME: %s\n", repoNameValue),
" SERVER_URL: ${{ github.server_url }}\n",
" run: |\n",
// ... other lines ...
// SECURITY: token is interpolated directly into the shell script
// If the token value contained shell metacharacters they would be
// interpreted as commands, not data. Moving to env: treats it as data.
fmt.Sprintf(" git remote set-url origin "https://x-access-token:%s@${SERVER_URL_STRIPPED}/${REPO_NAME}.git\"\n", token),
}
```

Change to:
```go
return []string{
" - name: Configure Git credentials\n",
" env:\n",
fmt.Sprintf(" REPO_NAME: %s\n", repoNameValue),
" SERVER_URL: ${{ github.server_url }}\n",
// SECURITY: token moved to env mapping so the shell treats it as data
// not syntax. Prevents shell injection if token value contains metacharacters.
// This also fixes the pattern inherited by all downstream .lock.yml consumers.
fmt.Sprintf(" GITHUB_TOKEN: %s\n", token),
" run: |\n",
// ... other lines ...
" git remote set-url origin "https://x-access-token:${GITHUB_TOKEN}@${SERVER_URL_STRIPPED}/${REPO_NAME}.git\"\n",
}
```

Why: The token parameter is ${{ github.token }} or a custom token string. Currently it's pasted directly into the shell script. After the change, it's passed as an environment variable. The shell sees ${GITHUB_TOKEN} which is a variable reference, not executable syntax.

Step 2: Fix ci.yml (line 2980)

File: .github/workflows/ci.yml

Current code:
```yaml
- name: Log in to GitHub Container Registry
run: echo "${{ github.token }}" | docker login ghcr.io -u "${{ github.actor }}" --password-stdin
```

Change to:
```yaml
- name: Log in to GitHub Container Registry
# SECURITY: token moved to env mapping to prevent shell interpretation
# of the token value as syntax
run: echo "${GITHUB_TOKEN}" | docker login ghcr.io -u "${{ github.actor }}" --password-stdin
env:
GITHUB_TOKEN: ${{ github.token }}
```

Why: Same pattern. github.actor can stay inline because it's a GitHub-controlled string (username, no metacharacters). The token is the sensitive value that needs protection.

Step 3: Fix permissions.yml (lines 32, 61, 89, 121, 148)

File: .github/workflows/permissions.yml

There are 5 jobs that each have the same pattern:

Current code (repeated in each job):
```yaml
run: |
# ...
code=$(curl -sS -o resp.json -w "%{http_code}"
-H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}"
# ...
```

Change each to:
```yaml
# SECURITY: secret moved to env mapping to prevent shell interpretation
run: |
# ...
code=$(curl -sS -o resp.json -w "%{http_code}"
-H "Authorization: Bearer ${GITHUB_TOKEN}"
# ...
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
```

Why: secrets.GITHUB_TOKEN is interpolated directly into a curl command in a shell script. Moving to env prevents the value from being interpreted as shell syntax.

Step 4: Fix remaining secret extractions in specific workflow files

Search all .lock.yml and .yml files for any remaining instances of ${{ secrets.* or ${{ github.token }} inside run: blocks (not in env: or with: blocks). For each instance:

  1. Add the secret/token to the step's env: block
  2. Replace the ${{ }} expression in the run: block with ${ENV_VAR_NAME}
  3. Add a comment: # SECURITY: moved to env mapping to prevent shell injection

Known instances beyond Steps 1-3:

  • ${{ secrets.TAVILY_API_KEY }} - 2 occurrences
  • ${{ secrets.SENTRY_API_KEY }} - 1 occurrence
  • ${{ secrets.GH_AW_SIDE_REPO_PAT }} - 1 occurrence
  • ${{ secrets.GH_AW_PROJECT_GITHUB_TOKEN }} - 1 occurrence

Step 5: Recompile all workflows

```bash
make recompile
```

This regenerates all .lock.yml files from their .md sources. The fix from Step 1 will propagate to all 180 generated workflow files automatically.

Step 6: Verify no trailing newline issues

Ensure all modified files end with a trailing newline. Run:
```bash
find .github/workflows -name "*.yml" -exec sh -c 'test "$(tail -c1 "$1")" && echo "Missing newline: $1"' _ {} ;
```

Step 7: Run validation

```bash
make agent-finish
```

This runs build, test, lint, recompile, and format checks.

Expected outcome

  • Step 1 fixes 411 findings in all generated .lock.yml files via a single Go code change
  • Steps 2-4 fix 11 findings in hand-written workflow files
  • All 422 instances of tokens/secrets in run blocks are moved to env mappings
  • All downstream repos consuming generated workflows will get the fix on next regeneration

Verification

After the changes, confirm no tokens or secrets remain directly in run blocks:
```bash
grep -rn '${{ secrets.' .github/workflows/ | grep 'run:' | grep -v 'env:' | grep -v '#'
grep -rn '${{ github.token' .github/workflows/ | grep -v 'env:' | grep -v 'with:' | grep -v '#'
```

Both commands should return no results.

- Chris Nyhuis (dagecko)

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions