Skip to content

Comments

Add allowed domains from network config to sanitization environment variable#2138

Closed
Copilot wants to merge 7 commits intomainfrom
copilot/add-allowed-domains-env-variable
Closed

Add allowed domains from network config to sanitization environment variable#2138
Copilot wants to merge 7 commits intomainfrom
copilot/add-allowed-domains-env-variable

Conversation

Copy link
Contributor

Copilot AI commented Oct 22, 2025

Problem

The sanitize_output.cjs JavaScript file has hardcoded default allowed domains, but it should receive the combined set of:

  1. Default GitHub-related domains for sanitization
  2. Custom domains from the network frontmatter configuration

Solution

This PR provides allowed domains to the JavaScript sanitization code via the GH_AW_ALLOWED_DOMAINS environment variable by combining default GitHub domains with network permission domains.

Implementation

Domain Management in Ecosystem System:

  • Created github-sanitization ecosystem in ecosystem_domains.json containing default GitHub domains (github.com, github.io, githubusercontent.com, githubassets.com, github.dev, codespaces.new)
  • Created playwright-localhost ecosystem in ecosystem_domains.json for localhost domains
  • Moved ComputeAllowedDomainsForSanitization function to domains.go to centralize domain management
  • Removed domain constants from constants.go - all domain lists now managed through the ecosystem system

Workflow Compilation:

  • Compiler always sets GH_AW_ALLOWED_DOMAINS environment variable in generated workflows
  • Combines github-sanitization domains with network permission domains (which expand ecosystem identifiers)
  • Deduplicates and sorts domains for consistency

How It Works:

  1. ComputeAllowedDomainsForSanitization() retrieves domains from network permissions using GetAllowedDomains()
  2. Always includes domains from the github-sanitization ecosystem
  3. Network permissions can specify ecosystem identifiers (defaults, python, etc.) or custom domains
  4. All domains are deduplicated and sorted before being set in GH_AW_ALLOWED_DOMAINS

Example

For a workflow with:

network:
  allowed:
    - defaults
    - python
    - example.com

The compiled workflow includes:

env:
  GH_AW_ALLOWED_DOMAINS: "*.pythonhosted.org,anaconda.org,...,github.com,github.io,...,example.com,..."

This provides:

  • GitHub sanitization domains (always included)
  • Ecosystem "defaults" domains (certificates, Ubuntu mirrors, etc.)
  • Ecosystem "python" domains (PyPI, Conda, etc.)
  • Custom domain (example.com)

Testing

  • ✅ All unit tests pass
  • ✅ All integration tests pass
  • ✅ JavaScript tests pass
  • ✅ Compiled workflows include GH_AW_ALLOWED_DOMAINS with complete domain list
  • ✅ Code formatted with go fmt

Benefits

  • Centralized Management: All domain lists managed in ecosystem_domains.json
  • Consistent Pattern: Sanitization and Playwright domains use the same ecosystem system
  • Maintainable: Domain updates only require changes to the JSON file
  • Secure: GitHub sanitization domains always included, network permissions add additional allowed domains
Original prompt

The list of allowed domains used in the url validation JavaScript code should be provided as an environment variable by the compiler and use by the JavaScript code.

The content of the variable is the default set of domains + custom domains provided by the user in the top level "network" front matter configuration.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits October 22, 2025 14:10
- Create computeAllowedDomainsForSanitization helper function
- Combine default GitHub domains with network permissions
- Set GH_AW_ALLOWED_DOMAINS environment variable in compiled workflows
- Add comprehensive unit and integration tests
- Recompile all workflow lock files with new allowed domains

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add environment variable for allowed domains in URL validation Add allowed domains from network config to sanitization environment variable Oct 22, 2025
Copilot AI requested a review from pelikhan October 22, 2025 14:20
// by combining default GitHub domains with domains from network permissions
func computeAllowedDomainsForSanitization(data *WorkflowData) []string {
// Start with default GitHub-related domains that match the JavaScript defaults
defaultDomains := []string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot use existing default domains from network.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved the default GitHub domains to constants.DefaultSanitizationDomains in pkg/constants/constants.go and updated the compiler and tests to use this constant instead of hardcoding the domains. This centralizes the domain list for easier maintenance. (f7801f5)

- Move DefaultSanitizationDomains to pkg/constants/constants.go
- Update computeAllowedDomainsForSanitization to use constants.DefaultSanitizationDomains
- Update tests to use the constant instead of hardcoded arrays
- Simplifies maintenance and ensures consistency across codebase

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan October 22, 2025 15:23

// DefaultSanitizationDomains defines the default GitHub-related domains
// that are always allowed in output sanitization
var DefaultSanitizationDomains = []string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot

  • remove, use default domains in domains.go instead
  • make sure these domains are in the default set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Created github-sanitization ecosystem in ecosystem_domains.json and updated ComputeAllowedDomainsForSanitization to always include these domains. (10a09a3)

@@ -60,6 +60,17 @@ const DefaultMCPStartupTimeoutSeconds = 120
// that are always allowed for Playwright browser automation
var DefaultAllowedDomains = []string{"localhost", "localhost:*", "127.0.0.1", "127.0.0.1:*"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to domains.go as a domain set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Created playwright-localhost ecosystem in ecosystem_domains.json with the localhost domains. (10a09a3)

// needsGitCommands checks if safe outputs configuration requires Git commands
// computeAllowedDomainsForSanitization computes the list of allowed domains for output sanitization
// by combining default GitHub domains with domains from network permissions
func computeAllowedDomainsForSanitization(data *WorkflowData) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to domains.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved ComputeAllowedDomainsForSanitization to domains.go and made it use ecosystem domains. (10a09a3)

Copilot AI and others added 2 commits October 22, 2025 16:21
- Add github-sanitization ecosystem to ecosystem_domains.json
- Add playwright-localhost ecosystem to ecosystem_domains.json
- Move ComputeAllowedDomainsForSanitization from compiler.go to domains.go
- Remove DefaultSanitizationDomains and DefaultAllowedDomains from constants.go
- Update all references to use ecosystem domains instead
- Sanitization always includes github-sanitization domains plus network permissions

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…ariable

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan October 22, 2025 16:23
@pelikhan pelikhan closed this Oct 22, 2025
@pelikhan pelikhan deleted the copilot/add-allowed-domains-env-variable branch October 23, 2025 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants