Add generic email domain whitelist middleware with configuration file #121

Copilot · 2025-11-25T23:09:46Z

Implements a generic middleware to restrict access based on email domain whitelist, controlled by configuration file and FEATURE_DOMAIN_WHITELIST_ENABLED flag. DOE/NNSA/national lab domains provided as default example.

Implementation

Middleware: backend/core/domain_whitelist_middleware.py
- Generic domain whitelist validation (not DOE-specific)
- Loads domains from JSON configuration file
- Supports subdomains (e.g., user@mail.sandia.gov)
- Bypasses /api/health and auth redirect endpoints
- Returns JSON 403 for API paths, redirect 302 for UI paths
Configuration Manager: backend/core/domain_whitelist.py
- Loads domains from config/defaults/domain-whitelist.json or config/overrides/domain-whitelist.json
- Optimized subdomain matching (O(log n) parent-level checking)
- Dual control: config file enabled field + environment variable
Configuration Files:
- config/defaults/domain-whitelist.json - Default config with DOE labs as example (22 domains)
- config/overrides/domain-whitelist.json - User customization (takes precedence)
- config/defaults/domain-whitelist-example.json - Template for custom configs
- config/defaults/DOMAIN_WHITELIST_README.md - Complete documentation
Feature flag: FEATURE_DOMAIN_WHITELIST_ENABLED in AppSettings
- Default: false
- Added to .env.example
- Backward compatible: FEATURE_DOE_LAB_CHECK_ENABLED still works
Integration: Added to middleware stack in main.py
- Positioned before AuthMiddleware in registration order (runs after in request flow)
- Only registered when feature flag is enabled

Usage

Enable with default DOE configuration

# Set in config/defaults/domain-whitelist.json
"enabled": true

# Set environment variable
FEATURE_DOMAIN_WHITELIST_ENABLED=true

Custom domain configuration

Create config/overrides/domain-whitelist.json:

{
  "version": "1.0",
  "enabled": true,
  "domains": [
    {"domain": "mycompany.com", "description": "My Company"},
    {"domain": "partner.org", "description": "Partner Organization"}
  ],
  "subdomain_matching": true
}

Configuration Format

{
  "version": "1.0",
  "description": "Domain whitelist configuration",
  "enabled": true,
  "domains": [
    {
      "domain": "example.com",
      "description": "Example Corporation",
      "category": "Enterprise"
    }
  ],
  "subdomain_matching": true
}

Test Coverage

8 tests covering:

Configuration loading and validation
Domain matching (exact and subdomain)
Middleware behavior (allow/deny/bypass)
Health endpoint bypassing
Disabled state handling
Custom config paths

Performance

Direct domain lookup: O(1) set membership
Subdomain matching: O(log n) parent-level checking
Example: mail.dept.sandia.gov checks 2 parent levels instead of iterating 22 domains

Documentation

Complete documentation available in config/defaults/DOMAIN_WHITELIST_README.md:

Configuration format and fields
Multiple example configurations (government, corporate, educational)
Enabling instructions
Troubleshooting guide
Logging details

Original prompt

add this DOE check behind a feature flag. make the cehck omore compact. make code improvements.

from fastapi import FastAPI, Request, HTTPException

app = FastAPI()

def doe_lab_email_middleware(app: FastAPI):
DOE_LAB_DOMAINS = {
# HQ / NNSA / DOE-wide
"doe.gov","nnsa.doe.gov","hq.doe.gov",

    # National labs (not exhaustive but broad coverage)
    "anl.gov","bnl.gov","fnal.gov","inl.gov","lbl.gov","lanl.gov","llnl.gov",
    "ornl.gov","pnnl.gov","sandia.gov","srnl.doe.gov","ameslab.gov",
    "jlab.org","princeton.edu","slac.stanford.edu","pppl.gov","nrel.gov",
    "netl.doe.gov","stanford.edu",  # SLAC/Stanford mix
}

@app.middleware("http")
async def _check_doe_email(request: Request, call_next):
    # Expect email in header; adjust name if needed
    email = request.headers.get("x-user-email")
    if not email or "@" not in email:
        raise HTTPException(status_code=401, detail="User email required")

    domain = email.split("@", 1)[1].lower()
    if not any(domain == d or domain.endswith("." + d) for d in DOE_LAB_DOMAINS):
        raise HTTPException(status_code=403, detail="Access restricted to DOE / NNSA / DOE labs")

    return await call_next(request)

return app

use it

doe_lab_email_middleware(app)

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>

garland3 · 2025-11-26T01:35:24Z

@copilot make it more generic with the doe as an example. so an ability to white list some domains. set the config file in the config folder similar to the existing config files.

Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>

Copilot · 2025-11-26T01:42:36Z

@copilot make it more generic with the doe as an example. so an ability to white list some domains. set the config file in the config folder similar to the existing config files.

Made the middleware generic with config file support. Changes in commits 2092ede and 97bca0f:

Generic Implementation:

Renamed DOELabMiddleware → DomainWhitelistMiddleware
Created DomainWhitelistManager to load domains from config
Feature flag: FEATURE_DOMAIN_WHITELIST_ENABLED (backward compatible with old name)

Config Files:

config/defaults/domain-whitelist.json - Default config with DOE as example (22 domains)
config/overrides/domain-whitelist.json - For customization (takes precedence)
config/defaults/domain-whitelist-example.json - Template

Usage:

Create config/overrides/domain-whitelist.json with your domains
Set "enabled": true in config file
Set FEATURE_DOMAIN_WHITELIST_ENABLED=true env var

See config/defaults/DOMAIN_WHITELIST_README.md for complete documentation and examples.

Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>

backend/tests/test_domain_whitelist.py

+        manager = DomainWhitelistManager(config_path=temp_config)
+
+        assert manager.is_enabled() is True
+        assert "sandia.gov" in manager.get_domains()


backend/tests/test_domain_whitelist.py

+
+        assert manager.is_enabled() is True
+        assert "sandia.gov" in manager.get_domains()
+        assert "doe.gov" in manager.get_domains()


backend/tests/test_domain_whitelist.py

+        assert manager.is_enabled() is True
+        assert "sandia.gov" in manager.get_domains()
+        assert "doe.gov" in manager.get_domains()
+        assert "example.org" in manager.get_domains()


backend/core/domain_whitelist.py

backend/tests/test_domain_whitelist.py

github-actions · 2025-11-26T02:31:09Z

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

⚠️ Security issues found in Python code

Recommendations

Review all SARIF files uploaded to GitHub Security tab
Address high and critical severity vulnerabilities immediately
Run npm audit fix and pip-audit locally to fix dependencies
Consider implementing additional security controls

github-actions · 2025-11-26T02:31:17Z

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

⚠️ Security issues found in Python code

Recommendations

Review all SARIF files uploaded to GitHub Security tab
Address high and critical severity vulnerabilities immediately
Run npm audit fix and pip-audit locally to fix dependencies
Consider implementing additional security controls

- Remove unused Request and TestClient imports from test_domain_whitelist.py - Remove unused List import from domain_whitelist.py - Addresses CodeQL static analysis warnings - All tests passing

github-actions · 2025-11-26T02:34:08Z

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

⚠️ Security issues found in Python code

Recommendations

Review all SARIF files uploaded to GitHub Security tab
Address high and critical severity vulnerabilities immediately
Run npm audit fix and pip-audit locally to fix dependencies
Consider implementing additional security controls

github-actions · 2025-11-26T02:37:32Z

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

⚠️ Security issues found in Python code

Recommendations

Review all SARIF files uploaded to GitHub Security tab
Address high and critical severity vulnerabilities immediately
Run npm audit fix and pip-audit locally to fix dependencies
Consider implementing additional security controls

garland3 · 2025-11-26T02:45:57Z

confirmed testing that this works with a test config file and the feature turned on.

the code ql checks are false positive.

@ktpedre This works.

Copilot

Pull request overview

This PR implements a generic email domain whitelist middleware to restrict application access based on user email domains. The feature is controlled by a configuration file (domain-whitelist.json) and a feature flag (FEATURE_DOMAIN_WHITELIST_ENABLED), making it flexible for different deployment scenarios (government labs, corporate, educational institutions). The implementation includes optimized subdomain matching (O(log n) instead of O(n)), comprehensive test coverage, and detailed documentation.

Key changes:

Generic, configurable domain whitelist middleware (not DOE-specific despite example config)
Layered configuration system with override support
Backward compatibility via FEATURE_DOE_LAB_CHECK_ENABLED alias
Proper middleware ordering to leverage request.state.user_email from AuthMiddleware

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`backend/core/domain_whitelist_middleware.py`	New middleware enforcing email domain restrictions with endpoint-specific responses (JSON 403 for API, redirect for UI)
`backend/core/domain_whitelist.py`	Configuration manager with optimized subdomain matching and layered config file search
`backend/modules/config/config_manager.py`	Added `feature_domain_whitelist_enabled` field with backward-compatible alias
`backend/main.py`	Registered middleware conditionally based on feature flag
`config/defaults/domain-whitelist.json`	Default configuration with 22 DOE/NNSA/national lab domains as example
`backend/tests/test_domain_whitelist.py`	Test suite covering configuration loading, domain matching, and middleware behavior
`docs/admin/domain-whitelist.md`	Comprehensive documentation with examples and troubleshooting guide
`.env.example`	Added `FEATURE_DOMAIN_WHITELIST_ENABLED` flag documentation

Copilot · 2025-11-26T02:50:29Z

backend/tests/test_domain_whitelist.py

+    def test_invalid_email(self, temp_config):
+        """Test handling of invalid email addresses."""
+        manager = DomainWhitelistManager(config_path=temp_config)
+
+        assert manager.is_domain_allowed("notanemail") is False
+        assert manager.is_domain_allowed("") is False
+        assert manager.is_domain_allowed("no-at-sign.com") is False
+


Missing test coverage: There's no test case for when a user has an email with a subdomain that should be rejected. For example, testing user@mail.gmail.com when only sandia.gov is whitelisted, to ensure subdomain matching only works for whitelisted parent domains, not for non-whitelisted domains.

Consider adding a test case:

def test_subdomain_of_non_whitelisted_domain(self, temp_config): """Test that subdomains of non-whitelisted domains are rejected.""" manager = DomainWhitelistManager(config_path=temp_config) # gmail.com is not whitelisted, so mail.gmail.com should also be rejected assert manager.is_domain_allowed("user@mail.gmail.com") is False

Copilot · 2025-11-26T02:50:29Z

backend/tests/test_domain_whitelist.py

+    def test_middleware_with_disallowed_domain(self, temp_config, create_middleware):
+        """Test that disallowed domains are blocked."""
+        from starlette.requests import Request
+        from starlette.responses import Response
+
+        middleware = create_middleware(temp_config)
+
+        async def call_next(request):
+            return Response("OK", status_code=200)
+
+        async def test_request():
+            scope = {
+                "type": "http",
+                "method": "GET",
+                "path": "/api/test",
+                "query_string": b"",
+                "headers": [],
+                "state": {},
+            }
+            request = Request(scope)
+            request.state.user_email = "test@gmail.com"
+
+            response = await middleware.dispatch(request, call_next)
+            assert response.status_code == 403
+
+        import asyncio
+        asyncio.run(test_request())


Missing test coverage: The middleware's _unauthorized_response method returns different responses based on whether the path starts with /api/ (JSON 403) or not (redirect 302), but there's no test verifying the redirect behavior for non-API paths.

Consider adding a test case:

def test_middleware_redirect_for_ui_path(self, temp_config, create_middleware): """Test that non-API paths get redirected instead of 403.""" from starlette.requests import Request from starlette.responses import Response, RedirectResponse middleware = create_middleware(temp_config) async def call_next(request): return Response("OK", status_code=200) async def test_request(): scope = { "type": "http", "method": "GET", "path": "/dashboard", # Non-API path "query_string": b"", "headers": [], "state": {}, } request = Request(scope) request.state.user_email = "test@gmail.com" response = await middleware.dispatch(request, call_next) assert isinstance(response, RedirectResponse) assert response.status_code == 302 import asyncio asyncio.run(test_request())

Copilot · 2025-11-26T02:50:29Z

backend/core/domain_whitelist.py

+                    domains.add(domain_entry.get('domain', '').lower())
+                elif isinstance(domain_entry, str):
+                    domains.add(domain_entry.lower())


Potential bug: Empty domain strings can be added to the whitelist set. If a domain entry has an empty domain field (line 77), it will add an empty string to the domains set. This could cause unexpected behavior where emails like user@ (malformed) might incorrectly match against an empty domain.

Add validation to skip empty domains:

for domain_entry in config_data.get('domains', []): if isinstance(domain_entry, dict): domain = domain_entry.get('domain', '').lower().strip() if domain: # Only add non-empty domains domains.add(domain) elif isinstance(domain_entry, str): domain = domain_entry.lower().strip() if domain: # Only add non-empty domains domains.add(domain)

Suggested change

domains.add(domain_entry.get('domain', '').lower())

elif isinstance(domain_entry, str):

domains.add(domain_entry.lower())

domain = domain_entry.get('domain', '').lower().strip()

if domain:

domains.add(domain)

elif isinstance(domain_entry, str):

domain = domain_entry.lower().strip()

if domain:

domains.add(domain)

Copilot · 2025-11-26T02:50:30Z

config/defaults/domain-whitelist.json

+      "domain": "pppl.gov",
+      "description": "Princeton Plasma Physics Laboratory",
+      "category": "National Laboratory"
+    },
+    {


Duplicate domain entry: "pppl.gov" appears twice in the domains list (lines 92-95 and 102-105). This will not cause runtime errors since the domains are stored in a set, but creates unnecessary redundancy in the configuration file and could be confusing for users editing it.

Remove one of the duplicate entries.

Suggested change

"domain": "pppl.gov",

"description": "Princeton Plasma Physics Laboratory",

"category": "National Laboratory"

},

{

Copilot · 2025-11-26T02:50:30Z

backend/main.py

 """
 app.add_middleware(SecurityHeadersMiddleware)
 app.add_middleware(RateLimitMiddleware)
+# Domain whitelist check (if enabled) - add before Auth so it runs after


[nitpick] The comment "add before Auth so it runs after" is potentially confusing. In FastAPI/Starlette middleware, when you call app.add_middleware(), middlewares are registered in reverse order of execution. The middleware added last runs first in the request processing chain.

In this code:

SecurityHeadersMiddleware is added first → runs last (modifies response headers)

RateLimitMiddleware is added second → runs third

DomainWhitelistMiddleware is added third → runs second

AuthMiddleware is added last → runs first

So DomainWhitelistMiddleware actually runs after AuthMiddleware in the request flow, which is correct (it needs request.state.user_email set by AuthMiddleware). However, the comment says "add before Auth so it runs after" which might be confusing since we're adding it before the AuthMiddleware registration.

Consider rewording to: "Add DomainWhitelistMiddleware before AuthMiddleware registration (executes after AuthMiddleware in request flow)" for clarity.

Suggested change

# Domain whitelist check (if enabled) - add before Auth so it runs after

# Domain whitelist check (if enabled) - add before AuthMiddleware registration (executes after AuthMiddleware in request flow)

Copilot · 2025-11-26T02:50:30Z

backend/core/domain_whitelist_middleware.py

+        # Check if domain is allowed
+        if not self.whitelist_manager.is_domain_allowed(email):
+            domain = email.split("@", 1)[1].lower()
+            logger.warning(f"Domain whitelist check failed: unauthorized domain {domain}")


[nitpick] Potential security information disclosure: The middleware logs the rejected domain at WARNING level when a domain is not whitelisted. This could expose information about which domains are attempting to access the system, which might be sensitive in some security contexts.

Consider either:

Lowering this to DEBUG level to reduce exposure in production logs

Using a more generic log message without the specific domain

Ensuring logs are properly secured if this information is intentionally captured for security monitoring

Example:

logger.debug(f"Domain whitelist check failed: unauthorized domain {domain}") # or logger.warning("Domain whitelist check failed: unauthorized domain")

Suggested change

logger.warning(f"Domain whitelist check failed: unauthorized domain {domain}")

logger.warning("Domain whitelist check failed: unauthorized domain")

Initial plan

d9c173e

Copilot AI assigned Copilot and garland3 Nov 25, 2025

Copilot started work on behalf of garland3 November 25, 2025 23:10 View session

Add DOE lab email domain check middleware with feature flag

f0988a3

Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add DOE email check behind a feature flag~~ Add DOE lab email domain restriction middleware with feature flag Nov 25, 2025

Copilot AI requested a review from garland3 November 25, 2025 23:24

Copilot finished work on behalf of garland3 November 25, 2025 23:24

Copilot started work on behalf of garland3 November 26, 2025 01:35 View session

Copilot AI and others added 2 commits November 26, 2025 01:41

Refactor to generic domain whitelist with config file

2092ede

Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>

Add documentation for domain whitelist configuration

97bca0f

Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>

Address code review feedback: optimize and refactor

abdb438

Co-authored-by: garland3 <1162675+garland3@users.noreply.github.com>

Copilot AI changed the title ~~Add DOE lab email domain restriction middleware with feature flag~~ Add generic email domain whitelist middleware with configuration file Nov 26, 2025

Copilot finished work on behalf of garland3 November 26, 2025 01:48

garland3 added 2 commits November 26, 2025 02:25

feat(docs): add email domain whitelist configuration documentation

b585121

chore(config): remove example domain whitelist configuration file

797b58e

github-advanced-security bot found potential problems Nov 26, 2025

View reviewed changes

fix: remove unused imports (Request, TestClient, List)

669044f

- Remove unused Request and TestClient imports from test_domain_whitelist.py - Remove unused List import from domain_whitelist.py - Addresses CodeQL static analysis warnings - All tests passing

garland3 marked this pull request as ready for review November 26, 2025 02:46

Copilot AI review requested due to automatic review settings November 26, 2025 02:46

garland3 merged commit 194a23b into main Nov 26, 2025
10 of 11 checks passed

garland3 deleted the copilot/add-doe-check-feature-flag branch November 26, 2025 02:46

Copilot started reviewing on behalf of garland3 November 26, 2025 02:46 View session

Copilot finished reviewing on behalf of garland3 November 26, 2025 02:49

Copilot AI reviewed Nov 26, 2025

View reviewed changes

-                    domains.add(domain_entry.get('domain', '').lower())
-                elif isinstance(domain_entry, str):
-                    domains.add(domain_entry.lower())
+                    domain = domain_entry.get('domain', '').lower().strip()
+                    if domain:
+                        domains.add(domain)
+                elif isinstance(domain_entry, str):
+                    domain = domain_entry.lower().strip()
+                    if domain:
+                        domains.add(domain)

	# Domain whitelist check (if enabled) - add before Auth so it runs after
	# Domain whitelist check (if enabled) - add before AuthMiddleware registration (executes after AuthMiddleware in request flow)

	logger.warning(f"Domain whitelist check failed: unauthorized domain {domain}")
	logger.warning("Domain whitelist check failed: unauthorized domain")

Add generic email domain whitelist middleware with configuration file #121

Add generic email domain whitelist middleware with configuration file #121

Uh oh!

Conversation

Copilot AI commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

Usage

Enable with default DOE configuration

Custom domain configuration

Configuration Format

Test Coverage

Performance

Documentation

use it

Uh oh!

garland3 commented Nov 26, 2025

Uh oh!

Copilot AI commented Nov 26, 2025

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 26, 2025

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

Recommendations

Uh oh!

github-actions bot commented Nov 26, 2025

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

Recommendations

Uh oh!

github-actions bot commented Nov 26, 2025

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

Recommendations

Uh oh!

github-actions bot commented Nov 26, 2025

🔒 Security Scan Results

Security Scan Summary

Scan Results

Python SAST (Bandit)

Recommendations

Uh oh!

garland3 commented Nov 26, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI commented Nov 25, 2025 •

edited

Loading