Skip to content

Populate Scripts/data_pipeline/.env in cloud sessions via SessionStart hook#18

Draft
lighthousemacro wants to merge 1 commit into
mainfrom
claude/add-env-file-lhm-ecH1l
Draft

Populate Scripts/data_pipeline/.env in cloud sessions via SessionStart hook#18
lighthousemacro wants to merge 1 commit into
mainfrom
claude/add-env-file-lhm-ecH1l

Conversation

@lighthousemacro
Copy link
Copy Markdown
Owner

@lighthousemacro lighthousemacro commented May 1, 2026

Summary

  • Cloud sessions used to start with no Scripts/data_pipeline/.env, so FRED/BLS/BEA fetchers couldn't authenticate without manual setup. This adds a SessionStart hook that mirrors cloud-injected env vars (intended to come from GitHub repository secrets) into the .env file, and also exports them via $CLAUDE_ENV_FILE so the env-var fallback in Scripts/data_pipeline/lighthouse/config.py:11-17 works when python-dotenv isn't installed.
  • The hook is non-destructive — if a given env var isn't set in the session, the existing .env value is preserved (matters for resume/clear/compact re-fires).
  • .gitignore now tracks .claude/hooks/ and .claude/settings.json while continuing to ignore the rest of .claude/. Actual .env values remain gitignored.

How it works

Each session start, the hook walks a known key list (FRED_API_KEY, BLS_API_KEY, BEA_API_KEY, KIMI_API_KEY, TG_API_KEY, plus optional LIGHTHOUSE_DB_PATH / SANTIMENT_API_KEY / DUNE_API_KEY / COINGLASS_API_KEY). For each key:

  1. If the env var is set in the session → write that value to .env and to $CLAUDE_ENV_FILE.
  2. If not → keep whatever value is already on the corresponding line in .env.

To make this useful in CI/cloud, configure each name as a GitHub repository secret so it's injected as an env var in Claude Code on the web sessions. Names should match exactly.

Test plan

  • Hook runs idempotently — second run preserves values written by the first
  • .env file written with correct value lengths for all five user-supplied keys
  • $CLAUDE_ENV_FILE produces sourceable export lines that re-create the env vars in a shell
  • Simulated config.py consumer reads required keys (FRED/BLS/BEA) successfully via env-var fallback
  • git ls-files --others --exclude-standard shows only the hook + settings.json (no .env leakage)
  • Verify on next fresh cloud session that the hook fires and .env is auto-populated from configured GitHub secrets
  • Run python Scripts/data_pipeline/run_pipeline.py --stats against a fresh session to confirm authentication

https://claude.ai/code/session_016GAGsSXbdzM8QmsR2Cmgeu


Generated by Claude Code

Summary by Sourcery

Add a cloud session startup hook to sync data pipeline credentials from environment variables into Scripts/data_pipeline/.env and ensure they are available to non-dotenv consumers.

Enhancements:

  • Introduce a SessionStart shell hook that mirrors configured API key environment variables into Scripts/data_pipeline/.env for cloud sessions while preserving existing values when env vars are unset.
  • Export mirrored API keys into the session via a CLAUDE_ENV_FILE-compatible format so code paths that rely on standard environment variables still function without python-dotenv.
  • Emit warnings on session start when required data pipeline API keys remain unset to surface misconfigured or missing secrets earlier.

Build:

  • Track .claude hook scripts and settings.json in version control while continuing to ignore sensitive .env contents and the rest of the .claude directory.

Cloud sessions previously started with no Scripts/data_pipeline/.env, so
the data pipeline could not authenticate to FRED/BLS/BEA without manual
setup. This adds a SessionStart hook that mirrors GitHub-injected
environment variables into Scripts/data_pipeline/.env (for python-dotenv
consumers) and into $CLAUDE_ENV_FILE (so env-var fallback works when
python-dotenv is not installed). The hook is non-destructive: missing
env vars preserve any existing value in .env across resume/clear/compact
events.

Update .gitignore to track .claude/hooks/ and .claude/settings.json
while keeping the rest of .claude/ ignored. Actual .env values remain
gitignored.

https://claude.ai/code/session_016GAGsSXbdzM8QmsR2Cmgeu
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 1, 2026

Reviewer's Guide

Implements a Claude Code SessionStart hook script that syncs cloud-injected environment variables into Scripts/data_pipeline/.env and an exportable env file, while updating .gitignore and adding Claude configuration files needed for the hook to run in cloud sessions.

Sequence diagram for SessionStart hook populating data_pipeline .env

sequenceDiagram
    actor Developer
    participant GitHubSecrets
    participant ClaudeCloudSession
    participant SessionStartHook
    participant EnvFile as DataPipelineEnvFile
    participant ClaudeEnvFile as ClaudeEnvExportFile
    participant ConfigConsumer as DataPipelineConfig

    Developer->>GitHubSecrets: Configure FRED_API_KEY, BLS_API_KEY, BEA_API_KEY, etc.
    Developer->>ClaudeCloudSession: Start cloud session
    GitHubSecrets-->>ClaudeCloudSession: Inject env vars into session

    ClaudeCloudSession->>SessionStartHook: Trigger SessionStart

    alt Running in remote Claude Code
        SessionStartHook->>SessionStartHook: Check CLAUDE_CODE_REMOTE == true
        SessionStartHook->>EnvFile: Read existing .env values (if file exists)
        loop For each key in KEYS
            SessionStartHook->>ClaudeCloudSession: Read env var for key
            alt Env var set
                SessionStartHook->>EnvFile: Write key=value from session env
            else Env var not set
                SessionStartHook->>EnvFile: Preserve existing key=value
            end
        end
        SessionStartHook->>EnvFile: Atomically replace .env and chmod 600

        opt CLAUDE_ENV_FILE set
            loop For each key in KEYS
                SessionStartHook->>ClaudeEnvFile: Append export key="value"
            end
        end

        SessionStartHook->>SessionStartHook: Check required keys FRED_API_KEY, BLS_API_KEY, BEA_API_KEY
        alt Any required key missing
            SessionStartHook-->>Developer: Print warning to stderr
        end

        SessionStartHook-->>Developer: Print path to written .env
    else Local session
        SessionStartHook-->>ClaudeCloudSession: Exit immediately
    end

    Developer->>ConfigConsumer: Run data pipeline scripts
    ConfigConsumer->>EnvFile: Load .env via python dotenv (if available)
    ConfigConsumer->>ClaudeCloudSession: Fallback to process env vars when dotenv is absent
Loading

Flow diagram for SessionStart hook .env synchronization logic

flowchart TD
    A["Start session-start.sh"] --> B{CLAUDE_CODE_REMOTE == true}
    B -->|No| Z["Exit without changes"]
    B -->|Yes| C["Set ENV_DIR and ENV_FILE paths"]
    C --> D["Create Scripts/data_pipeline directory"]
    D --> E{ENV_FILE exists?}

    E -->|Yes| F["Load existing key=value pairs into map existing"]
    E -->|No| G["Initialize empty existing map"]

    F --> H["Define KEYS array
    FRED_API_KEY, BLS_API_KEY, BEA_API_KEY,
    KIMI_API_KEY, TG_API_KEY,
    LIGHTHOUSE_DB_PATH, SANTIMENT_API_KEY,
    DUNE_API_KEY, COINGLASS_API_KEY"]
    G --> H

    H --> I["Create temporary file"]
    I --> J["Write header comments to temp file"]
    J --> K["For each name in KEYS"]

    K --> L["Resolve value from session env or existing map"]
    L --> M["Append name=value line to temp file"]
    M --> N{More KEYS?}
    N -->|Yes| K
    N -->|No| O["Move temp file to ENV_FILE
    and chmod 600"]

    O --> P{CLAUDE_ENV_FILE set?}
    P -->|No| S["Skip export file population"]
    P -->|Yes| Q["For each name in KEYS"]

    Q --> R["Resolve value, skip if empty,
    escape and append export line to CLAUDE_ENV_FILE"]
    R --> T{More KEYS?}
    T -->|Yes| Q
    T -->|No| S

    S --> U["Check required keys
    FRED_API_KEY, BLS_API_KEY, BEA_API_KEY"]
    U --> V{Any required key empty?}
    V -->|Yes| W["Print warnings to stderr"]
    V -->|No| X["No warnings"]

    W --> Y["Print success message with ENV_FILE path"]
    X --> Y
    Y --> AA["End script"]
Loading

File-Level Changes

Change Details Files
Add a SessionStart shell hook that keeps Scripts/data_pipeline/.env in sync with cloud session environment variables and exports them into a sourceable env file.
  • Introduce a bash script that exits early for non-remote sessions using CLAUDE_CODE_REMOTE guard.
  • Define a fixed list of API/database-related keys and load any existing key-value pairs from the current .env file into an associative array.
  • Implement a resolve helper that prefers current process environment variables but falls back to previously stored .env values.
  • Rewrite Scripts/data_pipeline/.env on each session start with resolved values for all known keys, preserving existing values when the corresponding env var is unset.
  • Set restrictive permissions on the regenerated .env file and append properly escaped export lines for non-empty keys to the path referenced by CLAUDE_ENV_FILE, when provided.
  • Emit warnings to stderr when required keys (FRED_API_KEY, BLS_API_KEY, BEA_API_KEY) remain empty and log a completion message showing the written .env path.
.claude/hooks/session-start.sh
Add Claude configuration file and adjust gitignore to track Claude hook/settings while keeping secrets untracked.
  • Create a .claude/settings.json file to enable Claude Code to recognize and run the session-start hook in cloud sessions.
  • Update .gitignore so that .claude/hooks/ and .claude/settings.json are tracked while the rest of the .claude/ directory, including any secret-bearing files, remains ignored.
.claude/settings.json
.gitignore

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants