Skip to content

fix: marketplace build respects GITHUB_HOST for GHE repos#1009

Draft
sergio-sisternes-epam wants to merge 1 commit intomainfrom
fix/1008-marketplace-build-ghe
Draft

fix: marketplace build respects GITHUB_HOST for GHE repos#1009
sergio-sisternes-epam wants to merge 1 commit intomainfrom
fix/1008-marketplace-build-ghe

Conversation

@sergio-sisternes-epam
Copy link
Copy Markdown
Collaborator

Description

apm marketplace build hardcoded github.com in four places, so GITHUB_HOST had no effect on ref resolution, token lookup, or metadata fetch. This PR threads the existing default_host() / build_https_clone_url() / AuthResolver pattern (already used by apm install) through the marketplace build pipeline.

Fixes #1008

Changes

ref_resolver.py -- RefResolver accepts an optional host parameter (defaults to GITHUB_HOST or github.com). Both list_remote_refs() and resolve_ref_sha() use build_https_clone_url() instead of hardcoded github.com.

builder.py -- MarketplaceBuilder stores a normalised host and HostInfo:

  • _resolve_github_token() resolves against the configured host, not "github.com"
  • _fetch_remote_metadata() uses the GitHub REST API for GHES/GHE Cloud (since raw.githubusercontent.com is github.com-only), skips metadata for non-GitHub hosts, and short-circuits tokenless GHE Cloud requests
  • AuthResolver import moved to top of try block to fix a scoping issue when auth_resolver is pre-injected

Tests -- 8 new tests covering GHE host resolution, metadata fetch paths (GHES REST API, non-GitHub skip, GHE Cloud no-token skip), and builder host env. Existing URL assertions updated to use urlparse.

Docs -- CHANGELOG entry, marketplace-authoring guide GHES section, apm-usage skill resource update.

Type of change

  • Bug fix
  • New feature
  • Documentation
  • Maintenance / refactor

Testing

  • Tested locally
  • All existing tests pass (6,549 passed)
  • Added tests for new functionality (8 new tests)

Thread the existing default_host() / build_https_clone_url() / AuthResolver
pattern (used by apm install) through the marketplace build pipeline.

Changes:
- RefResolver: accept optional host parameter, use build_https_clone_url()
  instead of hardcoded github.com for git ls-remote URLs
- MarketplaceBuilder: resolve tokens against configured host, use REST API
  for metadata fetch on GHES/GHE Cloud (raw.githubusercontent.com is
  github.com-only), skip metadata for non-GitHub hosts
- Fix AuthResolver import scoping so classify_host() works when
  auth_resolver is pre-injected
- Add GHE Cloud early-exit when no token (avoids pointless 401)

Tests:
- Update URL assertions to use urlparse (test convention)
- Add 4 RefResolver GHE host tests
- Add 3 metadata fetch path tests (GHES REST API, non-GitHub skip,
  GHE Cloud no-token skip)
- Add builder host env test

Docs:
- CHANGELOG: Fixed entry under [Unreleased]
- marketplace-authoring guide: GHES section
- apm-usage authentication skill: marketplace build example

Closes #1008

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@sergio-sisternes-epam sergio-sisternes-epam added the panel-review Trigger the apm-review-panel gh-aw workflow label Apr 27, 2026
@github-actions
Copy link
Copy Markdown

APM Review Panel Verdict

Disposition: APPROVE (two minor pre-merge suggestions; neither is a blocker)


Per-persona findings

Python Architect:

This is a routine bug-fix PR: two existing classes (MarketplaceBuilder, RefResolver) receive a host parameter; no new abstractions, no hierarchy changes. One class diagram + one flow diagram applies.

OO / class diagram

classDiagram
    direction LR
    class MarketplaceBuilder {
        <<Builder>>
        +_host: str
        +_host_info: Optional[object]
        +_github_token: Optional[str]
        +_get_resolver() RefResolver
        +_resolve_github_token() Optional[str]
        +_fetch_remote_metadata(pkg) Optional[Dict]
        +build() BuildResult
    }
    class RefResolver {
        <<Service>>
        +_host: str
        +list_remote_refs(owner_repo) List[RemoteRef]
        +resolve_ref_sha(owner_repo, ref) str
    }
    class AuthResolver {
        <<Strategy>>
        +classify_host(host) HostInfo
        +resolve(host) AuthContext
    }
    class HostInfo {
        <<ValueObject>>
        +kind: str
        +api_base: str
    }
    class AuthContext {
        <<ValueObject>>
        +token: str
        +source: str
    }
    class default_host {
        <<Pure>>
        +default_host() str
        +build_https_clone_url(host, repo) str
    }
    class ResolvedPackage {
        <<ValueObject>>
        +source_repo: str
        +sha: str
        +subdir: Optional[str]
    }
    MarketplaceBuilder *-- RefResolver : creates lazily
    MarketplaceBuilder ..> AuthResolver : classify_host and resolve
    MarketplaceBuilder ..> HostInfo : stores as _host_info
    MarketplaceBuilder ..> ResolvedPackage : reads in _fetch_remote_metadata
    MarketplaceBuilder ..> default_host : reads host at init
    RefResolver ..> default_host : reads host at init
    AuthResolver ..> HostInfo : returns
    AuthResolver ..> AuthContext : returns
    class MarketplaceBuilder:::touched
    class RefResolver:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading

Execution flow diagram

flowchart TD
    A["apm marketplace build\ncli.py"] --> B["MarketplaceBuilder.__init__()\nbuilder.py\n_host = default_host() or 'github.com'"]
    B --> C["_prefetch_metadata(resolved)\nbuilder.py:589"]
    C --> D["_resolve_github_token()\nbuilder.py:547\nsets _host_info AND _github_token"]
    D --> E["[NET] AuthResolver.classify_host(self._host)\nsrc/apm_cli/core/auth.py:134\nreturns HostInfo(kind, api_base)"]
    D --> F["[NET] resolver.resolve(self._host)\nreturns AuthContext.token"]
    F --> G["pool.submit(_fetch_remote_metadata, pkg)\nbuilder.py:553\nfor each resolved package"]
    G --> H{"host_kind?"}
    H -->|"not github/ghe_cloud/ghes"| I["logger.debug skip\nreturn None"]
    H -->|"ghe_cloud and no token"| J["logger.debug skip\nreturn None"]
    H -->|"self._host == 'github.com'"| K["[NET] raw.githubusercontent.com\n/{source_repo}/{sha}/{path}/apm.yml\nurllib.request.urlopen"]
    H -->|"ghes or ghe_cloud+token"| L["[NET] {api_base}/repos/{source_repo}\n/contents/{file}?ref={sha}\nAccept: application/vnd.github.raw\nurllib.request.urlopen"]
    K --> M["yaml.safe_load(raw) -> dict"]
    L --> M
    M --> N["return metadata dict"]
Loading

Design patterns

  • Used in this PR: Lazy initialization -- _host_info is populated as a side effect of _resolve_github_token() (called once before the thread pool in _prefetch_metadata()). _get_resolver() already used this pattern; _host_info extends it consistently.
  • Pragmatic suggestion: Move AuthResolver.classify_host(self._host) into __init__() (directly after self._host is set) rather than as a side effect of _resolve_github_token(). classify_host is a pure, cheap operation and its placement inside a method named "resolve token" is surprising. No new abstraction needed -- one line moved up. This eliminates the Optional[object] None-guard in _fetch_remote_metadata and makes the class invariant explicit: _host_info is always populated after __init__.

CLI Logging Expert: All new log calls use logger.debug() at library layer -- correct. No _rich_* or CommandLogger calls introduced in builder or resolver. The two new debug messages ("Skipping metadata fetch for %s (non-GitHub host: %s)" and "Skipping metadata fetch for %s (GHE Cloud requires auth)") follow the "named thing, reason" style and pass the verbose-mode "So What?" test. No user-visible output changes. No issues.


DevX UX Expert: This is a silent behavior fix -- no new flags, no command surface changes. The key UX property preserved: GITHUB_HOST set once, all apm commands obey it. Previously apm install respected it but apm marketplace build did not; now the mental model is consistent. The marketplace-authoring.md addition is clean: 2-line runnable example, cross-link to the authentication docs. The apm-usage/authentication.md skill update is correctly scoped (one line showing apm marketplace build uses the same env convention). No new flags to document in cli-commands.md (no surface change). No blocking issues.


Supply Chain Security Expert: Reviewed against the threat model:

  • Identity: Metadata URLs use pkg.source_repo and pkg.sha from the already-resolved ResolvedPackage (sourced from the lockfile). The SHA-pinning integrity model is untouched.
  • Integrity: apm.yml metadata is informational enrichment only; it does not affect install integrity decisions.
  • Token scope: resolver.resolve(self._host) routes through AuthResolver for the configured host -- correct. Token appears only in the Authorization header, not in any URL or log line.
  • Fail closed: Non-GitHub hosts return None (skip); GHE Cloud without a token returns None (skip). Both fail closed without error -- appropriate since metadata is optional enrichment.
  • api_base fallback construction: f"https://{self._host}/api/v3" when api_base is not set. self._host comes from os.environ.get("GITHUB_HOST", "github.com"). This is not a path traversal vector (network, not filesystem). No path security guard is needed here.
  • One observation: the _host_info is None fallback in _fetch_remote_metadata defaults host_kind to "github", meaning a non-GitHub host could reach the URL-construction branch if called out of sequence. In the production path this cannot happen (the call order is guaranteed via _prefetch_metadata). Moving classify_host to __init__ (per the Python Architect suggestion) would make this invariant structural rather than relying on call order.

No new supply-chain surface opened.


Auth Expert: Activated -- the PR changes resolver.resolve("github.com") to resolver.resolve(self._host) and introduces AuthResolver.classify_host(self._host).

  • Token resolution: Correct fix. Previously token resolution always happened against github.com even with GITHUB_HOST=corp.ghe.com. Now it routes through the full precedence chain (GITHUB_APM_PAT_{ORG} -> GITHUB_APM_PAT -> GITHUB_TOKEN -> GH_TOKEN -> git credential fill) for the configured host, consistent with resolve_for_dep().
  • AuthResolver import scoping: The lazy from ..core.auth import AuthResolver was moved to the top of the try block. This is the correct fix for the scoping issue when auth_resolver is pre-injected -- classify_host() can now be called regardless of whether a custom resolver was provided.
  • Thread safety: _host_info is set with if self._host_info is None: guard. Since _resolve_github_token() is called from the main thread before executor.submit(), there is no TOCTOU risk in the current implementation.
  • One concern (minor): self._host_info: Optional[object] should be Optional["HostInfo"] (with a TYPE_CHECKING import). The weak object type means type checkers cannot catch incorrect attribute access on this field. Suggested fix: add from typing import TYPE_CHECKING block with if TYPE_CHECKING: from ..core.auth import HostInfo and annotate _host_info: Optional["HostInfo"].
  • AuthResolver precedence invariant: Unchanged. The precedence diagram in docs/getting-started/authentication.md does not need updating -- the PR adds a call site, not a new strategy.

OSS Growth Hacker: This fix completes APM's GHES story: apm install already respected GITHUB_HOST; now apm marketplace build does too. For enterprise teams that use APM on GHES, this removes a silent failure mode that was invisible until build time. The CHANGELOG entry is clean and story-shaped. The doc addition gives enterprise users a self-contained 2-line recipe. Side-channel to CEO: once merged, this is worth a release-note beat that frames APM's complete GHE support posture -- "APM now fully supports GitHub Enterprise across install and marketplace build workflows" -- with a concrete GITHUB_HOST example. This is directly on the enterprise conversion surface and should be included in the next v0.10.x or v0.11.0 release announcement.


CEO arbitration

Specialists agree: this is a correct, well-tested, well-documented bug fix from an external contributor. The two minor suggestions (move classify_host to __init__, fix Optional[object] type annotation) are non-blocking quality improvements. Neither changes behavior in the production path -- _prefetch_metadata() guarantees _resolve_github_token() runs before _fetch_remote_metadata() -- but both make the code easier to reason about and type-safe. The PR is in draft; the author should address these before marking ready. No specialist disagreements to arbitrate. The Growth Hacker's release-beat note is filed for the maintainer's release planning (not a merge gate). Disposition: APPROVE pending the two suggestions below.


Required actions before merge

  1. Move classify_host to __init__ (src/apm_cli/marketplace/builder.py): After self._host: str = default_host() or "github.com" (line ~154), add self._host_info = AuthResolver.classify_host(self._host) (requires the lazy import to become an eager import, or use TYPE_CHECKING for the type hint and keep the lazy import). Remove the if self._host_info is None: guard in _resolve_github_token() and the if self._host_info else "github" fallback in _fetch_remote_metadata(). This makes the class invariant explicit and eliminates silent fallback behavior.

  2. Fix _host_info type annotation (src/apm_cli/marketplace/builder.py, line ~155): Change self._host_info: Optional[object] = None to self._host_info: Optional["HostInfo"] = None with a TYPE_CHECKING guard: if TYPE_CHECKING: from ..core.auth import HostInfo. This is a one-line change and ensures type checkers can validate _host_info.kind and _host_info.api_base access.


Optional follow-ups

  • Once merged, include a release-note beat for the next release that tells the complete GHE support story across apm install and apm marketplace build -- the Growth Hacker flags this as a concrete enterprise conversion surface.
  • Future: if a third host-routing branch is added to _fetch_remote_metadata (e.g., Bitbucket Server), consider a small HostMetadataStrategy abstraction; at 3 branches inline is still appropriate.
  • The _resolve_github_token() method now does two things (token resolution + host classification). If it grows further, consider splitting into _init_host_info() and _resolve_github_token(). Not needed now.

Generated by PR Review Panel for issue #1009 · ● 641.3K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

panel-review Trigger the apm-review-panel gh-aw workflow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] apm marketplace build fails for GitHub Enterprise Server repositories

1 participant