Skip to content

Route checkpoints to provider host when origin protocol is non-derivable#1279

Merged
pjbgf merged 3 commits into
mainfrom
mirror
May 27, 2026
Merged

Route checkpoints to provider host when origin protocol is non-derivable#1279
pjbgf merged 3 commits into
mainfrom
mirror

Conversation

@pjbgf
Copy link
Copy Markdown
Member

@pjbgf pjbgf commented May 27, 2026

https://entire.io/gh/entireio/cli/trails/435

When a checkpoint_remote is configured but the origin/push remote uses a
protocol that can't be mapped to a git transport (e.g. entire://, file://),
PushURL/FetchURL fell back to pushing checkpoints to the origin remote. For
an entire:// origin this meant the entire/checkpoints/v1 branch was pushed
to the entire:// remote helper, which doesn't host it — the push failed
non-fast-forward, the recovery sync hit its 2-minute timeout, and the
git-remote-entire helper wedged on shutdown.

The token path already handled this correctly via providerHost; the no-token
path did not. Add providerCheckpointURL as a shared fallback so a configured
checkpoint_remote with a known provider (github/gitlab) routes to that
provider's canonical host over HTTPS regardless of the origin scheme.

Plain local-path origins (no parseable URL) still fall back to origin: they
fail at ParseURL, an earlier branch this change does not touch. Fork
detection (owner mismatch) still runs before the fallback.

Whenever available, use entiredb-original-url to determine the correct authentication method to be used.


Note

Medium Risk
Changes where checkpoint git pushes/fetches target and how auth scheme is chosen; mistakes could misroute or break checkpoint sync for mirrored entire:// setups.

Overview
When checkpoint_remote is set but origin/push uses a non-git transport (entire://, file://, etc.), FetchURL and PushURL no longer fall back to pushing checkpoints at that remote. They call resolveProviderCheckpointURL, which builds the configured provider repo URL using transport precedence: saved remote.<name>.entiredb-original-url, then ENTIRE_CHECKPOINT_TOKEN → HTTPS on the provider host, then an existing remote on that host, else SSH. Unknown providers still fall back to origin.

ENTIRE_CHECKPOINT_TOKEN coercion to HTTPS now runs only for ssh/https remotes so entire:// hosts are not misused; non-derivable push remotes use the same provider fallback (push stays enabled when it applies).

gitremote.ParseURL understands entire:// by stripping the leading forge segment from the path so owner/repo parse correctly. Tests cover entire:///file:// routing, token behavior, and pre-mirror URL vs other remotes.

Reviewed by Cursor Bugbot for commit 904f279. Configure here.

pjbgf added 3 commits May 27, 2026 17:18
When a checkpoint_remote is configured but the origin/push remote uses a
protocol that can't be mapped to a git transport (e.g. entire://, file://),
PushURL/FetchURL fell back to pushing checkpoints to the origin remote. For
an entire:// origin this meant the entire/checkpoints/v1 branch was pushed
to the entire:// remote helper, which doesn't host it — the push failed
non-fast-forward, the recovery sync hit its 2-minute timeout, and the
git-remote-entire helper wedged on shutdown.

The token path already handled this correctly via providerHost; the no-token
path did not. Add providerCheckpointURL as a shared fallback so a configured
checkpoint_remote with a known provider (github/gitlab) routes to that
provider's canonical host over HTTPS regardless of the origin scheme.

Plain local-path origins (no parseable URL) still fall back to origin: they
fail at ParseURL, an earlier branch this change does not touch. Fork
detection (owner mismatch) still runs before the fallback.

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Paulo Gomes <paulo@entire.io>
Entire-Checkpoint: c8416fcad5be
…origins

With ENTIRE_CHECKPOINT_TOKEN set, PushURL coerced the push remote to HTTPS
while keeping its host. For an entire:// origin that produced a bogus target
(https://app.entire.io/<owner>/checkpoints.git) — the entire cluster host
isn't a usable HTTPS host. FetchURL already avoided this by targeting the
provider's canonical host.

Only coerce derivable (ssh/https) protocols to HTTPS; leave non-derivable
schemes (entire://, file://) untouched so they fall through to the
providerCheckpointURL fallback (github.com/gitlab.com over HTTPS). Coercion
for real git transports is unchanged, so enterprise installations keep
pushing to their own host.

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Paulo Gomes <paulo@entire.io>
Entire-Checkpoint: 044a43137ce2
Real-world failure: after `entire-repo mirror use` switched origin to
entire://aws-eu-central-1.entire.io/gh/entireio/cli, `git push` hung on
"Pushing entire/checkpoints/v1 to origin". ParseURL read the /gh/ forge prefix
as the owner ("gh"), so PushURL's fork-detection saw owner "gh" != checkpoint
owner "entireio" and bailed checkpoints to the entire:// origin.

Two changes:

1. ParseURL strips the forge/namespace prefix on entire:// URLs
   (/gh/<owner>/<repo>, /et/<project>/<repo>, legacy /git/...), so Owner/Repo
   reflect the real repository and fork-detection compares the true owner.

2. The provider fallback (used when the remote protocol isn't a git transport)
   now picks the checkpoint transport from what's already configured for the
   endpoint, resolved via go-git (no git shell-out):
     1. remote.<name>.entiredb-original-url — the URL the remote had before
        `entire-repo mirror use` switched it to entire://. Reused verbatim
        (host + scheme + port); the most faithful record of the user's auth.
     2. ENTIRE_CHECKPOINT_TOKEN set -> HTTPS on the provider host.
     3. an existing remote on the provider host -> reuse its scheme.
     4. otherwise SSH.

   Replaces the prior HTTPS-only providerCheckpointURL and the exec-based
   ListRemoteNames / credential-helper probe.

For the reported repo this routes checkpoints to git@github.com:entireio/
cli-checkpoints.git over SSH (matching the saved pre-mirror URL) instead of
hanging against the entire:// remote.

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Paulo Gomes <paulo@entire.io>
Entire-Checkpoint: a5408ab33453
Copilot AI review requested due to automatic review settings May 27, 2026 16:20
@pjbgf pjbgf requested a review from a team as a code owner May 27, 2026 16:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes checkpoint remote resolution when the origin/push remote uses a non-derivable scheme (e.g. entire://, file://) by routing checkpoint push/fetch to the configured provider’s canonical host instead of falling back to the origin helper URL. This aligns the non-token path with the existing token-based provider-host behavior and reduces sync failures for mirrored entire:// origins.

Changes:

  • Teach gitremote.ParseURL to parse entire:// remotes by stripping the forge prefix segment (e.g. /gh/...).
  • Add a provider-host fallback (resolveProviderCheckpointURL) for PushURL/FetchURL when checkpoint URL derivation fails due to non-derivable origin/push protocols.
  • Add tests covering entire:///file:// origins and scheme selection precedence (saved pre-mirror URL, token, existing remote, default SSH).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
cmd/entire/cli/gitremote/gitremote.go Adds entire:// parsing support via forge-prefix stripping.
cmd/entire/cli/gitremote/gitremote_test.go Adds coverage for entire:// parsing behavior.
cmd/entire/cli/checkpoint/remote/util.go Adds provider-host fallback resolution and transport selection logic for non-derivable remotes.
cmd/entire/cli/checkpoint/remote/util_test.go Adds regression tests for entire:///file:// origins and transport precedence.
Comments suppressed due to low confidence (1)

cmd/entire/cli/checkpoint/remote/util.go:460

  • deriveTokenOriginURL() will now successfully parse entire:// remotes (because ParseURL strips the forge prefix), and then synthesize an https:////.git URL. When ENTIRE_CHECKPOINT_TOKEN is set but checkpoint_remote is missing/unknown (or settings load fails), this can cause PushURL/FetchURL fallback behavior to misroute from the original entire:// remote helper to a likely-nonexistent HTTPS endpoint on the Entire host.
func deriveTokenOriginURL(originURL string) (string, bool) {
	info, err := gitremote.ParseURL(originURL)
	if err != nil {
		return "", false
	}
	if info.Host == "" || info.Owner == "" || info.Repo == "" {
		return "", false
	}
	// Keep the port only when the source was already HTTPS. SSH ports
	// (e.g., :2222) don't map to HTTPS ports on the same host.
	hostPort := info.Host
	if info.Protocol == ProtocolHTTPS {
		hostPort = info.HostPort()
	}
	return fmt.Sprintf("https://%s/%s/%s.git", hostPort, info.Owner, info.Repo), true
}

@pjbgf pjbgf enabled auto-merge May 27, 2026 16:41
@pjbgf pjbgf merged commit 7867e54 into main May 27, 2026
11 checks passed
@pjbgf pjbgf deleted the mirror branch May 27, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants