Skip to content

Support explicit-port GitHub URLs#139

Open
turian wants to merge 1 commit intonephila:masterfrom
turian:github-port-support
Open

Support explicit-port GitHub URLs#139
turian wants to merge 1 commit intonephila:masterfrom
turian:github-port-support

Conversation

@turian
Copy link
Copy Markdown

@turian turian commented Mar 18, 2026

Summary

  • parse() returns incomplete metadata for GitHub URLs with explicit ports (e.g. https://github.com:8443/owner/repo.git) — the port causes domain mismatch, so the URL falls through to BasePlatform and loses owner/name extraction
  • GitHub HTTPS, SSH, and git regex patterns now capture the port separately from the domain
  • Added port_colon format helper in result.py for correct URL reconstruction (:8443 when port is set, empty otherwise)
  • Ports are preserved across protocol rewrites (url2ssh, url2https, url2git)

Minimal reproduction

from giturlparse import parse

portless = parse("https://github.com/owner/repo.git")
with_port = parse("https://github.com:8443/owner/repo.git")

print("portless:", portless.valid, portless.owner, portless.name)
# portless: True owner repo

print("with_port:", with_port.valid, with_port.owner, with_port.name)
# Before fix: True  owner/repo.git   (falls through to BasePlatform)
# After fix:  True owner repo        (correctly parsed as GitHub)

Test plan

  • All existing tests pass (no regressions)
  • HTTPS with port: https://github.com:8443/Org/Repo.git → github, port=8443
  • GIT with port: git://github.com:9418/Org/Repo.git → github, port=9418
  • SSH URL-style with port: ssh://git@github.com:2222/Org/Repo.git → github, port=2222
  • SCP-style without port unchanged: git@github.com:Org/Repo.git → github, port=""
  • Blob/tree paths with port preserve path extraction
  • Access token + port: https://user:token@github.com:8443/Org/Repo.git
  • gist.github.com with port stays platform=github
  • Malformed ports (github.com:abc) don't match GitHub
  • Protocol rewrites preserve port across https↔ssh↔git
  • Component rewrites (name, owner) preserve port

🤖 Generated with Claude Code

GitHub URL patterns now correctly parse URLs with explicit ports like
https://github.com:8443/owner/repo.git, which previously fell through
to BasePlatform and lost owner/name extraction.

Changes:
- GitHub HTTPS/SSH/git regex patterns now capture port separately from domain
- Added port_colon format helper in result.py for URL reconstruction
- GitHub FORMATS updated to preserve ports across protocol rewrites
- Tests: 11 new parse cases, 9 rewrite cases, component rewrite with port

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@protoroto
Copy link
Copy Markdown
Member

@turian Hi, thanks for opening this pr! In order to make the CI pass may I ask you to:

  • open an issue (if it's not already opened)
  • add a changes/<number-of-issue-created>.feature file with a brief explaination of what's in this pr, so that when it will be merged and released this message will appear in the changelog
    Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants