Skip to content

Conversation

@MrFlynn
Copy link
Contributor

@MrFlynn MrFlynn commented Mar 29, 2025

Previously this action was limited to HTTP-style URLs (e.g. https://example.com/repo.git), but Git supports a variety of URL schemas across different protocols, like SSH, a popular alternative method supported by Github. This PR seeks to allow this action to work on URLs in .gitmodules with many different URL formats.

This PR is a follow on from #74. It replaces the existing validation that strictly validates HTTP-style URLs with the same logic that Git uses to validate URLs.

Additionally, because URLs no longer follow a strict format, the logic used to extract the remote name was changed. Due to URLs now being so varied, it uses a more dynamic, "best guess" method of identifying the remote name. This component has been tested fairly extensively and it passed all preexisting tests, but it's worth calling out as it is a non-trivial logic change.

MrFlynn added 2 commits March 27, 2025 22:14
These aren't strictly "URLs" in the web sense. They may be a variety of
formats and don't strictly conform to HTTP-style URLs, like SSH style
URLs. This change aligns the validation with Git's own URL validation.
While the validation doesn't necessarily gaurantee that the URL is
actually valid, calling the checkout action (which would be necessarily
to use this action anyway) would fail before this if it couldn't clone a
submodule due to a bad URL.

- Switched from `.url()` to `.regex()` string validator using the regex
  present in url.c[1] in Git's source code.
- Added unit test and test fixture to check compatibility with SSH-style
  URLs.

[1]:
https://github.com/git/git/blob/77bd3ea9f54f1584147b594abc04c26ca516d987/url.c#L6-L12
Since the action no longer accepts HTTP-style URLs, we need to account
for more cases. This commit adds a function that attempts a "best guess"
remote name extraction. It looks for specific characters in the URL that
probably demarcate the remote name. It's not a perfect solution, but it
works well in the vast majority of cases.

- Added `getRemoteName` function to get remote name.
- Added unit tests with a bunch of example cases that cover a gamut of
  different URL formats. The test cases were taken from this Stack
  Overflow post: https://stackoverflow.com/questions/31801271/what-are-the-supported-git-url-formats
@sgoudham sgoudham changed the title Expand URL Schema Support fix: support multiple URL protocols Mar 29, 2025
Copy link
Owner

@sgoudham sgoudham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the wait on this.

I was quite hesitant on accepting the changes to the remoteName logic but it has an extensive test suite which I'm happy with.

I've made some nitpicky modifications to the test setup so hope you don't mind that. All in all, it looks good! Thanks for taking the time to contribute, really appreciated!

@sgoudham sgoudham merged commit 2ffe311 into sgoudham:main Apr 1, 2025
1 check passed
@sgoudham
Copy link
Owner

sgoudham commented Apr 2, 2025

Actually, I might have to revert this one since I've just tested from the main branch and it's failing on the validation 🤔

https://github.com/sgoudham/update-git-submodules-test/actions/runs/14208637405/job/39811626312

Arguably should have carried out this test before merging but hindsight is 20/20!

Ignore me, totally forgot that GitHub Actions need to publish their compiled code into dist. Rebased the commit on main to also add the changes into dist.

sgoudham added a commit that referenced this pull request Apr 2, 2025
* Loosened validation of submodule URL.

These aren't strictly "URLs" in the web sense. They may be a variety of
formats and don't strictly conform to HTTP-style URLs, like SSH style
URLs. This change aligns the validation with Git's own URL validation.
While the validation doesn't necessarily gaurantee that the URL is
actually valid, calling the checkout action (which would be necessarily
to use this action anyway) would fail before this if it couldn't clone a
submodule due to a bad URL.

- Switched from `.url()` to `.regex()` string validator using the regex
  present in url.c[1] in Git's source code.
- Added unit test and test fixture to check compatibility with SSH-style
  URLs.

[1]:
https://github.com/git/git/blob/77bd3ea9f54f1584147b594abc04c26ca516d987/url.c#L6-L12

* Improved remote name extraction.

Since the action no longer accepts HTTP-style URLs, we need to account
for more cases. This commit adds a function that attempts a "best guess"
remote name extraction. It looks for specific characters in the URL that
probably demarcate the remote name. It's not a perfect solution, but it
works well in the vast majority of cases.

- Added `getRemoteName` function to get remote name.
- Added unit tests with a bunch of example cases that cover a gamut of
  different URL formats. The test cases were taken from this Stack
  Overflow post: https://stackoverflow.com/questions/31801271/what-are-the-supported-git-url-formats

* chore: delete helper function & rename test

---------

Co-authored-by: sgoudham <sgoudham@gmail.com>
@MrFlynn
Copy link
Contributor Author

MrFlynn commented Apr 2, 2025

No problem. I realize that was a pretty complicated change, but the old logic didn't really work once other URL formats get introduced. It's not something I had anticipated which is why I added a bunch of tests for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants