Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize Github URLs before comparing #365

Merged
merged 6 commits into from
Jan 13, 2023

Conversation

Leonidas-from-XIV
Copy link
Member

This PR tries to normalize the dev-repo URLs somewhat. @TheLortex reported in #118 that the .git is differing between the repos so this PR tries to normalize the URLs:

  1. Always have .git in the repo path
  2. Always use git+https as scheme
  3. Always strip fragments (No more #main)

This is just a workaround since there is no way to make sure the URLs are equivalent without visiting them (especially the .git addition is a bit dodgy), but it should at least cover some common cases and in the worst case the canonicalization gives up and returns the input, thus equivalent to what was there before.

Also turns out there was a Uri_utils.has_git_extension which I assume is somewhat related to the issue at hand, but it was unused so I removed it.

Closes #118

Copy link
Collaborator

@gridbugs gridbugs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does opam-monorepo make the assumption that all dev-repos are git repos?

lib/uri_utils.ml Outdated
let fpath =
match Fpath.has_ext ".git" fpath with
| true -> fpath
| false -> Fpath.add_ext ".git" fpath
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that the uri refers to a non-git repo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory it can, but

  1. Most dev-repos these days are git and even on github.
  2. It doesn't matter all that much because this is only to make the URLs canonical for comparison, so https://hghub.com/foo/bar.hg would be rewritten to https+git://hghub.com/foo/bar.hg.git which might not be a valid URL but will compare equal to itself.

The canonicalization process adds .git and rewrites schemes to git+https but if you'd rather have it remove .git and rewrite git+https:// and git:// to https:// that's also totally ok with me.

@Leonidas-from-XIV
Copy link
Member Author

I've decided to wrap the normalization in an abstract type to which the conversion only goes one way, to make sure the canonical URLs (whatever they might be) aren't used for other purposes.

Copy link
Collaborator

@emillon emillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be on the safe side, I think it's better to only do this canonicalization for known hosts (for example we know that github routes user/repo.git and user/repo to the same place but that's not necessarily the case for all hosts. Actually an alternative technique that might be easier (rather than canonicalization) is to parse the dev-repo field into Github of user:string;repo:string|...|Other of Uri.t.

This makes sure that the conversion is a one-way street and the
canonical URLs, whatever they might be will not be used for any other
purposes but `pp` and (more importantly) `equal`.
@Leonidas-from-XIV
Copy link
Member Author

@emillon You're right, that's a good point. I've simplified the whole thing to just parse user/repo out of github.com URLs and keep the rest the same.

It might make sense to remove the hash/branch, but in practice I doubt someone would specify a git repo with a hash as dev-repo.

@Leonidas-from-XIV Leonidas-from-XIV changed the title Normalize to a canonical URL before comparing URLs Normalize Github URLs before comparing Jan 12, 2023
@Leonidas-from-XIV Leonidas-from-XIV merged commit 5557a32 into tarides:main Jan 13, 2023
@Leonidas-from-XIV Leonidas-from-XIV deleted the canonical-url branch January 13, 2023 08:49
Leonidas-from-XIV added a commit to Leonidas-from-XIV/opam-repository that referenced this pull request May 3, 2023
CHANGES:

### Added

- Display warning when a package to be locked is missing a `dev-repo` field and
  is being skipped because of it (tarides/opam-monorepo#341, tarides/opam-monorepo#362, @kit-ty-kate, @Leonidas-from-XIV)
- Add option `--keep-symlinked-dir` to preserve symlinks in `duniverse/`, which
  can be useful for local development. (tarides/opam-monorepo#348, tarides/opam-monorepo#366, @hannesm,
  @Leonidas-from-XIV)

### Changed

- Canonicalize the URLs of the OPAM `dev-repo` fields to be able to detect more
  semantically equivalent URLs, this should reduce the risk of build failures
  due to duplicate code pulled (tarides/opam-monorepo#118, tarides/opam-monorepo#365 @TheLortex, @Leonidas-from-XIV)

- Simple the error message printed when dependencies don't use dune as their
  build system. The opam-0install diagnostic message is no longer printed in
  this case and the message has been reformatted and reworded to make the
  salient information easier to see. (tarides/opam-monorepo#384, @gridbugs)

### Fixed

- Error in case where multiple packages with different dev-repo fields would be
  placed in the same duniverse directory (tarides/opam-monorepo#377, @gridbugs)

- Fix a failure when using opam-monorepo with an opam 2.2 root
  (tarides/opam-monorepo#379, @kit-ty-kate)

- Fix assertion failure when prefix of "lock" subcommand is used (tarides/opam-monorepo#381,
  @gridbugs)

- Treat packages without build commands as virtual only if also lack install
  commands, as some non-virtual packages might only have install commands.
  (tarides/opam-monorepo#376 @Leonidas-from-XIV, @gridbugs)
Leonidas-from-XIV added a commit to Leonidas-from-XIV/opam-repository that referenced this pull request Sep 11, 2023
CHANGES:

### Added

- Display warning when a package to be locked is missing a `dev-repo` field and
  is being skipped because of it (tarides/opam-monorepo#341, tarides/opam-monorepo#362, @kit-ty-kate, @Leonidas-from-XIV)
- Add option `--keep-symlinked-dir` to preserve symlinks in `duniverse/`, which
  can be useful for local development. (tarides/opam-monorepo#348, tarides/opam-monorepo#366, @hannesm,
  @Leonidas-from-XIV)
- Adopt the OCaml Code of Conduct (tarides/opam-monorepo#391, @rikusilvola)
- Add solver tests (tarides/opam-monorepo#394, @samoht)

### Changed

- Canonicalize the URLs of the OPAM `dev-repo` fields to be able to detect more
  semantically equivalent URLs, this should reduce the risk of build failures
  due to duplicate code pulled (tarides/opam-monorepo#118, tarides/opam-monorepo#365 @TheLortex, @Leonidas-from-XIV)

- Simple the error message printed when dependencies don't use dune as their
  build system. The opam-0install diagnostic message is no longer printed in
  this case and the message has been reformatted and reworded to make the
  salient information easier to see. (tarides/opam-monorepo#384, @gridbugs)

- Encode `dev-repo` constraints in the opam solver - this allows to resolve
  more involved version constraints that were failing before (tarides/opam-monorepo#396, @samoht)

### Deprecated

### Fixed

- Error in case where multiple packages with different dev-repo fields would be
  placed in the same duniverse directory (tarides/opam-monorepo#377, @gridbugs)

- Fix a failure when using opam-monorepo with an opam 2.2 root
  (tarides/opam-monorepo#379, @kit-ty-kate)

- Fix assertion failure when prefix of "lock" subcommand is used (tarides/opam-monorepo#381,
  @gridbugs)

- Treat packages without build commands as virtual only if also lack install
  commands, as some non-virtual packages might only have install commands.
  (tarides/opam-monorepo#376 @Leonidas-from-XIV, @gridbugs)

- Improve the ordering of package candidates by putting broken packages at
  then end of the list (tarides/opam-monorepo#395, tarides/opam-monorepo#397, @samoht)

### Removed

### Security
Leonidas-from-XIV added a commit to Leonidas-from-XIV/opam-repository that referenced this pull request Sep 11, 2023
CHANGES:

### Added

- Display warning when a package to be locked is missing a `dev-repo` field and
  is being skipped because of it (tarides/opam-monorepo#341, tarides/opam-monorepo#362, @kit-ty-kate, @Leonidas-from-XIV)
- Add option `--keep-symlinked-dir` to preserve symlinks in `duniverse/`, which
  can be useful for local development. (tarides/opam-monorepo#348, tarides/opam-monorepo#366, @hannesm,
  @Leonidas-from-XIV)
- Adopt the OCaml Code of Conduct (tarides/opam-monorepo#391, @rikusilvola)
- Add solver tests (tarides/opam-monorepo#394, @samoht)

### Changed

- Canonicalize the URLs of the OPAM `dev-repo` fields to be able to detect more
  semantically equivalent URLs, this should reduce the risk of build failures
  due to duplicate code pulled (tarides/opam-monorepo#118, tarides/opam-monorepo#365 @TheLortex, @Leonidas-from-XIV)

- Simple the error message printed when dependencies don't use dune as their
  build system. The opam-0install diagnostic message is no longer printed in
  this case and the message has been reformatted and reworded to make the
  salient information easier to see. (tarides/opam-monorepo#384, @gridbugs)

- Encode `dev-repo` constraints in the opam solver - this allows to resolve
  more involved version constraints that were failing before (tarides/opam-monorepo#396, @samoht)

### Deprecated

### Fixed

- Error in case where multiple packages with different dev-repo fields would be
  placed in the same duniverse directory (tarides/opam-monorepo#377, @gridbugs)

- Fix a failure when using opam-monorepo with an opam 2.2 root
  (tarides/opam-monorepo#379, @kit-ty-kate)

- Fix assertion failure when prefix of "lock" subcommand is used (tarides/opam-monorepo#381,
  @gridbugs)

- Treat packages without build commands as virtual only if also lack install
  commands, as some non-virtual packages might only have install commands.
  (tarides/opam-monorepo#376 @Leonidas-from-XIV, @gridbugs)

- Improve the ordering of package candidates by putting broken packages at
  then end of the list (tarides/opam-monorepo#395, tarides/opam-monorepo#397, @samoht)

### Removed

### Security
nberth pushed a commit to nberth/opam-repository that referenced this pull request Jun 18, 2024
CHANGES:

### Added

- Display warning when a package to be locked is missing a `dev-repo` field and
  is being skipped because of it (tarides/opam-monorepo#341, tarides/opam-monorepo#362, @kit-ty-kate, @Leonidas-from-XIV)
- Add option `--keep-symlinked-dir` to preserve symlinks in `duniverse/`, which
  can be useful for local development. (tarides/opam-monorepo#348, tarides/opam-monorepo#366, @hannesm,
  @Leonidas-from-XIV)
- Adopt the OCaml Code of Conduct (tarides/opam-monorepo#391, @rikusilvola)
- Add solver tests (tarides/opam-monorepo#394, @samoht)

### Changed

- Canonicalize the URLs of the OPAM `dev-repo` fields to be able to detect more
  semantically equivalent URLs, this should reduce the risk of build failures
  due to duplicate code pulled (tarides/opam-monorepo#118, tarides/opam-monorepo#365 @TheLortex, @Leonidas-from-XIV)

- Simple the error message printed when dependencies don't use dune as their
  build system. The opam-0install diagnostic message is no longer printed in
  this case and the message has been reformatted and reworded to make the
  salient information easier to see. (tarides/opam-monorepo#384, @gridbugs)

- Encode `dev-repo` constraints in the opam solver - this allows to resolve
  more involved version constraints that were failing before (tarides/opam-monorepo#396, @samoht)

### Deprecated

### Fixed

- Error in case where multiple packages with different dev-repo fields would be
  placed in the same duniverse directory (tarides/opam-monorepo#377, @gridbugs)

- Fix a failure when using opam-monorepo with an opam 2.2 root
  (tarides/opam-monorepo#379, @kit-ty-kate)

- Fix assertion failure when prefix of "lock" subcommand is used (tarides/opam-monorepo#381,
  @gridbugs)

- Treat packages without build commands as virtual only if also lack install
  commands, as some non-virtual packages might only have install commands.
  (tarides/opam-monorepo#376 @Leonidas-from-XIV, @gridbugs)

- Improve the ordering of package candidates by putting broken packages at
  then end of the list (tarides/opam-monorepo#395, tarides/opam-monorepo#397, @samoht)

### Removed

### Security
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantically equivalent but different "dev-repo" fields lead to build errors
3 participants