Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework of dependency and requirements #354

Merged
merged 3 commits into from
Apr 17, 2020

Conversation

waj
Copy link
Member

@waj waj commented Apr 16, 2020

This is a refactor of dependency requirements (specifications of which versions are suitable for each dependency).

Currently, each dependency has different fields (version, branch, tag, commit) that declares the requirement. But this model is imprecise (allows invalid states), difficult to reason in code (version is sometimes a pattern, others an exact version) and requires constant parsing of the version field (for example to differentiate patterns, or exact version).

With this change, each Dependency now has only three fields: name, resolver and requirement. Also note that the relationship between Dependency and Resolver is now inverted. (In a future refactor I also plan to remove the name from Resolver making it totally independent from the dependency that it pretends to install).

Requirement is actually a union of these types (structs):

  • Version: represents an exact version already resolved. Used in dependencies read from lock files and internally during dependency resolution.
  • VersionReq: a version pattern. This corresponds to the version field entered in the shard.yml
  • Ref: an abstract struct representing any reference specific for the resolver
  • Any: a wildcard, specifying that any version is suitable

Currently we only have only one resolver (git) that defines subclasses of Ref:

  • GitBranchRef: corresponds to the branch attribute in shard.yml
  • GitTagRef: corresponds to the tag attribute in shard.yml
  • GitCommitRef: corresponds to the commit attribute in shard.yml
  • GitHeadRef: used internally when there is no version specified and no tags

The parsing of Ref is delegated to the resolver.
Also, printing of exact versions is delegated to resolvers, thus producing better output without introducing ad-hoc code.

spec/unit/spec_spec.cr Show resolved Hide resolved
src/config.cr Show resolved Hide resolved
src/requirement.cr Show resolved Hide resolved
src/resolvers/git.cr Show resolved Hide resolved
src/resolvers/path.cr Show resolved Hide resolved
@straight-shoota
Copy link
Member

straight-shoota commented Apr 16, 2020

The internal refactoring looks nice.

But I'm hesitant about the changes to parsing dependencies. They violate quite a few properties that
have been established in #306 and previous discussions. For example, it was agreed that extraneaous attributes in dependency mappings should be ignored to enable forward compatibility. And that YAML parsing should be isolated and not depend on resolvers. Requiring the resolver url to be the first entry in the dependency mapping is also a huge breaking change. I can't see how we could possibly accept that. The SPEC does not define a specific order of keys, so the implementation must not assume any.
Now this breaks with all of that, and I don't see any immediate benefit wich would justify deriving from the previous decisions.
Maybe you can shed some light on that? Is it necessary to break behaviour and why?

I've looked at the dependency mappings available on shardbox.org to get some stats: There are 720 dependencies (about 11%) where the resolver is not the first k/v pair. These would be broken with this change. When taking only dependencies defined by latest releases into account, the number is still 122 broken dependencies (but that's even 20%).

Additional, undefined properties on the other hand are less frequent. There are only 35 instances (about 0,5 %) and they all seem to be misspellings of version. So in this regard, a breaking change would be acceptable. But it would break forward compatibility in case we intend to introduce additional properties at some point.

The stats only take dependencies into account. I didn't look at dev_dependencies as I don't think additional data is necessary.

@waj
Copy link
Member Author

waj commented Apr 16, 2020

Big oops! Thank you @straight-shoota for being right there. I thought it could be acceptable to break compatibility because fixing the yml would be an easy thing, but that would also break the parsing of previous versions which is not acceptable. I don't know what I was thinking... sorry and thanks.

I'll change the parsing to keep it compatible. Probably by parsing each dependency attributes to a hash, figure out the resolver type and pass that hash to the resolver to define the requirement.

Now, what do we win by allowing the resolver key to not be the first entry in each map? Maybe it's late to change that, but I think it improves readability of the spec.yml.

@RX14
Copy link
Contributor

RX14 commented Apr 16, 2020

I just want to say having shardbox to give statistics about these kinds of changes has been amazing and affected real data-driven change to shards. Thank you so much @straight-shoota!

@waj
Copy link
Member Author

waj commented Apr 16, 2020

Absolutely, shardbox is awesome!

I just pushed the changes that revert the (in)compatibility of the parsing.

I also replaced the Dependency.new(pull) with Dependency.from_yaml because I didn't like that little game with tuples to workaround issues with nilables.

private def git_refs(version)
case version
def parse_requirement(params : Hash(String, String)) : Requirement
params.each do |key, value|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we raise an Ambiguous dependency requirement for ... if more than one branch, tag, commit is specified instead of honoring the first one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, pick the most specific one version > tag > branch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually a tag could be more specific than a version with a pattern. But they also could be conflicting if a version is not reachable in a branch.

If anything, I would say we could raise instead of considering just the first one. @straight-shoota do you think this would be a breaking change?

We could implement a stricter parser if we want, for the current shard.yml, but be more relaxed when the solver is scanning released versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we could define this later. There is currently not defined behavior with conflicting attributes and this PR is not changing that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I’m out of feedback :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFTR: Shardbox database shows no dependencies with more more than one requirement restriction. So breaking things is not a concern.

@bcardiff
Copy link
Member

I've just noticed the lock file generated after this refactor as follows. Which is semantically fine, but just wondering if the change is intended or not.

version: 1.0
shards:
  molinillo:
---    github: crystal-lang/crystal-molinillo
+++    git: https://github.com/crystal-lang/crystal-molinillo.git
    version: 0.1.0

@straight-shoota
Copy link
Member

@bcardiff I guess it doesn't matter much either way. Using explicit git is a bit more verbose, but might have a slight advantage towards forward compatibility. If we add more shortcut keys, older shards version would still be able to install from the lock file.

@bcardiff
Copy link
Member

Yes, I agree it can still work and it might even be a good choice. But since it wasn't stated as an intended change, I wanted to double-check.

@waj
Copy link
Member Author

waj commented Apr 16, 2020

Actually that changed since #306, not my fault 😇

I think it's ok, if we think the lock file is saving a normalized definition of the dependencies

@waj waj merged commit 4395d37 into crystal-lang:master Apr 17, 2020
@waj waj deleted the rework-dependency-requirement branch April 17, 2020 04:57
@straight-shoota
Copy link
Member

I would have appreciated to get a chance at properly reviewing the updated changes :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants