Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(RE-6200) URI parsing, Source handling, Git cloning #394

Merged
merged 22 commits into from
Aug 4, 2016
Merged

(RE-6200) URI parsing, Source handling, Git cloning #394

merged 22 commits into from
Aug 4, 2016

Conversation

mckern
Copy link
Contributor

@mckern mckern commented Jul 12, 2016

I started refactoring how Vanagon::Component::Source works... and wound up making sweeping revisions to source handling in general. There's probably more to come, but I wanted to get people looking at this before it gets even larger and more difficult to grok. Spec tests pass, but coverage is incomplete so I'd like to start testing this out before anyone even considers merging it.

@@ -317,11 +317,11 @@ def build_dir(path)

# This will add a source to the project and put it in the workdir alongside the other sources
#
# @param url [String] url of the source
# @param uri [String] uri of the source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update docs here

@stahnma
Copy link
Contributor

stahnma commented Jul 12, 2016

time bundle exec build puppet-agent el-7-x86_64
Error loading project 'puppet-agent' using '/home/stahnma/puppet-agent/configs/projects/puppet-agent.rb':
undefined method `git_version' for Vanagon::Utilities:Module
/home/stahnma/vanagon/lib/vanagon/project/dsl.rb:147:in `version_from_git'
/home/stahnma/vanagon/lib/vanagon/project.rb:169:in `block in load_project'
/home/stahnma/vanagon/lib/vanagon/project/dsl.rb:26:in `call'
/home/stahnma/vanagon/lib/vanagon/project/dsl.rb:26:in `project'
/home/stahnma/vanagon/lib/vanagon/project.rb:28:in `load_project'
/home/stahnma/vanagon/lib/vanagon/project.rb:28:in `instance_eval'
/home/stahnma/vanagon/lib/vanagon/project.rb:28:in `load_project'
/home/stahnma/vanagon/lib/vanagon/driver.rb:26:in `initialize'
/home/stahnma/vanagon/bin/build:27:in `new'
/home/stahnma/vanagon/bin/build:27:in `block in '
/home/stahnma/vanagon/bin/build:25:in `each'
/home/stahnma/vanagon/bin/build:25:in `'
/home/stahnma/.gem/ruby/bin/build:23:in `load'
/home/stahnma/.gem/ruby/bin/build:23:in `'
/home/stahnma/vanagon/lib/vanagon/project/dsl.rb:147:in `version_from_git': undefined method `git_version' for Vanagon::Utilities:Module (NoMethodError)
    from /home/stahnma/vanagon/lib/vanagon/project.rb:169:in `block in load_project'
    from /home/stahnma/vanagon/lib/vanagon/project/dsl.rb:26:in `call'
    from /home/stahnma/vanagon/lib/vanagon/project/dsl.rb:26:in `project'
    from /home/stahnma/vanagon/lib/vanagon/project.rb:28:in `load_project'
    from /home/stahnma/vanagon/lib/vanagon/project.rb:28:in `instance_eval'
    from /home/stahnma/vanagon/lib/vanagon/project.rb:28:in `load_project'
    from /home/stahnma/vanagon/lib/vanagon/driver.rb:26:in `initialize'
    from /home/stahnma/vanagon/bin/build:27:in `new'
    from /home/stahnma/vanagon/bin/build:27:in `block in '
    from /home/stahnma/vanagon/bin/build:25:in `each'
    from /home/stahnma/vanagon/bin/build:25:in `'
    from /home/stahnma/.gem/ruby/bin/build:23:in `load'
    from /home/stahnma/.gem/ruby/bin/build:23:in `'
real    0m0.581s
user    0m0.528s
sys 0m0.050s

I also had to update the vanagon.gemspec to include fuistigit

@mckern mckern changed the title URI parsing, Source handling, Git cloning (RE-6200) URI parsing, Source handling, Git cloning Jul 15, 2016
def add_source(url, ref: nil, sum: nil)
@component.sources << OpenStruct.new(:url => url, :ref => ref, :sum => sum)
def add_source(uri, **options)
@component.sources << OpenStruct.new(options)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate that we're using OpenStruct here, because it means that there's really no documentation about what kind of object (or structure) we SHOULD be pushing into @component.sources here. I had to fish around for #add_source to figure out what is expected of each source we push into @component.sources.

As a prelude to fixing RE-6200 (Unable to use https source of a github repo),
Vanagon::Component::Source is being restructured.
- Changed class variable to class instance variable
- Added class accessor for @rewrite_rules
- Updated spec test to use the accessor instead of
  directly calling the class var
- Changed from self. method declaration to reopening the class
  (In preparation for adding private helper methods)
This should help clarify intent, and is done in preparation
for rewriting the `git` utility.
This makes them easier to use from within classes without having
to `include Vanagon::Utilities` to get access to them.
I don't know why we didn't do this sooner but this is a convenience
alias. It's less typing and it's a name we'd expect that functionality
to be exposed under,
In preparation for moving the Git functionality into the
Vanagon::Component::Source::Git class, here's some new and
interesting Error classes that will be used for meaningful
error messages.

* Vanagon::GitError
* Vanagon::InvalidRepo
* Vanagon::UnknownRef
* Vanagon::UnknownSha
* Vanagon::CheckoutFailed
`git` methods have been removed from Vanagon::Utilities, and confined
to the Vanagon::Component::Source::Git class (nothing outside of the
Source subclass uses them).
@mckern
Copy link
Contributor Author

mckern commented Jul 28, 2016

Still needs

  • Documentation
  • Allow setting git clone depth through the Source DSL
  • Source#absolute_certainty needs to be written
  • :type keywords for component initialization

I've decided that we don't need the type functionality, so those two line-items are removed from the scope on this PR.

def valid_remote?(url)
!!::Git.ls_remote(url)
rescue ::Git::GitExecuteError
nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're making it a boolean (in the yard, and explicitly above), should this return false instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes some sense.

@haus
Copy link
Contributor

haus commented Jul 29, 2016

did you consider using refinements instead of monkey patching?

@mckern
Copy link
Contributor Author

mckern commented Jul 29, 2016

Ruby 2.2/2.3 docs state:

Refinements only modify classes, not modules so the argument must be a class.

They might work for this particular use case but it's a crapshoot.

Edit

None of the monkey-patched methods are overwriting existing behaviors in this PR. It's worth investigating porting them to refinements, but they should be generally safe.

This is part of a work-in-progress for Vanagon::Component::Source, which
will remove the rewrite functionality. This involves stronger use of
keyword arguments but should preserve the DSL.

This commit also introduces a dependency on the Fustigit gem, which is
a utility that makes parsing triplet-style git URIs viable.
Lower-case 'e' will only work on case-insensitive filesystems.
Uppder-case 'E' is the case you want.
Fustigit has been updated, and moved to the Gemspec. Basic comments
have been added to both the Gemspec and the Gemfile to clarify intent
a little bit.
Gems should be in EITHER Gemfile or the Gemspec, but not both.
Ruby-Git might not be the One True Git Library that solves all of our needs
but it's generally better than abstracting out "%x()"-wrapped calls to Git
ourselves.

I've extended it slightly to support Git submodules and unshallowing repos
that were shallow-cloned but we're now using it basically as-is. In theory,
this will potentially make porting to a different Git library (like Rugged)
easier, since we've just got a handful of composed functions instead of an
entire implementation to port.
Disable Metrics/AbcSize for a handful of functions, and add a target Ruby version
to .rubocop.yml (Ruby 2.1 API compatibility is our target, by the way).
@stahnma
Copy link
Contributor

stahnma commented Jul 29, 2016

@mckern are you planning to squash/clean commits more than this or is this what you want for final review?

@mckern
Copy link
Contributor Author

mckern commented Jul 29, 2016

@stahnma I want to add a doc commit today but I wasn't planning to squash/rebase any more than this. These commits have a fair amount of churn in them so the best I could probably do would be to flatten it into one commit. I think we lose some important history here but I am open to being persuaded.

@mcdonaldseanp
Copy link
Contributor

FWIW I really don't think we should squash most of these. In the near future we may need to fix/update major parts to fit our needs and having the commit history will allow us to understand the original iterations and thought process so we can avoid the process of making a decision Ryan already realized was not going to work.

@mckern
Copy link
Contributor Author

mckern commented Jul 29, 2016

s/realized/assumed/

@stahnma
Copy link
Contributor

stahnma commented Jul 29, 2016

Trying to build el-7-x86_64 off of PA stable.

https://gist.github.com/stahnma/9df32ad6a8a510abaa03eb98a9469ab2

@haus
Copy link
Contributor

haus commented Jul 29, 2016

@mckern yea, monkey patching the gem is pretty safe. if it were touching anything from ruby core i'd be much more interested in limiting the scope of the effect.

@mckern
Copy link
Contributor Author

mckern commented Jul 29, 2016

@stahnma thanks for helping test things off the happy path. You caught a thing I'd missed. Fixed and testing now.

Local sources have had some work done around how archives
are handled and local files are moved around. HTTP and Local
sources have a lot of overlap in their workflow, since HTTP
sources are basically just Local sources with a remote
fetch in the mix. As HTTP now inherits from Local, this should
reduce a substantial amount of code duplication while letting
HTTP sources benefit from the refactoring.
The original plan was to have a "best guess or specific definition" for
initializing a new Source object. But the "best guess" functionality now
relies on the Source subclasses providing some sort of "is this valid?"
functionality that was not part of the original plan. With this new functionality
the "best guess or specific definition" plan is abandoned.
The rewrite engine has a glaring flaw: it previous treated
'http' and 'https' sources as equivilent and applied all
'http' rules to any 'https' sources. This is dumb, but it's
behavior we have to preserve until the engine is gone.
It's harder than you think it is.
# is not smart, and it has very poor support for submodule options.
# For example, you can't pass any arguments to the handful of submodule
# options that accept them (like --depth). We may extend it later,
# but for rev. 0001, simply initializing them will suffice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to have an example of how this method will be used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Added a little yardoc to the extension for the Ruby-Git library.
@mckern mckern removed the don't merge label Aug 3, 2016
@mckern
Copy link
Contributor Author

mckern commented Aug 3, 2016

High Level Breakdown

Git

  • Our custom %x{git do-a-thing} commands have been offloaded to an external library. This is an under-the-hood swap, and the existing API (such as it is) has been preserved while hopefully making it easier to replace the Git library when/if needed.
  • Git functionality is expanded, supporting virtually all variants of acceptable Git URIs (including the all-too-common Triplet format of <user>@<host>:<repo>.git).
  • I tried (and failed) to make Shallow Cloning a viable thing, but it turns out that shallow cloning has a TON of edge cases. libgit2 doesn't even support them, so that should give you a reasonable idea of why I abandoned it. The commits and code related to shallow cloning have been left here to preserve the notes about why it was abandoned.

HTTP source handling

  • HTTP sources now support redirects. Remote sources are now more likely to come down correctly the first time as a result of the new redirection support.
  • Additionally, duplicate code between Local sources and HTTP sources has been cleaned up so that the Local class does the heavy lifting.
  • Reduced code in the HTTP class will hopefully make it easier to replace/extend in the future if we have to move to a more performant HTTP backend than the native Net::HTTP implementation in Ruby.

Local source handling

  • Archive types have been expanded greatly, with support for bzip2, xz, rar, etc.

Component::Source definition

  • Component::Source.source now uses a cleaner heuristic to determine what sort of object to return: it asks each subclass if a given URI is valid and returns the first one that says "yes". In order, it tries Git, then HTTP/HTTPS, and then the Local filesystem before giving up.
  • Refactoring use of class variables to class instance variables, which should make reasoning about scope a little easier as we move towards a better defined object model for Vanagon's guts.

Misc

  • Use of Ruby's English library, allowing us to use more descriptive values for common operations, e.g. checking $CHILD_STATUS.success? instead of $?.success?
  • More status messages and error messages. Vanagon still exits with a stack dump if operations don't succeed but at least now the output is slightly more meaningful.
  • Updated tests for code paths that were touched during this refactoring.

@mcdonaldseanp
Copy link
Contributor

👍 on everything, I believe with spec tests and all, this is ready for merge

@ScottGarman ScottGarman merged commit a92c16e into puppetlabs:master Aug 4, 2016
shrug pushed a commit that referenced this pull request Aug 11, 2016
(MAINT) Update CHANGELOG for PR #394
@mckern mckern deleted the RE-6200/you-are-i branch April 13, 2017 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants