New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Allow HEAD-upgrades #584

Closed
wants to merge 12 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@vladshablinsky
Copy link
Contributor

vladshablinsky commented Jul 25, 2016

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes? Here's an example.
  • Have you successfully run brew tests with your changes locally?

This is a copy of #274, but not filled with old comments and suggestions regarding the project in general, so the maintainers can leave their feedback.

The further work on GSoC project will be continued here.

@vladshablinsky

This comment has been minimized.

Copy link
Contributor

vladshablinsky commented Jul 25, 2016

edit: the use of short and long hashes for git was addressed here before, now the problem is solved and the previous description of the problem lives here.

@vladshablinsky vladshablinsky changed the title [WIP][RFC] Copy of Allow HEAD-upgrades [WIP][RFC] Allow HEAD-upgrades Jul 25, 2016

@UniqMartin

This comment has been minimized.

Copy link
Contributor

UniqMartin commented Jul 27, 2016

  • Store long hashes in Tab. Cons: complicate.

I think this would be the cleanest solution. I haven't spent much time thinking about how much this complicates the whole design, so take this with a grain of salt.

  • Use short commits and use == for VCSDownloadStrategy and start_with? for GitHubDownloadStrategy. Cons: there is a possibility we get outdated version (see the example).

I don't think this is an unreasonable strategy. While this won't be perfect, the likelihood of this failing would be close to 0% if the shortened hash isn't overly short, e.g. 16 characters. Even if this wasn't sufficient to uniquely identify all commits, it would only affect a few commits and in extremely big Git repositories with millions if not billions of commits and objects.

  • Store long hashes everywhere. Cons: long hashes in outdated outputs and long prefixes.

This could be partially addressed by using a more efficient encoding that uses fewer characters at the expense of no longer having hex digits and thus adding an additional layer of abstraction that might complicate matters. E.g. by re-encoding the full commit hash in Base32, its length could be shortened from 40 to 32 characters. (Just wanted to mention this theoretical possibility. I don't think this is a practical solution due to its complexity.)

@xu-cheng

This comment has been minimized.

Copy link
Contributor

xu-cheng commented Jul 27, 2016

@vladshablinsky can you give a list of example prefix for different VCS strategies? I think there is no much advantage to short git commit if we cannot short commit for other VCS installations.

return false unless head && head.downloader.is_a?(VCSDownloadStrategy)
downloader = head.downloader
downloader.shutup! unless ARGV.verbose?
version.version.commit != downloader.fetch_last_commit

This comment has been minimized.

@xu-cheng

xu-cheng Jul 27, 2016

Contributor

You probably need to cache the upstream last commit.

This comment has been minimized.

@vladshablinsky

vladshablinsky Jul 27, 2016

Contributor

Do I cache it in Formula or in VCSDownloadStrategy?

@xu-cheng

This comment has been minimized.

Copy link
Contributor

xu-cheng commented Jul 27, 2016

Also, @vladshablinsky shouldn't this PR be ready to be merged(after review)? If so, let's remove WIP tag in the title.

@xu-cheng xu-cheng added the discussion label Jul 27, 2016

@xu-cheng

This comment has been minimized.

Copy link
Contributor

xu-cheng commented Jul 27, 2016

@Homebrew/maintainers This PR should be the final one for @vladshablinsky's GSoC project. The only remaining problem is how we should store VCS commit in the prefix, as mentioned in the above.

All of implementations for better supporting HEAD have been done. Please give your valuable comments. Thanks.

Finally thanks for @vladshablinsky to finish his project way before schedule.

@xu-cheng xu-cheng changed the title [WIP][RFC] Allow HEAD-upgrades [RFC] Allow HEAD-upgrades Jul 27, 2016

formulae.select(&:outdated?).each do |f|
formulae.select do |f|
f.outdated?(:fetch_head => fetch_head)
end.each do |f|

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

I have to admit that I'm not a big fan of chaining multiple multi-line do blocks. Would the following be also acceptable to you?

outdated_formulae = formulae.select do |f|
  f.outdated?(:fetch_head => fetch_head)
end

outdated_formulae.each do |f|
  …

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

Also this pattern of checking for outdated formulae appears in at least three places in this diff. I wonder if there's a good way to refactor this to avoid repeating this three-line select code snippet everywhere.

if outdated_versions.any? && f.head? && !fetch_head
puts "#{f.full_name} (#{outdated_versions*", "}) HEAD update available"
else
puts "#{f.full_name} (#{outdated_versions*", "}) < #{f.pkg_version}"

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

Now that you're modifying this anyway: Can you please replace *", " with .join(", ") here and two lines above? The former is a lot more obscure.

@@ -980,7 +998,7 @@ def migration_needed?
end

# @private
def outdated_versions
def outdated_versions(options={})

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

Please use spaces around the assignment (i.e. options = {}).

end
end

# @private
def outdated?
outdated_versions.any?
def outdated?(options={})

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

Please use spaces around the assignment (i.e. options = {}).

end

def setup_tab_for_prefix(prefix, tap_string=nil)
def setup_tab_for_prefix(prefix, options={})

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

Please use spaces around the assignment (i.e. options = {}).

"stable" => formula.stable ? formula.stable.version.to_s : nil,
"devel" => formula.devel ? formula.devel.version.to_s : nil,
"head" => formula.head ? formula.head.version.to_s : nil
}

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

I think in addition to this you will also need to populate attributes["source"]["versions"] in Tab.from_file_content. Otherwise Tab#versions will return nil for existing tabs that were created before this change and this will cause Tab#stable_version and Tab#devel_version to fail.

This comment has been minimized.

@vladshablinsky

vladshablinsky Jul 27, 2016

Contributor

👍 Already implemented it locally. Haven't pushed the changes yet, but I will.

UPD Done.


def devel_version
Version.create(versions["devel"]) if versions["devel"]
end

This comment has been minimized.

@UniqMartin

UniqMartin Jul 27, 2016

Contributor

Is there a reason there's both Tab#stable_version and Tab#devel_version but noTab#head_version? This feels like a surprising omission to me.

@vladshablinsky vladshablinsky force-pushed the vladshablinsky:ds_last_commit_copy branch 2 times, most recently from ec3ec58 to 9802b0c Jul 27, 2016

@vladshablinsky

This comment has been minimized.

Copy link
Contributor

vladshablinsky commented Jul 27, 2016

shouldn't this PR be ready to be merged(after review)? If so, let's remove WIP tag in the title.

I have three questions left regarding the project:

    1. Seems like curl_output never fails, so we probably need something like this:
  def fetch_last_commit
    return unless head?
    commit = github_last_commit || super
    version.update_commit(commit)
    commit
  end
    1. Do we need to check that last fetched commit not nil, maybe raise an Error or something?
      * 3. How do I change print_outdated_json behaviour if --fetch-HEAD flag passed.

Also, what if we change --fetch-HEAD to --fetch-head?

UPD:

And probably the last one. We might want HEAD with nil commit to be outdated without any other checks.

can you give a list of example prefix for different VCS strategies?

Mercurial:
HEAD-c0f87ece5d4766de9f269edce7c6ac506bb32899 for sdl_rtf

Subversion:
HEAD-412 for smpeg2

Bazaar:
HEAD-14764 for squid

Fossil:
HEAD-a25269bd3d8df10ff5d871dd227a504ce16431c4_2 for libspatialite

@UniqMartin

This comment has been minimized.

Copy link
Contributor

UniqMartin commented Jul 27, 2016

It does, but you have to check the status yourself after invoking it, e.g.:

output, errors, status = curl_output(…)
handle_failure unless status.success?

(See GitHub.open for a practical example where this is used to handle the error case.)

@vladshablinsky vladshablinsky force-pushed the vladshablinsky:ds_last_commit_copy branch 2 times, most recently from b2f439f to d90e81b Jul 27, 2016

@vladshablinsky

This comment has been minimized.

Copy link
Contributor

vladshablinsky commented Jul 29, 2016

Still can't understand why it fails on Maverics. I thought maybe the problem was in parentheses, so I installed ruby 1.8.7 and tried some silly stuff like this:

dummy_f = "f"

def dummy_f.outdated_versions(options = {})
  [self, options.keys]
end

def dummy_refute_predicate o1, op, msg = nil
  raise "Raised with msg: #{msg}" if o1.__send__(op)
end

dummy_refute_predicate dummy_f.outdated_versions(:fetch_head => true), :empty?

But it all works good, so, I don't know what is the reason of the fail.


refute_predicate f.outdated_versions(:fetch_head => true), :empty?

tab_a["versions"] = { "stable" => f.version }

This comment has been minimized.

@UniqMartin

UniqMartin Jul 29, 2016

Contributor

The test failure is actually happening here. (You cannot fully trust the line numbers in the test bot output because your PR is being merged on top of an up-to-date master, so things might move a little bit.) The problem is that OpenStruct in Ruby 1.8.7 doesn't have a []= method. If you change it to the following, it will work:

tab_a.versions = { "stable" => f.version }

This comment has been minimized.

@vladshablinsky

vladshablinsky Jul 29, 2016

Contributor

Thanks!

@@ -980,7 +998,7 @@ def migration_needed?
end

# @private
def outdated_versions
def outdated_versions(options = {})
@outdated_versions ||= begin

This comment has been minimized.

@UniqMartin

UniqMartin Jul 29, 2016

Contributor

I wonder if it is still appropriate to cache the return value in a single instance variable, as the result can change based on the given options. I would expect the following sequence to produce different results, but it probably won't because the first one will have already populated the cache, so the options on the second invocation don't have any effect:

f.outdated_versions
f.outdated_versions(:fetch_head => true)

This comment has been minimized.

@vladshablinsky

vladshablinsky Jul 29, 2016

Contributor

What should I do about it? We can cache outdated versions in two instance variables, but I think we shouldn't give up caching.

This comment has been minimized.

@UniqMartin

UniqMartin Jul 29, 2016

Contributor

I think a simple solution would be to cache based on the given options, though I assume this will continue to be slightly incorrect, as I think :fetch_head => true means “fetch HEAD now (possibly again) and then return a result based on this most recent fetch”.

A cleaner solution would be to think about what doesn't change during the lifetime of a Homebrew process and only cache that. Then refactor the method to compute the final result from the information in the cache and add to that the information that can potentially change on every invocation of this method.

This comment has been minimized.

@vladshablinsky

vladshablinsky Jul 29, 2016

Contributor

I'm not sure but I think that we don't need to fetch remote repository twice when we already have outdated versions with :fetch_head => true cached somewhere.

As I can see we only use Formula#outdated and Formula#outdated_versions in cmd/upgrade and cmd/outdated. For outdated we don't need to reassign anything (at least now). For upgrade we need outdated only to understand that it's time to upgrade.

Also, supposing we don't cache anything at all, is it possible that outdated_versions returns something different from what it returns on its first call? It seems that no, otherwise we would have the same caching problem we have now. And assuming that our further behaviour depends only on what is returned by outdated_versions on its first call, if outdated_versions(:fetch_head => true) is not empty on its first call it won't change from what it is on first call. If outdated_versions(:fetch_head => true) is empty on its first call, I think it's even worse if it gets not empty somewhere in the middle of the process. So, I think having outdated_versions with and without the flag isn't that bad.

@vladshablinsky vladshablinsky force-pushed the vladshablinsky:ds_last_commit_copy branch 3 times, most recently from 148c8e6 to 755ebd7 Jul 29, 2016

puts "#{f.full_name} (#{f.outdated_versions*", "} < #{f.pkg_version})"
outdated_versions = f.outdated_versions(:fetch_head => fetch_head)
if outdated_versions.any? && f.head? && !fetch_head
puts "#{f.full_name} (#{outdated_versions.join(", ")}) HEAD update available"

This comment has been minimized.

@MikeMcQuaid

MikeMcQuaid Aug 2, 2016

Member

Let's go for < latest HEAD

head_version = latest_head_version
if head_version
head_version_outdated?(head_version, options) ? all_versions.sort! : []
else
all_versions.sort!

This comment has been minimized.

@xu-cheng

xu-cheng Aug 3, 2016

Contributor

A nitpick: I would prefer to using all_version.sort in this line and the line above. After all, you are using the return value.

This comment has been minimized.

@bfontaine

bfontaine Aug 3, 2016

Member

Am I missing something or could it be simplified?

# before
if head_version
  head_version_outdated?(head_version, options) ? all_versions.sort! : []
else
  all_versions.sort!
end

# after
if head_version && !head_version_outdated?(head_version, options)
  []
else
  all_versions.sort!
end

This comment has been minimized.

@vladshablinsky

vladshablinsky Aug 3, 2016

Contributor

@bfontaine looks good, thanks. Probably it was similar to your suggestion at some point locally.

@vladshablinsky vladshablinsky force-pushed the vladshablinsky:ds_last_commit_copy branch 3 times, most recently from a0bca72 to 94f54d4 Aug 3, 2016

@vladshablinsky

This comment has been minimized.

Copy link
Contributor

vladshablinsky commented Aug 4, 2016

@xu-cheng there is some code uncovered in cmd/outdated, so maybe some new integration tests should be added, shouldn't they? Other things seems to be covered where it's possible. GitHubDownloadStrategy is not covered and it's probably expected, but most of the other code is covered.

Note about CVS:

I spent some time trying to get last revision (commit hash) in the CVS repository, it seems there is no way to do that just running one command and not scanning through all the data CVS outputs for each file. So, I think it'll be better to just have timestamps as a unique identifier for CSV.


return true if stable && tab.stable_version && tab.stable_version < stable.version
return true if devel && tab.devel_version && tab.devel_version < devel.version
end

This comment has been minimized.

@xu-cheng

xu-cheng Aug 5, 2016

Contributor

I would avoid to introduce this method, as tab_version_outdated? should be implementation details that should not be exposed.

Can we merge it into head_version_outdated?? The only place I see you use tab_version_outdated? is in outdated? in which case, you can just use head_version_outdated?(version, :fetch_head => false). Actually, you don't need to call head_version_outdated? in outdated version at all. See my comment in cmd/outdated

downloader = head.downloader
downloader.shutup! unless ARGV.verbose?
downloader.commit_outdated?(version.version.commit)
end

This comment has been minimized.

@xu-cheng

xu-cheng Aug 5, 2016

Contributor

a nitpick, can we have a else false end here. A method with ? in name should only return true or false, but not nil.


outdated = outdated_formulae.each do |f|
outdated_versions = f.outdated_versions(:fetch_head => fetch_head)
current_version = if f.tab_version_outdated?(f.latest_head_version)

This comment has been minimized.

@xu-cheng

xu-cheng Aug 5, 2016

Contributor

It's bad practice to use f.latest_head_version, because it's only referred to one of version and has different semantical meaning with outdated_versions.

This can(should) be if f.head? && outdated_versions.any? { |v| v.to_s == f.pkg_version.to_s } . The idea is that:
(1) You should check whether it's a head formula.
(2) If f.pkg_version is the same as one of installed versions, it means that you didn't fetch latest HEAD.

puts "#{f.full_name} (#{outdated_versions.join(", ")}) < latest HEAD"
else
puts "#{f.full_name} (#{outdated_versions.join(", ")}) < #{f.pkg_version}"
end

This comment has been minimized.

@xu-cheng

xu-cheng Aug 5, 2016

Contributor

This part is overcomplicated, you should use the same logic as below print_outdated_json, i.e.

      current_version = if f.head? && outdated_versions.any? { |v| v.to_s == f.pkg_version.to_s }
        "latest HEAD"
      else
        f.pkg_version.to_s
      end
      puts "#{f.full_name} (#{outdated_versions.join(", ")}) < #{current_version}"
@xu-cheng

This comment has been minimized.

Copy link
Contributor

xu-cheng commented Aug 5, 2016

there is some code uncovered in cmd/outdated, so maybe some new integration tests should be added, shouldn't they? Other things seems to be covered where it's possible. GitHubDownloadStrategy is not covered and it's probably expected, but most of the other code is covered.

It is preferable to have the test as well. But it can be done in follow up PR.


I have tested this PR. Once you fix the problem I mentioned in cmd/outdated I think we can merge PR and improve other matters in follow up PR.

@vladshablinsky vladshablinsky force-pushed the vladshablinsky:ds_last_commit_copy branch from 94f54d4 to 268d2ef Aug 5, 2016

@vladshablinsky vladshablinsky force-pushed the vladshablinsky:ds_last_commit_copy branch from 268d2ef to 3d9db36 Aug 6, 2016

vladshablinsky added some commits Aug 6, 2016

Update upgrade/outdated methods for head versions
Introduce `--fetch-HEAD` option. Without this option upgrade and
outdated never fetch latest upstream commit to detect if HEAD is
outdated -- tabs are used instead. However, if option is passed,
we fetch commit from upstream, which is more time consuming,
but we can be sure that version is up-to-date or outdated.

@vladshablinsky vladshablinsky force-pushed the vladshablinsky:ds_last_commit_copy branch from 3d9db36 to 0a6d73c Aug 6, 2016

@xu-cheng xu-cheng closed this in 072e5df Aug 6, 2016

@xu-cheng

This comment has been minimized.

Copy link
Contributor

xu-cheng commented Aug 6, 2016

Thanks @vladshablinsky. Congratulation on finishing the GSoC project.

@UniqMartin

This comment has been minimized.

Copy link
Contributor

UniqMartin commented Aug 6, 2016

Great work! 🎉

@MikeMcQuaid

This comment has been minimized.

Copy link
Member

MikeMcQuaid commented Aug 8, 2016

Great work on this project @vladshablinsky!

@Homebrew Homebrew locked and limited conversation to collaborators May 3, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.