Fork Support

olly edited this page Aug 7, 2010 · 26 revisions

We’re going with subdomains! Read more here, and hop in #gemcutter on Freenode or check out issues if you want to help with implementing it.

We’re in the process of trying to figure out how to deal with forked or ‘namespaced’ gems. Add your own ideas in, or hop in #gemcutter on Freenode if you’re feeling for some more instantaneous feedback.

Some more links/research on this:

Options

Gem Name: The current way with GitHub Gems. Basically, tack on the ‘namespace’ onto the gem name but leave everything else as is.
Namespace: Modify the gemspec to contain a ‘namespace’ field.
Source: Allow users to register a subdomain of gemcutter.org and push forked install gems to that. The root domain remains the trusted/real repository.
Version: Shove the ‘namespace’ into Gem::Version, so something like gem 'jekyll', 'qrush-1.0.0'

The Lineup

Evaluating some pros/cons of each solution.

Options Gem Name Namespace Source Version
Pros
Doesn’t change workflow for the primary use case X X X X
Ability for gems to specify forks as dependencies X X X X
Cons
Requires modifying Rubygems X X X
Cumbersome installation of forks X
No polymorphism of gems (i) X
Won’t be accepted into Rubygems (ii) X

(i) If another gem or a Rails app explicitly requires qrush-gemcutter, then only that gem satisfies the dependency rather than any ‘gemcutter’ fork that supports the same API.

Opinions

Feel free to add your thoughts below this. -@qrush


Another potential issue which is really worth consideration, is that of encouraging or discouraging gem forks in the first place. Ruby itself lacks name spacing support, as such using fork gems in these ways can lead to the inability to load other gems that subsequently depend on non-fork versions, or other fork versions. Another really much better option, is to release a new gem, with code in a new namespace, if the code is not going to get pulled back into the original code base. Opening up repositories to reputable contributors (following clear rules, like writing tests / keeping patches clean) should also be encouraged to aid in faster release cycles where appropriate. Faster release cycles also require a keener understanding of versioning semantics. -@raggi


raggi: I’m all for this, but I feel that GitHub has given the Ruby community a new model of development that can’t easily be backed out of. The ability to “fork” a gem and publish it seems like a feature that has worked well since gems.github.com started, and if they start preferring to point users towards gemcutter instead we’ll need to address this issue somehow. -@qrush


qrush: Alas, I’ve been against that for the start due to this issue, @tmm1 can attest to this, being that I went off on a massive rant about it at him a long time ago. Moving forward I think we need to make it easier for people to do the right thing, rather than maintain ease for the wrong thing – but of course, that’s what this discussion is about. My point is, I think there’s two aspects under discussion, first of all what is the right way? and second how do we help people do it the right way more easily? -@raggi


I’ve added “no polymorphism between forks with the same API” to the cons list. – @drnic


(ii) The above version proposal above won’t be accepted into RubyGems. However, alpha portions can be appended to versions and mark the gems as “prerelease” right now. (gem “rake”, “0.8.7.drbrain”) – @drbrain


My rubygems+github proposal can be considered similar to the namespace proposal in that the “namespace” is contained in the signing keys. – @drbrain


My thought is that a dependency can have a ‘host’ added to it, like s.dependency[‘gemname’, ‘version’, ‘host’]
Then the dependencies should at least resolve sanely :) @rogerdpack


I recently had a “huge” discussion about this kind of issue with @raggi. I raised an issue for gems to be able to specify their source for dependencies .

I’ve actually thought some more about it, and “come around” to some extent. The ‘source’ of the gem is not part of the gem dependency. The ‘namespace’ (and I use the term namespace to mean any of the above solutions) might be, but the place that the gem was downloaded from (ie source) is not. rogerdpack ‘s suggestion (and mine) certainly seems to be a nice way to go at first, but is not actually correct (Note: I actually got this idea from rails’ `config.gem` syntax, which has also misunderstood the problem). What is required is a way to differentiate forked gems and specify a specific fork. I still don’t think that a single namespace is a good idea anymore. As qrush stated github has opened a new way of thinking and if people want a specific ‘namespace’ of a gem then specify. If you want the publicly available (normal) version then don’t supply a ‘namespace’.

Looks like I might be advocating namespace ing … ;)

Definitely don’t want to go back to single ‘namespacing’ though for sure.

Thanks for opening discussion. @adamsalter


The more I think about it the more I like the suggestion “Modify the gemspec to contain a ‘namespace’ field”. If rubyforge (or gemcutter) was to stay the canonical (ie default) gem source, both`gem install activerecord` and `gem install rails/activerecord` would still work. Only gem sources that hosted multiple gem forks would have to change anything in the short term. Similarly for gem dependencies.

Also wanted to point out that one ‘sideeffect’ of forking (and that we are already seeing currently) is that it becomes quite difficult to debug `require ‘activerecord’`. This is not a new problem since potentially any two gems could previously have similar file names, but obviously forking multiplies the problem (a good thing in the long term, but a hassle at the moment ;) With namespacing, `require ‘rails/activerecord’` is not adequate for obvious reasons. I not sure the require syntax or functionality should change, but the problem then is how to manage the load paths logically. Otherwise I can forsee multiple gems with multiple namespace dependencies quickly conflicting with each other and causing unforeseen errors. E.g. I have two gems installed ‘adamsalter/activerecord’ and ‘rails/activerecord’. My gem uses a heavily modified version of ‘activerecord’, the default rails gem requires ‘activerecord’, does my gem fork break rails?. (Note: that there’s nothing stopping this already, namespacing is just bringing the problem to the fore).

@adamsalter


If a dependency has a host attached to it and the host goes down the gem can no longer be installed. Hostnames aren’t durable enough (over the course of years) to be used for namespaces.

Kernel#require shouldn’t be modified to support forks. It’s Kernel#gem’s job to manage the load path.

Simply adding namespaces as supported in RubyGems is not good enough. It’ll quickly fall into the platform problem before platform auto-installs was added. RubyGems needs to be able to resolve across namespaces to do the right thing. (Install the newest version that the user trusts.)

@drbrain


How do folks feel about the approach yum repositories use? You may have many repositories, and you can have include/exclude rules for packages based upon repository (yum.conf). You may also do priorities\ by saying that repository B may not overwrite packages in other repositories (yum-priorities). I’m not suggesting this as a solution, just a model to look at for ideas.

Along with this, instead of namespacing gems, the repository becomes the namespace. If I wanted to fork gems and publish them to gemcutter, or any where else I would have my repository at repos.gemcutter.org/copiousfreetime and I could have all my forks there. I then give it a higher priority than the gemcutter.org which has a higher priority than rubyforge.org.

@copiousfreetime


Some thoughts on namespacing versus source

Namespacing:

Install

gem install toholio-serialport
gem install jeffrafter-serialport
gem install serialport

Gem Dependencies

...

Use

config.gem "jeffrafter-serialport", :lib => "serialport" #rails
gem("jeffrafter-serialport") # is there something I need to worry about here
require 'serialport' # which one does it choose

Different sources (ignoring version differences):

Install

# Note, you can only have one serialport gem installed at a time (except for version difference), so these just overwrite each other
gem install serialport --source=toholio.gemcutter.org
gem install serialport --source=jeffrafter.gemcutter.org
gem install serialport

Confusion will ensue if someone screws up and does something like

gem sources --add thoughtbot.gemcutter.org

Gem Dependencies

rubygems doesn’t keep track of where a gem comes from… you can only depend on the name
this is considered an advantage, but will be confusing if you are developing a gem and need both installed

Use
# rails, but keep in mind it will only install the gem from there if it isn’t already installed (regardless of source)
config.gem “serialport”, :source => “toholio.gemcutter.org”

# as you can only have one serialport gem installed at a time there is no ambiguity here
gem(“serialport”) # is there something I need to worry about here
require ‘serialport’ # which one does it choose

@jeffrafter