New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a place for `/pkg` #32989

Open
vieux opened this Issue May 3, 2017 · 15 comments

Comments

7 participants
@vieux
Copy link
Collaborator

vieux commented May 3, 2017

In an effort of splitting Moby and Docker, we are reorganizing the repo to avoid confusion (see #32871)

We decided to not move some of the packages at first, so we won't break projects depending on those.

Item 1 - Move pkgs with single consumers to a subpackage of that consumer

(see https://gist.github.com/dnephin/35dc10f6b6b7017f058a71908b301d38)

  • github.com/docker/docker/pkg/aaparser
  • github.com/docker/docker/pkg/broadcaster
  • github.com/docker/docker/pkg/devicemapper
  • github.com/docker/docker/pkg/filenotify
  • github.com/docker/docker/pkg/gitutils (@dnephin #33477)
  • github.com/docker/docker/pkg/jsonlog (move or duplicate RFC3339NanoFixed first)
  • github.com/docker/docker/pkg/listeners
  • github.com/docker/docker/pkg/loopback
  • github.com/docker/docker/pkg/namesgenerator
  • github.com/docker/docker/pkg/pidfile
  • github.com/docker/docker/pkg/platform
  • github.com/docker/docker/pkg/pubsub
  • github.com/docker/docker/pkg/registrar
  • github.com/docker/docker/pkg/tailfile
  • github.com/docker/docker/pkg/tlsconfig
  • github.com/docker/docker/pkg/truncindex

Item 2 - Split packages with poor cohesion

@vieux vieux added the area/project label May 3, 2017

@vieux vieux added this to the splitting_moby_and_docker milestone May 3, 2017

@vieux vieux referenced this issue May 3, 2017

Open

[META] Splitting moby and docker #32867

4 of 14 tasks complete
@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented May 4, 2017

Here's my proposal for dealing with pkg/:

Summary

github.com/moby/moby/pkg should be split up as following:

  1. packages that only have a single consumer should be moved to a sub-package of that consumer
  2. packages with heavy coupling should be grouped together and moved to a new repo
  3. packages that have poor cohesion should be split up and moved to one of the above locations (new repo or single consumer)
  4. in a few rare cases parts of very small packages should be duplicated between the two consumers

The full list of moves is rather long, so I've captured it in this gist: full pkg moves proposal. This gist includes the full list of pkg dependencies and consumers, as well as the script I used to generate the list.

New repos

Below is a rough plan for the new repos that will be created. This may change as we work on the split, but it should be approximately on this scale. These are the github urls, but are not necessarily the "import urls". We will likely use our own urls with a service like http://labix.org/gopkg.in. These repo names are purely a suggestion and can easily be changed.

  • github.com/docker/plugins
  • github.com/moby/go-archive
  • github.com/moby/go-fsutils
  • github.com/moby/go-httputils
  • github.com/moby/go-ioutils
  • github.com/moby/go-sync
  • github.com/moby/go-random
  • github.com/moby/go-system
  • github.com/moby/go-testutils

(Note: fsutils and archive might be combined)

Many of these would be created by combining multiple pkg/ packages.

Others

16 packages will be moved to their single consumer.
4 packages need to be split up before they are moved.
3 packages are small enough they can be copied to where they are used
3 packages will be moved to existing repos

@unclejack

This comment has been minimized.

Copy link
Contributor

unclejack commented May 4, 2017

@dnephin What would be the purpose of having the code split up into so many repositories? There are some other projects out there which make the life of a potential contributor a nightmare. The more processes and procedures are put in place to make a single contribution, the less likely it is for someone to go through the process to make that contribution.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented May 4, 2017

The general idea is that each of these new repos are functionally very different and are not called from the same components. Any pkgs with high coupling were merged into a single package to minimize the overhead of contributing. We're going from 54 ./pkg packages down to 10 repos.

It should be extremely unlikely that any contribution to a moby/moby component would need to modify more than one of these. In many cases not a single one of these packages would be modified. If you see any cases where these packages would have to be modified together, please do let me know, and I can update my proposal.

My rational for not moving everything into a single "random-pkg-stuff" repo is that it would actually be considerably worse for contributors. If multiple PRs require modification to the same repo it is very difficult to coordinate the vendoring of those changes. If we had every pkg in a single repo then the frequency of conflicting changes would be much higher.

A single repo of random utilities is also difficult to describe, and has no clear goal or purpose. This makes it hard to find owners for the entire repo, and results in quality issues over time.

I think we have to find a good balance between too many repos, and too few. I think the best way to do that is to look at package dependencies and consumers, and group things together based on their coupling to other packages. That is what I've done here.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented May 5, 2017

system, mount, and fs seem fairly related.

Maybe instead of github.com/moby/go-locker maybe a more generic repo for concurrency primitives?

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented May 5, 2017

btw, I wrote locker and always hated the name, just no one suggested a different one.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented May 5, 2017

Ya go-locker is the smallest of them. I couldn't find any other packages for concurrency primitives. I guess over time we could add more, so renaming it to something more general sgtm.

Combining system, mount, and fs could work, although there are also dependencies between fs and archive I think.

This proposal was just an approximation, as the work is done to actually split it out we might discover a better fit, but I think the general plan remains the same.

@thaJeztah thaJeztah added the roadmap label May 5, 2017

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented May 8, 2017

I would suggest that we favor using the internal package rather than assuming broad reuse of the packages within docker. Taking a package from being useful in Docker to broad use cases takes a lot of time and consideration. Often, the goals of making an independent package and separating docker functionality might be at odds.

A great example is pkg/httputils. There are really only a few exports that are just utility functions that are specific to the way docker uses them.

Dare I invoke this relationship, but I fear this effort will result in a serious case of leftpad-itis.

If we do find that a package is broadly useful, we should create a process for identifying those candidates and eliminate other options before going this route.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented May 8, 2017

The goal of this effort is support breaking out components. Any pkg that is only used by a single component could be moved to an internal, or some other subpackage. I've categorized those as "16 packages will be moved to their single consumer". I don't think these are really "internal" worthy, but it could work if we really want to hide them.

For the other 38 pkg internal will not work. They are used by multiple components.

I agree many of these packages will have a suboptimal interface initially.

Trying to refactor everything right now will greatly delay the work to split out components. Instead of trying to fix everything now we can improve the interface of these packages over time.

That said, I don't think we want to rush this and create 10 new repos right away. This is just a roadmap for when we hit shared components. Maybe as we extract things we'll realize that we don't need some of these repos.

Since you mentioned pkg/httputils, you'll see from my proposal that I have suggested splitting it up into the parts that are only used by one consumer, and the parts that are shared with multiple packages.

The go-httputils repo will need to be shared between whatever component provides the http api for backwards compatibility and the docker/cli repo, so we know there will be at least two consumers.

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented May 9, 2017

Creating components is the process of finding the correct interfaces. Splitting these out into packages without refactoring is going to create unforeseen problems.

If we can, it would be best to eliminate as many of these packages as possible. If we do find there are real shared patterns, we can propose and design the package with respect to requirements.

Either way, one less package for you to worry about: #33097.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented May 9, 2017

Creating components is the process of finding the correct interfaces

Yes the interface for the component, not the interface for every library used by a component.

If we can, it would be best to eliminate as many of these packages as possible.

Yes, that is part of the plan.

Either way, one less package for you to worry about: #33097.

Not exactly. As you can see in the plan go-random was going to be created from pkg/random and pkg/stringid (and possibly pkg/namegenerator). So this change is moving the code in the same direction as the plan, but we still need a repo for pkg/stringid, which is used in what will becoming a few different components.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented May 9, 2017

Adding some more details based on a discussion during slack standup.

This proposal is only a guideline to follow as we hit pkg that are needed by components. We should prioritize the split out of components first, and address pkg libraries as part of each component extraction.

Before creating any new repos the following steps should be performed:

  1. check for functions with single callers and move them to the consuming package or a subpackage of it
  2. if a function is sufficiently small, and has a small number of consumers, duplicate it
  3. finally, if 1 and 2 are not appropriate, either create a proposal for a new repo with a clear purpose, or add it to an existing repo if the functionality matches the purpose of the repo

Given these steps, the following work can be done at any time:

  • 16 packages moved to their single consumer
  • 4 packages split up before they are moved
  • 3 packages can be copied to where they are used

These packages and operations are listed in the full proposal

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented May 10, 2017

@dnephin Consider moving mount/mountinfo to containerd dependencies. We can maintain them there and that won't require another package. Please see containerd/containerd#813.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented May 10, 2017

Perfect, that one was an outlier so moving it to containerd would be great. I've updated the proposal and the summary.

@vdemeester

This comment has been minimized.

Copy link
Member

vdemeester commented Aug 8, 2017

pkg/devicemapper is going to be a little bit a pain to move as it seems to be used a bit in some plugins and stuff (https://github.com/search?utf8=%E2%9C%93&q=%22github.com%2Fdocker%2Fdocker%2Fpkg%2Fdevicemapper%22&type=Code)

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Aug 8, 2017

@vdemeester It looks like most of those references are forks of docker or vendored dependencies. Are there any that have direct references?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment