Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidating .NET GitHub repos #119

Open
stephentoub opened this issue Aug 14, 2019 · 0 comments
Open

Consolidating .NET GitHub repos #119

stephentoub opened this issue Aug 14, 2019 · 0 comments

Comments

@stephentoub
Copy link
Member

stephentoub commented Aug 14, 2019

To discuss these plans, please comment on the corresponding issue at https://github.com/dotnet/coreclr/issues/26175.

Over the next few months, we plan to consolidate several of the foundational repositories of .NET Core, including dotnet/coreclr and dotnet/corefx.

.NET’s repository structure on GitHub was initially created in a fairly fine-grained manner, with the aim of enabling runtime agility and increased developer productivity. However, this separation has led to a variety of complications for contributors and maintainers alike. For example:

  • Confusion about where issues should be opened. It’s not always clear in which repo an issue should be reported, complicated by the fact that some implementations actually span repositories (e.g. a type might be exposed through a reference assembly and have its tests in corefx but actually be implemented in System.Private.CoreLib in coreclr).
  • Difficulty sharing source. System.Private.CoreLib is intricately connected to the rest of the runtime, and thus has lived with the runtime code. That, however, means that although corefx has been intended as the place to share as much core library code as possible across any underlying runtime, we’ve needed to keep a significant body of code in System.Private.CoreLib, which then has meant it’s needed to be “mirrored” to each runtime repo (coreclr, corert, mono) that needs to consume it as source; similarly, corefx has similarly required access to much of that source, for example to use the same interop DllImports that are employed elsewhere in the runtimes. The automatic mirror that shares this source needs to be maintained, and even when it has been, it’s often led to lag. It’s also complicated developer processes, where to validate a change in one repo generally requires manually mirroring the code to other local repos in order to validate the change and ensure merging into one repo won’t break the others.
  • PRs spanning multiple repos. Because the runtime, CoreLib, and the core libraries are all intricately linked, changing runtime behaviors, adding new APIs, or changing various build processes (e.g. improving static analysis) often requires multiple PRs carefully staged across time and multiple repos. For example, to add a new method to a type like Dictionary<>, a developer must first make the source changes in her local coreclr repo and the test and reference assembly changes in her local corefx repo. When satisfied with the fix, the developer must submit a PR to coreclr as well as a PR to corefx, the latter of which will fail. Eventually the coreclr PR will be merged, and the changes will mirror to the other runtimes that also require the update. Eventually those mirror PRs will be merged, and builds will be created containing the fix for each runtime. At some point later those builds will be consumed into the corefx repo, after which point the original corefx PR can be re-validated and eventually merged. That’s the best case; things get more complicated in situations where there are bidirectional dependencies.
  • Building an installable runtime. Self-hosting a custom built .NET requires intricate knowledge of how all the repos work and interact. A developer can’t just clone a single repo, make a desired change, and easily produce an installer.
  • Consistency. One goal we had for more fine-grained repos was to enable isolation and independence for teams working within each repo’s confines, but this has led to non-trivial duplication of effort on things like build systems and CI, and then the resulting lack of consistency as each system ends up diverging from the others.

The issues go beyond the runtime. For example, the ASP.NET maintainers and community did a great job in the past year or so consolidating from ~55 repos down to ~5 repos, but that’s still more repos than is desirable, leading to similar issues as cited above for the runtime. On top of that, these ASP.NET repositories are in the aspnet GitHub org, which adds an additional set of issues, for example:

  • Complications moving issues between repos. If a developer opens an issue in aspnet/aspnetcore and it’s determined that the cause of the issue is actually in dotnet/corefx, there is no GitHub mechanism to enable easily moving that issue across the aspnet to dotnet organizational boundary.
  • Permissions. Each organization’s permissions end up needing to be managed and maintained separately.

The issues extend into tooling as well. For example, we currently have multiple repositories that all logically make up the dotnet CLI, but actually creating a working installer spans multiple repos.

Plan

To address these issues, we’re planning to make some changes to our repository structure:

  • dotnet/platform. We plan to combine dotnet/coreclr, dotnet/corefx, dotnet/corert, dotnet/core-setup, and the relevant portions of mono/mono into a new dotnet/platform repo. Everything needed to build and produce the Microsoft.NETCore.App shared framework will be in this repo. We will no longer suffer from the complications of source mirroring. Features like changing runtime behavior or adding an API will no longer require a complicated dance across multiple repos. Etc.
  • dotnet/aspnetcore. We plan to move the existing aspnet/aspnetcore repository into the dotnet organization. Along with that, we aim to combine a variety of the other aspnet repositories into aspnetcore, such as aspnet/blazor. Whether repos like entityframeworkcore remain separate or combined remains an open question. A goal is that, just as dotnet/platform will be responsible for the creation of Microsoft.NETCore.App, dotnet/aspnetcore will be responsible for the creation of Microsoft.AspNETCore.App.
  • dotnet/cli. We plan to combine dotnet/toolset and dotnet/sdk into the dotnet/cli repo.

FAQ

Why are we merging repos?

We believe we can significantly improve several aspects of .NET on GitHub by combining repos, benefiting both maintainers and contributors. These improvements will manifest in a variety of ways, such as in better issue management, much easier models of contribution, and easier and faster ways to build and install the resulting bits.

What will happen to the aspnet org?

The aspnet org as a separate entity is legacy and artificial. In time we hope to absorb it into the dotnet org and sunset the aspnet org.

Will the coreclr and corefx repos go away?

With the help of GitHub, we plan to migrate all issues from these repos to the new dotnet/platform repo, and then lock down creation of additional issues, such that the repos will no longer be used for issue management. We will also inhibit the creation of PRs to the master branch, which will effectively become an archive for read-only review of history. In this sense, these repos will be archived, and no active development will happen with them. However, we plan to continue servicing previous .NET Core releases out of these repos, so the various release branches (e.g. release/2.1, release/3.0, etc.) will continue to see (limited) activity.

Will the mono repo go away?

No. mono/mono contains the source for the full mono stack and will continue to live on happily. We will simply be moving the managed source associated with System.Private.CoreLib and copying the relatively small amount of source that makes up the native mono runtime to dotnet/platform. We may choose to then use some mirroring technology to keep the runtime copy in sync (this will, however, not suffer from the same mirroring issues we currently experience, as we would not be mirroring between two components required to build the same binaries), or we may choose to let them diverge and manually sync only those changes deemed relevant to both implementations.

Will the corert repo go away?

Yes. We plan to retire/archive the corert repo. Some of the technology in the corert repo will be migrated to the master branch of dotnet/platform, where it will be productized as part of .NET. Other portions of the corert repo will be migrated to feature branches of dotnet/platform, where the experimentation can continue. In this way, we will use feature branches to continue experimenting with the corert technology, while making it easier to share portions with its shipping counterpart and also graduate functionality into master if/when it’s ready.

Does this mean there will be a single repo for all of .NET?

No. We will be reducing the number of repos that contribute to .NET, but currently we do not believe that going all the way down to one is the right answer.

Doesn’t this mean that issue and PR tracking will now be overwhelming?

It is already the case that the vast majority of issues in a given repo are not relevant to any individual developer, and with several thousand open issues in each repo and on the order of a hundred open PRs in each, we already need systems (e.g. labels) to successfully manage issues and PRs. As such, we don’t believe the merging will have a significant impact on this aspect of developer productivity. If it turns out to have an unexpectedly large negative impact, we will work with the community to find ways to mitigate the problem. However, there are already a multitude of successful open source projects on GitHub with at least an order of magnitude more issues.

What will happen to existing issues?

With the help of GitHub functionality, we plan to migrate all issues from old repos (e.g. dotnet/coreclr, dotnet/corefx) to the new repos (e.g. dotnet/platform). We may also use this as a forcing function to revisit stale issues and either close those that are no longer relevant or reinvigorate those that are demanding of more immediate attention.

What about git history?

In general, we plan to keep history, such that history from each constituent repo will be a part of the new repo. However, we have made some mistakes in the past (e.g. large binaries, multitudes of automated PRs for flowing bits and source between repos, etc.), and we plan to rewrite history to correct those mistakes wherever possible and impactful. Some rough calculations suggest this could end up significantly reducing the size of the repos as well as the time it takes to clone, which should not only help developers approaching the project but also CI. This will end up meaning that SHAs may be different in the new repo than they were in the old repo; as previously mentioned, however, any references to the old SHAs in the old repos will continue to work, as those repos will remain accessible.

Will I still show up in the contributor list due to previous contributions?

Yes. We plan to merge all such history.

Will this break debugging with SourceLink?

No. The existing repos will continue to be accessible, and commit SHAs there will remain unchanged.

Might plans change?

Sure. Part of the goal of posting this announcement is to hear from you, hear about additional benefits you're excited about, and hear about additional concerns we may not have considered. We will include such ideas in our planning and course correct as necessary.

Discussion

To discuss these plans, please comment on the corresponding issue at https://github.com/dotnet/coreclr/issues/26175.

@dotnet dotnet locked and limited conversation to collaborators Aug 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant