Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified Build design - Managing multiple SDK bands in the VMR #13720

Merged
merged 48 commits into from Jul 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
88ba649
Port first half into markdown
premun May 18, 2023
6b4d4b3
Port second half without tables
premun May 18, 2023
1f37d9e
Add a list of repos
premun May 22, 2023
1fb81bd
Use mermaid for everything
premun May 23, 2023
9da28b6
Port new code flow diagrams
premun May 23, 2023
86f6ecb
Add parallel sections
premun May 23, 2023
f56f347
Remove initial state
premun May 23, 2023
f0b15c0
Small fixes
premun May 23, 2023
7dc1634
Describe band life cycle
premun May 24, 2023
3451a66
Import missing tables
premun May 24, 2023
3855bbe
Add proposal docs
premun May 24, 2023
9022607
Add Maestro flow changes
premun May 24, 2023
21b6d35
Fix a note
premun May 24, 2023
82fd9c9
Move code flow diagrams
premun May 25, 2023
2cc9718
Add comments for code flow diagrams
premun May 25, 2023
d1f8697
Add WIPs
premun May 25, 2023
14f0f83
Add more side-by-side
premun May 25, 2023
66b5cc4
Add more SDK branches
premun May 25, 2023
4983074
Add the new proposal
premun May 26, 2023
5b547d8
Remove v1
premun May 26, 2023
7e587a3
Port changes from the PR
premun May 26, 2023
1ae8359
Improve main document
premun May 26, 2023
060cba6
Small notes
premun May 29, 2023
edc3fbd
Merge main
premun May 29, 2023
946cc50
Update proposals with ideas around using intermediates
premun May 29, 2023
22a1c61
Add Release and Implementation complexity sections
premun May 29, 2023
60fca84
More fixes
premun May 29, 2023
0d0aa92
Improve the release section
premun May 30, 2023
0949999
More cosmetics
premun May 30, 2023
fb3aa70
Release section in proposals
premun May 30, 2023
9a5cfaf
Fix TODOs
premun May 30, 2023
7a55e96
Adjust priorities
premun May 30, 2023
ef9e7e1
Apply suggestions from Matt
premun Jun 7, 2023
fee509f
"Rolling" build
premun Jun 7, 2023
83c2cc0
Fix diagram for branches
premun Jun 8, 2023
cf3f2b7
Adjust comparison summary
premun Jun 8, 2023
755a6f4
Fill out maintenance
premun Jun 8, 2023
57ea7c4
Add a note explaining "at least 3 bands"
premun Jun 8, 2023
0947549
Explain why we want shared components only once
premun Jun 12, 2023
2b24e26
Resolve few more comments
premun Jun 12, 2023
917c1e1
Add band life cycles
premun Jun 12, 2023
d13d224
Update Documentation/UnifiedBuild/VMR-Managing-SDK-Bands.md
premun Jun 12, 2023
8d1fb7c
Intermediates -> Build output packages
premun Jun 16, 2023
d6f2d82
Mark SDK branches as the winner of the build area
premun Jun 16, 2023
dc2c797
Explain shared components
premun Jun 16, 2023
7df4aa4
Add conclusion
premun Jun 16, 2023
c847034
Fix typo
premun Jun 16, 2023
d40a1b5
Fix a typo
premun Jun 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
184 changes: 184 additions & 0 deletions Documentation/UnifiedBuild/VMR-Managing-SDK-Bands-SDK-branches.md
@@ -0,0 +1,184 @@
> Note: This is a proposal for a strategy to build, manage and release multiple SDK bands of .NET. The proposal is part of the [Unified Build](./README.md) effort. For more context about the problem this design is trying to solve see the [Managing SDK Bands](./VMR-Managing-SDK-Bands.md) document.

# Managing SDK Bands - "SDK branches" proposal

This proposal follows closely how we organize SDK band branches today. The bottom line is that we'd just keep using SDK branches in the VMR the same way we have them in other repositories. This is, in fact, what we’re currently already doing with today’s read-only VMR-lite where we synchronize the SDK branches of `dotnet/installer`.

This document describes the end-to-end process from developing to shipping multiple SDK bands using this model.

## Layout

For simplicity, let's consider we are synchronizing the repositories `dotnet/arcade`, `dotnet/runtime`, `dotnet/roslyn` and `dotnet/sdk` where `dotnet/runtime` and `dotnet/arcade` are the shared components.

The layout of files will stay almost the same as today's VMR-lite:

```sh
└── src
├── arcade
├── roslyn
├── runtime
└── sdk
```

The problem with this is that each SDK branch would contain source code for all shared components. This would cause problems with keeping the sources of these synchronized.
Furthermore, we don't really even want this behavior as for instance, the preview band always stays locked to the last released version of the shared components until right before the release happens.

To work around that, we'd have to make an adjustment. This adjustment would require a feature in Source Build where we could specify whether a components is built form source or restored from its build output package.
This functionality actually already exists and each repository already references its dependencies via `eng/Version.Details.xml` so that it can build inside of its individual repository.
Considering we have this capability, we'd then change the VMR contents so that the SDK branches of other bands than the first one (`1xx`) would not contain the sources of the shared components.
Copy link
Member

@tmds tmds Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For source-build, it's likely maintainers will only build one SDK feature branch, and it would be nice if they can just work with a single branch from the vmr for doing so. This requires each SDK branch to include all shared components so it is self-contained.

If a maintainer would like to provide all SDK feature branches, it would be preferable to only have to build the runtime once. This could be implemented through a top-level flag causing only the sdk to be built. This will effectively build the same things as this proposal has for building under the non 1xx branches.

The proposed solution for eliminating the shared components from the non 1xx branches seems to be for dealing with:

This will give us more flexibility such as locking down the version of the shared components in the preview band to the last released version.

I don't fully understand the issue mentioned here, but I assume it could be solved by adding the necessary things to dotnet/installer branches.

Copy link
Member Author

@premun premun Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good feedback, thanks. We still need to discuss the "Release" section in depth but my personal expectation is that with the product becoming a bit more complex, the process of how we release and deliver the sources might be affected as it will need to address these new options.
The solution to this, in my opinion, can range from having a "super VMR" which would reference the released commits of the VMR via submodules, through us preparing a special release branch in the VMR, to just compiling the release tarball for all bands. But I think we should discuss this further and propose options here.

I don't fully understand the issue mentioned here, but I assume it could be solved by adding the necessary things to dotnet/installer branches.

This is mentioned in the parent document, in the Band life cycle section:

For this setup, we'd say the 100th band is in servicing and the 200th band is in preview. It is important to also note that while a band is in preview, it uses the most recently released .NET runtime while the servicing band revs with the 7.0 channel.

But I can maybe pull this up into its own subsection to make it more visible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main point of feedback is that for source-build maintainers the UX is better if branches are self-contained and can be built directly.

A key feature of the vmr is that it bundles all sources to build .NET in a single place.
That is definitely a big improvement from how we were building previously. We should aim to preserve this model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @dotnet/distro-maintainers

Copy link
Member

@tmds tmds Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dotnet/source-build#3524 (reply in thread) @mmitche said:

  • Currently, there is no guarantee that you can use a 1xx SDK to build the newer SDK bands. It's likely that newer SDK bands take dependencies on new features that are introduced in those bands. However, barring unexpected bugs, The N SDK should always build with N-1 of the same SDK band. So 8.0.202 with 8.0.201.
  • Currently there is no guarantee that you can use a 2xx+ SDK to build the runtime. I think it's likely that it will work most of the time, and should work with small modifications (e.g. disable a warning or analyzer), but we don't guarantee it today. The runtime will always build with N-1 of the 1xx band though. We always want to be building it with a supported toolset, and the 1xx band is supported as long as the runtime is in support.

This means that if a 2xx band were to include the runtime, the runtime would need a 1xx SDK band, and the sdk would need a 2xx band. These are conflicting build dependencies.

The current proposal respects the guarantees by having in each branch only the sources that are expected to build with their corresponding SDK band.

To be able to build a runtime, distro maintainers need to maintain a 1xx band.
It's not possible to only build and ship the latest band.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmds The 2xx+ bands would not include the runtime. So to build the 2xx band, you'd build 1xx, which would build the 1xx SDK and the shared runtime. Then those source-built outputs would be fed into the 2xx, 3xx, etc. SDK builds. Each branch would not be independently buildable such that each included the shared components.

It is pretty important to not view the SDK and runtime as a single product unit. They're essentially separate products, where the SDK happens to redist the runtime. In addition, the shared bits must be built once for N SDK bands. Also remember that the SDK is not the only distribution mechanism of the runtime bits. The aspnetcore hosting bundle, the runtime installers/zips, etc. also redist the runtime.

To illustrate this more, let's say you made each SDK band independently buildable. If a distro decides to make available the 1xx and 2xx bands, they would check out the 1xx SDK branch, build the runtime and SDK, and then check out the 2xx branch, again building the runtime and SDK. Now you have two copies of the shared components, built from different sources. When you deploy the runtime package, you need to choose one. You make an arbitrary choice, let's say 1xx. But now for the 2xx sdk, the runtime redisted in the SDK layout (which would be used in certain build scenarios) is not the same as the one used for 1xx. This would cause at best confusion, and at worst, installation issues.

So basically we're stuck in a place where either at most one band must contain the runtime sources, or no bands contain the runtime sources, and the runtime and SDK are built separately. In either case, the output of the build that creates the runtime binaries serves as an input to other builds. We think that having the 1xx band as the home of the runtime provides the least amount of friction.

Instead, they would flow in the branches via a package dependency flow where the branches would reference the build output packages that would be built from the `1xx` branch. This will give us more flexibility such as locking down the version of the shared components in the preview band to the last released version.

The complete layout would then look like this:

```sh
# release/9.0.1xx branch
└── src
├── arcade
├── roslyn
├── runtime
└── sdk

# release/9.0.2xx and other branch
└── src
├── roslyn # references the runtime and arcade build output packages instead of sources
└── sdk # references the runtime and arcade build output packages instead of sources
```

To summarize the characteristics:

- VMR has SDK branches, e.g. `release/9.0.1xx` and `release/9.0.2xx`.
- Each repository is a folder under `src/` in the `1xx` branch of the VMR.
- Each non-1xx branch of each SDK-specific repository maps to a folder under `src/` in a matching branch of the VMR.
- Each commit of the `1xx` branch produces a single runtime and single SDK. The non-1xx branches do not contain all the code however.
- Commits of the `non-1xx` branches produce SDKs only and their shared components are referenced as packages built from the `1xx` branch.
premun marked this conversation as resolved.
Show resolved Hide resolved

## Band life cycle

- **Product preview time**
The preview time is when most of the development happens and the VMR would contain a single band only. For this time, we only have the 1xx branch in the VMR and everything works the same way as now.

- **Band preview time**
The band that is created the latest and is to be released next is called the preview band. Except of the 1xx, each preview band is locked down to use the latest released version of the shared components for the time of development. Since this proposal won't put the sources of the shared components in the non-1xx branches, it will be quite obvious that the dependencies come from packages.

- **Band snap**
To create a new band, and for the ease, it would be the best to do the snap in the VMR from where it would be flown to the appropriate branches in the individual repositories:
premun marked this conversation as resolved.
Show resolved Hide resolved

1. Create the new branch based off of the current one.
E.g. `src/sdk/9.0.1xx` to `src/sdk/9.0.2xx`
2. Remove sources of shared components in the `2xx` branch. Adjust package versions and point the new band to the build output packages of shared components from the last release.
3. Configure Maestro subscriptions between new VMR bands and their individual repository counterparts.
4. If there are at least 3 bands, configure subscriptions of the currently released band to consume the build output packages of the `1xx` band.
> Note: We need 3 bands at minimum because The first one is there from the beginning so we need to wait until a second one only gets out of preview which happens when we snap the third one.
5. Maestro flows the changes from the VMR and creates the appropriate branches in the individual repositories.

This makes sure that the new (preview) band is locked down to use the latest released shared components and that the a newly released bands will start getting the newest shared components built in the `1xx` branch.

## Working with the code

The proposed layout has some problematic implications. Let's consider the following scenarios:

1. A developer needs to make sure a cross-repo change to `src/runtime` and `src/sdk` in the `3xx` band.
Copy link
Member

@jkotas jkotas May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rare scenario. The bar for dotnet/runtime is servicing in this situation. We are not doing any dotnet/runtime feature work during servicing.

Do you have an example of a change that was done in dotnet/runtime servicing and that had to be synchronized with SDK change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this feedback. I lack knowledge in what kind of changes we do and how we work with the repositories during servicing. Maybe I have been optimizing for scenarios that do not happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas I think it's rare for us to do real feature work in shared components during servicing timeframes. However, cross repo changes, for example infra changes, do happen in servicing and may be better suited to working in the VMR. That said, it doesn't mean you can't do the VMR-only commit for 1xx, then port the SDK changes for 2xx+.

2. A distro maintainer needs to build the `3xx` band from source.

We need to make sure both of these scenarios are easy to do but the layout of the sources doesn't allow that out of the box.
Copy link
Member

@jkotas jkotas May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one does not need to be easy. It is ok for it to be more complicated, on purpose.

The second one we do not have a customer for today.


It seems that to make this work, we'd need to be able to tell Source Build to easily swap between using the sources and the build output packages of the shared components.
When someone would be interested in these flows, we should have a mechanism to also checkout the sources and reference them during the build. There are couple of possibilities:

1. The `src/` folder of non-1xx branches would contain submodules pointing to the original individual repositories. These would not be used in most flows but could be activated. When we'd be flowing changes from the 1xx branch, we could also change where the submodule points.
2. Have a script that would check the components out into `src/` onto the same location where they are placed in the 1xx branch. It would also create some dummy file to signal that Source Build should ignore `Version.Details.xml` when restoring the build output packages but rather build the sources. The `src/` locations and the signal file would be ".gitignored". The dev would then have to backport their changes from withing the folders to either the 1xx branch or the individual repositories.
3. We could just expect the individual repositories to be checked out somewhere else on developer's disk (e.g. next to the VMR itself) and Source Build would know to find and build those instead (again through an invisible signal file for instance).

The first option seems quite straightforward but the individual repository doesn't necessarily have to have the same contents as its counterpart in the VMR which might be problematic.
The second flow solves the scenario of .NET distro maintainers fully as we'd easily create a source tarball that would match the layout of the 1xx branch. It doesn't work well for the developer scenario though.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are still unclear for me on what this layout looks like for the source tarball. Are there multiple tarballs, one for each SDK band? If so, how are they managed to have shared components? If not, what does the layout look like for multiple SDK bands?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to explore having multiple tarballs. 1 runtime+SDK product, then N SDK-only products. That opens up the possibility of additional non-SDK products without implying that we will steadily expand source-build over time to always be the "whole world".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. There's also a question of how much is this an "implementation detail" and how much it influences the decisions made at this point.

The third option seems to be the best for the developer but doesn't solve the distro maintainer scenario fully.

It seems that we could have a mixture of `2.` and `3.` where Source Build would have a feature of "know to look elsewhere" and distro maintainers would build from a tarball assembled as described in `2.` and developers would have the option to use the sources from their local individual repositories as shown in `3.`.

## Code flow

To re-iterate what the planned code flow looks like for .NET 9 (with full VMR back flow) – the individual repositories only receive and send updates from/to the VMR and not between each other. A regular forward flow with changes going to the VMR only would look like this:

```mermaid
sequenceDiagram
premun marked this conversation as resolved.
Show resolved Hide resolved
autonumber

participant runtime as dotnet/runtime<br />release/9.0
participant SDK_1xx as dotnet/sdk<br />release/9.0.1xx
participant SDK_2xx as dotnet/sdk<br />release/9.0.2xx
participant VMR_1xx as VMR<br />release/9.0.1xx
participant VMR_2xx as VMR<br />release/9.0.2xx

runtime->>runtime: New change ➡️ RUN_2

runtime->>VMR_1xx: Flow of 📄 RUN_2
Note over VMR_1xx: 📦 VMR_2 build output packages are built
VMR_1xx->>VMR_2xx: Flow of 📦 VMR_2<br />(runtime build output packages)
Note over VMR_2xx: 📦 VMR_3 build output packages are built

Note over VMR_2xx: ✅ Coherent state<br />VMR 1xx and 2xx have 📄 RUN_2

par Parallel backflow of build output packages
VMR_1xx->>SDK_1xx: Backflow of 📦 VMR_2
and
VMR_2xx->>SDK_2xx: Backflow of 📦 VMR_3
end
```

The situation gets more interesting for breaking changes. Let’s imagine a situation where a change is needed in one of the bands that requires a breaking change in a shared component:
premun marked this conversation as resolved.
Show resolved Hide resolved

```mermaid
sequenceDiagram
autonumber

participant runtime as dotnet/runtime<br />release/9.0
participant SDK_1xx as dotnet/sdk<br />release/9.0.1xx
participant SDK_2xx as dotnet/sdk<br />release/9.0.2xx
participant VMR_1xx as VMR<br />release/9.0.1xx
participant VMR_2xx as VMR<br />release/9.0.2xx

runtime->>runtime: Change in runtime ➡️ RUN_2

runtime->>VMR_1xx: PR with source change to 📄 RUN_2 is opened
activate VMR_1xx
Note over VMR_1xx: ❌ Requires a change in SDK
VMR_1xx->>VMR_1xx: Change needed in src/sdk<br />Creating 📄 SDK_1.2
deactivate VMR_1xx
Note over VMR_1xx: 📦 VMR_2 build output packages are built

VMR_1xx->>SDK_1xx: Flow of 📄 SDK_1.2, 📦 VMR_2

VMR_1xx->>VMR_2xx: Flow of 📦 VMR_2
activate VMR_2xx
Note over VMR_2xx: ❌ Requires a change in SDK
VMR_2xx->>VMR_2xx: Change needed in src/sdk<br />Creating 📄 SDK_2.2
deactivate VMR_2xx
Note over VMR_2xx: 📦 VMR_3 build output packages are built
VMR_2xx->>SDK_2xx: Flow of 📄 SDK_2.2, 📦 VMR_3

Note over VMR_2xx: ✅ Coherent state<br />VMR 1xx and 2xx both use 📄 RUN_2
```

The diagram shows:

1. A change was made in `dotnet/runtime`.
2. The change is flown to VMR's `1xx` branch where a PR with the source change is opened.
3. The PR build fails and more changes are needed under the `src/sdk` folder. PR is merged.
Official VMR build publishes build output packages for each repository.
4. New sources of the `1xx` band, together with the we new runtime build output package are flown back to `dotnet/sdk`.
5. Build output packages of shared components are flown to VMR's 2xx branch.
6. The PR build fails and, similarly to the `1xx` branch PR, more changes are needed under the `src/sdk` folder. PR is merged.
Official VMR build publishes build output packages for each repository.
7. New sources of the `2xx` band, together with the we new runtime build output package are flown back to `dotnet/sdk`.

After the last step, the `1xx` VMR branch has the sources of `dotnet/runtime` that are packaged and used by the `2xx` branch which means they're coherent.

## Release

The release has three main phases:

1. **Figuring out what to release** - We need to make sure the SDK branches are coherent. This means that the lastly published build output packages from the `1xx` branch have flown to all of the other SDK band branches. For that to happen, we need to enable the package flow for the preview band and consume the newest bits to validate everything.

2. **Compiling the binary release** - Since the shared components were built only once and stored inside of the build output packages, we can assemble the packages from all band branches and release them together, similarly to how we do it today. The staging pipeline could assemble the build products from the official builds similarly to how we do it today.

3. **Publishing and communicating the release of the sources** - Publishing of sources so they are easily consumed by 3rd party partners would differ based on whether the consumer cares about one or all bands. or a single SDK band release, only the 1xx band branch would contain the sources in such a way that you could build directly. The non-1xx band branches do not contain the source code of the shared components and only reference them as build output packages. This means that we'd need to compile the sources by restoring them from the 1xx band branch. For releases of multiple SDKs together, we'd also need to compile the full set of sources by bringing the branches together.