Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a HLDD draft for Unified Open Networking Hardware Support via Yocto #1

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

jklare
Copy link
Collaborator

@jklare jklare commented Dec 18, 2023

No description provided.

@@ -0,0 +1,163 @@
# Unified Open Networking Hardware Support

[[_TOC_]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does github MD support creating a TOC like this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, it actually does not (GitLab does). I will leave this here for now as a placeholder.

baseline-network-linux-hld.md Outdated Show resolved Hide resolved
baseline-network-linux-hld.md Outdated Show resolved Hide resolved
baseline-network-linux-hld.md Outdated Show resolved Hide resolved
maturity. Although given those differences, all of the currently available
and open network operating systems are utilising a mainline Linux kernel and
need to build and maintain platform specific abstractions (e.g out of tree
drivers) for peripheral switch hardware.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A definition and/or examples should be given for "peripheral switch hardware".

amount of work needed for keeping drivers building and working with newer
kernels stays manageable.

These will also help vendors quickly updating their drivers when submitting new

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "these"? "Coccinelle patches"? It's better to be repetitive than to be unclear.

Copy link
Collaborator

@KanjiMonster KanjiMonster Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced this with "semantic patches"

baseline-network-linux-hld.md Outdated Show resolved Hide resolved
via scripts from SONiC, initial installer
- M3: import missing platforms from ONL and DentOS, ONLP adapter(?), extensible
installer
- M4: get accepted as official layer

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted by whom? Yocto?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- M4: get accepted as official layer
- M4: get accepted as official Yocto layer

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the text to say Yocto layer and added the link (thanks for it)

baseline-network-linux-hld.md Outdated Show resolved Hide resolved

- Provide ASIC / dataplane support via e.g. generic SAI implementations.
- Provide a common API / userspace utilities for accessing various sensors and
data regardless of their specific API (PDDF, ONLP, S3IP).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds suspiciously like https://xkcd.com/927/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have experience with ONLP and PDDF. PDDF has been very useful in bringing up interfaces via SAI in minimal Linux builds such as ONIE. I suggest that we stick with one, no reason to create a new standard, but it would be useful to have a shim to provide output as current users expect.


## Scope

The current open network operating system (NOS) market is fragmented and has
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call out what PAS/NOSes are included in this list? Possibly just the fully open source ones ex: community SONiC as SONiC has many bespoke Enterprise versions from different vendors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, an explicit list sounds like a good idea

- TBD: organization of platform support
- TBD: platform initialization

In the first iteration, the platform support is fine as-is. Future iterations

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which platform support is "fine as-is"? Multiple options from multiple sources are mentioned: "base networking hardware support combines hardware support from multiple sources". Does the project start with one source, adding more, or is the first step using all of them "as-is"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote the text a bit. Does it sound better now?

@jklare
Copy link
Collaborator Author

jklare commented Dec 18, 2023

I will accept a bunch of comments via the GitHub Web UI for now and squash those later.

largely aligned with and adapted from the
[OCP OpenNetworkLinux Project](https://github.com/opencomputeproject/OpenNetworkLinux) (which has been moved to maintenance mode, pending archival).
This proposal specifically seeks an agreement between the developers and
maintainers of the aforementioned network operating system communities, to be

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we've mentioned about any NOSes yet.

[OCP OpenNetworkLinux Project](https://github.com/opencomputeproject/OpenNetworkLinux) (which has been moved to maintenance mode, pending archival).
This proposal specifically seeks an agreement between the developers and
maintainers of the aforementioned network operating system communities, to be
able to share common implementations to support open networking hardware.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are not much commonality between different NOSes. Maybe yo mean "a baseline" or something similar. You should clarify.

able to share common implementations to support open networking hardware.

Support for open networking hardware platforms is currently distributed over
various different projects. Many platforms, especially older ones, are supported

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. First there are many platforms using ONL [platform as in HW?]. Then there is SONiC [platform means the OS]. And finally DentOS [it also uses ONL but but it is a different platform?]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to write a more detailed definition for "platform". In general, i think the idea is to talk about "platform" == "whitebox" == "open switch hardware" here.


### Base Open Networking Hardware Support

The base networking hardware support combines hardware support from multiple

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still have not defined what the proposal is. I know there is something called "base networking HW support" but I still don't know what that is.

- TBD: organization of platform support
- TBD: platform initialization

In the first iteration, the platform support is fine as-is. Future iterations

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be the first platform supported?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed the EdgeCore AS 4630 in previous calls as a widely adapted and pretty useful one, so we might want to make this one our "demo platform". Since we are focussing on a BSP Layer here, it might be worth evaluating how many "low hanging fruits" from the ONL and SONiC side we can find and then add them in one go. In addition to the current code status, we might want to continue the discussion with whitebox vendors to see what they think and what would be useful for them.


To allow quick testing and easy extension for other users, an example
implementation and support layer for Yocto will be provided. It will contain all
basic building blocks for creating an installable image:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installable but not usable (i.e. no control plane.) We should explicitly state what the first deliverable is (and the next and so on...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a misunderstanding. IMHO an installable image like this would be quite useful for evaluating platform support as well as the hardware validation by vendors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be a good idea to write a section about the target "customers" for this BSP layer to emphasize that the intention here is not to provide a "plug and play NOS", but rather a layer (similar to ONL) on which other full blown NOS can be build on (e.g. dentOS or SONiC).


### ASIC/dataplane support

ASIC/dataplane support is out of scope for now, but may be provided by a future
Copy link

@taskin0003 taskin0003 Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would reword this. We are proposing a NOS build infra, which does not support neither the data nor the control plane. Doesn't sound like it would be that useful.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ONL has shown that providing a bootable NOS with support for the aforementioned hardware peripherals is quite useful as a building block (e.g for OpenOLT, dentOS and BISDN-Linux) even without providing a data plane abstraction for the control plane.

Target Yocto version is scarthgarp, which is the next LTS release, planned for
April 2024.

The initial Kernel version target could be 5.15 (minimum kernel version of Yocto)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"will be 5.15"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably will need to update this once we all agree on the details here and can pinpoint the best target kernel a bit better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, my idea is to start out the lowest supported kernel, since will pose the least amount of work to do to get someting running, but then quickly switch to a more recent LTS, or maybe even the most recent.

Which one we target is probably something that will move if we do try to become an official Yocto layer.

installer
- M4: get accepted as official layer

## Future Extensions

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the "future extensions", proposal does not sound like there is a roadmap to provide a full network switch support including data and control planes for an OS.

KanjiMonster and others added 6 commits January 11, 2024 15:46
Co-authored-by: Jan Klare <jan.klare@bisdn.de>
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Co-authored-by: ideaship <ideaship@users.noreply.github.com>
Co-authored-by: ideaship <ideaship@users.noreply.github.com>
Co-authored-by: ideaship <ideaship@users.noreply.github.com>
Co-authored-by: ideaship <ideaship@users.noreply.github.com>
Co-authored-by: ideaship <ideaship@users.noreply.github.com>
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
We want the core of the platform support to be OS agnostic for easier
reuse by other projects, so be explicit about it.

Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Extend the milestones with sub items for clearer work splits and extend
the possible future extensions with more options and more text

Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Update the graphic showing three layers, with the bottom one being the
actual platform support. Explain the different layers a bit more.

Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Signed-off-by: Jan Klare <jan.klare@bisdn.de>
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Comment on lines +121 to +124
- patches make required changes due to upstream changes more obvious (provide a
“conflict diff”) and easier to review
- branches are closer to the upstream approach, but may be harder to track
changes needed due to upstream changes
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest maintaining separate feature branches and rebase them on each new upstream kernel change. This way, each commit in a topic branch can be submitted upstream as soon as possible and the remaining downstream changes are obvious.

A rebase can be reviewed easily by using git range-diff.

Copy link
Collaborator

@KanjiMonster KanjiMonster Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reviewability concern was about reviewing pull requests or submitted patches - you can't submit a rebase for being reviewed by others.

So AFAICT if you want the rebase to a newer kernel reviewable, you need to create a pull request for the changes (or submit a patchset), then compare the diff with the diff of the old branch to ensure no important changes were dropped, or wrong rebase decisions were taken in merge conflicts.

A patch updating patches makes this more obvious to me, but then again I have an OpenWrt background, so I might be a bit biased there ;-)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reviewability concern was about reviewing pull requests or submitted patches - you can't submit a rebase for being reviewed by others.

So AFAICT if you want the rebase to a newer kernel reviewable, you need to create a pull request for the changes (or submit a patchset), then compare the diff with the diff of the old branch to ensure no important changes were dropped, or wrong rebase decisions were taken in merge conflicts.

In my experience, GitHub reviews work well for incremental changes being merged into the main branch (if the PR is targeted at the upstream repo). But for kernel patches, we have a different situation: While they are used in a downstream kernel (as in this case), they need to be rebased again and again.

When merging in new upstream releases (as in dentproject/linux#4) one after the other, cherry-picking any commit for upstreaming becomes harder and harder because the necessary changes are hidden in the merge commits.

So the only sustainable way is to keep rebasing the set of out-of-tree patches for each upstream release, thereby updating each patch to be compatible again. Ideally as topic branches, to avoid introducing unnecessary dependencies between patches (which would make upstreaming harder).

But if you keep these only as a set of patches in git, you don't have the necessary context during PR review when looking at diffs of patches on that repository. Renames or removals of patches easily create so much noise that the real differences are hidden.

At my day-job, we use umpf to build a linear series (in git and/or as patches) from the topic branches. When updating the combined patch series in a BSP layer, we then add the range-diff as a comment to the PR, as that's much easier to review. The individual topic branch rebases can be reviewed with range-diffs outside of PRs if needed. There's a blog post with more background. I can't guarantee the tool would be a good fit here, but I think one should take a look. ;)

A patch updating patches makes this more obvious to me, but then again I have an OpenWrt background, so I might be a bit biased there ;-)

:)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reviewability concern was about reviewing pull requests or submitted patches - you can't submit a rebase for being reviewed by others.
So AFAICT if you want the rebase to a newer kernel reviewable, you need to create a pull request for the changes (or submit a patchset), then compare the diff with the diff of the old branch to ensure no important changes were dropped, or wrong rebase decisions were taken in merge conflicts.

In my experience, GitHub reviews work well for incremental changes being merged into the main branch (if the PR is targeted at the upstream repo). But for kernel patches, we have a different situation: While they are used in a downstream kernel (as in this case), they need to be rebased again and again.

When merging in new upstream releases (as in dentproject/linux#4) one after the other, cherry-picking any commit for upstreaming becomes harder and harder because the necessary changes are hidden in the merge commits.

So the only sustainable way is to keep rebasing the set of out-of-tree patches for each upstream release, thereby updating each patch to be compatible again. Ideally as topic branches, to avoid introducing unnecessary dependencies between patches (which would make upstreaming harder).

But if you keep these only as a set of patches in git, you don't have the necessary context during PR review when looking at diffs of patches on that repository. Renames or removals of patches easily create so much noise that the real differences are hidden.

At my day-job, we use umpf to build a linear series (in git and/or as patches) from the topic branches. When updating the combined patch series in a BSP layer, we then add the range-diff as a comment to the PR, as that's much easier to review. The individual topic branch rebases can be reviewed with range-diffs outside of PRs if needed. There's a blog post with more background. I can't guarantee the tool would be a good fit here, but I think one should take a look. ;)

Interesting tool, I will need to take a closer look at it.

I guess my concern with rebasing the out-of-tree changes into a new branch (and then updating the recipe to build from the new branch/commit) is that this doubles the work needed for updating the kernel - once for the new branch, then for updating the recipe.

This also has the side effect that all pending PRs for the kernel will now have to be retargeted to the new branch, which can be painful, depending on how good you are with git.

But hopefully we won't have too many local changes, and push hard for upstream first, even for backporting fixes to stable versions (kernel maintainers are also only human, and easily overlook fixes worth backporting).

Comment on lines 196 to 199
Currently there are several competing ways of providing access to sensor data
and other values. To avoid forcing users to know which one is used for their
current platform, it would be nice to have a generic/common utilities for
querying the data, hiding the implementation details.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to list (and perhaps link to) the competing ways here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we should explicitly list them (ONLP, SONiC PDDF, S3IP) and maybe even explain their differences somehwere.

Maybe even provide some in-kernel interfaces or "common" userspace daemons that implement similar functionality as well (which frankly should be where the effort should go instead of adding yet another abstraction layer ...)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard kernel interface for this is hwmon. For iio sensors there is an in-kernel translation layer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. ONLP etc are often about providing semantic information to the existing sensors (e.g. where in the chassis they are).

But for some components there are no (usable) upstream kernel interfaces yet. E.g. SFP ports (AFAICT) rely on being tied to ethernet interfaces, which don't exist for those devices where the whole ASIC access is hidden behind a binary blob interface.

And there is no way to describe SFP ports outside of devicetree.

Or the PSUs that provide multiple different sensors and EEPROM access via I²C, but their presence is instead described in a register in a separate CPLD.

Also ONIE enabled devices have an I²C connected EEPROM with contains various TLV entries describing the device (model name, serial number, mac addresses etc). To read this out you will need to know which I²C bus to access, at which address. And there is no way to autodetect that.

So these utilities also provide system setup at boot by instantiating the appropriate I²C devices, so things become visible.

Yes there is a certain catch-22 - you need to know the device you are booting on to know on which I²C bus and address the EEPROM is to know which device you are booting on.

Obviously this problem doesn't exist on devicetree devices, but it exists on the many x86 based platforms.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a list of competing implementations with links. Also tried to summarize their differences a bit.

Comment on lines +209 to +215
The base platform support will be already provided by the base platform layer
(i.e. custom hardware drivers, platform initialization, etc).

Since DentOS uses a customized version of the Prestera driver that contains
several features not included in the upstream driver, the custom Prestera
driver will be provided as an external kernel module package, while the
in-kernel driver will be disabled.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, selecting between the out-of-tree and mainline drivers via different Yocto machines (or just a config option) would allow easy comparison and testing of the different feature sets. That would in turn make improving the mainline driver easier.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A feature should be (hopefully) easy to do and would make switching between both easier than switching between different machines (AFAICT).

Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants