Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opaque Types #799

Closed
ghost opened this issue Dec 13, 2019 · 13 comments
Closed

Opaque Types #799

ghost opened this issue Dec 13, 2019 · 13 comments

Comments

@ghost
Copy link

ghost commented Dec 13, 2019

An important aspect of switch configuration is the creation of objects, such as groups, interfaces, nexthops, tunnels, vrfs, ports, etc.

When the switch pipeline refers to these objects, it usually does so using integer ids, because integer ids are efficiently implementable in the hardware. There are several downsides to that approach:

  • Integer ids are not very readable for humans. For example, it is much harder to figure out what the following set of flows does:
if dst_ip in 10.0.0.0/8 then use group 4231
group 4231 chooses between nexthop 5643 and 4023
if nexthop = 5643 set interface to 92 and neighbor to 123
if nexthop = 4023 set interface to 12 and neighbor to 456
if interface = 92 set egress port to 5 and src-mac to "00:00:00:00:00:01"
if interface = 12 set egress port to 517 and src-mac to "00:00:00:00:00:02"
if neighbor = 123 set dst-mac to "00:00:00:00:01:01"
if neighbor = 456 set dst-mac to "00:00:00:00:01:02"

when compared to this set of flows, where objects are referenced by name (aka string):

if dst_ip in 10.0.0.0/8 then use group "distribute among middle block nodes"
group "distribute among middle block nodes" chooses between nexthop "mb1" and "mb2"
if nexthop = "mb1" set interface to "int-5/1" and neighbor to "10.0.0.1"
if nexthop = "mb2" set interface to "int-5/2" and neighbor to "10.0.0.2"
if interface = "int-5/1" set egress port to "port-5/1" and src-mac to "00:00:00:00:00:01"
if interface = "int-5/2" set egress port to "port-5/2" and src-mac to "00:00:00:00:00:02"
if neighbor = "10.0.0.1" set dst-mac to "00:00:00:00:01:01"
if neighbor = "10.0.0.2" set dst-mac to "00:00:00:00:01:02"
  • For efficiency, it makes sense to choose the minimum bit-width for integer ids, e.g. a switch with just 32 ports would choose a 5bit port id, and a switch with a 1K neighbor table would choose a 10bit neighbor id. This however makes it hard to operate a heterogeneous fleet, as the controller needs to deal with different bit-widths on different targets.

  • The state of the art for object naming is sufficiently bad that PSA and P4Runtime introduced ad-hoc ways to deal with the translation between various representations of ports and other metadata (see here).

To fix the above problems, we propose the introduction of an opaque type opaque<type, int> to the P4 language. Semantically, an opaque type opaque<T, N> behaves just like the type T. However, the operations on opaque types are limited to equality-checks and assignment, which allows the compiler/switch to remap values of T to a different physical representation. The switch only needs to handle N distinct values of type T, which it can use to influence the choice of the physical representation (e.g. if N = 1024 it might make sense to choose a 10 bit integer as the physical representation).

Currently, P4 only allows the string type in annotations. We propose to also allow it in the opaque type.

Example

A user wants to have human readable names for nexthops; on a switch that supports upto 1024 nexthops. The user defines nexthop names as opaque<string, 1024>. The P4 program would then look as follows:

struct local_metadata_t {
  opaque<string, 1024> nexthop;
}

table nexthop {
  key = {
    local_metadata.nexthop : exact;
  }
  size: 1024
}

action set_nexthop(opaque<string, 1024> nexthop) {
  local_metadata.nexthop = nexthop;
}

table l3 {
  key = {
    header.ip_addr : lpm;
  }
  actions = {
    set_nexthop;
  }
}

Because representing nexthops as strings is prohibitively expensive in hardware, the compiler decides to remap the nexthop names to 10 bit integers. This means that the nexthop table can be efficiently implemented with 1024 entries of SRAM, where the nexthop id indexes into that SRAM.

When the controller installs a flow on the switch, the switch software dynamically maps the string name to a 10 bit integer. For example:

  • The controller installs a nexthop flow that matches on "mb0". The switch software allocates the nexthop id 0 for that name, and installs the nexthop.
  • The controller installs a route pointing to the nexthop "mb0". The switch software translates the name to the nexthop id 0.
  • The controller installs a nexthop flow that matches on "mb1". The switch software allocates the nexthop id 1 for that name, and installs the nexthop.
  • The controller removes the nexthop flow that matches on "mb0". The switch software releases the nexthop id 0.
  • The controller installs a nexthop flow that matches on "mb3". The switch software reuses the nexthop id 0 for that name, and installs the nexthop.

The operations on an opaque type are restricted to equality-checks and assignment. In particular, ternary matches cannot be implemented efficiently as the bit pattern of T might be very different from the type that implements T. Stefan Heule's proposals of using optional and set (both of which work well with opaque types) would allow a user to get back much of the practically relevant ternary match functionality.

@jafingerhut
Copy link
Collaborator

Just to check my understanding here: You want to pass larger P4Runtime messages between servers and clients?

@vgurevich
Copy link
Contributor

@konne-google -- is there a reason the type construct doesn't work for you in terms of creating a restricted type?

Currently the control plane representation of the type is a little bit outside of P4 purview, but you are free to have annotations associated with the type that will describe the control plane representation of your restricted type. Given the diversity of the control planes, it looks to me that keeping this information as annotation might be better for now.

If you ask myself, my biggest quip is that type only produces restricted types :)

@ghost
Copy link
Author

ghost commented Dec 16, 2019

jafingerhut@ yes that's right, the message in P4RT would contain the string (the mapping to int would happen on the switch). For us, the improvements in debuggability and portability would far outweigh the cost of a larger message.

vgurevich@ that's a great point. So instead of opaque<string, 1024>, you would prefer we wrote:

@max_distinct_values(1024)
type string nexthop;

That sound great to me!

Two questions:

Would a string in that position already be supported, or would we need to extend the spec for that?

The spec says:

While similar to typedef, the type keyword introduces in fact a
new type, which is not a synonym with the original type: values of the
original type and the newly introduced type cannot be mixed in
expressions.

Does this imply that no operations are allowed on that type, other than checking for equality and assignment?

@jafingerhut
Copy link
Collaborator

A type defined via the type keyword, if it has a base type of bit<W> or int<W>, can be cast to or from that base type, in addition to supporting equality and assignment.

Vladimir has also proposed recently in the LDWG that one of those directions of casting should be unnecessary and happen automatically. I am not trusting my memory right now on which direction that is.

@jafingerhut
Copy link
Collaborator

P4_16 supports a string type only in positions where compile time constants are permitted, e.g. for a log extern intended only for debugging programs, or in annotations. There is no current plan to support strings as the types of run-time variables in P4_16, and I suspect there would be active resistance, and/or extremely careful delimiting of it such that it would really have to be an integer under the hood.

@vgurevich
Copy link
Contributor

@konne-google ,

Not quite.

As a data plane designer you are supposed to define the exact type that will be used in the data plane, e.g.

type bit<10> nexthop_t;

Separately (and that's where things are not yet standardized) you should be able to specify how you want that type to be visible at the control plane. This can be achieved through any kind of annotation, e.g.

type bit<10> nexthop_t @representation("my_nexthop_string");

It will be up to your control plane layers to convert from bit<10> to your string and vice versa.

@ghost
Copy link
Author

ghost commented Dec 18, 2019

Thanks vgurevich@, I think I understand that better now. So basically, instead of writing opaque<T, N> you would write:

type bit<round_up_to_nearest_power_of_two(N)> nexthop @representation("T")

This seems like we could use it.

One problem I see with that is that not everything is a power of two, e.g. the number of ports is usually not a power of two (because there are management ports etc, in addition to the dataplane ports). That would be a problem when combined with a feature like the set match-kind proposed by Stefan (#795), which would waste resources in that case (because the bitmap would be larger than necessary). We could get around that by adding another annotation, max_distinct_values, as follows:

type bit<round_up_to_nearest_power_of_two(N)> nexthop @representation("T") @max_distinct_values(N)

It's getting to be a mouth-full, but does that sound reasonable?

jafingerhut@, I don't understand the cast to uint. Doesn't that completely defeat the point of introducing the type keyword in the first place (if you can extract the value, then it can't be renamed by the switch). Where is this cast specified?

Thanks everyone for the very helpful comments, very much appreciated!

@vgurevich
Copy link
Contributor

@konne-google ,

It is quite common (e.g. in the case of ports) that the value in the data plane is N bits wide, but not all values are representable. Nevertheless, we need to carry these N bits in the data plane, since we can't carry less :) and simply be careful.

I cannot say much about the proposed set match type, but it seems to require some specialized hardware. As a result, you might need to associate a certain extern with it and be able to specify the range in the declaration of that extern (essentially the width of the bitmask). All the values outside of the range will cause a miss anyway.

As a more well known example, a number of architectures allow one to declare an indirect counter using the Counter() extern, for example:

Counter<bit<10>>(999, CounterType_t.Packets_and_Bytes) my_counter;

where the index type is bit<10>, but the total number of instances is not 2^10, but less (999 here).

Good architectures typically specify that the attempt to count using the counter instance outside of this range is a no-op.

Your control plane might know what exactly you are supposed to represent and through the corresponding annotation impose the necessary restrictions. Note, that this is still not a bullet-proof way, since P4Runtime passes integers, so it is still possible to pass a number outside of your desired range.

@jafingerhut
Copy link
Collaborator

@konne-google Please ignore my comment above: "Vladimir has also proposed recently in the LDWG that one of those directions of casting should be unnecessary and happen automatically. I am not trusting my memory right now on which direction that is."

I was mixing that up in my memory with an actual proposal that has really been made, but has nothing to do with the type keyword in P4_16. It does have to do with serializable enum types in P4_16: #793

@cc10512
Copy link
Contributor

cc10512 commented Dec 21, 2019

@konne-google Sorry to chime in so late. Let me see if I understand correctly what you need: a symbolic (string) value that you want to pass on the user/control-plane facing side of the programming, while still represented as integers in the data plane. Seems to me that you are really describing a serializable enum. As long as we can publish the symbolic names of the serializable values in the enum, you get the integer representation as well as the stringified name. And you don't need a new type. Am I missing something?

@jafingerhut
Copy link
Collaborator

@cc10512 Hopefully what I say here will be accurate, but @konne-google can correct me if I go off the rails.

I see at least two differences between this proposal and using serializable enums:

(1) With serializable enums, the P4Runtime API still passes integer values over the gRPC connection, not strings as this proposal would do. With serializable enums, the enum member name to integer mapping would be in the P4Info file for a P4 program, so it would be fairly straightforward for some debug-level tool to do the number->name mapping, when a name exists for a numeric value.

(2) With serializable enums, the name<->number mapping is fixed at the time of compiling the P4 program. With this proposal, the string value<->number mapping is dynamically created by the switch driver software / P4Runtime server software, and the set of names exists nowhere in the P4 source code. The set of names in use can change over time, without changing the P4 program.

I do not know if those differences are important enough to whoever will be making the decision on this proposal, but wanted to at least point out that these differences are there.

@ghost
Copy link
Author

ghost commented Dec 22, 2019 via email

@ghost
Copy link
Author

ghost commented Feb 3, 2020

I'm closing this bug, there is no change needed to the language, because type does in-principle already support what we need. There are some follow-up bugs to make type more useful:

@ghost ghost closed this as completed Feb 3, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants