Opaque Types #799

ghost · 2019-12-13T21:29:24Z

An important aspect of switch configuration is the creation of objects, such as groups, interfaces, nexthops, tunnels, vrfs, ports, etc.

When the switch pipeline refers to these objects, it usually does so using integer ids, because integer ids are efficiently implementable in the hardware. There are several downsides to that approach:

Integer ids are not very readable for humans. For example, it is much harder to figure out what the following set of flows does:

if dst_ip in 10.0.0.0/8 then use group 4231
group 4231 chooses between nexthop 5643 and 4023
if nexthop = 5643 set interface to 92 and neighbor to 123
if nexthop = 4023 set interface to 12 and neighbor to 456
if interface = 92 set egress port to 5 and src-mac to "00:00:00:00:00:01"
if interface = 12 set egress port to 517 and src-mac to "00:00:00:00:00:02"
if neighbor = 123 set dst-mac to "00:00:00:00:01:01"
if neighbor = 456 set dst-mac to "00:00:00:00:01:02"

when compared to this set of flows, where objects are referenced by name (aka string):

if dst_ip in 10.0.0.0/8 then use group "distribute among middle block nodes"
group "distribute among middle block nodes" chooses between nexthop "mb1" and "mb2"
if nexthop = "mb1" set interface to "int-5/1" and neighbor to "10.0.0.1"
if nexthop = "mb2" set interface to "int-5/2" and neighbor to "10.0.0.2"
if interface = "int-5/1" set egress port to "port-5/1" and src-mac to "00:00:00:00:00:01"
if interface = "int-5/2" set egress port to "port-5/2" and src-mac to "00:00:00:00:00:02"
if neighbor = "10.0.0.1" set dst-mac to "00:00:00:00:01:01"
if neighbor = "10.0.0.2" set dst-mac to "00:00:00:00:01:02"

For efficiency, it makes sense to choose the minimum bit-width for integer ids, e.g. a switch with just 32 ports would choose a 5bit port id, and a switch with a 1K neighbor table would choose a 10bit neighbor id. This however makes it hard to operate a heterogeneous fleet, as the controller needs to deal with different bit-widths on different targets.
The state of the art for object naming is sufficiently bad that PSA and P4Runtime introduced ad-hoc ways to deal with the translation between various representations of ports and other metadata (see here).

To fix the above problems, we propose the introduction of an opaque type opaque<type, int> to the P4 language. Semantically, an opaque type opaque<T, N> behaves just like the type T. However, the operations on opaque types are limited to equality-checks and assignment, which allows the compiler/switch to remap values of T to a different physical representation. The switch only needs to handle N distinct values of type T, which it can use to influence the choice of the physical representation (e.g. if N = 1024 it might make sense to choose a 10 bit integer as the physical representation).

Currently, P4 only allows the string type in annotations. We propose to also allow it in the opaque type.

Example

A user wants to have human readable names for nexthops; on a switch that supports upto 1024 nexthops. The user defines nexthop names as opaque<string, 1024>. The P4 program would then look as follows:

struct local_metadata_t {
  opaque<string, 1024> nexthop;
}

table nexthop {
  key = {
    local_metadata.nexthop : exact;
  }
  size: 1024
}

action set_nexthop(opaque<string, 1024> nexthop) {
  local_metadata.nexthop = nexthop;
}

table l3 {
  key = {
    header.ip_addr : lpm;
  }
  actions = {
    set_nexthop;
  }
}

Because representing nexthops as strings is prohibitively expensive in hardware, the compiler decides to remap the nexthop names to 10 bit integers. This means that the nexthop table can be efficiently implemented with 1024 entries of SRAM, where the nexthop id indexes into that SRAM.

When the controller installs a flow on the switch, the switch software dynamically maps the string name to a 10 bit integer. For example:

The controller installs a nexthop flow that matches on "mb0". The switch software allocates the nexthop id 0 for that name, and installs the nexthop.
The controller installs a route pointing to the nexthop "mb0". The switch software translates the name to the nexthop id 0.
The controller installs a nexthop flow that matches on "mb1". The switch software allocates the nexthop id 1 for that name, and installs the nexthop.
The controller removes the nexthop flow that matches on "mb0". The switch software releases the nexthop id 0.
The controller installs a nexthop flow that matches on "mb3". The switch software reuses the nexthop id 0 for that name, and installs the nexthop.

The operations on an opaque type are restricted to equality-checks and assignment. In particular, ternary matches cannot be implemented efficiently as the bit pattern of T might be very different from the type that implements T. Stefan Heule's proposals of using optional and set (both of which work well with opaque types) would allow a user to get back much of the practically relevant ternary match functionality.

The text was updated successfully, but these errors were encountered:

jafingerhut · 2019-12-13T21:36:06Z

Just to check my understanding here: You want to pass larger P4Runtime messages between servers and clients?

vgurevich · 2019-12-14T02:58:36Z

@konne-google -- is there a reason the type construct doesn't work for you in terms of creating a restricted type?

Currently the control plane representation of the type is a little bit outside of P4 purview, but you are free to have annotations associated with the type that will describe the control plane representation of your restricted type. Given the diversity of the control planes, it looks to me that keeping this information as annotation might be better for now.

If you ask myself, my biggest quip is that type only produces restricted types :)

ghost · 2019-12-16T18:05:43Z

jafingerhut@ yes that's right, the message in P4RT would contain the string (the mapping to int would happen on the switch). For us, the improvements in debuggability and portability would far outweigh the cost of a larger message.

vgurevich@ that's a great point. So instead of opaque<string, 1024>, you would prefer we wrote:

@max_distinct_values(1024)
type string nexthop;

That sound great to me!

Two questions:

Would a string in that position already be supported, or would we need to extend the spec for that?

The spec says:

While similar to typedef, the type keyword introduces in fact a
new type, which is not a synonym with the original type: values of the
original type and the newly introduced type cannot be mixed in
expressions.

Does this imply that no operations are allowed on that type, other than checking for equality and assignment?

jafingerhut · 2019-12-16T18:20:28Z

A type defined via the type keyword, if it has a base type of bit<W> or int<W>, can be cast to or from that base type, in addition to supporting equality and assignment.

Vladimir has also proposed recently in the LDWG that one of those directions of casting should be unnecessary and happen automatically. I am not trusting my memory right now on which direction that is.

jafingerhut · 2019-12-16T18:23:06Z

P4_16 supports a string type only in positions where compile time constants are permitted, e.g. for a log extern intended only for debugging programs, or in annotations. There is no current plan to support strings as the types of run-time variables in P4_16, and I suspect there would be active resistance, and/or extremely careful delimiting of it such that it would really have to be an integer under the hood.

vgurevich · 2019-12-16T22:53:04Z

@konne-google ,

Not quite.

As a data plane designer you are supposed to define the exact type that will be used in the data plane, e.g.

type bit<10> nexthop_t;

Separately (and that's where things are not yet standardized) you should be able to specify how you want that type to be visible at the control plane. This can be achieved through any kind of annotation, e.g.

type bit<10> nexthop_t @representation("my_nexthop_string");

It will be up to your control plane layers to convert from bit<10> to your string and vice versa.

ghost · 2019-12-18T23:36:31Z

Thanks vgurevich@, I think I understand that better now. So basically, instead of writing opaque<T, N> you would write:

type bit<round_up_to_nearest_power_of_two(N)> nexthop @representation("T")

This seems like we could use it.

One problem I see with that is that not everything is a power of two, e.g. the number of ports is usually not a power of two (because there are management ports etc, in addition to the dataplane ports). That would be a problem when combined with a feature like the set match-kind proposed by Stefan (#795), which would waste resources in that case (because the bitmap would be larger than necessary). We could get around that by adding another annotation, max_distinct_values, as follows:

type bit<round_up_to_nearest_power_of_two(N)> nexthop @representation("T") @max_distinct_values(N)

It's getting to be a mouth-full, but does that sound reasonable?

jafingerhut@, I don't understand the cast to uint. Doesn't that completely defeat the point of introducing the type keyword in the first place (if you can extract the value, then it can't be renamed by the switch). Where is this cast specified?

Thanks everyone for the very helpful comments, very much appreciated!

vgurevich · 2019-12-19T00:14:28Z

@konne-google ,

It is quite common (e.g. in the case of ports) that the value in the data plane is N bits wide, but not all values are representable. Nevertheless, we need to carry these N bits in the data plane, since we can't carry less :) and simply be careful.

I cannot say much about the proposed set match type, but it seems to require some specialized hardware. As a result, you might need to associate a certain extern with it and be able to specify the range in the declaration of that extern (essentially the width of the bitmask). All the values outside of the range will cause a miss anyway.

As a more well known example, a number of architectures allow one to declare an indirect counter using the Counter() extern, for example:

Counter<bit<10>>(999, CounterType_t.Packets_and_Bytes) my_counter;

where the index type is bit<10>, but the total number of instances is not 2^10, but less (999 here).

Good architectures typically specify that the attempt to count using the counter instance outside of this range is a no-op.

Your control plane might know what exactly you are supposed to represent and through the corresponding annotation impose the necessary restrictions. Note, that this is still not a bullet-proof way, since P4Runtime passes integers, so it is still possible to pass a number outside of your desired range.

jafingerhut · 2019-12-20T02:12:50Z

@konne-google Please ignore my comment above: "Vladimir has also proposed recently in the LDWG that one of those directions of casting should be unnecessary and happen automatically. I am not trusting my memory right now on which direction that is."

I was mixing that up in my memory with an actual proposal that has really been made, but has nothing to do with the type keyword in P4_16. It does have to do with serializable enum types in P4_16: #793

cc10512 · 2019-12-21T00:00:54Z

@konne-google Sorry to chime in so late. Let me see if I understand correctly what you need: a symbolic (string) value that you want to pass on the user/control-plane facing side of the programming, while still represented as integers in the data plane. Seems to me that you are really describing a serializable enum. As long as we can publish the symbolic names of the serializable values in the enum, you get the integer representation as well as the stringified name. And you don't need a new type. Am I missing something?

jafingerhut · 2019-12-21T01:38:41Z

@cc10512 Hopefully what I say here will be accurate, but @konne-google can correct me if I go off the rails.

I see at least two differences between this proposal and using serializable enums:

(1) With serializable enums, the P4Runtime API still passes integer values over the gRPC connection, not strings as this proposal would do. With serializable enums, the enum member name to integer mapping would be in the P4Info file for a P4 program, so it would be fairly straightforward for some debug-level tool to do the number->name mapping, when a name exists for a numeric value.

(2) With serializable enums, the name<->number mapping is fixed at the time of compiling the P4 program. With this proposal, the string value<->number mapping is dynamically created by the switch driver software / P4Runtime server software, and the set of names exists nowhere in the P4 source code. The set of names in use can change over time, without changing the P4 program.

I do not know if those differences are important enough to whoever will be making the decision on this proposal, but wanted to at least point out that these differences are there.

ghost · 2019-12-22T00:17:52Z

I think the second part of Andy's response is what I care about the most. The P4 program doesn't know what neighbors, vrfs, etc the control plane will have to create, so we can't enumerate them. That said, if `type` is indeed opaque, then we can use that without introducing anything new in the language, and just adding an annotation that's used by P4RT to do the translation.

…

On Fri, Dec 20, 2019, 16:38 Andy Fingerhut ***@***.***> wrote: @cc10512 <https://github.com/cc10512> Hopefully what I say here will be accurate, but @konne-google <https://github.com/konne-google> can correct me if I go off the rails. I see at least two differences between this proposal and using serializable enums: (1) With serializable enums, the P4Runtime API still passes integer values over the gRPC connection, not strings as this proposal would do. With serializable enums, the enum member name to integer mapping would be in the P4Info file for a P4 program, so it would be fairly straightforward for some debug-level tool to do the number->name mapping, when a name exists for a numeric value. (2) With serializable enums, the name<->number mapping is fixed at the time of compiling the P4 program. With this proposal, the string value<->number mapping is dynamically created by the switch driver software / P4Runtime server software, and the set of names exists nowhere in the P4 source code. The set of names in use can change over time, without changing the P4 program. I do not know if those differences are important enough to whoever will be making the decision on this proposal, but wanted to at least point out that these differences are there. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#799?email_source=notifications&email_token=AJ3D54FAMVARLNQCNHOS5DDQZVXSFA5CNFSM4J2WL35KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOSH2I#issuecomment-568140777>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJ3D54DVU65A35TUAVUWRO3QZVXSFANCNFSM4J2WL35A> .

ghost · 2020-02-03T21:21:44Z

I'm closing this bug, there is no change needed to the language, because type does in-principle already support what we need. There are some follow-up bugs to make type more useful:

support type as exact key: Support exact match on values of type types p4c#2177
support type as action parameter: Support action parameter values of type types in P4Runtime p4runtime#263
add new match kind optional: New match kind: optional #794
add new match kind set: New match kind: set #795

ghost closed this as completed Feb 3, 2020

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opaque Types #799

Opaque Types #799

ghost commented Dec 13, 2019

jafingerhut commented Dec 13, 2019

vgurevich commented Dec 14, 2019

ghost commented Dec 16, 2019

jafingerhut commented Dec 16, 2019

jafingerhut commented Dec 16, 2019

vgurevich commented Dec 16, 2019

ghost commented Dec 18, 2019

vgurevich commented Dec 19, 2019

jafingerhut commented Dec 20, 2019

cc10512 commented Dec 21, 2019

jafingerhut commented Dec 21, 2019

ghost commented Dec 22, 2019 via email

ghost commented Feb 3, 2020

Opaque Types #799

Opaque Types #799

Comments

ghost commented Dec 13, 2019

jafingerhut commented Dec 13, 2019

vgurevich commented Dec 14, 2019

ghost commented Dec 16, 2019

jafingerhut commented Dec 16, 2019

jafingerhut commented Dec 16, 2019

vgurevich commented Dec 16, 2019

ghost commented Dec 18, 2019

vgurevich commented Dec 19, 2019

jafingerhut commented Dec 20, 2019

cc10512 commented Dec 21, 2019

jafingerhut commented Dec 21, 2019

ghost commented Dec 22, 2019 via email

ghost commented Feb 3, 2020