-
Notifications
You must be signed in to change notification settings - Fork 0
/
ietf.org_archive_id_draft-tsuchiya-pip-00.txt
1681 lines (1332 loc) · 81.7 KB
/
ietf.org_archive_id_draft-tsuchiya-pip-00.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Internet Draft -- Expires Nov. 20, 1992
PRELIMINARY DRAFT:
Pip: The `P' Internet Protocol
Paul F. Tsuchiya
Bellcore
tsuchiya@thumper.bellcore.com
May 19, 1992
Status
This document is an Internet Draft. Internet Drafts are working documents
of the Internet Engineering Task Force (IETF), its Areas, and its Working
Groups. Note that other groups may also distribute working documents as
Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six months.
Internet Drafts may be updated, replaced, or obsoleted by other documents
at any time. It is not appropriate to use Internet Drafts as reference
material or to cite them other than as a "working draft" or "work in
progress."
Please check the I-D abstract listing contained in each Internet Draft
directory to learn the current status of this or any other Internet Draft.
Disclaimer:
This text version does not contain the figures from the postscript
version. As such, it is missing information essential to the
paper, and so it is strongly suggested that the postscript version
be read.
1.0 Purpose of this draft
Pip is an IP protocol that scales, encodes policy, and is high speed. The
purpose of this draft is to explain the basic concepts behind Pip so that
people can start thinking about potential pitfalls. I am proposing Pip as an
alternative to the two "medium term" proposals that emerged from the
Road (Routing and Addressing) group to deal with the dual IP problems
of scaling and address depletion. Because this proposal, which represents
new ideas, is competing with old (and therefore well thought-out) ideas, I
wish to circulate it (and get the process started) as quickly as possible,
albeit in not as complete a form as I would like. I expect to have a
complete proposal by the beginning of September. There will be a plenary
presentation and a BOF covering this material at the Boston meeting of
IETF.
2.0 Pip General
Pip has the following features:
1. Pip carries multiple address types in a common format. As such, it is
beneficial for transition from one address to another, and for future
evolution (of routing techniques as well as of addressing schemes).
2. The Pip address is completely general (multiple levels of hierarchy,
expands to any number of systems).
3. The Pip address is compact-it grows with the number of systems.
4. The Pip address efficiently encodes policy (source-based) routes, both
in "long form" (explicit path) and "short form" (path identifier).
5. Because the Pip address can be a path identifier (multi-layer if de-
sired, like the ATM VCI/VPI), Pip can be used in a connection-orient-
ed fashion (this paper only briefly touches on mechanisms for
controlling connections).
6. The Pip address includes multicasting (potentially substantially more
sophisticated than what is for IP multicast numbers, for instance, hier-
archical multicast).
7. Pip efficiently encodes QOS (Quality-of-Service) information.
8. The routing table lookup with Pip is well-bounded (by the depth of
the address hierarchy).
9. Pip accommodates "multiple defaults" routing from (multi-homed)
stub domains.
10. Pip allows intra-domain routing and hosts to operate with no notion
of the "inter-domain" parts of their address, if desired. This is equiva-
lent to current IP hosts and intra-domain routers not needing to know
their own network number.
11. Pip accommodates tunneling across transit domains.
12. By virtue of 8 and 9, Pip accommodates separation of interior and ex-
terior routing.
13. Pip simplifies handling mobile systems (by having flat network layer
identifiers).
In short, Pip is a "next generation" protocol, intended to allow the internet
to evolve over the foreseeable future.
One of the design philosophies behind Pip is that it encodes all "routing"
information (what is traditionally spread over the address and QOS fields)
in a single structure (the Routing Directive). The rules for parsing the
structure are simple on one hand, but provide a rich set of routing
functions. Therefore, it is possible to build a single forwarding engine that
will accommodate many different types of routing styles, including
traditional hierarchical addresses, policy, source route, and virtual circuit.
This way, the forwarding engine can be built in hardware and can remain
constant even while internet routing evolves.
Another design philosophy behind Pip is that it delays the definition of
how internet packet should be composed and interpreted. The meaning of
addresses and QOS information are dynamically determined by
information in Directory Services, distributed protocols such as routing
protocols, and MIBs, rather than in a protocol specification. Current
internet protocols have continuously been moving towards this
philosophy, but with header formats that are not conducive to late
semantic definition. Pip facilitates late semantic definition of the internet
protocol header. This on one hand makes it easier to evolve the internet
incrementally, but requires that all systems (hosts, routers, and directory
servers) be a little smarter, and that algorithms be a little more complex.
This, in a nutshell, is the trade-off being made by Pip.
3.0 Transition Approach
Like IP, Pip by itself is nothing more than a header format and some rules
about how to forward the header. It is nothing without routing and
addressing and related algorithms behind it. But since Pip can encode the
semantics of existing internet headers (addresses, QOS, etc.), it can take
advantage of existing routing protocols and addressing schemes. This is
one of the main virtues of the proposal to move to CLNP [OSI2]-that it
takes advantage of an existing body of work. However, Pip will allow us
to move forward into advanced features that CLNP will not handle, while
still allowing us to take advantage of existing work (although not as easily
as moving to CLNP will).
Since Pip can encode backbone-oriented "addresses" that are
semantically equivalent to NSAP addresses, transition to Pip will be
almost identical to the transition to CLNP already described by Callon
[Ref]. Once most of IP has disappeared (and therefore scaling and address
depletion are no longer concerns), we can evolve advanced features into
the internet (policy, mobility, flow control) without having to change the
internet protocol. (Of course not having to change the internet protocol
doesn't mean not having to change routers. But not having to change the
internet protocol is still better than having to change it, especially because
it facilitates piece-wise evolution).
In the following sections, I show how Pip works outside of the context of
interoperation with existing addressing and routing schemes.
4.0 Pip Header Structure
Figure 1 shows the Pip header structure. The Pip header has 5 parts not
found (at least in this form) in current internet protocols. They are the
Handling Directive (HD), the Tunnel, the Logical Router (LR), the
Routing Hints (RH), and the IDs. While these parts are fundamental to
Pip, the details of their layout, and the layout of other fields, is open to
change.
The IDs field contains flat (non-hierarchical) values that do nothing more
than identify the source and destination of a Pip packet. The Routing
Directive (RD), which consists of the Tunnel, the LR and the RH,
contains routing information. Either the Tunnel or the RH are used, but
not both. The RH holds routing information such as (hierarchical)
addressing, source-route (including policy), and virtual circuit
information. The Tunnel simply marks entry and exit points of a domain,
and is used to temporarily over-ride the RH. The LR holds route-effecting
QOS information (such as routing metrics), plus various information
needed to make the RH operate properly. The HD holds non-route-
effecting QOS information, such as queueing directives, congestion
avoidance and control, and priority.
This packet structure better represents internet protocol functions than
traditional internet protocols. For instance, traditional internet protocols
combine the functions of identification and routing into the address fields.
Doing this generally limits the flexibility of the protocol. For instance,
host mobility is harder when the address combines these two functions.
Traditional internet protocols also split the routing function over multiple
fields (the address and the QOS fields). While this doesn't necessarily
limit functionality, it generally complicates the routing table lookup
function, or more accurately, it generally results in router
implementations that ignore the QOS fields, thus making it harder to add
QOS routing to the existing infrastructure.
Traditional internet protocols must use self-encapsulation in order to
tunnel through groups of routers. Pip has a specific field for this purpose,
thus eliminating the overhead of replicating the entire header.
No Pip header checksum is shown in Figure 1. I am undecided as to
whether or not one is necessary, particularly since the HD, Hop Count,
Tunnel, and RH fields will commonly change values from router to router.
In fact, of the first 5+ (32-bit) words, only the first word will potentially
not be modified. No fragmentation/reassembly fields are shown. I am
strongly inclined to leave these out, and just depend on dynamic
MaxPDU discovery to handle this. Finally, no version number field is
shown. Protocol identification (at the previous layer) can serve this
function.
The following sections cover the various parts of the Pip header in detail.
4.1 Boring Parts
The "boring" parts of the Pip header are the ID Type field (4 bits), the
Options length field (4 bits), the Total Length field (24 bits), and the
Protocol field (8 bits), and the Hop Count field (8 bits).
The ID Type describes the length and type of the Source and Destination
IDs. The IDs can be 0, 4, 6, or 8 octets each (the actual types, which are
not so boring, are described in the separate section on IDs below). The
Options Length field gives the number of 32-bit options that come after
the RD. The Total Length field gives the total length of the Pip packet,
including the Pip header, in octets. The maximum size Pip packet is 224 =
16,777,216 octets. This is substantially larger than the corresponding
fields in IP or CLNP, both of which allow for maximum packet sizes of
65536 octets. These fields comprise the first 32-bit word.
The Protocol field indicates the higher layer protocol, and is equivalent to
the IP Protocol field. The Hop Count field counts down the number of
hops before the packet should be dropped. It is the same size as the
corresponding fields in IP or CLNP, allowing for 256 hops. The Hops
field falls on a 32-bit (and 64-bit) boundary, making it convenient to
modify.
4.2 Tunnel and Routing Directive (RD)
The RD is the most novel and powerful aspect of Pip. The RD is general,
compact, and fast. It is general in that it can accommodate any address
type and any routing algorithm type, including source-based routing. It is
compact in that it encodes hierarchical addresses efficiently. And, it is fast
because 1) the number of steps required for the forwarding function is
small, even in the worst case, and 2) the same steps are used for
forwarding all types of routing, so an efficient and general forwarding
engine can be built.
The RD composed of three parts, the Tunnel, the Logical Router (LR),
and the Routing Hints (RH).
Because a router can be playing multiple roles, Pip models a router as
multiple "Logical Routers". For instance, a router may be operating at
multiple levels of the hierarchy, may be participating in multiple routing
algorithms, including multicast, may be operating with multiple routing
metrics, and so on. While the function of logical routers is for most
purposes a feature, it is required to make the RH mechanism work
properly, as is described below.
The basic algorithm for finding a route is to 1) determine the forwarding
table index, 2) determine which forwarding table to use (that is, which
logical router is active for this packet), 3) index directly into the
forwarding table (no search technique such as hashing or tree search is
necessary) and retrieve the routing information, 4) modify the RD for the
next-hop router. This is explained in more detail below (see Section
4.2.4)Tunnel
The 32-bit Tunnel is composed of two 16-bit fields, the Source Exit ID
(SEI) and the Destination Exit ID (DEI). The DEI comes after the SEI,
and so falls on the least significant bits of a word boundary.
When the DEI is 0, then the Tunnel is ignored and the RH is used to route
the packet. Otherwise, the RD is ignored and the Tunnel is used.
The purpose of the Tunnel is as follows. Consider two routers, X and Y,
both of which understand the RH (at the level at which the RH is
operating). Between X and Y are a series of routers that do not understand
the RH (at that level). Assume that a Pip packet (with a NULL Tunnel)
arrives at X and should be routed to Y. In order to get the packet to Y, X
fills the DEI field with a value that is understood by the intermediate
routers to mean "route to Y". X fills the SEI field with a value that is
understood by the intermediate routers to mean "route to X". The purpose
of the SEI field is to handle the case where a return packet (an error packet
or control packet of some sort) needs to be sent (either to X or to the
original source host). When Y receives the packet, it recognizes the
Tunnel as terminating at itself, writes the Tunnel field to 0, and forwards
based on the RH.
Tunneling is traditionally useful for preventing external routing
information from being required internally. It is also used by the ISIS
routing protocol for repairing area partitions. Pip tunneling can be used
for both of these purposes. Because of the way "addresses" (called RH
Numbers in Pip) are assigned in Pip, however, tunneling turns out to be
necessary just to make Pip work.
There are no nested tunnels in Pip (that is, tunnels cannot have tunnels).
While nested tunnels could be of some use, it seems that the usefulness of
tunneling diminishes with the number of nested levels. By having only
one level of tunneling, the packet format is simplified (and the size kept
small). To make nested tunneling work, it would be necessary to either
modify the size of the packet en route (to add and delete tunnels), or for
the originating host to put in enough Tunnel fields for the deepest nesting.
The former case is difficult because it requires changing the packet size,
which doesn't work for instance with (cut-through) ATM switching. The
latter requires extra complexity and overhead in informing the originating
host how many Tunnel fields to include in the packet. For these reasons, I
have chosen to limit tunneling to one level.
4.2.1 Logical Router (LR)
As described above, the LR field indicates which of multiple forwarding
tables should be used when routing a packet. The many uses of the LR
will become clear throughout the coming examples.
Note that in theory one can always use different indexing values, rather
than different forwarding tables, as a means of distinguishing logical
routers. This, however, couples "addressing" (RH numbering) between
different logical domains, thus generally complicating things. For
instance, one could use different RH values to indicate different QOSs
(cost, delay, etc.), but that would require that each system have an RH
Number indicating cost, another indicating delay, and so on. So, unless
such coupling is convenient, it is best to decouple RH numbering using
the LR field.
Even though the LR field can be treated as a flat field by a router, the
individual bits have specific meaning. My goal is that most or all of the
bits' meaning be determined dynamically (via system management or the
routing protocol or some other distributed protocol), and not be specified
in a standards document. This allows for the maximum flexibility in
evolving the protocol (adding new features, purging old ones). For
instance, upon booting, a host should, as part of its configuration process,
contact a local router and learn the meaning of each bit of the LR field. A
network debugger, even, could query attached routers for these
definitions, so that meaningful information could be logged and
displayed.
The following bits are likely to be required:
1. Level. This indicates what level of a hierarchical RH Number is being
routed on at a given time. This use of the LR field is only necessary if
hierarchical RH Numbers are being used.
2. Multicast. If multicast is used, at least one bit may be needed to indi-
cate whether the packet should be multicast or unicast. If several mul-
ticast algorithms are in use, multiple bits may be needed.
3. Route-effecting QOS. This would be any QOS type that influences
the route chosen, such as cost or high-bandwidth. Note that QOS need
not be route effecting. For instance, a QOS type of low delay might
only influence how packets are queued (given priority in the queue),
but not influence how they are routed. In this case, the HD would
have certain bits set aside for "low delay" (actually, priority queue-
ing), but the LR would not. In other cases, a given QOS might effect
both routing and handling.
4.2.2 Routing Hints (RH)
The RH is the most interesting and novel aspect of Pip. It holds what is
normally thought of as the "address" in a traditional internet header. It can
also hold many other kinds of routing information, such as policy
information.
The RH consists of the RH Descriptor and the Routing Hint Fields (RHF,
see Figure 2). The RH Descriptor tells how to interpret the RHFs. The
RHFs are a series of fields, listed in the order that they will be required by
the routers in the path from source to destination. This should not be taken
to assume that the RHFs necessarily specify a source route, in some
conventional sense of the term. Most normally, the RHFs will simply
contain a hierarchical source and destination RH Number, where each
RHF denotes one level of the hierarchical RH Number. This and other
uses of the RHFs (such as virtual circuit or path identifiers, true source
routes, and Sirpent- or Paris-style source routes) are given later.
Each pair of RHFs are separated by an RHF Relator (RHFR). The RHFR
is a two-bit field that shows the relationship between the field before it
and the field after. It has three values, up, down, and none. If down, the
previous RHF is hierarchically above the subsequent RHF. If up, the
previous RHF is hierarchically below the subsequent RHF. If none, the
two RHFs are not hierarchically related.
The RH Descriptor and RH are parsed as follows. The 6-bit RHF Offset
field determines which RHF is currently active. The RHF Length field
indicates the size of each RHF (all of which are the same length). The
RHF sizes represented by each RHF Length value are given in the
following table:
After this is a series of 1 or more RHFs. Where the actual values needed
in the RHFs vary greatly (some small, some large), this structure will
result in a larger RH than seems necessary. I don't know how to shrink
each RHF to its smallest size and still make the header parsing simple
(and therefore fast).
After the RHFs comes enough padding to make the RD fall on a 32-bit
word boundary.
The combined 10-bit RHF Offset/RHF Length, then, is used to isolate the
current RHF that a router should be routing on. A typical implementation
on a common CPU/RAM processor would be to use the full 10 bits as a
direct index into an array of size 1024, each entry of which contains data
on how to isolate the current field. For instance, if RHF Offset = 3 and
RHF Length= 8 (meaning each RHF/RHFR is 14 bits long), the data
would instruct the processor to fetch the first (32-bit) word of the RH,
shift left 10, mask with 0x00003c00, fetch the second word, shift right 22,
mask with 0x000003ff, and OR the two results. In this example, the RHF/
RHFR straddled 32-bit word boundaries, and so two fetches were needed.
(The RHF Relator should also be saved off at this time to be used later.)
Once the RHF is isolated, it is used as a direct index into a forwarding
table. The forwarding table can be well populated because (as is discussed
later in this paper) the RHF values are chosen not based on how many
things might have to be encoded at a given level of the hierarchy, but on
how many things are actually encoded at a given level. In other words,
the "address" that is ultimately carried in packets is, unlike current
internet protocol addresses, well-utilized.
In addition to the information in the forwarding table described above, the
forwarding table entry must also indicate whether the RHF Offset needs
to be decremented. The RHF Offset is usually decremented when a packet
crosses a hierarchical boundary. For instance, if the packet was being
forwarded based on the equivalent of "network number" through a
backbone, the router bordering the indicated network would decrement
the RHF Offset so that the next router (the router in the indicated
network) would automatically look at the "subnet number" field. Often a
single router is acting at two or more levels of the hierarchy, for instance a
level 2 router in the ISIS routing protocol. In this case, the forwarding
table entry and RHFR would indicate that, instead of routing the packet to
another router, the next RHF should also be examined (and, another
forwarding table used). It would be unusual to find a router operating at
more than three levels of the hierarchy. Further, address hierarchies are
shallow. Telephone numbers in the USA have only 4 levels of hierarchy
(including the international code). Therefore, the number of iterations of
this search is well-bounded.
Note that this "field indexing" style of lookup is not just a cute
optimization. Pip derives most of its routing flexibility from it, and
wouldn't be general without it.
4.2.3 Fowarding Algorithm
This section describes the algorithm for forwarding a packet, based on the
contents of the Tunnel and the RD (see Figure 3). For expository reasons,
the unicast algorithm is defined, followed by the modifications needed for
multicast. These same algorithm is used no matter what kind of routing
algorithm is being used (hierarchical, policy, source, virtual circuit).
Getting the appropriate behavior, according to the routing algorithm used,
requires configuring the tables shown in Figure 3 correctly.
1. If the Tunnel Field is not 0, index into the Tunnel Table using the val-
ue in the Tunnel Field, and go to step 2. Otherwise (the Tunnel Field
is 0), index into the Logical Router Table (LR Table) with the value in
the LR Field, and go to step 3.
2. If the Information column contains forwarding info, then modify the
Tunnel Field value according to the instructions in the Information
column, and forward the packet. Otherwise, if it contains a pointer to
the LR Table, set the Tunnel Field to 0 and go to step 1. Otherwise, if
it contains a pointer to a forwarding table, then go to step 4.
3. If the Information column contains forwarding info, then modify the
LR Field and Tunnel Field values according to the instructions in the
Information column, and forward the packet accordingly. Otherwise,
if it contains a pointer to another forwarding table, then go to step 4.
4. Using the RH Descriptor (RHF Offset/RHF Length), isolate the cor-
rect RHF and RHFR. Using the RHF, index into the correct forward-
ing table (determined by the pointer in the previous step). If the
Information column contains forwarding info, then modify RHF Off-
set field, the value of the isolated RHF, the Tunnel Field, and the LR
Field value according to the instructions in the Information column,
and forward the packet accordingly. Otherwise, if it contains a pointer
to another forwarding table, modify the isolated RHF field value ac-
cording to the instructions in the Information column, and repeat step
4 (using the new forwarding table).
If tunneling is being used, and the router receiving the Pip packet is not
the last router of the tunnel, then the router will find the forwarding
information in the Tunnel Table, and not index any other tables. If the
router is the last router of the tunnel, and the Tunnel Field has not been set
to zero by the previous router, then the router will find a pointer in the
Tunnel Table, and forward according to the RH.
If tunneling is not being used, the router receiving the packet will
normally find a pointer in the Logical Router Table. When a router finds a
pointer in a forwarding table (thus pointing it to another forwarding
table), it is normally the result of "routing down the hierarchy". That is,
the router is operating at multiple levels of the hierarchy, and is parsing
the hierarchical RH Number.
Section 5 gives examples of the algorithm described above.
Multicast Algorithm
For multicast, the tables in Figure 3 are modified such that the
Information column in each table contains a set of information blocks,
each one being a pointer or forwarding info. When there are multiple
forwarding info blocks (either in the same table entry, or by virtue of
multiple pointers reaching multiple tables), then multiple packets are
transmitted. Each packet may have the Tunnel or RD fields modified
differently, so each information block contains these instructions.
4.3 Handling Directive (HD)
The HD is something of a catch-all field for any packet handling
mechanisms that don't influence the route taken by a packet. Typical
handling types would be queueing directives, such as priority queueing,
security directives, such as encryption, and so on.
The meaning of the specific bits is meant to be handled in the same way as
the LR-that is, the meaning of the bits is defined dynamically through
system management or configuration protocols, not through hard-coded
definition in a standards document.
Each domain autonomously determines what meaning is assigned to each
bit. When different domains use different bits for the same purpose, the
value of the HD must be modified when a packet crosses domain borders
so that the next domain may correctly interpret the meaning of the HD.
The border router determines the proper translation via protocol exchange
with the neighboring domain or via system management.
By packing all of the handling bits together, an implementation style
whereby the HD is used as a direct index into a RAM memory, thus
retrieving the appropriate handling mechanisms and values, is possible.
This paper does not further discuss the HD. Most notably, it does not
discuss how a dynamic routing protocol would propagate HD
information.
4.4 IDs
When an ID is present, it alone is used to identify the source and
destination hosts. However, IDs can be mapped to the associated RH, so
that the RH implies a certain ID The ID therefore need not be carried in
most packets. This works as follows. When a packet is first sent from a
source host X to a destination host Y, the ID is included. The destination
host Y, upon receiving the packet, associates the source ID with the
"Source RH Number". These are the RHFs that describe the "source
address" of the source host (see example 1). When Y returns a packet to
X, it writes X's ID in the destination ID field, and X's Source RH Number
in the RH (as the Destination RH Number). This indicates to X that Y has
recorded the mapping between X's source RHs and X's ID, and
subsequent packets from X that contain the same source RH need not
include the ID field.
If the host is mobile, and changes RH Numbers while communicating
with another host, then it includes the ID when it uses a new RH Number.
This lets the destination host associate another Source RH Number with
the ID, so that subsequent packets can again leave the ID off. An out-of-
band message can be used to de-associate no-longer-valid RH Numbers.
(If both hosts are mobile, then some kind of third party server will be
necessary, so that current RH Numbers can be determined, in case both
hosts get new RH Numbers simultaneously.) If the hosts get new RH
Numbers often, then the ID can simply be included in every packet.
The ID Type field is interpreted as follows. The first two bits indicate the
type (and length) of the source ID, and the second two bits indicate the
type of the destination ID. The meaning of the four values are: 0 = no IDs;
1 = 32-bit IP number; 2 = 48-bit IEEE 802 number; 3 = 64 bit number.
The 64-bit number can have multiple interpretations, including X.121
number, E.164 number, and so on. While the ID field never influences
routing, the IP-type ID can be used during transition from IP to Pip to
determine how to fill in parts of the RD as the packet traverses the
internet.
The ID field is padded out to a 32-bit boundary. It may make sense to pad
out to a 64-bit boundary, given the introduction of 64-bit word processors.
4.5 Options
No options are defined at this time. In the future there might be options to
establish virtual paths in lieu of policy routes, reserve bandwidth, manage
mobile hosts, manage multicast lists, or whatever. In general, I would
assume that, if options are present, the packet leaves the normal
forwarding code (or hardware) path for special (and slower) processing.
Options are not further discussed in this paper.
4.6 Messages
Pip requires the following "ICMP"-type messages:
Use/don't use tunneling message
Incorrect RH message (usually means not enough levels of RH
Number given)
Max PDU exceeded notification
Received ID incorrect (used to flush old RH Number from sending
host)
Normal redirect
Tunnel redirect
ARP
The use of these messages are explained by the following examples.
5.0 Examples
Following are descriptions of how various routing and addressing styles
are used with Pip. These will further explain the use of the RD.
5.1 Example 1: IP-style Hierarchical RH Numbers (Addresses)
The examples in this section are primarily for the purpose of introducing
the various concepts of Pip, particularly the RD. None of the examples are
give the complete algorithm, but they get successively more complex and
complete. Later examples (Examples 2 and on) will be complete.
Consider the network of Figure 4. The RH Numbers shown correspond to
IP-style addressing.
The Pip analogue to existing IP and CLNP addressing styles is
hierarchical RH Numbers. When plain hierarchical RH Numbers (plain
means with no QOS or policy information) are used, the RHFs (and
RHFRs) are structured as shown in Figure 5. The first group of RHFs are
called the "Source RHFs". These are separated by "up" RHFRs, and are
roughly equivalent to the source address in a traditional IP packet. The
second (and last) group of RHFs are called the "Destination RHFs".
These are separated by "down" RHFRs, and are roughly equivalent to the
destination address in a traditional IP packet.
The Source RHFs are listed in order of lowest level of the hierarchy first.
That is, this field will come in on the wire first. The Destination RHFs are
listed in order of highest level of the hierarchy first. Note that this is the
order in which the fields (specifically the Destination RHFs in this case)
will be used by routers. The RHFR between the source and destination
RH Number indicates "none".
5.1.1 Example 1.1: No tunneling, no default routing.
Assume that no tunneling is needed, and that default routing is not being
used. In other words, the forwarding tables of the routers within the
network have network numbers for other networks. The Tunnel Table for
router x consists of one entry, indicating that all non-zero tunnel values
are invalid. If a Pip packet with a non-zero Tunnel was received, the
"Don't use tunneling" message would be sent to the sender.
The LR table for router x is as follows:
LR table = [ <LR.level=3, use FT3> <LR.level=2, use FT2>
<LR.level=1, ambiguous> ]
For these examples, the only information in the LR Table is that
concerning the hierarchical level at which the packet is operating. Since
the bits denoting this do not necessarily need to be in the least significant
positions of the LR Field, the "LR.level=X" notation implies the index
into the LR table.
The reason the LR.level=1 is ambiguous is that router x is attached to two
level 1 areas (subnets), and therefore wouldn't know which level 1 table
(FT1a or FT1b) to use. As seen from x's forwarding tables below, FT2
must first be indexed to determine whether FT1a or FT1b should be used.
The forwarding tables for router x are as follows:
These table are simplified in that they do not show, for pedagogical
reasons, information relating to the RHF Relators. This will be shown in
later examples.
Example 1.1a: From 2.2.1 to 2.2.2
First consider a packet from 2.2.1 to 2.2.2. Host 2.2.1 would initially
make a directory service query and get back an RH Number in the
following form: <level 1 = 2; level2 = 2; level 3 = 2>. By comparing its
own RH Number with that for the destination, 2.2.1 would conclude that
they share the same level 3 and level 2 (that is, are in the same network
and subnet). 2.2.1 would then compose the following RD:
RD = < Tunnel = 0; LR.level = 1; RHF Offset = 2; RH = 1 (none) 2 >,
where "LR.level" indicates the bits in the LR field indicating the
hierarchical level, and "RH = 1 (none) 2" means that the first RHF is
value 1, the second RHF is value 2, and the RHFR between them is
"none".
The source knows to set Tunnel = 0 because of a local parameter
indicating that tunneling is not in effect. Normally, a host will assume that
tunneling is not in effect unless told otherwise (either by a configuration
message or by a "Don't use tunneling" error message).
The source host initially sets LR.level = 1 because that is the highest
uncommon level between source and dest (and therefore a level at which
routing must take place). The RH contains the level 1 value from the
source (1) followed by the level 1 value from the destination (2). Because
the host is setting the RH.level to 1, the host doesn't have to include any
RH Number components higher than that in the RH. Since neither value is
hierarchically above the other, the RHFR is set to "none". Finally, the
RHF Offset is set to point to the beginning of the Destination RHF of the
RH (value 2). In all examples, the RHF being pointed to by the RHF
Offset will be printed in bold type.
If the host knew that strict subnet-per-LAN IP-style RH Numbering were
being used, it could deduce that the destination host is on the same LAN
as itself, and ARP for the destination. But assuming that the source host
doesn't know this, the source host would send the packet to its "default"
router, which is x.
When router x receives the packet, it goes into the LR table with
LR.level=1, and determines that the LR is ambiguous in this case. It
therefore sends an "LR ambiguous" message to the host. The host would
label router x as being ambiguous at level 1, so that future packets (even
to different destinations) would start at level 2. Normally, a configuration
message from router x (as part of router discovery) would have prevented
the need for the error message.
The host composes another RH, this time with level 2 included:
RD = < Tunnel = 0; LR.level = 2; RHF Offset = 3; RH = 1 (up) 2
(none) 2 (down) 2>.
Now, the bottom two levels of the source RH Number (2.1) occupy the
first two RHFs (but in reverse order), and the bottom two levels of the
destination RH Number (2.2) occupy the last two RHFs.
When router x received this packet, it would index the LR table with
LR.level=2, and determine that forwarding table FT2 should be used.
Using the RHF Offset, router x would isolate the third RHF (value 2)
from the RH. Router x would index 2 into forwarding table FT2, and
retrieve a result indicating that it needs to move to level 1, using
forwarding table FT1a. Router x would increment RHF Offset, isolate the
fourth RHF (value 2) from the RH, use this as an index into FT1a, and
determine that the destination is on subnet 2.2. It would then use an ARP
function to discover the LAN RH Number of 2.2.2.
Router x would also redirect host 2.2.1. After the redirect, packets from
2.2.1 would go directly to 2.2.2, and would use an RH with only level 1.
To form a return packet, 2.2.2 would reverse the order of the RHFs, and
calculate the values of LR.level and RHF Offset similarly to the way that
2.2.1 calculated them. As such, 2.2.2 would copy the level of the
incoming packet into the return packet.
Note that the RH for level 1 packets (after the redirect) would only be 1
word long. Only putting as much of the RH Number in the RH as needed
is one reason that Pip is compact. Since most traffic is local, most packets
will be able to take advantage of this particular optimization.
Example 1.1b: From 2.2.1 to 2.1.3
For a packet from 2.2.1 to 2.1.3, the directory service query would return
<level 3 = 2; level2 = 1; level 1 = 3>. By comparing its own RH Number
with that for the destination, 2.2.1 would conclude that they share the
same level 3 (network), but not the same level 2 or level 1. 2.2.1 would
then compose the following RD:
RD = < Tunnel = 0; LR.level = 2; RHF Offset = 3; RH = 1 (up) 2
(none) 1 (down) 3>.
The bottom two levels of the source RH Number (2.1) occupy the first
two RHFs (but in reverse order), and the bottom two levels of the
destination RH Number (1.3) occupy the last two RHFs. When router x
receives this packet, it would parse the packet as described above, go into
FT2 with index 1, then go into FT1b with index 3, and route the packet to
subnet 2.1.
Example 1.1c: From 2.2.1 to 1.5.11
For a packet from 2.2.1 to 1.5.11 (a host in Net 1), host 2.2.1 would
determine that there is no common level, and so would form an RD
starting at level 3:
RD = <Tunnel = 0; LR.level = 3; RHF Offset = 4; RH = 1 (up) 2 (up)
2 (none) 1 (down) 5 (down) 11>.
The full three levels of the source address (2.2.1) occupy the first three
RHFs (but in reverse order), and the full three levels of the destination
address (1.5.11) occupy the last three RHFs. When router x receives this
packet, it would go to forwarding table FT3 (based on the RL.level of 3)
with an index of 1, and forward the packet to router z without
incrementing the RHF Offset or changing the LR.level.
5.1.2 Example 1.2: With default routing, no tunneling
In the previous examples, the level 3 table (FT3), at least in the IP case,
would be very large, because it must hold all active network numbers.
One way to reduce forwarding table size in general is to use default
routing. With current IP networks, default routing works best if there is
only one exit point, because since there is only one path out of a private
network, default routing doesn't degrade the quality of paths found. If
default routing to multiple exits is used, then sometimes a non-optimal
exit point can be chosen.
With Pip, tunneling would normally be used to handle default routing
with multiple exits. For pedagogical purposes, we give an example here
where default routing is used without tunneling (again from the network
of Figure 4). The level 1 and 2 forwarding tables for router x (FT1a,
FT1b, and FT2) are the same as for Example 1.1. The forwarding table for
level 3 (FT3), however, has a single entry of:
FT3 (level 3, tunnel=0) = [ *, y, 3 ],
where * means all possible index values, y means next hop router y, and 3
means the transmitted packet should operate at level 3 (LR.level = 3, RHF
Offset = unchanged).
Assume the same host pair as Example 1.1c above (2.2.1 to 1.5.11). Host
2.2.1 would form the same RD as shown in example 1.1.c. Upon
receiving this packet, router x would not even need to isolate the RHF,
because it knows that all packets at level 3 are routed to y. Assuming that
y defaults level 3 packets to Backbone 1, the packet would take a longer
path than necessary.
5.1.3 Example 1.3: With default routing and tunneling
Now, we consider the case where tunneling is in use. The level 1 and 2
forwarding tables (FT1a, FT1b, and FT2) for router x are the same as in
the first example. There is no level 3 forwarding table. The Tunnel Table
(TT) is shown below:
Note that there is a new column in the table (the 4th column). This is the
value the Tunnel field gets written to upon transmission. Note that the
Tunnel Table is small (just two entries, one for each exit point). Router x's
LR table is modified as follows (to indicate the lack of a level 3
forwarding table):
LR table = [ <LR.level=3; error (Send "Use tunneling message")>
<LR.level=2; use FT2>
<LR.level=1; ambiguous>
The Tunnel Table and level 3 Forwarding Table for router y are as
follows:
Example 1.3a: From 2.2.1 to 1.5.11, host fails to use tunnel
Normally hosts would be configured to use or not use tunnels as
appropriate (via some router-to-host configuration protocol). Assume for
this example though that host 2.2.1 has somehow not been informed to
use tunnels for inter-domain (level 3) traffic.
Host 2.2.1 would generate an RD as shown in Example 1.1c. When router
x receives this packet, it goes to the LR Table entry for LR.level=3. This
results in the error shown. Router x sends an error message to 2.2.1
indicating that it must use tunneling for level 3 traffic.
Example 1.3b: From 2.2.1 to 1.5.11, host uses tunnel value
Now assume that either because of proper configuration or the error
message of the previous example, host 2.2.1 knows to use a tunnel for
level 3 traffic. Now, host 2.2.1 generates the following RD:
RD = <Tunnel = 1; LR.level = 3; RHF Offset = 4; RH = 1 (up) 2 (up)
2 (none) 1 (down) 5 (down) 11>.
In general, a host will know which Tunnel values are valid, via a
configuration message. Barring this, it probably makes sense to have a
convention where, lacking better information, a host simply chooses
value 1. The routing algorithm could treat this value to mean "route to
closest exit point", so that a single exit point doesn't get overloaded with
default-tunneled packets.
In this example, host 2.2.1 arbitrarily picks a Tunnel value of 1. Upon
receiving this packet, router x indexes into TT by 1 (the Tunnel value),
and forwards the packet to router y with no changes in the RD. When y
receives the packet, it indexes 1 into its Tunnel Table TT. The resulting
entry indicates that the appropriate exit point has been reached (which is y
for Tunnel value 1), and that the level 3 (inter-domain) forwarding table
FT3 should be consulted. (Alternatively, router x could have written the
Tunnel Field to 0 upon transmission to y. In this case, y would go directly
to the RH).
For this, router y isolates the appropriate RHF in the RH, which is the 4th
RHF (destination network number), value 1. The first entry in FT3
reveals that the appropriate exit point is actually z. Therefore, y puts z's
tunnel value (2) in the Tunnel field and forwards the packet to z. Router y
also sends a "Tunnel Redirect" message to 2.2.1, indicating that for this
particular level 3 value (network number 1), the appropriate tunnel value
is 2. As a result, subsequent packets from 2.2.1 to 1.*.* (where "*" means
"anything") will go via z.
Discussion
The "Tunnel Redirect" described in Example 1.3a, combined with use of
the Tunnel Field, are what make multiple defaults routing work. With
multiple defaults routing, the host's relationship with the exit border
routers is analogous to a host's relationship with its directly connected
(next-hop) routers. In the latter case, the connected router sends a
conventional redirect to the host to get to use an alternate router attached
to the same network. In the former case, the Tunnel Redirect serves the
same purpose with respect to an alternate border router attached to the
same stub domain. This is a powerful technique useful for isolating the
internal stub routing from external routing.
A few more comments about router y's level 3 forwarding tables is called
for. Note first that if router y receives an RD with a tunnel of 2 (FT3a,
second entry), it will forward that packet onto z. This would be necessary,
for instance, if a host on subnet 2.4 tunneled a packet to z.
If a packet is tunneled to y destined for network 3, y would write the
tunnel to 0 (assuming that it didn't subsequently have to tunnel through
backbone 1), and forward the packet onto Backbone 1 (FT3b, third entry).
As with router x, router y should never receive an RD at level 3 with a
NULL tunnel (except from a mis-configured host). When router y
receives a packet from Backbone 1, the RD should indicate level 2, as y's
neighbor router in Backbone 1 would know to decrement the LR.level
(and increment the RHF Offset) before forwarding a packet to y.
5.1.4 Example 1.4: Using tunneling for policy
This example shows how tunneling can be used as a limited policy
mechanism. Later examples will show how full policy information can be
encoded in the RD.
For this example, assume that x's and y's level 3 forwarding tables are as
shown in example 1.3, and that z's level 3 forwarding tables are structured
similarly to y's, except that z uses Backbone 2 to get to Network 1, uses y
to get to Network 3, and uses Backbone 2 to get to Network 4. Therefore,
there are two ways to get to Network 4, either via Backbone 1 (via y), or
via Backbone 2 (via z).
Assume that Host 2.2.1 has a packet to send to a host on Network 4. If
host uses a tunnel value of 1, then the packet will travel via Backbone 1. If
the host uses a tunnel value of 2, then the packet will travel via Backbone
2. In this manner, the tunnel value acts as a policy mechanism.
Although it is not the best method for getting policy, note that, with the
topology of Figure 4, it could be possible for Host 2.2.1 to choose
between Backbone 1 and 2 even for sending packets to Networks 1 or 3.
This could be done, for instance, by modifying y's and z's routing tables
so that they didn't send tunnel redirects, but instead blindly forwarded the
packet onto their connected backbones. (This is assuming that Network 2
does not advertise itself as a transit network, and therefore packets would
not be routed back to 2, thus causing a loop.)
A variation on this would be to define a bit in the LR to mean "force
indicated tunnel", so that if this bit was off, the border routers (y or z)
would pick the best path, but if this bit were on, it would override the
router's better judgement and force the packet directly onto the backbone
as described in the last paragraph. As with all host-initiated policy
mechanisms, this requires that the host (or policy server) be
knowledgable about the route it is choosing.
5.2 Example 2: Backbone-oriented Hierarchical RH Numbers
It is well-known that IP-style addresses do not scale well. NSAP
addresses (at least as defined by RFC 1237 [CGC]) scale better because
the addresses are rooted at the backbones.
Figure 6 shows an example topology and backbone-oriented RH Numbers
for use with this and subsequent examples. Each backbone has its own
number, which is advertised in routing updates to all other backbones.
(Hierarchically grouped backbones, for instance, where all backbones in a
country are given the same RH Number prefix, are possible, but are not
shown in Figure 6.) Note that stub network X has two levels of hierarchy
internally, while stub Y only has one.
One of the outstanding problems with the address assignment technique
of RFC 1237 is how to handle stub networks that are attached to more
than one backbone. One solution is to have multiple RH Numbers, one
per attached backbone. This type of solution can be used for Pip. For
instance, stub X (and its hosts) is shown to have two RH Number prefixes
(1.14 and 26.81), one reflecting its attachment to A and the other its
attachment to D. The negative aspects of the multiple addresses solution
are not as bad with Pip as with CLNP. Indeed, with Pip, hosts can be
completely isolated from inter-domain RH Numbering conventions.
One reason that multiple RH Number prefixes is easier with Pip is the
simple fact that "inter-domain" levels of the RH Number are not included
in intra-domain RDs. For instance, the RD for a packet from host w to
host y would be:
RD = < Tunnel = 0; LR.level = 2; RHF Offset = 3; RH = 9 (up) 27
(none) 12 (down) 58>.
Neither of the prefixes for stub domain X (1.14 or 26.81) are in the
packet. Internal communications are not affected by backbone RH
Numbering conventions. Hosts may (or may not) need to know their
backbone RH Numbers for inter-domain traffic, and so the functions for
reconfiguring these parts of all host RH Numbers may be required. This
would be done alongside other host configuration (such as how to use
tunnels, etc.), and is not particularly difficult.
Another reason why multiple RH Numbers is less of a problem with Pip is
that the transport protocol uses only the ID field for the purpose of
labeling connections. This means that the RH Number prefix (or any other
part of the RD) can change arbitrarily during a transport connection
without effecting the connection.
Appendix A shows the forwarding tables for various routers in Figure 6.
5.2.1 Example 2.1: Inter-domain communications without backbone
selection (with tunneling)
For these examples, host x wishes to send a packet to host z, and does not
care which backbone (A or D) is used, but would like the routers to
choose the best path. Assume that routing will find D as the best backbone
for reaching Y from X.
Example 2.1a: Complete host isolation from external RH Numbering
conventions.
This example describes a mode of operation where hosts (or internal
routers) do not need to know the "inter-domain" components of their RH
Numbers (although directory systems still must). This is the extreme case
of isolating internal network operation from external influences.
At a minimum, the host must initially know 1) that the stub-domain
border routers will handle the inter-domain RH Numbers, and 2) which
bit in the LR Field determines that so-called RH-Tunneling will be used
to find exit routers. The host must eventually know 1) how many levels of
inter-domain RH Number there are, and 2) the minimum RHF length for
these levels.
Initially, the host makes its best guess at the number of levels and the
minimum RHF length. For example, if host x thought that there was only
one level of RH Number above the stub domain, it might create the
following RD: