Circuit Monitoring

Introduction

The following outlines the proposed architecture for monitoring circuits, and describes what has been implemented so far.

Overview

Constraints

Various Measurement Metrics

Users have expressed a desire to retrieve various metrics about the circuits they use. These metrics include up/down status, acheivable bandwidth, loss rate, jitter, and utilization. These statistics allow users to verify that the circuit fulfills their requirements. Domains have expressed interest in exporting more low level metrics in such a way that that information can be correlated with the dynamic circuits.

How these metrics are described can vary significantly depending on the metric being measured, the method of collection, and the layer at which the metric is being measured. For example, end-to-end metrics captured by IP-level tools like iperf, owamp, ping and traceroute use IP addresses as the unique identifier for elements, and a source/destination pair to define a given test. SFlow and NetFlow are similar, using a combination of IP address, protocol and protocol port to define a unique element, and a pair of these elements to define a given flow. Statistics collected via TL1 or SNMP, on the other hand, use a combination of host address, and interface or facility name to define each unique measureable element, and, for the most part, there is only a single element for each measured metric.

The above metrics are only a first cut at the types that might offered. Lower level statistics like light levels will likely be added into the framework at some point. The monitoring infrastructure must be extensible enough to describe the existing measurement data examples described above, while allowing for new types of measurement data to be integrated into the infrastructure without requiring substantial protocol changes.

Metric Collection Techniques

There are two broad methods of collecting circuit statistics: passive monitoring, and active monitoring. Active monitoring involves injecting traffic onto the circuit and measuring how that traffic is affected by traversing the circuit. Passive monitoring avoids injecting traffic onto the circuit by taking measurements of the elements along the circuit's path. Most passive circuit monitoring involves querying the switches and routers for the measurement counters or flow information that they maintain. Passive monitoring could also involve mirroring traffic to a monitoring device or inserting a tap into the circuit to permit a monitoring device to measure circuit statistics.

As some statistics may be impossible to gather either just active, or just passive measurements, domains may need to use a mixture of active and passive measurements to export relevant circuit statistics. The circuit monitoring infrastructure should not constrain the method, passive or active, that domains must employ to collect circuit statistics.

End-To-End vs. Domain-Specific Statistics

There are two granularities of circuit statistics that users are interested in: high-level statistics conveying the end-to-end performance, and more granular statistics that allow clients to more easily debug performance issues. One example of this split is the E2EMon tool which retrieves up/down state of each domain's segment of a circuit, and then calculates an end-to-end status. Users can then easily look at the overall state of their circuit, as well as the dig down to find the domain at issue if a problem occurs. This separate definition of the end-to-end statistics and lower level statistics becomes more important when active measurement are being collected that span the end-to-end circuit while passive measurements are being collected that measure per-segment information. The circuit monitoring infrastructure should not mandate either end-to-end or segment-specific segments, but should accomodate statistics gathered at both levels of granularity.

Continuous vs. On-Demand Data Collection

Beyond active vs. passive collection, there is another dimension to the collection methodology: whether or not the monitoring happens continuously, is initiated in response to circuit creation, or is initiated at the request of the end user. Many passive metrics can be monitored continuously since the elements being monitored will only exist when the circuits are available, or, for more static elements, the metrics are more generally useful outside of per-circuit monitoring. Active measurements are more likely to be initiated in response to circuit creation or user request since the data collection would interefere with production traffic if it were to run constantly.

To allow for the addition of measurement data that may need to be collected using either metric, the dynamic circuit monitoring infrastructure must not mandate a specific type of data collection.

Centralized vs. Decentralized Measurement Archives

There are two envisioned paradigms for storing the measurement data to allow clients to retrieve them: centralized and decentralized. In the centralized model, each institution has a single service that holds all measurement data. This has the advantage of allowing for a single well known location from which all data can be obtained. It also matches the current circuit reservation model which expects that clients will only interact with a single entity, the inter-domain controller, in a given institution. However, differences between circuit reservation and measurement collection and provision mean that this model will not work for all institutions.

In the circuit reservation model, there needs to be one entity whose role it is to arbitrate access to the various resources. This leads naturally to the single inter-domain controller model. For measurement, there can be numerous different agents collecting network statistics. As a domain begins making more and more network measurements, the feasibility of a single locus of measurement data becomes infeasible. An example of this would include a instution like ESnet, which is monitoring both OWAMP and SNMP data across their domains. The wide range of network elements contributing data to each archive means that the machine collecting each type of data is heavily loaded. Combining both types of data collection onto a single machine would put heavy strain on that machine, necessitating a much more expensive machine, and hard drive configuration. In order to scale effectively, domains like ESnet need the ability to split their measurement archives based on geographic location, type of measurement data, or some other metric.

The above does not mean to imply that all domains must behave in a decentralized fashion. For domains that can handle having a single locus of measurement data, the infrastructure must not prevent them from doing so. However, the circuit monitoring infrastructure must be able to handle a more decentralized approach for the domains that need or desire it.

Existing Monitoring Infrastructure

Most domains that will be deploying dynamic circuit monitoring infrastructure will have an existing monitoring infrastructure in place. They are likely already collecting data such as SNMP statistics or alarm information. It is unlikely that domains will want to deploy a parallel monitoring infrastructure to gather the same information that they are already gathering. The SNMP MA provides a good example of easy integration. This measurement archive is able to integrate with existing Cacti, MRTG and Cricket installations with little work required of the site administrator. This ease of installation, and integration with existing monitoring infrastructure makes it more likely that a given domain will deploy the software.

Proposed High Level Architecture

The architecture makes use of 3 general types of services: a service to allow clients to find services and relevant measurement data, perfSONAR Lookup Service (LS), a service to allow clients to look up topological information, perfSONAR Topology Service (TS), and services that provide measurement data for clients, perfSONAR Measurement Archives (MA).

The clients in this architecture can include both central monitoring systems like E2EMon as well as individual applications running on end-site clients.

In the model, a circuit, identified by a circuit identifier, is described by a circuit descriptor which contains the properties applicable to the entire end-to-end circuit, including bandwidth, duration, and other similar metadata. This descriptor is added to a TS. Clients query the LS using the circuit identifier to find which TS contains the circuit descriptor for that circuit.

The circuit descriptor contains a list of segment identifiers for the domain-specific segments that comprise the circuit. Each domain, or an entity working on its behalf, produces a segment descriptor that contains information about their segment of the circuit, along with as much topological detail as they desire to publish. These segment descriptors are then registered with a TS. Clients query the LS using the segment identifiers in the circuit descriptor to find which TSes contain the segment descriptors for each domain's segment.

Once the client has the circuit descriptors and the segment descriptors, they can use the LS along with information contained within the descriptors to find measurements that are applicable to each circuit segment, or the entire end-to-end circuit.

Since clients will often make repeated queries to obtain updated performance statistics, they may cache any of the information discovered about the circuit, including the various descriptors, the location of topology servers and measurement archives. This caching allows them to short circuit much of the discovery process described above.

Benefits Of This Architecture

The architecture allows for each domain to control of the information being made available about the circuits traversing their networks. Each domain controls what information it adds to the segment descriptor, and how the segment descriptor is made available. Since the segment descriptor contains the information that the client uses to find measurement data, domains can control how much information the client knows is relevant to their circuits.

The above architecture also allows for a more decentralized approach to making available measurement data. Instead of requiring a single well known network oracle that must know everything about a circuit, the architecture uses the LS to find the services containing relevant circuit information. This permits domains to organize their data repositories in the way that is most reasonable to their needs.

The architecture also permits pooling of resources by allowing domains to share LSes, TSes or MAs.

Architecture

Circuit/Segment Descriptions

There are 4 separate elements involved in describing circuits:

Circuit Identifier: Each circuit is identified by a unique identifier that clients use when looking up information about the end-to-end circuit. The format of this identifier is a URN containing a domain name portion and a domain-specific portion, the format agreed upon by the GLIF organization.
Segment Identifier: Each domain's segment of the circuit is identified by a unique identifier that clients use when looking up information about that segment of the circuit. Like the circuit identifier, the format of this identifier is a URN containing a domain name portion and a domain-specific portion, the format agreed upon by the GLIF organization.
Circuit Descriptor: The circuit descriptor is a high-level description of the end-to-end circuit itself, including things like the bandwidth, source, destination, and any other information specific to the end-to-end circuit. The circuit descriptor contains an ordered list of segment identifiers for the domain domain-specific segments that comprise the end-to-end circuit.
Segment Descriptor: The segment descriptor provides the domain-specific aspects of the reservation. This can include things like bandwidth, ingress point, egress point and any other information specific to that segment of the circuit.

The separation of elements is key to the architecture. It permits clients to find relevant information about the circuits while enabling domains to retain control over the information that clients can receive.

The circuit descriptor must be created by the source domain and must be registered into a Topology Service. The circuit descriptor must contain the segment identifiers for all the domains along the path, therefore, the source domain must know these segment descriptors. This could be accomplished with tighter integration with the circuit provisioning serivce, or by using an algorithmic approach for generating the identifier. For example, one algorithm that might be used would take the circuit identifier and use it as the domain-specific portion of the identifier. A circuit identifier "urn:glif:internet2.edu:gri-123456" might have an ESnet segment identifier of "urn:glif:es.net:circuit_internet2.edu:gri-123456". Using an algorithm to generate the segment identifiers allows for distributed domains to work together without necessitating communication every time a circuit is setup.

The segment descriptor can be created by the domain providing that segment of the circuit, or could be created by an entity acting on behalf of that domain. This descriptor must be registered into a Topology Service, either one provided by the domain, or one that has agreed to store the descriptor for the domain. The domain can provide more detail in the segment descriptor for trusted clients than they do for other clients. This could be done by having the segment descriptor contain an abstract path through the network, and allowing trusted clients to query to find out the infrastructure that underlies that abstract path. However, if the internal infrastructure is abstracted, the domain must provide a way for clients to discover measurement data using the abstract path.

Topology Service Infrastructure

Circuit descriptors and segment descriptors need to be loaded into a Topology Service to allow clients to find them. The Topology Service registers with the Lookup Service the circuit identifiers and segment identifiers for the circuit and segment descriptors it contains. The registration ensures that clients can query the Lookup Service using a circuit or segment descriptor and discover which Topology Service contains the circuit or segment descriptor. This indirection allows for the Topology Service infrastructure can be centralized, decentralized or a hybrid.

A centralized approach to topology infrastructure would have a single Topology Service that all participating domains register their segment descriptions, circuit descriptions and network topologies with. This would provide a single point of failure for the participating domains, but would ease administrative burden.

A completely decentralized approach would have each domain provide their own Topology Service. They would register their segment and circuit descriptions with their domain's Topology Service. Clients would query the Lookup Service to discover each domain's Topology Service. This increases the administrative overhead, but increases the scalability of the overall system while removing any single point of failure.

There are numerous gradiations between the completely centralized approach, and the completely decentralized. For example, only certain domains could deploy Topology Service infrastructure, and the other domains could simply piggy-back off either a centralized Topology Service, or another domain's Topology Service.

Lookup Service Infrastructure

The perfSONAR Lookup Service Infrastructure ties the architecture together. This infrastructure allows clients to discover the Topology Services for obtaining circuit and segment descriptors, as well as the various Measurement Archives containing measurements. Similar to the Topology Lookup Infrastructure, the Lookup Service Infrastructure can be both centralized or decentralized, with a similar of upsides and downsides for each.

Measurement Infrastructure

Measurement Collectors

The monitoring infrastructure must include agents whose role is to collect measurements of the circuit, using either passive or active techniques. This can be accomplished via constant collection, on-demand collection or some combination.

It may be reasonable to monitor certain network elements whether circuits are enabled or not. For example, many devices will only have certain elements configured when a circuit is enabled. If a measurement collector is constantly looking for these configured elements, and measuring them when available, the collector can run constantly without negatively impacts.

Some administrators may not wish to have a collector constantly polling all elements of a given type from their hosts. In these cases, at circuit setup time, the measurement collectors will need to be informed of the new network elements need to be monitored, and at circuit teardown time, will need to be told to stop monitoring those elements. In this case, the domain will need to have an agent, possibly human, that can enable monitoring of the appropriate network elements when a circuit is brought up.

Constant monitoring and on-demand monitoring could be combined to allow certain elements to be monitored constantly, while others are only measured on-demand. As long as the appropriate measurement collectors are notified, this hybrid monitoring is feasible.

Measurement Archives

The measurement data for circuits needs to be made available in perfSONAR Measurement Archives which must register with the Lookup Service Infrastructure so that clients can find it.

Required Metrics

There are a number of metrics that are generally useful, but can be hard for clients to discern using technology specific measurements. These metrics include operational and administrative status and utilization. The method for discerning these characteristics is highly dependent on the infrastructure used to allocate the circuits, and any client wishing to monitor this information would need to understand the intricacies of the the wide array of networking infrastructure (e.g. Ethernet and VLAN, MPLS tunnels, or SONET) in order to accurately calculate those metrics.

To ease client development, domains must make some metrics available such that clients do not need to know how to interpret technology-specific measurement data. These metrics include, at a minimum, operational status, administrative status and utilization. The easiest way to make these metrics available is to have one or more services that clients can query using the segment identifier, and who will respond with the metrics. These metrics could be collected by a special measurement collector that collects and stores the data in a Measurement Archive along with the segment identifier. They could also be made available by a service which accepts the queries, collects the relevant technology-specific data, and transforms it into the specified technology-agnostic metric.

Service Interactions

Circuit Provisioning Time

The Circuit Provisioning Service contacts the various routers and switches and sets up the new circuit
The Circuit Provisioning Service contacts the circuit monitoring agent, and informs it about the new circuit, including the internal path and other circuit metadata.
The Circuit Monitoring Agent creates the circuit and segment descriptors, and registers them with the Topology Service.
The Topology Service tells the Lookup Service about the circuit and segment descriptors it now contains.
The Circuit Monitoring Agent contacts the Measurement Collectors and informs them of any new elements that they need to measure or new information that they need to make available.
The Measurement Collectors begin storing data into Measurement Archives.
The Measurement Archives tell the Lookup Service about the measurement data they can make available.

Circuit Monitoring

The Measurement Collector queries the routers and switches for circuit statistics
The Measurement Collector stores the results into the Measurement Archive
The Measurement Archive tells the Lookup Service about the measurement data it has.

Client Querying

The client uses the circuit identifier, and asks the Lookup Service where it can find the circuit descriptor. The Lookup Service directs the client to the Topology Service.
The client contacts the Topology Service and retrieves the circuit descriptor.
The client finds the segment identifier for the red domain, and asks the Lookup Service where it can find that segment descriptor. The Lookup Service directs the client to the red domain's Topology Service.
The client contacts the red domain's Topology Service and retrieves the segment descriptor.
The client finds the segment identifier for the purple domain, and asks the Lookup Service where it can find that segment descriptor. The Lookup Service directs the client to the purple domain's Topology Service.
The client contacts the purple domain's Topology Service and retrieves the segment descriptor.
The client asks the Lookup Service which Measurement Services can return performance statistics about the red domain's segment of the circuit. The lookup infrastructure directs the client to one or more Measurement Services in the red domain.
The client requests the performance statistics from the Measurement Services which may retrieve the statistics directly from a known databasae, compute the statistics on the fly using existing collected statistics or may collect the statistics on the fly and return them.
The client asks the Lookup Service which Measurement Services can return performance statistics about the purple domain's segment of the circuit. The lookup infrastructure directs the client to one or more Measurement Services in the purple domain.
The client requests the performance statistics from the Measurement Services which may retrieve the statistics directly from a known databasae, compute the statistics on the fly using existing collected statistics or may collect the statistics on the fly and return them.

Possible Information Model For Circuit/Segment Descriptors

As part of the circuit provisioning process, a circuit descriptor is created that describes, from a very high level, the path that the circuit is taking.

<link id="urn:nml:internet2.edu:gri123456_downstream"> <!-- The name of the circuit corresponds to the URI for the identifier -->
  <source>urn:nml:internet2.edu:hostA_eth0</source> <!-- the source can be a URN or a full description of a domain, node, port or link -->
  <destination>urn:nml:es.net:hostZ_eth1</destination> <!-- the destination can be a URN or a full description of a domain, node, port or link -->

  <!-- Human-Readable Description of the Circuit -->
  <description>Phoebus Circuit</description>

  <!-- Capacity of the circuit in bps. --> 
  <capacity>1000000000</capacity>  <!-- 1Gbps -->

  <!-- Period when the circuit will be active for. The start and end elements
       are unix timestamps. There may be other ways to describe time, but it is
       reasonable to specify a single representation for timestamps -->
  <lifetime>
     <start>1234177754</start>
     <end>1234567890</end>
  </lifetime>

  <reverse-link>urn:nml:internet2.edu:gri123456_upstream</reverse-link>

  <relation type="over">
      <path>
          <hop id="internet2-1">
              <linkIdRef>urn:nml:internet2.edu:gri123456_downstream_segment</linkIdRef>
              <nextHop>esnet-1</nextHop>
          </hop>
          <hop id="esnet-1">
              <linkIdRef>urn:nml:es.net:gri123456_downstream_segment</linkIdRef>
          </hop>
      </path>
  </relation>
</link>

<link id="urn:nml:internet2.edu:gri123456_upstream">
  <source>urn:nml:internet2.edu:hostA_eth0</source> <!-- the source can be a URN or a full description of a domain, node, port or link -->
  <destination>urn:nml:es.net:hostZ_eth1</destination> <!-- the destination can be a URN or a full description of a domain, node, port or link -->

  <!-- Human-Readable Description of the Circuit -->
  <description>Phoebus Circuit</description>

  <!-- Capacity of the circuit in bps. --> 
  <capacity>1000000000</capacity>  <!-- 1Gbps -->

  <!-- Period when the circuit will be active for -->
  <lifetime>
     <start>1234177754</start>
     <end>1234567890</end>
  </lifetime>

  <reverse-link>urn:nml:internet2.edu:gri123456_downstream</reverse-link>

  <relation type="over">
      <path>
          <hop id="internet2-1">
              <linkIdRef>urn:nml:internet2.edu:gri123456_upstream_segment</linkIdRef>
              <nextHop>esnet-1</nextHop>
          </hop>
          <hop id="esnet-1">
              <linkIdRef>urn:nml:es.net:gri123456_upstream_segment</linkIdRef>
          </hop>
      </path>
  </relation>
</link>

During the circuit provisioning process, each domain SHOULD setup their monitoring services to respond to requests for the ID specified for their domain. For example, the status MA for Internet2 could be setup to respond to requests for "urn:nml:internet2.edu:gri123456_segment", and have it return the up/down status of the circuit. This would, obviously, only make sense for the monitored properties that span the entire circuit like up/down status, packets/s, bytes/s, etc. The monitored data SHOULD correspond to just those counters/status for the circuit.

However, a domain can also choose to provide clients with more information. They could create a descriptor for their segment of the circuit. The level of detail provided would be up to the individual domain. They would then register this descriptor with a Topology Service. Interested clients could then use the segment identifier given in the circuit descriptor to look up the segment descriptor.

An example is provided in Figure 2 that looks similar to the circuit identifier given above in so far as its a path with element identifiers in each hop. Unlike the circuit descriptor, this segment descriptor contains hops for each port and link. The port and link ids in this descriptor might correspond to physical elements OR they might reference virtual elements created during circuit provisioning. Internet2 could configure its monitoring services to return information for each identifier inside its segment descriptor.

<link id="urn:nml:internet2.edu:gri123456_segment">
  <relation type="over">
      <bidirectionalPath>
          <path direction="downstream">
              <hop id="0">
                  <portIdRef>urn:nml:internet2.edu:port_packrat_eth0</portIdRef>
                  <nextHop>1</nextHop>
              </hop>
              <hop id="1">
                  <portIdRef>urn:nml:internet2.edu:port_packrat_eth1</portIdRef>
                  <nextHop>2</nextHop>
              </hop>
              <hop id="2">
                  <linkIdRef>urn:nml:internet2.edu:link_packrat_newy</portIdRef>
                  <nextHop>3</nextHop>
              </hop>
              <hop id="3">
                  <portIdRef>urn:nml:internet2.edu:port_newy_1-A-4-1</portIdRef>
                  <nextHop>4</nextHop>
              </hop>
	      <!-- there's a psuedo-"link" between these two hops, the
		   cross-connect. It's ommitted here for brevity, but could be
		   added in for completeness. -->
              <hop id="4">
                  <portIdRef>urn:nml:internet2.edu:port_newy_1-A-4-5</portIdRef>
              </hop>
          </path>

          <path direction="upstream" />
      </bidirectionalPath>
  </relation>
</link>

If Internet2 wished to make even more information available to a client, it could create descriptors of each of the elements and register those into a Topology Service. Its other option would be to include those descriptors in place of the identifiers. Even if it were to include them, the domain should register them into the Topology Service. An example of this form is available in Figure 3. Here, the path is identical to what it was in the previous segment identifier, but the elements are now fleshed out some.

<link id="urn:nml:internet2.edu:gri123456_segment">
  <relation type="over">
      <bidirectionalPath>
          <path direction="downstream">
              <hop id="0">
                  <port id="urn:nml:internet2.edu:port_packrat_eth0">
			<name>eth0</name>
			<address type="ipv4">207.75.164.10</address>
			<description>Commodity connection</description>
                  </port>
                  <nextHop>1</nextHop>
              </hop>
              <hop id="1">
                  <port id="urn:nml:internet2.edu:port_packrat_eth1">
			<name>eth1</name>
			<address type="ipv4">207.75.164.8</address>
			<description>Connection to NEWY Switch</description>
                  </port>

                  <nextHop>2</nextHop>
              </hop>
              <hop id="2">
                  <link id="urn:nml:internet2.edu:link_packrat_newy">
			<name>PACKRAT_NEWY-1</name>
			<capacity>1000000000</capacity>
			<description>Link connecting Packrat to the NEWY Router</description>
		  </link>
                  <nextHop>3</nextHop>
              </hop>
              <hop id="3">
                  <port id="urn:nml:internet2.edu:port_newy_1-A-4-1">
			<relation type="management">
				<address type="dns">mss.newy32aoa.net.internet2.edu</address>
			</relation>
			<name>1-A-4-1</name>
			<description>Connection to Packrat</description>
                  </port>
                  <nextHop>4</nextHop>
              </hop>
	      <!-- there's a psuedo-"link" between these two hops, the
		   cross-connect. It's ommitted here for brevity, but could be
		   added in for completeness. -->
              <hop id="4">
                  <port id="urn:nml:internet2.edu:port_newy_1-A-4-5">
			<relation type="management">
				<address type="dns">mss.newy32aoa.net.internet2.edu</address>
			</relation>
			<name>1-A-4-5</name>
			<description>Connection to ESnet</description>
                  </port>
              </hop>
          </path>

          <path direction="upstream" />
      </bidirectionalPath>
  </relation>
</link>

Virtual Elements

During circuit provisioning, the domain might create virtual elements that correspond to just that subsegment of a port or link that is traversed by the circuit. These elements SHOULD be "over" physical hardware. If the domain makes status, counters or other information available corresponding to these elements, they SHOULD correspond to just those portions used by the circuit.

<layer2:link id="urn:nml:internet2.edu:gri123456_segment_0">
  <vlan>3000</vlan>

  <relation type="over">
      <linkIdRef>urn:nml:internet2.edu:link_packrat_to_CHIC</linkIdRef>
  </relation>

  <relation type="upstreamPort">
      <portIdRef>urn:nml:internet2.edu:port_packrat_eth0.3000</portIdRef>
  </relation>
  <relation type="downstreamPort">
      <portIdRef>urn:nml:internet2.edu:port_NEWY_eth3.3000</portIdRef>
  </relation>
</layer2:link>

<layer2:port id="urn:nml:internet2.edu:port_packrat_eth0.3000">
  <name>eth0.3000</name>
  <relation type="over">
      <portIdRef>urn:nml:internet2.edu:port_eth0</portIdRef>
  </relation>

  <relation type="child">
      <linkIdRef>urn:nml:internet2.edu:link_WASH_TO_NEWY</linkIdRef>
  </relation>

  <relation type="parent">
      <nodeIdRef>urn:nml:internet2.edu:node_packrat</nodeIdRef>
  </relation>
</layer2:port>

The domain can provide even more information, if desired, by adding their static hardware definitions into a Topology Service. This hardware should also be monitored so that users can query the element's state independent of any circuits traversing it.

<link id="urn:nml:internet2.edu:link_packrat_to_CHIC">
  <relation type="parent">
      <portIdRef>urn:nml:internet2.edu:port_packrat_eth0</portIdRef>
  </relation>
  <name>The Link From Washington D.C. to New York</name>
</link>

<layer2:port id="urn:nml:internet2.edu:port_packrat_eth0">
  <name>eth0</name>
  <address type="mac">00:11:43:34:E0:23</address>
  <relation type="under">
      <portIdRef>urn:nml:internet2.edu:port_2001:468:1420:0:211:43ff:fe34:e023</portIdRef>
      <portIdRef>urn:nml:internet2.edu:port_207.75.164.10</portIdRef>
  </relation>
  <relation type="parent">
      <nodeIdRef>urn:nml:internet2.edu:node_packrat</nodeIdRef>
  </relation>

  <relation type="child">
      <linkIdRef>urn:nml:internet2.edu:link_WASH_TO_NEWY</linkIdRef>
  </relation>
</layer2:port>

<layer3:port id="urn:nml:internet2.edu:port_207.75.164.10">
  <address type="ipv4">207.75.164.10</address>
  <relation type="sibling">
      <portIdRef>urn:nml:internet2.edu:port_2001:468:1420:0:211:43ff:fe34:e023">
  </relation>
  <relation type="over">
      <portIdRef>urn:nml:internet2.edu:node_packrat_eth0</portIdRef>
  </relation>
  <relation type="parent">
      <nodeIdRef>urn:nml:internet2.edu:node_packrat</nodeIdRef>
  </relation>
</layer3:port>

<layer3:port id="urn:nml:internet2.edu:port_2001:468:1420:0:211:43ff:fe34:e023">
  <address type="ipv6">2001:468:1420:0:211:43ff:fe34:e023</address>
  <relation type="sibling">
      <portIdRef>urn:nml:internet2.edu:port_207.75.164.10</portIdRef>
  </relation>
  <relation type="over">
      <portIdRef>urn:nml:internet2.edu:node_packrat_eth0</portIdRef>
  </relation>
  <relation type="parent">
      <nodeIdRef>urn:nml:internet2.edu:node_packrat</nodeIdRef>
  </relation>
</layer3:port>

<layer3:port id="urn:nml:internet2.edu:port_packrat_lo">
  <name>lo</name>
  <address type="ipv4">127.0.0.1</address>

  <relation type="parent">
      <nodeIdRef>urn:nml:internet2.edu:node_packrat</nodeIdRef>
  </relation>
</layer3:port>

<base:node id="urn:nml:internet2.edu:node_packrat">
  <name>packrat</name>

  <relation type="child">
      <portIdRef>urn:nml:internet2.edu:port_207.75.164.10</portIdRef>
      <portIdRef>urn:nml:internet2.edu:port_2001:468:1420:0:211:43ff:fe34:e023">
      <portIdRef>urn:nml:internet2.edu:port_packrat_eth0</portIdRef>
      <portIdRef>urn:nml:internet2.edu:port_packrat_lo</portIdRef>
  </relation>
</base:node>

Possible Development/Deployment Roadmap

Step 1

Start off with the highest level of granularity. There are no Segment Descriptors, each domain makes available information for their circuit segment. The domain SHOULD provide status information.

A gui would be written to either find or take a circuit descriptor and lookup status or other information. A domain that does not provide status information will be listed as "unknown". Due to the time required to perform these lookups and find this information, this gui would need to cache this information.

Step 2

Either each domain deploys its own TS or a central one could be provided. Each domain could register a segment descriptor for their segment. The depth of these circuits would be as detailed as that domain wanted to be. If a domain did not register the sement descriptor, it would simply be viewed as opaque by the gui.

The gui would be modified to lookup the segment descriptors in the TS. It would then allow users to click on segments that have a corresponding segment descriptor. The user would be taken to a page containing status.

Step 3

Software would be written (or reused) to collect and make available any available counters. This would include TL1 or SNMP.

The gui would be modified to collect this information when available.

Step 4

The above software would be deployed on the various domains, and configured.

Step 5

An agent would be written that would listen to OSCARS NB events. This agent would construct the segment descriptors for a given domain, and an end-to-end segment descriptor using the naming convention used in the above examples.

Initially, it would only work with predefined physical hardware that is always being monitored.

Step 6

The agent and monitoring software so that the agent could signal the monitoring software to start collecting information about a given circuit's virtual links. It would then make a circuit descriptor consisting of these virtual links.

Circuit Monitoring

Introduction

Overview

Constraints

Various Measurement Metrics

Metric Collection Techniques

End-To-End vs. Domain-Specific Statistics

Continuous vs. On-Demand Data Collection

Centralized vs. Decentralized Measurement Archives

Existing Monitoring Infrastructure

Proposed High Level Architecture

Benefits Of This Architecture

Architecture

Circuit/Segment Descriptions

Topology Service Infrastructure

Lookup Service Infrastructure

Measurement Infrastructure

Measurement Collectors

Measurement Archives

Required Metrics

Service Interactions

Circuit Provisioning Time

Circuit Monitoring

Client Querying

Possible Information Model For Circuit/Segment Descriptors

Virtual Elements

Possible Development/Deployment Roadmap

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Useful Links

Reporting an Issue

Project Planning

Support Rotation

Writing Code

Release Management

Clone this wiki locally