Skip to content
This repository has been archived by the owner on Apr 22, 2024. It is now read-only.

Commit

Permalink
Merge 2d3e1bd into 591d9b0
Browse files Browse the repository at this point in the history
  • Loading branch information
josemauro committed May 31, 2021
2 parents 591d9b0 + 2d3e1bd commit 5fa0c7e
Showing 1 changed file with 334 additions and 0 deletions.
334 changes: 334 additions & 0 deletions docs/blueprints/EP023.rst
@@ -0,0 +1,334 @@
:EP: 23
:Title: Kytos Pathfinder Filter Paths by Metadata
:Authors:
- David Ramirez <drami102@fiu.edu>
- Marvin Torres <mtorr068@fiu.edu>
:Created: 2020-10-05
:Kytos-Version:
:Status: Draft

***************************************************
EP023 - Kytos Pathfinder Filter Paths by Metadata
***************************************************

Abstract
========

This blueprint is to add support for filtering by metadata using Kytos Pathfinder's root `POST /` endpoint. In particular,
`shortest_path`, the method decorated with `@rest('/vX', methods=['POST']`, should get the user's edge attribute
requirements, remove best paths with edges that fail to meet those requirements, then return the remaining paths.
It covers details of the test that determines whether a path is preserved, flexibility, the dependencies that must be added, and
the verification requirements.

Motivation
==========

Currently, Kytos Pathfinder accepts the following parameters:
- Source
- Destination
- Desired links
- Undesired links
- Parameter

*Desired links* and *undesired links* are noteworthy. They showcase Kytos Pathfinder's current filtering ability.
Namely, they cause Kytos Pathfinder to remove edges that do not have all the desired links and none of the undesired ones.

In its current state, this list of parameters is insufficient for filtering out paths based on metadata information.
Even *Parameter*, the only one that holds metadata information, can only hold one attribute name.

In addition, Kytos Pathfinder's pathfinding ability is limited. While it can use weights, it assumes that the
higher the weight, the worse its associated edge. While this is true for some attributes like delay and utilization,
it is false for others like bandwidth. So finding the path with the highest bandwidth is currently impossible. Second,
it can only find the best path in one area and not multiple - it can not find, for example, a path with ideal bandwidth
and delay. Lastly, it finds the absolute best paths in the graph when the best might not be required. A user might
look for a path that is good enough for their needs.

Rationale
=========

These changes would add the following to the backend of Kytos Pathfinder:
- **REST Endpoint**: The endpoint `/vX` (where X is an integer) that accepts the source, destination, and constraints, and outputs the constrained shortest path.
- **Filter Functionality**: The ability to find the subgraph of the *graph* instance variable, namely one that contains only qualifying edges.

Specification
=============

Details on testing a path and verifying these changes are discussed.

Testing a Path
--------------

The path test assumes the following:

Constraints were given by the user. For example:

- Bandwidth must be at least 34 units.
- Delay must be at most 45 units.
- Owner must be Joe.
- Reliability can be at least 3 units, or utilization can be at most 32 units, or both.
- Path must have links A and B.
- Path must not have link C.

These are required because it will join attribute names in these constraints with the
attribute names in an edge's metadata and see if the associated attribute values are
high or low enough.

Also, it assumes that **all** edges in a graph have the following attributes:

- Bandwidth
- Delay
- Reliability
- Utilization
- Priority
- Ownership

And that some attributes will have a maximum while others have a minimum:

Minimum:
- Bandwidth
- Priority
- Reliability

Maximum:
- Delay
- Utilization

Equal:
- Ownership

It will *pass* an edge if its metadata meets constraints, otherwise it will *fail* the edge.
Assume that a user wishes to check all edge attributes (checking all attributes is optional). First, it obtains
the constraint as a JSON object:

.. code-block:: JSON
"mandatory_filters": {
"bandwidth": "A",
"delay": "B",
"utilization": "C",
"reliability": "D",
"priority": "E",
"ownership": "F"
}
where **A** through **E** are floating point numbers or integers and **F** is a string.

Then it retrieves an edge in the graph and performs the following test, which is a series of questions:

- bandwidth(edge) at least A?
- delay(edge) at most B?
- utilization(edge) at most C?
- reliability(edge) at least D?
- priority(edge) at least E?
- ownership(edge) equal to F?

If YES to all questions, then it **passes** the edge.

For example, assume that it receives the user constraint *Delay must be at most 45 units and bandwidth must be at least 30 units*.
Then the incoming JSON object will be as follows:

.. code-block:: JSON
"mandatory_filters": {
"bandwidth": 30,
"delay": 45
}
Then it will retrieve an edge and check if its bandwidth is at least 30 units and its delay
is at most 45 units. If it does not have both qualities then it fails. Then it will repeat
this for every subsequent edge it finds.

A returned path will have all of the passing edges and none of the failing ones.

To summarize, the input required is the user's constraint as a JSON object and the output is
the set of paths that meets that constraint.

Flexibility in Theory
---------------------

Recall in the previous section that the test is a series of questions, and that a YES to all questions
means a **passing result**.

However, a user might be okay with some YESes. Support for flexibility can meet those needs.

Flexible metrics are JSON objects, just like inflexible ones:

.. code-block:: JSON
"optional_filters": {
"bandwidth": "A",
"delay": "B",
"utilization": "C",
"reliability": "D",
"priority": "E",
"ownership": "F"
}
If a flexible part is included, then for each edge the test needs to see if it can answer YES to a minimum number of
questions. Such edges would be marked as passing, while the rest would be marked as failing. This way,
paths will have edges that meet the minimum requirements of the user.

This requires finding the set of k-sized combinations from a set of n flexible metrics, where k is at least
the minimum number of YES answers. For example, assume the user specifies the following constraint:

.. code-block:: JSON
"optional_filters": {
"bandwidth": 30,
"delay": 40,
"utilization": 50
}
If they wish to find a path that has one of those qualities (e.g. "bandwidth is at least 30 units"), then
the set will have to be split into c(3, 1) = 3 tests, each with a single question:

- bandwidth(edge) at least 30?
- delay(edge) at most 40?
- utilization(edge) at most 50?

If it can pass at least one test, then it passes overall.

Softening constraints to find more paths than usual is the main idea of flexibility. In practice, it is more nuanced than shown
here to meet user needs. It will not test each edge, mark the ones that pass, and find the shortest
paths using those marked edges. While this is the easiest way to be flexible, this presents a few issues:

1. Edges along a path might have pass by having different qualities. For example, if a path has edges
A and B, A might have sufficient delay and bandwidth, while B might have sufficient utilization and ownership.
This means edges in a path will likely have no common wanted traits, which user might need.
2. With edges failing to have common wanted traits, a path has a chance of having an edge with an unwanted trait.
So users will have to filter paths manually by finding such edges in their paths.

To produce useful results, Kytos Pathfinder will have to find paths with edges that share common traits.

Flexibility in Kytos Pathfinder
-------------------------------

How it works is multiple tests will be done on one graph instead of the same test on multiple edges. In particular,
if a constraint has *n* "questions", then the entire graph will be tested c(n, 1) + c(n, 2) + .. + c(n, k) times, where
1 <= k <= n.

Assume that a constraint is *{A, B, C}* (for simplicity's sake). What Kytos Pathfinder will do is
find the *power set* of that constraint minus the empty set:

{A, B, C}, {A, B}, {B, C}, {A, C}, {A}, {B}, {C}

This totals up to 1 + 3 + 3 = 7 tests on the graph.

Kytos Pathfinder will find passing edges with constraint {A, B, C}, find paths using those edges,
find new passing edges with constraint {A, B}, find paths using those new edges, and so on. Finally,
it collects all the paths found. This will soften constraints while providing paths with guaranteed
common traits among its edges, *e.g.* edges with sufficient bandwidth.

The *minimum number* still applies here. It determines the smallest size of combinations to use.
So if the constraint was {A, B, C} and the minimum number was 2, then tests for {A}, {B}, and {C}
would not exist.

Flexibility with Inflexibility
------------------------------

A constraint can be split into two parts: *flexible* and *inflexible*. These two constraints can
work together to produce smarter searches than if they were mutually exclusive.

For example, users can specify a zero (0) minimum. This would let all edges pass. This could be used with
inflexible metrics to specify optional qualities on top of required ones. So it can, for example, find paths
from flexible metrics as long as the paths are owned by "Joe".

Implicit Versus Explicit Constraint Setup
-----------------------------------------

In the previous sections, each attribute in a constraint is shown with implicit boundaries:

Minimum:
- Bandwidth
- Priority
- Reliability

Maximum:
- Delay
- Utilization

Equal:
- Ownership

JSON objects do not include relational operators such as "<", ">", or "=". In the following example,
paths are expected to have at least 30 bandwidth, at most 40 delay, and at most 50 utilization:

.. code-block:: JSON
"mandatory_filters": {
"bandwidth": 30,
"delay": 40,
"utilization": 50
}
An extended version of this object could use explicit relational operators:

.. code-block:: JSON
"mandatory_filters": {
"bandwidth": ">=30",
"delay": ">40",
"utilization": "=50"
}
These paths are expected to have at least 30 bandwidth, greater than 40 delay, and exactly 50 utilization.

In general, a user requirement defined in the extendend constraint will have the following template:

- "<attribute name>":"[< > <= >= =]<required value>"
- Exactly one relational operator must be specified each requirement.

Dependencies
------------

- **Itertool's Combination Function** To find the set of k-sized combinations from the set of n flexible metrics. These sets will serve as the tests done on edges.

Verification
------------

One way to verify the test is to obtain two sets of links, one that passed and the other that failed.
All the passing links should be present in the first set, and all the failing links should be present
in the second.

Possible Implementation
=======================

The best implementation for this is editing the currently decorated method because the constraints are optional parameters. If a new method is created and users
specify only a source and a destination, then its logic becomes equivalent to the current endpoint's. So in such cases a new endpoint
would be redundant.

A positive to this is that other network applications can still call the same endpoint. However, one downside to this
is the work required to inform current Kytos Pathfinder users about the parameter changes.

NetworkX is robust enough to have what is needed for the constrained shortest path algorithm to work.

They require a graph to hold link metadata, which networkx supports:

.. code-block:: python
def create_graph_with_metadata():
'''Create a graph with preset edge metadata.'''
edges = [('A', 'B', {'bandwidth': 25, 'delay': 20, 'ownership': 'A'}),
('A', 'C', {'bandwidth': 20, 'delay': 25, 'ownership': 'B'}),
('C', 'D', {'bandwidth': 15, 'delay': 10, 'ownership': 'A'})]
G = nx.Graph()
G.add_edges_from(edges)
return G
They also require the creation and traversal of subgraphs. NetworkX also supports this:

.. code-block:: python
G = nx.path_graph(5)
H = G.edge_subgraph([(0, 1), (3, 4)])
The only caveat to creating subgraphs in this way is that they are read-only. This will not present an
issue since traversing a graph is a read-only process.

References
==========

- Constrained Shortest Path Computation:
- http://www.cs.ust.hk/~dimitris/PAPERS/SSTD05-CSP.pdf

0 comments on commit 5fa0c7e

Please sign in to comment.