Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circuit Redundancy Groups #7025

Open
tardoe opened this issue Aug 24, 2021 · 9 comments
Open

Circuit Redundancy Groups #7025

tardoe opened this issue Aug 24, 2021 · 9 comments
Labels
status: needs milestone Awaiting prioritization for inclusion with a future NetBox release type: feature Introduction of new functionality to the application

Comments

@tardoe
Copy link

tardoe commented Aug 24, 2021

NetBox version

v2.11.12

Feature type

New functionality

Proposed functionality

This proposal adds a new model for defining the relationship between circuits to track a designed redundancy between them.

A Circuit Redundancy Group (CRG) should contain two or more circuits that provide redundancy to one another. For example, two circuits exist from A to B, when added as members of a CRG, it is then easy to determine the impact of various failure scenarios. Optionally, circuits can be given a priority to indicate their primary, secondary, tertiary etc. status within the group.

Use case

A common use case is for dual circuits between two sites or devices. E.g.:

NYC1 -- DAL1 (provider A)
   |      |
NYC2 -- DAL2 (provider B)

In this case the two circuits linking DAL and NYC sites would be added to the same CRG to indicate they protect each other. One of the most useful example use-cases for this information might be evaluating maintenance notifications. If both circuits have overlapping maintenance events, the user would know to expect an outage. These types of scenarios could be determined well ahead of time given the designed redundancy is documented.

Database changes

An addition of a CRG model and any associated models required to track a many-to-many relationship between CRG and Circuit. It might also be worth considering CRG nesting (similar to how Locations models operate) where redundancy design are more complicated.

External dependencies

None

@tardoe tardoe added the type: feature Introduction of new functionality to the application label Aug 24, 2021
@jeremystretch jeremystretch added the status: under review Further discussion is needed to determine this issue's scope and/or implementation label Aug 27, 2021
@tardoe
Copy link
Author

tardoe commented Sep 3, 2021

Having given this some further thought, we can likely add a field titled "minimum circuits" to the CRG model to indicate the minimum number of circuits required to call the CRG "healthy". This will enable a few use cases such as:

  • CRGs with more than two parallel links where more than two are required at any one time.
  • Circuits that are related but cannot be modelled with terminations such as back-to-back connected circuits in uncontrolled spaces (e.g. a long-haul circuit and a last-mile circuit that are directly connected without a device in between) that would be considered a single logical circuit. This would have a "minimum circuits" value of 2 in this case - meaning both are required.
  • Nesting can be used to group these back-to-back CRGs (as above) with other single-circuits that provide redundancy to on-another.

@github-actions
Copy link

github-actions bot commented Nov 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. NetBox is governed by a small group of core maintainers which means not all opened issues may receive direct feedback. Please see our contributing guide.

@github-actions github-actions bot added the pending closure Requires immediate attention to avoid being closed for inactivity label Nov 2, 2021
@jeremystretch jeremystretch added status: needs milestone Awaiting prioritization for inclusion with a future NetBox release and removed status: under review Further discussion is needed to determine this issue's scope and/or implementation pending closure Requires immediate attention to avoid being closed for inactivity labels Nov 2, 2021
@jeremystretch jeremystretch added status: accepted This issue has been accepted for implementation and removed status: needs milestone Awaiting prioritization for inclusion with a future NetBox release labels Dec 15, 2021
@jeremystretch jeremystretch added this to the v3.3 milestone Dec 15, 2021
@jeremystretch jeremystretch added this to Data Model in Roadmap Dec 15, 2021
@jeremystretch jeremystretch removed this from Data Model in Roadmap Jun 9, 2022
@jeremystretch jeremystretch removed this from the v3.3 milestone Jun 9, 2022
@jeremystretch jeremystretch added status: under review Further discussion is needed to determine this issue's scope and/or implementation and removed status: accepted This issue has been accepted for implementation labels Jun 9, 2022
@jeremystretch
Copy link
Member

After revisiting this, it needs a more detailed implementation proposal. What specifically should this model look like? @tardoe can you provide some more detail?

@tardoe
Copy link
Author

tardoe commented Jul 5, 2022

@jeremystretch What additional details do you require that's not been specified in the original outline?

I'd expect the CRG model to be similar to the Location model, likely with an added field for "failover type" - this could either be user defined or a pre-defined list, e.g.: "active-passive", "active-active", "at-least-one" etc. etc.

@jeremystretch
Copy link
Member

What fields does the proposed CRG model have? Which of those are required? What are its unique constraints and ordering logic?

@github-actions
Copy link

github-actions bot commented Sep 6, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. NetBox is governed by a small group of core maintainers which means not all opened issues may receive direct feedback. Do not attempt to circumvent this process by "bumping" the issue; doing so will result in its immediate closure and you may be barred from participating in any future discussions. Please see our contributing guide.

@github-actions github-actions bot added the pending closure Requires immediate attention to avoid being closed for inactivity label Sep 6, 2022
@cybarox
Copy link
Contributor

cybarox commented Sep 7, 2022

My 2ct on this:

I would like to show an example where CRG could be useful to define which impact an outage could have.

Our branches are dependent on cloud software that have to be always available.
Therefore, we are planning 3 circuits for our sites because of high availability.
We use two ISPs, due to different backbones. Our Main ISP offers an LTE/4G backup option, but this should only be used if our 2 main circuits are down.
If I would transfer our design to CR groups in NetBox, it would look something like this:

                  ________________________
                 [________site-crg________]
                 [ prio = 0               ]
                 [ type = parent          ]
                 [ mode = active-passive  ]
                 [ min_child = 1          ]
                 [ child_type = crg       ]
                 [________________________]
                   /                    \
     _____________/________      ________\_____________
    [_______main-crg_______]    [______backup-crg______]
    [ parent = site-crg    ]    [ parent = site-crg    ]
    [ prio = 1             ]    [ prio = 2             ]
    [ mode = active-active ]    [ mode = passive       ]
    [ min_child = 1        ]    [ min_child = 1        ]
    [ child_type = circuit ]    [ child_type = circuit ]
    [______________________]    [______________________]
      |  __________________       |  __________________
      |-[__circuit1-isp1___]      \-[__circuit3-isp1___]
      [ [ prio = 1         ]        [ prio = null      ]
      | [ type = VDSL      ]        [ type = LTE/4G    ]
      | [ provider = isp1  ]        [ provider = isp1  ]
      | [ crg = main-crg   ]        [ crg = backup-crg ]
      | [__________________]        [__________________]
      |  __________________
      \-[__circuit2-isp2___]
        [ prio = 2         ]
        [ type = VDSL      ]
        [ provider = isp2  ]
        [ crg = main-crg   ]
        [__________________]

I would define a higher-level CRG for the site. In our case, 2 further CRGs are subordinated to this one.
One of the groups combines our two main ciruits. There should minimum 1 circuit be working here.
The other group defines circuits which are defined as backup for the main circuits. (e.g. if the excavator hits the main cable)

We also have custom fields to identify information such as physical location of the circuit handover point, ID of the electrical box, access contacts, etc. Today we fill in these fields for each circuit. It would make more sense to assign these fields to the CRG to maintain them in one place only.

I hope this helps when considering a solution.

@jeremystretch jeremystretch added status: needs milestone Awaiting prioritization for inclusion with a future NetBox release and removed status: under review Further discussion is needed to determine this issue's scope and/or implementation pending closure Requires immediate attention to avoid being closed for inactivity labels Sep 16, 2022
@ryanmerolle
Copy link
Contributor

This could be a good candidate for 3.6

@tardoe
Copy link
Author

tardoe commented May 11, 2023

Adding some more thoughts on this while I make time to actually implement it:

  • The issue with @cybarox is that you need 3 instances to describe a scenario that you could really do with 1. To lower complexity I would simply create a single CRG with all three circuits. The two main ISPs would have equal weight and the LTE connection with a lower weight. Using those weights you can easily tell if a site is "degraded" based on the weights of all available circuits. You can also offline assess the impact of any set of outages.

  • Given a "site" for an enterprise can be treated very differently to a service provider or a campus (e.g. site == building??) I feel that a site-level CRG is probably unnecessary.

I'd like to have a go at implementing this in the coming weeks, time permitting. I feel from an initial, basic cut I can have a better idea of how this model might work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs milestone Awaiting prioritization for inclusion with a future NetBox release type: feature Introduction of new functionality to the application
Projects
None yet
Development

No branches or pull requests

4 participants