Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable correct representation of components in stretched layer 2 VLAN #1593

Closed
hawko2600 opened this issue Oct 14, 2017 · 12 comments
Closed

Comments

@hawko2600
Copy link

Issue type

[ ✓] Feature request
[ ] Bug report
[ ] Documentation

Environment

  • Python version: 2.7.6
  • NetBox version: 2.2.1

Description

Whilst netbox has the capability to represent "global" vlans, it lacks the ability to correctly represent a stretched layer 2 network, i.e. where one or more vlans live in one or more datacenters, and may share a vlan id with other local or stretched vlans from the same or other tenants at the same site (in totally physically segregated networks, of course).

This disability extends to the components living in the datacentres, e.g. its not possible to represent that a virtual machine lives in a cluster that exists in multiple sites on a stretched layer 2 vlan; instead you have to use workarounds like duplicate resources at each site or other permutations/combinations of entering in fake data. From the location of a VM, the cluster it lives on, the IP network it lives on, the VLAN its in, etc. all permit only a 1:1 mapping with "site" and "tenant", therefore you're forced to duplicate entries to work around the problem.

The "global" vlan described by #235 doesn't exist in practical use (but would be supportable under this issue anyway), and unfortunately other attempts like #852 were incorrectly closed as a duplicate of the former. The core difference to #235 is the 1:many relationships with sites, it's not just 1 site or "all sites". For geolocation and other reasons this problem also extends to tenants.

TL;DR: sites and tenants should accept 1:many relationships when used in VMs, clusters, VLANs, prefixes, ...

@hawko2600
Copy link
Author

ping @stefanjagger @Armadill0 @Censored3 @candlerb @rdujardin (and anyone else with more than one datacentre ;)

@candlerb
Copy link
Contributor

The fact that things can only be assigned to a single "Site" or "Tenant" (or to none) hasn't been a major stumbling block for me. I would choose to create a single object associated with none, rather than duplicate the object.

Right now I am using "Tenant" to represent "major service group", and there are cases where one device/VM is part of two specific services, but not global. Hence I could make use of many-to-many Tenant relations if they existed.

However for me, it would be much more useful if "Role" could change into something like a tag or label, where I could assign multiple Role labels to the same Device or VM. There are a ton of uses I could find for that (e.g. which major services are affected by this device; what monitoring tools to configure for it...) and I'd no longer need to mis-use Tenant. I don't know if there's a ticket for that idea.

I would not use many-to-many relations between (VLAN|Prefix) to Site. In fact, in the current model, it annoys me that both VLANs and Prefixes both have to be separately associated with Site and Tenant, so there's lots of duplicate entry already, which would only get worse if this became many-to-many.

I would prefer that the Prefix can be associated just with a Tenant, and the VLAN can be associated just with Site(s). But that's a different issue.

@hawko2600
Copy link
Author

The trouble with not associating is that you can't drill down from a site level or perform any kind of automation based on that grouping via the API. Its defeating the purpose of functionality like http://netbox.readthedocs.io/en/latest/data-model/extras/#export-templates

I have multiple sites, two of which have some spanned layer 2 networks. I need to be able to represent that certain vlans are the same vlan at those two sites (and, ergo, all the other objects that exist thereon) however I can only duplicate everything at those sites, or use "global" vlan and add them to many other sites they don't exist at. Both are ugly hacks that break automation.

The Role => tag idea is solid, its only used for search so it makes sense and especially where you have a physical server host that might run segregated roles in containers. I'd use this too!

For gateways and hosting providers, multiple tenants can be at the same site and duplicate vlan id's and IP networks (RFC1918 addressing) with another tenant as the physical network is segregated; this is why there's that superset - its not a duplicate for the same reason another person called Brian is not a duplicate of you, even though in smaller circles you can be the only Brian in the room.

@candlerb
Copy link
Contributor

multiple tenants can be at the same site and duplicate vlan id's and IP networks (RFC1918 addressing)

That's what VRFs are for: to distinguish distinct instances of the same prefix. Duplicate vlan IDs are already supported.

@lampwins
Copy link
Contributor

@darthmdh would #150 solve your issues. You talk about assigning vlans to sites for the purpose of automation. For me, I only care that a vlan is assigned to a port. Being assigned to at least one port means the vlan is at the site. Prior to that assignment, I do not care whether or not it is "at the site" from an automation standpoint. There is nothing meaningful I can do from an automation perspective by knowing strictly that a vlan is assigned to a site.

@jeremystretch
Copy link
Member

@darthmdh To be clear, a global VLAN is one which is available at all sites. Your request is to support the assignment of a VLAN to a set of specific sites. How do you propose extending the data model to facilitate this?

@candlerb
Copy link
Contributor

Related to the discussion on multiple tags: #132

@hawko2600
Copy link
Author

@lampwins unfortunately that's not related. The reason this issue fails automation is that you're currently forced to create fake entries - iterating via the REST API will then have the orchestration system attempt to access / create / modify / document things that don't actually exist (or take actions multiple times unnecessarily on the thing that does exist). I do support issue #150 however as it would be useful to relate a device to vlans its interfaces are on if for nothing else than documentation / CMDB.

@candlerb my problem is the reverse; I know I can create duplicate vlan id's and I explicitly do NOT want to do this when it is the same vlan. Netbox needs to support a 1:1 relationship between a vlan object created in netbox and an actual vlan that exists. Right now I'm forced to create duplicate vlan objects because of this issue where netbox will not let me associate the vlan object with the specific sites it exists at. There is no vrf, this is not a distinct instance of the same vlan id at one or more sites, it is literally the exact same vlan. There is a DCI circuit between the sites and multiple layer 2 networks are stretched between them. You can arp devices at both sites on that vlan from either side. It is not a "global vlan" as there are other sites which do not have these vlans - they might have a distinct instance (or ten) of the same vlan id, or not, but that's already supported as we both know.

@jeremystretch Going on the assumption Sites is already 1:many with VLAN Groups, Prefixes and VLANs, the relationship table needs to be extended to many:many so that these items can exist at multiple sites. I should probably mock this up in a test instance.

Prefixes probably annoy me the most, I'm unable to associate them properly as the layer 3 network exists on a stretched layer 2 at multiple sites (but not every site), so the site and vlan fields are left blank & there's no ability to navigate up & down the association hierarchy.

@candlerb
Copy link
Contributor

@candlerb my problem is the reverse; I know I can create duplicate vlan id's and I explicitly do NOT want to do this when it is the same vlan.

I didn't say you should. I said: "I would choose to create a single object associated with none, rather than duplicate the object."

The comment about duplicate IP subnets in different VRFs was in response to:

multiple tenants can be at the same site and duplicate vlan id's and IP networks (RFC1918 addressing) with another tenant as the physical network is segregated

Therefore, in this thread you have been talking about different things at different times:

  1. a single vlan ID 1234 with subnet 10.11.12.0/24, which spans multiple sites
  2. vlan ID 1234 with subnet 10.11.12.0/24 for tenant 1, and a separate instance of vlan ID 1234 with subnet 10.11.12.0/24 for tenant 2. These are separate broadcast domains, isolated from each other.

In Netbox, the first is modelled as a single VLAN object and single prefix object. The second is modelled as two VLANs (both with ID 1234) and two prefixes (both with 10.11.12.0/24, but with different VRF IDs to distinguish them as being different subnets)

Netbox needs to support a 1:1 relationship between a vlan object created in netbox and an actual vlan that exists. Right now I'm forced to create duplicate vlan objects because of this issue where netbox will not let me associate the vlan object with the specific sites it exists at.

No, you're not forced to do that. What I suggest is you create a "global" VLAN (associated with no sites, or all sites, however you look at it), even though it may not propagate to all sites. Then you put in the comments "this VLAN exists in site X and Y only".

It is not perfect, but in my opinion it is better than creating duplicate objects representing the same thing.

The long-term solution I think you want is to be able to associate a VLAN with an explicit list of sites. I wouldn't really object to it, but for me it's a low priority. You explicitly pinged me to ask for my opinion, and so here it is.

However this idea bears some similarity to the idea of tagging (#132) where an arbitrary set of tags can be associated with an object, and that would be much more useful to me.

The main problem I see with associating a VLAN with a list of sites, is that for consistency you would also need to associate a prefix with a list of sites, and that results in more duplicate data entry and hence increases the likelihood of inconsistencies.

To avoid this, as I said before, I think that a prefix should be associated with a tenant (or tenants), and a VLAN should be associated with a site (or sites). The VLAN is the thing which has physical expression, in the sense of where its broadcast domain propagates. Given a prefix which has been assigned to a VLAN, you can find out which sites it exists in by seeing where its associated VLAN propagates.

@hawko2600
Copy link
Author

The main problem I see with associating a VLAN with a list of sites, is that for consistency you would also need to associate a prefix with a list of sites, and that results in more duplicate data entry and hence increases the likelihood of inconsistencies.

Where is the duplicate data, though? If the prefix 10.11.12.0/24 exists in separate vlans at (separate, or the same) site you simply have multiple prefixes that share the same name, but are otherwise completely different. It is perfectly logical for multiple tenants to use RFC1918 addressing however they see fit.

To avoid this, as I said before, I think that a prefix should be associated with a tenant (or tenants), and a VLAN should be associated with a site (or sites). The VLAN is the thing which has physical expression, in the sense of where its broadcast domain propagates. Given a prefix which has been assigned to a VLAN, you can find out which sites it exists in by seeing where its associated VLAN propagates.

Prefixes can already be associated with a tenant, and I'm simply asking to fix that the vlan cannot be associated with multiple sites as per the common place layer 2 stretched vlan method. The design philosophy clearly states that netbox is to accurately reflect a real-world network. You can have prefixes that do not exist on any vlan (as the network is untagged), so its not accurate to map ip -> prefix -> vlan to discover the site (and the workaround for this would be to create a gazillion duplicate "vlan 1" objects, which is just horrid)

What I suggest is you create a "global" VLAN (associated with no sites, or all sites, however you look at it), even though it may not propagate to all sites. Then you put in the comments "this VLAN exists in site X and Y only".

Then you're not able to accurately navigate in the UI nor via the REST API. In practical usage there's no such thing as a "global" vlan (though you could argue untagged id 1 fits the bill on a technicality, just to be argumentative). Look at #329, its one of the many issues here wrongly closed as a duplicate of #235 before #235 was actually implemented in a manner that does not permit accurate reflection of stretched layer 2 networks.

I know what you're saying about random tags in #132 (this comment is killer!) and at first glance it would seem you could lump sites and tenants into this, but this would make it difficult to expose a sane api as the containing object would be /api/dcim/tag/<random_name> instead of /api/dcim/sites (and this would have a knock-on effect with the UI). Whilst that's perfectly fine with arbitrary labels like "role" or "platform", it doesn't lend itself well to something that physically exists like a location.

@candlerb
Copy link
Contributor

Where is the duplicate data, though?

Sigh.

  1. VLAN 1234 is associated with sites A, B, C, X and Y (which is what you propose).
  2. Prefix 10.11.12.0/14 is associated with VLAN 1234
  3. Prefix 10.11.12.0/14 is associated with sites A, B, C, X and Y (because prefixes can be assigned to sites)

Therefore, points (1) and (3) require duplication of the site linkage information.

You can have prefixes that do not exist on any vlan (as the network is untagged)

The prefix exists on a layer two domain. If that domain passes through a switch, the switch will assign it a VLAN ID - even if it's untagged on all ports on that switch. So to me, it makes sense to model all layer two domains as "VLANs". But you're right, people today may be assigning prefixes directly to sites without bothering to model the layer two domain.

@jeremystretch
Copy link
Member

Closing this out as it never really went anywhere, and I'm not interested in upending the current data model at this point.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants