-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with current ACL implementation and solution proposal #311
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense. I think it has been like this since the usecases of the maintainers has currently been 100% personal usage and we just have not ran into this. But I can see how it is better.
My other comments should probably be read before this, as this is the last section:
However my main concern is the migration path.
I am contemplating two relativeliy different one:
- Change the logic
By this I mean, implement the new proposed solution, allowing access between all namespaces and fixing the ACL system.
As a part of allowing access, remove the shared node code, table, tests etc.
It would be a breaking change, but we are not 1.0.0, so that "should be fine" ™️.
- "rename" namespace to user
This is a "even more" breaking change, while I think it would make sense to clean up the naming to avoid confusion and bring it closer to the upstream lingo.
Basically this would:
- Implement the new suggestion as User (to reflect what it actually is)
- Remove share node feature
- Remove the namespace (cli, table, code, etc).
This would make the user/group keywords in ACLs make more sense aswell.
Before doing either, we should make sure we "get it right", I can't see any concerns right now, but would like to get some more eyes on it.
@juanfont @madjam002 @ohdearaugustin @cure
Edit:
I also want to write the ACLs in YAML (as well as HUJSON), but this should be trivial.
docs/acls.md
Outdated
|
||
For personal users the default behavior could either allow all communications between all namespaces (like tailscale) or dissallow all communications between namespaces (current behavior). | ||
|
||
For businesses and organisations, viewing a headscale instance a single tailnet would allow users (namespace) to talk to each other with the ACLs. As described in tailscale's documentation [[1]], a server should be tagged and personnal devices should be tied to a user. Translated in headscale's terms each user can have multiple devices and all those devices should be in the same namespace. The servers should be tagged and used as such. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like allow all as a default seem desirable, and then you add ACLs to limit it. That seem to be the thing that would align the most with Tailscales upstream behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be in the favor for the current behavior, as it implies more security by default. This is probably a typical black/whitelist problem.
Being more align with the Tailscale upstream behaviour is a good point to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm personally more in favor of using ACL's to filter users behavior and all open by default. Users would not have to worry that nodes cannot talk to each other when they start their Headscale instance. If this is a needed feature it's quite easy to implement with the ACLs. The biggest issue is to properly understand Tailscale's ACLs that are quite difficult to grasp in my opinion.
I really like this and is basically what I've done with my Headscale instance. Removing sharing and just relying on ACLs to control access seems like a good solution. One thing to note - I'm pretty sure I read somewhere on the Tailscale docs that peers are only sent to people who have permission to access them via ACLs (I can't find the page now where I read it). Perhaps this is something that Headscale could implement in the future, it doesn't really help security as a connection can't be made if an ACL doesn't allow it, but it just means that |
docs/acls.md
Outdated
|
||
For personal users the default behavior could either allow all communications between all namespaces (like tailscale) or dissallow all communications between namespaces (current behavior). | ||
|
||
For businesses and organisations, viewing a headscale instance a single tailnet would allow users (namespace) to talk to each other with the ACLs. As described in tailscale's documentation [[1]], a server should be tagged and personnal devices should be tied to a user. Translated in headscale's terms each user can have multiple devices and all those devices should be in the same namespace. The servers should be tagged and used as such. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be in the favor for the current behavior, as it implies more security by default. This is probably a typical black/whitelist problem.
Being more align with the Tailscale upstream behaviour is a good point to consider.
This adaptation of the ACL would need breaking changes, but I also see no problem with that as the project is still in constant developed and we won't need a backwards compatibility. Furthermore if we complete rework the ACL maybe we also should add the ACL configuration to a API. So a more dynamical configuration is possible. As for now we would need to change the configuration file, which will need a restart/reload of the server. All in all I think that is a great proposal. I also would love to write the ACL in Yaml. Not important just nice to have. |
I consider the ACL configuration as an API a very good thing to do. Restart Headscale to reload ACLs is not quite good. I didn't think of YAML to write configuration but it should be nice and quite easy to implement ! |
Thank you for you extensive review of this ! I agree with all points raised here. All those features are quite too much in my opinion for a single pull request. For migration path, we could either go slow and not change too much default behavior. Either by adding a flag to enable the new version of the ACLs or only enabling behavior if ACLs are enabled or completely break everything since it's a pre 1.0 release. Your second part of the migration path seems to me to be the natural next step of this rework. Removing the |
I think that this document should be moved into
Then I think that we should make a new proposal for this, because it does require some design and concern:
I recon it would make sense to rename it in any case, for non-OIDC setups aswell, my argument for this is to make it more aligned with the upstream, allowing us to scrap extra documentation to explain what stuff means. And you could probably argue that when all machines can talk to all machines, it makes more sense to associate them with a user, rather than namespace. User is also used in the ACL IIRC? This would probably also make sense to have in the proposal folder.
I agree, lets do it gradually, and do it from 0.14.0, since 0.13.0 is quite a big release already.
I will personally vote for full change rather than "flag guarding" it, as it just means we need to maintain more logic. |
Ok, I've moved the content into a proposals folder. Do you want me to try and write some other proposals for new As for this format, I think we will loose the comments. I could modify this document to try and integrate your comments, but I think we will still loose informations. |
Let us do that in separate PRs, one (big) problem at the time. |
A hidden thing was implied in this document is that each person should have his own namespace. Hidden information in spicification isn't good. Thank's @kradalby for pointing it out.
I started to look at the PR and it looks sensible, but I dont have the mental capacity to start commenting today. However I want to write this down before I forget, I think a release plan like this would make sense: 0.14.0: #320 which enables the new behaviour when ACLs are active, with a Changelog warning that this will be the default for everything in the consecutive release. This is not a defacto release plan, but a suggestion to allow some breathing room. Smaller changes like:
Can be added from 0.15.0 with this plan. We need to communicate during this process, so we dont do duplicate work. Probably having all features created as issues and assigning it would be a good start. |
This pull request is not really meant to be merged, but to try and explain the issues with the current ACL's and propose an alternative. This was discussed with @kradalby on discord.
I hope that the issue is correctly explained.