Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with current ACL implementation and solution proposal #311

Merged
merged 9 commits into from
Mar 8, 2022

Conversation

restanrm
Copy link
Contributor

@restanrm restanrm commented Feb 8, 2022

This pull request is not really meant to be merged, but to try and explain the issues with the current ACL's and propose an alternative. This was discussed with @kradalby on discord.

I hope that the issue is correctly explained.

Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense. I think it has been like this since the usecases of the maintainers has currently been 100% personal usage and we just have not ran into this. But I can see how it is better.

My other comments should probably be read before this, as this is the last section:

However my main concern is the migration path.

I am contemplating two relativeliy different one:

  1. Change the logic

By this I mean, implement the new proposed solution, allowing access between all namespaces and fixing the ACL system.
As a part of allowing access, remove the shared node code, table, tests etc.

It would be a breaking change, but we are not 1.0.0, so that "should be fine" ™️.

  1. "rename" namespace to user

This is a "even more" breaking change, while I think it would make sense to clean up the naming to avoid confusion and bring it closer to the upstream lingo.

Basically this would:

  • Implement the new suggestion as User (to reflect what it actually is)
  • Remove share node feature
  • Remove the namespace (cli, table, code, etc).

This would make the user/group keywords in ACLs make more sense aswell.

Before doing either, we should make sure we "get it right", I can't see any concerns right now, but would like to get some more eyes on it.

@juanfont @madjam002 @ohdearaugustin @cure

Edit:
I also want to write the ACLs in YAML (as well as HUJSON), but this should be trivial.

docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated

For personal users the default behavior could either allow all communications between all namespaces (like tailscale) or dissallow all communications between namespaces (current behavior).

For businesses and organisations, viewing a headscale instance a single tailnet would allow users (namespace) to talk to each other with the ACLs. As described in tailscale's documentation [[1]], a server should be tagged and personnal devices should be tied to a user. Translated in headscale's terms each user can have multiple devices and all those devices should be in the same namespace. The servers should be tagged and used as such.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like allow all as a default seem desirable, and then you add ACLs to limit it. That seem to be the thing that would align the most with Tailscales upstream behaviour.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be in the favor for the current behavior, as it implies more security by default. This is probably a typical black/whitelist problem.
Being more align with the Tailscale upstream behaviour is a good point to consider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally more in favor of using ACL's to filter users behavior and all open by default. Users would not have to worry that nodes cannot talk to each other when they start their Headscale instance. If this is a needed feature it's quite easy to implement with the ACLs. The biggest issue is to properly understand Tailscale's ACLs that are quite difficult to grasp in my opinion.

docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated Show resolved Hide resolved
@madjam002
Copy link
Contributor

I really like this and is basically what I've done with my Headscale instance. Removing sharing and just relying on ACLs to control access seems like a good solution.

One thing to note - I'm pretty sure I read somewhere on the Tailscale docs that peers are only sent to people who have permission to access them via ACLs (I can't find the page now where I read it). Perhaps this is something that Headscale could implement in the future, it doesn't really help security as a connection can't be made if an ACL doesn't allow it, but it just means that tailscale status / Windows tray menu would only show nodes that are allowed for that particular user/device via ACLs.

docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated

For personal users the default behavior could either allow all communications between all namespaces (like tailscale) or dissallow all communications between namespaces (current behavior).

For businesses and organisations, viewing a headscale instance a single tailnet would allow users (namespace) to talk to each other with the ACLs. As described in tailscale's documentation [[1]], a server should be tagged and personnal devices should be tied to a user. Translated in headscale's terms each user can have multiple devices and all those devices should be in the same namespace. The servers should be tagged and used as such.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be in the favor for the current behavior, as it implies more security by default. This is probably a typical black/whitelist problem.
Being more align with the Tailscale upstream behaviour is a good point to consider.

docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated Show resolved Hide resolved
docs/acls.md Outdated Show resolved Hide resolved
@ohdearaugustin
Copy link
Collaborator

This adaptation of the ACL would need breaking changes, but I also see no problem with that as the project is still in constant developed and we won't need a backwards compatibility.

Furthermore if we complete rework the ACL maybe we also should add the ACL configuration to a API. So a more dynamical configuration is possible. As for now we would need to change the configuration file, which will need a restart/reload of the server.

All in all I think that is a great proposal.

I also would love to write the ACL in Yaml. Not important just nice to have.

@restanrm restanrm closed this Feb 14, 2022
@restanrm restanrm reopened this Feb 14, 2022
@restanrm
Copy link
Contributor Author

This adaptation of the ACL would need breaking changes, but I also see no problem with that as the project is still in constant developed and we won't need a backwards compatibility.

Furthermore if we complete rework the ACL maybe we also should add the ACL configuration to a API. So a more dynamical configuration is possible. As for now we would need to change the configuration file, which will need a restart/reload of the server.

All in all I think that is a great proposal.

I also would love to write the ACL in Yaml. Not important just nice to have.

I consider the ACL configuration as an API a very good thing to do. Restart Headscale to reload ACLs is not quite good.

I didn't think of YAML to write configuration but it should be nice and quite easy to implement !

@restanrm
Copy link
Contributor Author

I think this makes sense. I think it has been like this since the usecases of the maintainers has currently been 100% personal usage and we just have not ran into this. But I can see how it is better.

My other comments should probably be read before this, as this is the last section:

However my main concern is the migration path.

I am contemplating two relativeliy different one:

1. Change the logic

By this I mean, implement the new proposed solution, allowing access between all namespaces and fixing the ACL system. As a part of allowing access, remove the shared node code, table, tests etc.

It would be a breaking change, but we are not 1.0.0, so that "should be fine" tm.

2. "rename" namespace to user

This is a "even more" breaking change, while I think it would make sense to clean up the naming to avoid confusion and bring it closer to the upstream lingo.

Basically this would:

* Implement the new suggestion as User (to reflect what it actually is)

* Remove share node feature

* Remove the namespace (cli, table, code, etc).

This would make the user/group keywords in ACLs make more sense aswell.

Before doing either, we should make sure we "get it right", I can't see any concerns right now, but would like to get some more eyes on it.

@juanfont @madjam002 @ohdearaugustin @cure

Edit: I also want to write the ACLs in YAML (as well as HUJSON), but this should be trivial.

Thank you for you extensive review of this ! I agree with all points raised here. All those features are quite too much in my opinion for a single pull request. For migration path, we could either go slow and not change too much default behavior. Either by adding a flag to enable the new version of the ACLs or only enabling behavior if ACLs are enabled or completely break everything since it's a pre 1.0 release.

Your second part of the migration path seems to me to be the natural next step of this rework. Removing the namespace cli seems a lot because not every users would be using OIDC. I think we should rename this cli but keeping it is a must have. For a personal instance OIDC is often too much a burden to setup.

@kradalby
Copy link
Collaborator

I think that this document should be moved into docs/proposals or something, and maybe have a 001- in front.
Proposal is a bit more versatile than Issues since it is easier to comment on the actual text.

I consider the ACL configuration as an API a very good thing to do. Restart Headscale to reload ACLs is not quite good.

Then I think that we should make a new proposal for this, because it does require some design and concern:

  • How do we maintain the current setup, allowing "config as code"
  • were do we store/update changes and how do we treat it comparing to the already defined as code

Removing the namespace cli seems a lot because not every users would be using OIDC. I think we should rename this cli but keeping it is a must have. For a personal instance OIDC is often too much a burden to setup.

I recon it would make sense to rename it in any case, for non-OIDC setups aswell, my argument for this is to make it more aligned with the upstream, allowing us to scrap extra documentation to explain what stuff means.

And you could probably argue that when all machines can talk to all machines, it makes more sense to associate them with a user, rather than namespace. User is also used in the ACL IIRC?

This would probably also make sense to have in the proposal folder.

All those features are quite too much in my opinion for a single pull request. For migration path, we could either go slow and not change too much default behavior.

I agree, lets do it gradually, and do it from 0.14.0, since 0.13.0 is quite a big release already.

Either by adding a flag to enable the new version of the ACLs or only enabling behavior if ACLs are enabled or completely break everything since it's a pre 1.0 release.

I will personally vote for full change rather than "flag guarding" it, as it just means we need to maintain more logic.

@restanrm
Copy link
Contributor Author

Ok, I've moved the content into a proposals folder. Do you want me to try and write some other proposals for new user command and acl command or should we do this in another pull request ?

As for this format, I think we will loose the comments. I could modify this document to try and integrate your comments, but I think we will still loose informations.

@kradalby
Copy link
Collaborator

Ok, I've moved the content into a proposals folder. Do you want me to try and write some other proposals for new user command and acl command or should we do this in another pull request ?

Let us do that in separate PRs, one (big) problem at the time.

A hidden thing was implied in this document is that each person should have his own namespace.
Hidden information in spicification isn't good.
Thank's @kradalby for pointing it out.
@kradalby kradalby mentioned this pull request Feb 18, 2022
@kradalby
Copy link
Collaborator

I started to look at the PR and it looks sensible, but I dont have the mental capacity to start commenting today. However I want to write this down before I forget,

I think a release plan like this would make sense:

0.14.0: #320 which enables the new behaviour when ACLs are active, with a Changelog warning that this will be the default for everything in the consecutive release.
0.15.0: Make all namespaces talk to everything, regardless of ACL settings. Adds a warning that namespace will be renamed to user in next release.
0.16.0: Rename namespace -> user

This is not a defacto release plan, but a suggestion to allow some breathing room.

Smaller changes like:

  • YAML ACL
  • Remove shared nodes
  • other things

Can be added from 0.15.0 with this plan.

We need to communicate during this process, so we dont do duplicate work. Probably having all features created as issues and assigning it would be a good start.

@kradalby kradalby merged commit 62d7fae into juanfont:main Mar 8, 2022
@restanrm restanrm deleted the docs-acl-modifications branch March 8, 2022 19:07
@kradalby kradalby mentioned this pull request Jun 10, 2022
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants