Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ACL management via API #582

Open
samson4649 opened this issue May 12, 2022 · 29 comments
Open

Add ACL management via API #582

samson4649 opened this issue May 12, 2022 · 29 comments
Labels
enhancement New feature or request might-come

Comments

@samson4649
Copy link

Feature request
Add management of the headscale ACLs via the API. This will allow for ACLs to be updated and current ACLS to be fetched from the headscale instance during runtime.

Method Purpose Comments
GET Get currently active ACL Data returned as JSON object
PUT Set new active ACL JSON blob provided to overwrite existing

Aiming to get this started with JSON input and output to start.z

As well as making this a programmable interface for automating ACLs, this will also make integration tests for PR #581 easier to achieve.

@samson4649 samson4649 added the enhancement New feature or request label May 12, 2022
@restanrm
Copy link
Contributor

Hello ! I love this idea and also wanted to work on this. I tried to make a proposal doc to describe this feature but didn't release it. As discussed in #492 it's not easy. We want to keep config as code principle as much as possible, but if the ACLs are in DB it's not the case. Also, the API modifying files on system is not ideal and would not work on k8s…

@samson4649
Copy link
Author

I've implemented a working update mechanism however im waiting merge for pr #581 (issue #492)

~ # ./headscale acl list
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:37Z DBG Setting timeout timeout=5000
2022-06-20T07:11:37Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock
{
	"groups": {
		"group:admin": [
			"host-1"
		],
		"group:boss": [
			"subnet-1"
		]
	},
	"hosts": {
		"host-1": "10.64.0.2/32",
		"subnet-1": "10.64.0.0/10"
	},
	"acls": [
		{
			"action": "accept",
			"src": [
				"boss"
			],
			"dst": [
				"boss:*"
			]
		},
		{
			"action": "accept",
			"src": [
				"admin"
			],
			"dst": [
				"dev1:*"
			]
		}
	]
}
~ # ./headscale acl update --file /etc/headscale/acl_2.hujson 
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:43Z DBG Setting timeout timeout=5000
2022-06-20T07:11:43Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock

~ # ./headscale acl list
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:47Z DBG Setting timeout timeout=5000
2022-06-20T07:11:47Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock
{
	"hosts": {
		"host-1": "10.64.0.2/32",
		"lol-net": "69.69.69.0/24",
		"subnet-1": "10.64.0.0/10"
	},
	"acls": [
		{
			"action": "accept"
		}
	]
}

@routerino
Copy link

Any word on what needs to happen to progress now that 16.0 is out? I'm waiting on API access to progress work on an ACL management interface.

@restanrm
Copy link
Contributor

restanrm commented Aug 8, 2022

I think the PR #581 is ready to be merged. Someon needs to take the update ACL subject. If I find some time I'll try to tackle it. I still have some issues on the ACL's that I need to handle. Time is hard to find. If someone else want to try on this please do !

@juanfont juanfont added this to the v0.17.0 milestone Aug 12, 2022
@kradalby
Copy link
Collaborator

kradalby commented Sep 8, 2022

While this is making good progress with a proposal, I will push it to 0.19.0 as a target, we have a lot of stuff for 0.17.0 and want to do a code reorg for 0.18.0.

@kradalby kradalby modified the milestones: v0.17.0, v0.19.0 Sep 8, 2022
@kradalby kradalby removed this from the v0.19.0 milestone May 10, 2023
@VaalaCat
Copy link

Is there any progress on this?

@kradalby
Copy link
Collaborator

There is currently no planned work to implement this.

Might happen in the future, but we have a lot planned for the time being.

Copy link
Contributor

This issue is stale because it has been open for 180 days with no activity.

@github-actions github-actions bot added the stale label Nov 17, 2023
Copy link
Contributor

github-actions bot commented Dec 1, 2023

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Dec 1, 2023
@PizzaProgram
Copy link

At the end did someone merged this?
Or still waiting after 1.5 years?

@samson4649
Copy link
Author

At the end did someone merged this? Or still waiting after 1.5 years?

Nope.

@pallabpain
Copy link
Contributor

pallabpain commented Jan 27, 2024

I believe we can write a much simpler implementation by following what Tailscale is doing. https://github.com/tailscale/tailscale/blob/main/api.md#policy-file

The APIs can directly deal with the policy JSON structure and expect the user to fetch and modify it. All we will need to do is store is as jsonb in the DB somewhere and that's it.

If no one is working on this, I'd like to give this a try since it'd be a pretty useful feature for many.

@mathmaniac43
Copy link

@juanfont I think this issue was automatically closed by the bot by mistake. It seems to me like it should be re-opened since there is still interest in this feature and it looks like @pallabpain wants to take a swing at it. Thanks!

@juanfont juanfont reopened this Feb 21, 2024
@github-actions github-actions bot removed the stale label Feb 22, 2024
@pallabpain
Copy link
Contributor

I have a draft PR out at the moment which only introduces the bare minimum APIs. I'm still working on it at the moment.

#1792

While it is rather simple to set and get the policy via the APIs, we currently can load the ACLs using a file. I don't know how to handle both cases where it may be possible to load the ACL from the file, store it in the DB, and serve it to the API consumers. However, changes made to the policy via the APIs may not be written back to the file (if the policy was initially loaded from a file). This will make things inconsistent.

Considering restarts, it may be possible to refer to the policy stored in the db as the source of truth. But let's say the users update the policy file and expect headscale to use that. Then it will again make things inconsistent.

@kradalby What do you think? Does it make sense to have only one way of updating the policy, i.e. via the API?

@PizzaProgram
Copy link

... some ideas:

1. IMHO it would not be a big problem, if the ACL changes would be accepted only by the API.

  • Simply store the current one in the DB,
  • and just save a backup JSON file in a '../ACL/..' path whenever it changes.
  • Adding a timestamp to the file, acl2024-02-26_112233.old
  • deleting oldest, if file count > 1000 .
  • This helps security wise too, because it will make it visible, how many times it was changed, load back prev. working one, etc.

I know, docker is not officially supported, still many sysadmins are using it.
Editing files inside a docker container is a pain anyway, so it's safer to store in the DB, which would be preserved even if the image gets replaced / updated.

2. Lock

_But in case really considering allowing modifying it via files, HeadScale could:
Store the ACL in a simple JSON file, which would be locked during run

  • so nobody could edit it / delete it manually.

3. In case it would accept a new file edited manually:

There would be an ACL directory, containing 3 files:

  • Readme_ACL.md
  • acl.conf.new.sample
  • acl.conf (Locked)

The Readme would explain everything, including how to rename the .samle file into:

  • acl.conf.new

HS would check for this file once every 10 seconds, or watch for "file-change event" on Linux.

  1. if it detects a acl.conf.new file present,
  2. it tries to load it,
  3. if succeding, it would:
    • rename the old acl.conf to acl2024-02-26_112233.old
    • rename acl.conf.new to acl.conf + Lock it (so it becomes read-only)
  4. on failed loading:
    • create a error2024-02-26_112233.log

4. Security / important:

  • If the new ACL failing to load properly, (JSON parsing error, wrong keyword, duplicate entry, etc.)
  • it would be essential to disabling every access, rather than allowing everything.
  • But it's better to keep the "last working copy" in the DB too, and if there's a problem, load the old ACL instead!

@pallabpain
Copy link
Contributor

pallabpain commented Feb 26, 2024

@PizzaProgram Here's how I have thought about the ACL management, so far:

APIS

  1. Returns the current policy JSON
GET /api/v1/acl
  1. Updates the policy
PUT /api/v1/acl

The input payload shall be parsed using the existing helper function such that it undergoes the required set of parsing for it to be valid. Only a valid payload will be admitted and it shall overwrite the existing payload. This will ensure that the existing working copy is not corrupted by an invalid payload.

In addition to that, since we are storing it in the DB, we will have access to the updated_at timestamp.

I think we can also incorporate the suggestion of storing the update counter if it'd be helpful.

I think one way to allow file-based configs may be to have commands to get and set the policy which effectively stores it in the DB, thus making the DB the source of truth at all times.

# headscale acl get

# headscale acl set PATH_TO_FILE

These APIs are very much in-line with the original issue description.
#582 (comment)

@PizzaProgram
Copy link

PizzaProgram commented Feb 26, 2024

As far as I can tell, this sounds perfect.

Although I still recommend strongly to:

Keep a previous version of the ACL in DB too !

It would allow to:

  • get it
  • compare it with current one
  • check the date of that one
  • restore it

There are millions of cases, where something can go wrong. So keeping at least "one backup" would be crucial.

# headscale acl get-backup
# headscale acl restore-backup

I'm sure these 2 little words could save someone from a disaster, undoing an error with one easy command.
(Imagine a 1000+ client config and 1 tiny typing error, or a wrong API-call miss-click. )

Summarized:

it shall overwrite the existing payload.

So instead of overwrite, I recommend to simply:

  • insert a new line in the DB (generating a new Auto-sequence number)
  • point the "actual config ID" to that line

It is nearly the same amount of work, but opens up many possibilities to work with ACLs in a much safer way.

For example:

# headscale acl show-id  // shows the active config ID
# headscale acl get-by-id 5998 // get the whole config
# headscale acl restore-by-id 5996

@kradalby
Copy link
Collaborator

I think this discussion is going in a generally positive direction, I have not yet had any time to look at the PR, so apologise if this is addressed already, but I would like to add that the current read-from-disk-on-startup behaviour must be preserved. It is a non-negotiable, that does not mean that this feature cannot be added, it just need to find a way to work together with the current behaviour.

@PizzaProgram
Copy link

the current read-from-disk-on-startup behaviour must be preserved.

I do not think that a Database open / read would affect this in any way at startup.

We are talking about a "secondary" / optional possibility to save/push an ACL config:

  • to/from a file
  • while the server is running,
    instead of just by "virtual / API" payload only.

The config will be loaded from the database during startup. No actual "files" involved.

@pallabpain Am I summarizing it correctly ?

@pallabpain
Copy link
Contributor

pallabpain commented Feb 26, 2024

I think this discussion is going in a generally positive direction, I have not yet had any time to look at the PR, so apologise if this is addressed already, but I would like to add that the current read-from-disk-on-startup behaviour must be preserved. It is a non-negotiable, that does not mean that this feature cannot be added, it just need to find a way to work together with the current behaviour.

@kradalby The PR is still a work in progress. However, I intend to reach a point where both read-from-disk and APIs exist at the same time and that reading from disk too updates the ACLs in the database. That should give us a working prototype to discuss further.

The only thing that bothers me is the co-existence of two update mechanisms that might lead to inconsistencies due to either human error or incorrect usage.

Here's what I aim to achieve in the PR:

  1. APIs to get and set the policy
  2. CLI commands to get and set the policy
  3. Load policy from a file on startup and store in DB (also during reloads)

@PizzaProgram Yes, storing every modification as an entry is something that we can easily achieve. My only concern with such an implementation is what makes a particular config the last working config. Since the last working config is subjective to headscale users' intended usage we are already making sure that the input payloads are at least validated for correctness.

I believe that instead of storing history in the DB, users can employ any means for maintaining histories and versions, Gitops, etc.

What do you think?

@Yoshino-s
Copy link

Maybe add a new configuration like

# acl_policy_mode: db|[file]

acl_policy_mode: file
acl_policy_path: /path/to/acl.json

---

acl_policy_mode: db
# acl_policy_path will be ignore

If acl_policy_mode is file, all behaviors will keep same as the old version, and the acl api will be unavailable.
If acl_policy_mode is db, more new feature can be introduced, including acl api and other cli options?

so we can keep the the current read-from-disk-on-startup behaviour and also add what we want now

@pallabpain
Copy link
Contributor

pallabpain commented Mar 4, 2024

@Yoshino-s This can be one way to solve this. We can still let the GET API open which will return the policy. But in file mode, the PUT API will not set anything.

Or we can simply add a disclaimer saying that when there's an ACL file specified in the config, it will override the ACL in the DB on every startup. What do you think?

@Yoshino-s
Copy link

Yoshino-s commented Mar 4, 2024

@pallabpain Maybe simply ignoring it is a more logical option for users, thinking about db means data is from db, so the file is nothing about.

I made a draft implement at https://github.com/yoshino-s/headscale . Feel free to use it to merge into your pr, or just do your own :P

@evenh
Copy link
Contributor

evenh commented Mar 4, 2024

I think that the proposed acl_policy_mode should be respected with regards to the principle of least surprise. Having that be file by default means that one can maintain backwards compatibility.

Exposing the current ACL (backed by file or database) in the API and blocking modification when backed by file sounds like a good idea.

@kradalby
Copy link
Collaborator

kradalby commented Mar 4, 2024

Or we can simply add a disclaimer saying that when there's an ACL file specified in the config, it will override the ACL in the DB on every startup. What do you think?

I agree with Even on this one, having it go back to file everytime might be quite confusing, a solid toggle doing on or the other makes sense to me.

Making the file available with GET API also sounds reasonable.

@pallabpain
Copy link
Contributor

Collecting everyone's inputs,

  • ACL APIs:

    • Introduce ACL APIs to allow fetch and update operations, and persist in the database.
  • Configuration:

    • Users can configure ACLs to reside in a file or the database using a designated attribute.
      • In the 'file' mode, API access is restricted for ACL updates.
      • In the 'db' mode, APIs are imperative for dynamic ACL modifications.
  • CLI:

    • Implement CLI commands to get and set the ACLs per the configuration.

@pallabpain
Copy link
Contributor

Collecting everyone's inputs,

  • ACL APIs:

    • Introduce ACL APIs to allow fetch and update operations, and persist in the database.
  • Configuration:

    • Users can configure ACLs to reside in a file or the database using a designated attribute.

      • In the 'file' mode, API access is restricted for ACL updates.
      • In the 'db' mode, APIs are imperative for dynamic ACL modifications.
  • CLI:

    • Implement CLI commands to get and set the ACLs per the configuration.

@Yoshino-s @kradalby @evenh I have updated my PR considering the above requirements. Please take a look. Thanks. 🙇🏼

@kradalby
Copy link
Collaborator

kradalby commented Mar 6, 2024

Will take a look tomorrow.

@almereyda
Copy link

almereyda commented Mar 26, 2024

After reviewing the ongoing discussion on #1792, I am left with wondering if the implementation could be (optionally) made wire-compatible with the upstream without too much effort.

gitops-pusher.go has a client implementation that pushes HuJSON files to upstream and is used in CI for implementing the GitOps pattern.

It appears this even accepts a custom apiServer implementing the expected endpoints.

Would the gitops-pusher maybe provide enough documentation of a suitable API surface for more generic upstream compatibility?

We would probably only have to agree on a default string for the tailnet variable or make it configurable, e.g. if dns_config.magic_dns: true is set through dns_config.base_domain.

This compatibility work could also provide the foundation to implementing a more complete set of endpoints from the Tailscale HTTP v2 API, eventually providing OAuth compatibility for fine-grained access control, as suggested in the previously rejected #1202.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request might-come
Projects
None yet
Development

No branches or pull requests