Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: This error results from an error during password verification #86

Closed
kaysond opened this issue Nov 24, 2021 · 24 comments · Fixed by #94
Closed

Error: This error results from an error during password verification #86

kaysond opened this issue Nov 24, 2021 · 24 comments · Fixed by #94
Assignees
Labels
bug Something isn't working

Comments

@kaysond
Copy link
Contributor

kaysond commented Nov 24, 2021

If its not obvious by now I've been slowly trying to get this up and running all night XD

I'm now stuck on the following error: Error: This error results from an error during password verification

Couldn't find that error text in the repo so I guess its a dependency? I'm wondering if it's a permissions problem because the image is built depending on being run as uid 10001

My compose file is:

---
version: '3.8'

services:
  lldap:
    hostname: "{{ .Service.Name }}.{{ .Task.Slot }}"
    image: nitnelave/lldap:latest@sha256:0af94a81b3e5d55967e5012298d81bb5eceaa3c9670fbc39d2ba63bada7ffcf1
    volumes:
      - /var/lib/homelab/lldap/config:/data
    networks:
      - public
    healthcheck:
      interval: 30s
      retries: 3
      start_period: 60s
      test: exit 0
      timeout: 5s
    deploy:
      mode: replicated
      replicas: 1
      restart_policy:
        condition: on-failure
        delay: 15s
        max_attempts: 3
        window: 60s
      update_config:
        delay: 15s
        failure_action: rollback
        max_failure_ratio: 0
        monitor: 30s
        order: start-first
        parallelism: 1
      rollback_config:
        delay: 15s
        failure_action: pause
        max_failure_ratio: 0
        monitor: 30s
        order: start-first
        parallelism: 1
      labels:
        traefik.enable: 'true'
        traefik.http.routers.lldap.entrypoints: websecure
        traefik.http.routers.lldap.middlewares: local-only@file
        traefik.http.services.lldap.loadbalancer.server.port: 17170
        traefik.subdomain: lldap
      resources:
        limits:
          memory: 1G

networks:
  public:
    attachable: false
    ipam:
      config:
      - subnet: 172.21.6.0/24

and my config is

## Default configuration for Docker.
## All the values can be overridden through environment variables, prefixed
## with "LLDAP_". For instance, "ldap_port" can be overridden with the
## "LLDAP_LDAP_PORT" variable.

## The port on which to have the LDAP server.
ldap_port = 3890

## The port on which to have the HTTP server, for user login and
## administration.
http_port = 17170

## The public URL of the server, for password reset links.
http_url = "https://lldap.homelab.lan"

## Random secret for JWT signature.
## This secret should be random, and should be shared with application
## servers that need to consume the JWTs.
## Changing this secret will invalidate all user sessions and require
## them to re-login.
## You should probably set it through the LLDAP_JWT_SECRET environment
## variable from a secret ".env" file.
## You can generate it with (on linux):
## LC_ALL=C tr -dc 'A-Za-z0-9!"#%&'\''()*+,-./:;<=>?@[\]^_{|}~' </dev/urandom | head -c 32; echo ''
jwt_secret = "sEa5%:8auaXV!JbQ%|Uwh2hKtNYg)E:J"

## Base DN for LDAP.
## This is usually your domain name, and is used as a
## namespace for your users. The choice is arbitrary, but will be needed
## to configure the LDAP integration with other services.
## The sample value is for "example.com", but you can extend it with as
## many "dc" as you want, and you don't actually need to own the domain
## name.
ldap_base_dn = "DC=homelab,DC=lan"

## Admin username.
## For the LDAP interface, a value of "admin" here will create the LDAP
## user "cn=admin,ou=people,dc=example,dc=com" (with the base DN above).
## For the administration interface, this is the username.
ldap_user_dn = "admin"

## Admin password.
## Password for the admin account, both for the LDAP bind and for the
## administration interface. It is only used when initially creating
## the admin user.
## It should be minimum 8 characters long.
## You can set it with the LLDAP_LDAP_USER_PASS environment variable.
## Note: you can create another admin user for user administration, this
## is just the default one.
ldap_user_pass = "C3lPplboew17lmqN9fYdAwgJhUZivme0wst+VfaM"

## Database URL.
## This encodes the type of database (SQlite, Mysql and so
## on), the path, the user, password, and sometimes the mode (when
## relevant).
## Note: Currently, only SQlite is supported. SQlite should come with
## "?mode=rwc" to create the DB if not present.
## Example URLs:
##  - "postgres://postgres-user:password@postgres-server/my-database"
##  - "mysql://mysql-user:password@mysql-server/my-database"
##
## This can be overridden with the DATABASE_URL env variable.
database_url = "sqlite:///data/users.db?mode=rwc"

## Private key file.
## Contains the secret private key used to store the passwords safely.
## Note that even with a database dump and the private key, an attacker
## would still have to perform an (expensive) brute force attack to find
## each password.
## Randomly generated on first run if it doesn't exist.
key_file = "/data/private_key"
@kaysond
Copy link
Contributor Author

kaysond commented Nov 24, 2021

So it was sort of a permissions issue - originally I was getting a different error that lldap couldn't write to /data/private_key, because the directory wasn't writeable by 10001. So I did a touch private_key && chmod a+w private_key. Apparently the library (opaque?) doesn't like it if the private_key file exists but is empty!

@nitnelave
Copy link
Member

I see. And you didn't have issues with the users.db, users.db.shm and related files?

I'll try to add a more explicit (and fatal) error if we can't write to the server key.

@nitnelave nitnelave added the bug Something isn't working label Nov 24, 2021
@nitnelave nitnelave self-assigned this Nov 24, 2021
@kaysond
Copy link
Contributor Author

kaysond commented Nov 24, 2021

I see. And you didn't have issues with the users.db, users.db.shm and related files?

No. The order of events was:

  1. Run container without write permissions on /data
  2. Get "Error: Could not write the generated server setup to file /data/private_key"
  3. Run touch private_key && chmod a+w private_key
  4. Get "Error: This error results from an error during password verification"
  5. Remove private_key, and do chown 10001 /data

Then everything starts up just fine because it can create files in /data, and there's no empty private_key.

@kaysond
Copy link
Contributor Author

kaysond commented Nov 24, 2021

So now I'm trying to run it as a different user (via docker-compose user:) so I can use the existing permissions setup of my storage, and I get the following:

Loading configuration from /data/lldap_config.toml
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Could not write the generated server setup to file `server_key`

Caused by:
    Permission denied (os error 13)', server/src/infra/configuration.rs:218:49
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

It's interesting because it says its loading the configuration, and it has permissions to read that file, but it's using what appears to be the default value for key_file

Edit: looks like the issue here is that get_server_setup automatically writes to the location at key_file if the path doesn't exist, and since the first call to ConfigurationBuilder is used to build the defaults, it's trying to write to /app/server_key, which my user doesn't have permissions for
https://github.com/nitnelave/lldap/blob/ba7848d043da20041c14312bd4d0ea24107b8623/server/src/infra/configuration.rs#L217-L222
https://github.com/nitnelave/lldap/blob/ba7848d043da20041c14312bd4d0ea24107b8623/server/src/infra/configuration.rs#L80-L93
https://github.com/nitnelave/lldap/blob/ba7848d043da20041c14312bd4d0ea24107b8623/server/src/infra/configuration.rs#L105-L121

@kaysond
Copy link
Contributor Author

kaysond commented Nov 24, 2021

This also explains the original error I was seeing - the problem is that get_server_setup sees that the private_key file exists, so it loads it, but then provides a totally empty config to the rest of the program. I think there probably needs to be some validation to ensure there are no problems with the file contents before continuing.

@JaneJeon
Copy link

JaneJeon commented Nov 24, 2021

I, too, am also stuck on the

Get "Error: Could not write the generated server setup to file /data/private_key"

thing (not just touching the file, but also creating a valid private key and directly mounting it on it, in which case it complains that it can't read it). I'm pretty sure there's some deep underlying permissions issue at play here.

Tried like 50000 different things (including different directories and permissions) :/

@JaneJeon
Copy link

And given how severe this issue is (it's preventing me from even spinning up the container in the first place) and given that nobody apparently complained about this before us, I'm guessing it's a somewhat recent regression. If that's the case, when would be the last "safe" image?

@kaysond
Copy link
Contributor Author

kaysond commented Nov 24, 2021

@JaneJeon what you are describing seems like a configuration/permissions issue, not a container issue. Once I set proper write permissions on the data directory (e.g. chown -R 10001 /data && chmod -R u+w /data), the container started up with no issues.

I do think, though, that the way the container is set up is prone to confuse users because permissions handling is fairly opaque. It would probably be better overall to make it compatible with docker-compose user: or allow for PUID and PGID env vars to set the process user/group.

@JaneJeon
Copy link

JaneJeon commented Nov 25, 2021

As I have mentioned above, I have run those commands that you mentioned and it still doesn't work. I've ran it within the container and I thought maybe the issue is that I"m mounting the data directory from the outside (i.e. I'm mounting from a host folder to /data within the directory), but even after I have run the commands on the host folder I'm mounting, it stil spits out the same error.

Either way, I'm ready to throw in the towel. I host like 7 different docker containers that I mount a host directory onto, and lldap seems to be the only one absolutely screeching at it.

@JaneJeon
Copy link

btw @nitnelave I'm pretty sure I found where the actual error message came from: https://docs.rs/opaque-ke/0.5.0/opaque_ke/errors/enum.ProtocolError.html#variant.VerificationError

Dunno if it helps, but

@kaysond
Copy link
Contributor Author

kaysond commented Nov 25, 2021

btw @nitnelave I'm pretty sure I found where the actual error message came from: https://docs.rs/opaque-ke/0.5.0/opaque_ke/errors/enum.ProtocolError.html#variant.VerificationError

Dunno if it helps, but

Is that the error message you're seeing? Your comment mentioned the other error. If you're getting the password verification error it's probably the same thing I saw - whatever you've put in the private_key file is invalid.

I also mount a host directory (/var/lib/lldap/data) to /data in the container. If I empty the directory (rm /var/lib/lldap/data/*) then give the default user perms (chown 10001 + chmod /var/lib/lldap/data), it works.

@JaneJeon
Copy link

JaneJeon commented Nov 25, 2021

Okay, just checked again. I had generated a valid pem key but it looks like having any key in there triggers the error you mentioned above.

Deleting it seems to work, but it's still... idk, quite dirty of a solution imho

@nitnelave
Copy link
Member

Yeah, I'm aware that the permissions stuff is not super well handled right now. I think I'll build on #89 and introduce better user support, including the docker-compose user, UID and so on. I'll have to look up how it's done elsewhere, unless someone wants to take it up.

@kaysond
Copy link
Contributor Author

kaysond commented Nov 25, 2021

I'm happy to work on that next after we close #89!

@nitnelave
Copy link
Member

That would be of great help, thanks!

@kaysond
Copy link
Contributor Author

kaysond commented Nov 26, 2021

So what linuxserver.io does in their containers, is create a user called abc, and in their base image init scripts, do usermod -o -u "$PUID" abc (and similar for groups). They then do exec s6-setuidgid abc:abc in the container to drop privileges on the process.

You could do something similar here, though you don't have busybox setuidgid in alpine, or s6 (you could add it to the container, though). I think you should be able to just do it with su but I haven't tried that yet.

But if we fix the initial server_key write bug, you could just use docker's built in user parameter. What I'd propose is to change the default for key_file to be /data/server_key, and database_url to sqlite:///data/users.db?mode=rwc (which would match the config template anyways). Then in the container we make a /data directory with 0777 permissions.

This way, the initial config can get written there no problem if you don't set the user, or if you set key_file to another location. If you mount a host directory, then the container will see the permissions of the host directory.

I did a quick test and it seems to work just fine, and by doing --user I can set the process uid to whatever I want

@nitnelave
Copy link
Member

Rather than change the defaults in the source code, how about this: the start script checks if /data/lldap_config.toml exists, otherwise it creates it (copying from the template). That way we can make sure that the folder is writeable even before we start lldap.
WDYT?

@kaysond
Copy link
Contributor Author

kaysond commented Nov 26, 2021

Rather than change the defaults in the source code, how about this: the start script checks if /data/lldap_config.toml exists, otherwise it creates it (copying from the template). That way we can make sure that the folder is writeable even before we start lldap.
WDYT?

I think thats a good idea in general, but it doesnt quite solve the permissions problem. The problem is you're running the lldap binary from inside /app. So when configuration.rs tries to write the initial key file to server_key, it actually writes it to /app/server_key, which is owned by the app user (uid 10001). If you set --user, then it can't write there and it crashes. It always tries to write there when you start the container up when it loads the defaults

Maybe the "right" thing to do here is actually fix the "bug" - we don't actually want get_server_setup to write the server config to file until after the defaults are loaded, the toml is merged, and the env vars are merged

https://github.com/nitnelave/lldap/blob/ba7848d043da20041c14312bd4d0ea24107b8623/server/src/infra/configuration.rs#L115

@nitnelave
Copy link
Member

But... It does wait until the config is loaded. It's just that in the cases mentioned above, there was no (readable) config to load, so it reverted to the defaults. I think it's better to make sure there's a config, and if it's copied from the template it'll have the correct location for the server_key file

@kaysond
Copy link
Contributor Author

kaysond commented Nov 26, 2021

I'm new to rust, so its very possible I'm reading the code wrong, but this is my understanding of configuration.rs

init gets called
https://github.com/nitnelave/lldap/blob/5b5395103ae56ebbea841be76c000b8b243895dc/server/src/infra/configuration.rs#L206

This creates the configuration, which starts with a call to ConfigurationBuilder::default().build().unwrap() on L128, then afterwards merges in the config file and env vars

https://github.com/nitnelave/lldap/blob/5b5395103ae56ebbea841be76c000b8b243895dc/server/src/infra/configuration.rs#L217-L222

The initial ConfigurationBuilder::build() calls get_server_setup(self.key_file.as_deref().unwrap_or("server_key"))?; (L82) but at this point, self.key_file has not yet been defined from the config file, so it's using "server_key" from either the unwrap_or or from the macro (again - rust noob so not sure which, but theyre the same value!)

https://github.com/nitnelave/lldap/blob/5b5395103ae56ebbea841be76c000b8b243895dc/server/src/infra/configuration.rs#L80-L84

Macro:

https://github.com/nitnelave/lldap/blob/5b5395103ae56ebbea841be76c000b8b243895dc/server/src/infra/configuration.rs#L63-L64

So the result is that get_server_setup("server_key") is called, and since /app/server_key doesn't exist, it tries to write there. If the process uid is not 10001, this will always fail.

https://github.com/nitnelave/lldap/blob/5b5395103ae56ebbea841be76c000b8b243895dc/server/src/infra/configuration.rs#L105-L121

After the defaults are loaded as described above, user-supplied values get merged in, and then there's another call to get_server_setup, but this time its with correctly loaded value of key_file (assuming you can even get here).

https://github.com/nitnelave/lldap/blob/5b5395103ae56ebbea841be76c000b8b243895dc/server/src/infra/configuration.rs#L228


This lines up with the behavior I observed - the config file was properly mounted and readable just fine, but if i changed --user, it would fail trying to write to server_key, despite that in the config file I set key_file: /data/private_key.

Please let me know if my rust understanding is off :D

@nitnelave
Copy link
Member

Don't sell yourself short, you're one up on me on this one :D

get_server_setup should not be called from the build, indeed. Right now, it tries to read/write 2 server keys, one next to the binary and one in the correct place.

I'll fix that when i get the chance.

@kaysond
Copy link
Contributor Author

kaysond commented Nov 26, 2021

get_server_setup should not be called from the build, indeed. Right now, it tries to read/write 2 server keys, one next to the binary and one in the correct place.

I'll fix that when i get the chance.

Awesome! I bet that, along with maybe some more explicit info on permissions in the docs should solve most of the issues

@nitnelave
Copy link
Member

Just pushed a fix, feel free to try it, you can add user: 1000:1000 (or whatever your uid/gid is) in the docker compose.

@kaysond
Copy link
Contributor Author

kaysond commented Nov 29, 2021

This is fixed on my end. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants