Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: package should be able to ask for less sandboxing #4674

Closed
lefessan opened this issue May 20, 2021 · 9 comments
Closed

Feature request: package should be able to ask for less sandboxing #4674

lefessan opened this issue May 20, 2021 · 9 comments

Comments

@lefessan
Copy link
Contributor

Some packages, like pgocaml and other packages depending on pgocaml, require to have access to external services. In the case of pgocaml and its children, it needs to be able to connect to the local PostgresQL database (either unix socket or localhost in read/write mode) to check the correctness ocaml types with regards to db schemas. Since sandboxing prevents such accesses, using opam to distribute executables depending on pgocaml is difficult, you must explain to the user how he should disable sandboxing and re-enable it afterwards, and to do that everytime the switch is updated. It would be much better if such packages could ask for permission to disable sandboxing, and opam would prompt the user to ask if he is ok with it or not for the given packages.

It comes with multiple benefits:

  • sandboxing would be disabled on a per-package basis, i.e. sandboxing would still be working for other packages installed in the same batch (which is not the case with the other approach)
  • no complex configuration: the user can just reply Y when prompted, or use a specific option, like --disable-sandboxing-for=pgocaml, to make it automatically
  • tooling can detect this flag in the opam package and warn opam repository maintainers so that they can verify that there is a good reason for it

Of course, given that sandboxing is currently performed through hooks, an early solution would be to disable hooks instead of disabling sandboxing. So a package would just add:

disable-hooks: true

(it would also disable opam-bin, but that's fair, because opam-bin is not supposed to work on packages performing side-effects on the system).

@AltGr
Copy link
Member

AltGr commented May 20, 2021

I am pretty concerned about fast-tracks to disable security features in general; especially if they can be specified at the same place where the security issue might come from.
This said, I agree that this would be much better than the current "ask the user to disable sandboxing" solution, which is 😬.

With this in mind, in lack of a better solution, I would be in favour of a command-line option --disable-sandboxing-for rather than a package flag, although the fact that it would be needed at every build, and for every dependent package, seems quite annoying still. As you pointed out already, it would also be a bit tricky to implement due to the way sandboxing is implemented through generic hooks.

Now side-stepping the feature request per se, and focusing on the pgocaml package itself: I get the feeling that the issue stems from the fact that the package breaks one of opam's base assumptions: that everything should be contained within the switch prefix. You mention "packages performing side-effects on the system" which in theory shouldn't be allowed to exist, do you have examples in mind ?

However, I am not sure that this is actually the case for pgocaml: please correct me if I am wrong, but doesn't it just dialog with a pgsql server just to get its configuration info, without actual side-effects ? (Otherwise wouldn't that mean that you are required to run the binaries on the machine where you compiled them ?)

I will look at pgocaml, with some hope that we can find a way to make it work without r/w access to the system, therefore making it fit inside the sandbox.

Another track could be to check the minimal set of permissions pgocaml needs within the sandbox, and discussing if that could be acceptable in general.

@dra27
Copy link
Member

dra27 commented May 20, 2021

pgocaml infers the types of SQL expressions by connecting to a server which a database with the required schema and preparing those statements against that database (preparing types the statement, but doesn't execute it). It does also have mechanisms for making changes while doing that (for example, if you create temporary tables then you need to do that at compile-time in order to infer the types for SELECT on them). So it does potentially violate the sandbox.

Two other points, though: the opam sandbox scripts don't at present restrict network access, so this can be solved (as you would likely be doing on Windows anyway) by using a TCP port on loopback to communicate with pgsql. It should also be possible to configure a Unix socket in the switch.

@avsm
Copy link
Member

avsm commented May 20, 2021

I am strongly against any mechanism which would allow something within the sandbox to turn off the protection afforded by the sandbox. This would be an obvious way for any chained exploit to trivially bypass our protections.

opam does allow filesystem read access and network access as @dra27 notes -- either of those should be sufficient to solve the pgocaml usecase described in the original report.

@lefessan
Copy link
Contributor Author

I am strongly against any mechanism which would allow something within the sandbox to turn off the protection afforded by the sandbox. This would be an obvious way for any chained exploit to trivially bypass our protections.

The mechanism is not _within the sandbox, it is a declaration in the opam file, so that opam can ask the user before starting any installation of the proposed solution. That's exactly what installers do on mobiles, they ask the user whether he allows the application to bypass the sandbox to access other resources. I would assume that Google and Apple have very good security experts who didn't see it as a security breach.

opam does allow filesystem read access and network access as @dra27 notes -- either of those should be sufficient to solve the pgocaml usecase described in the original report.

Are you implying that, if a package can silently read my $HOME/.ssh/id_rsa and share it with other users on my computer through the localhost network (maybe even on other computers ?), it's not considered as a potential security risk, but that a package that explicitely requests permission to disable some sandboxing creates a bigger risk ?

@dra27
Copy link
Member

dra27 commented May 20, 2021

It is absolutely is within the sandbox - the package is what you're about to run within the sandbox and so reading anything from the package which tells/advises you to turn off the sandbox is within the sandbox.

The .ssh argument is a strawman. The point is that you can configure either the system (by using TCP) or the switch (by providing a unix domain socket within the switch itself, which can be writable). In the specific case of pgocaml, there're already a lot of assumptions being made about the environment in which the package is installed anyway; having to configure the system further doesn't feel like opam's domain. Even more specifically for pgocaml, I'd suggest distributing pre-processed files for a package (a la autoconf, etc.)

The sandbox is intentionally course-grained and difficult to disable. If you start prompting the user, what then happens for unattended scripts? A non-interactive prompt now automatically answers yes to sandbox disabling? 😱 Let's not forget that the reason the sandbox was added is literally because a package author made a mistake, and quite a few people lost files...

@lefessan
Copy link
Contributor Author

The .ssh argument is a strawman. The point is that you can configure either the system (by using TCP) or the switch (by providing a unix domain socket within the switch itself, which can be writable). In the specific case of pgocaml, there're already a lot of assumptions being made about the environment in which the package is installed anyway; having to configure the system further doesn't feel like opam's domain. Even more specifically for pgocaml, I'd suggest distributing pre-processed files for a package (a la autoconf, etc.)

The whole point of opam is to make it easy for developers to share their work, and for newbies or external users to be able to use packages easily. All what you are saying goes against those principles.

I am not saying my solution is the best or the only one, but asking the user to change the system or the switch is clearly not the good way to go. Neither is putting more burden on the developer, to understand how to turn automatically generated files by dune and ppxs into source files...

Moreover, the proposed feature is generic, it would be useful for pgocaml today, but it will clearly be also very useful for other packages in the future, that neither of us can think about.

The sandbox is intentionally course-grained and difficult to disable. If you start prompting the user, what then happens for unattended scripts? A non-interactive prompt now automatically answers yes to sandbox disabling?

Haha, who said disabling sandboxing would be the default ? There could be a lot of ways to make unattended scripts work perfectly, either by configuring opam or even the switch with a list of packages that are allowed to disable sandboxing... or not.

Let's not forget that the reason the sandbox was added is literally because a package author made a mistake, and quite a few people lost files...

Currently, we have to ask our users to completely disable sandboxing in opam configuration, for all packages, to be able to install these packages. So, they are actually taking a higher risk that the one they would take by accepting to disable it only for one or two packages.

@dra27
Copy link
Member

dra27 commented May 21, 2021

Neither is putting more burden on the developer, to understand how to turn automatically generated files by dune and ppxs into source files...

I apologise if I'm misunderstanding this, but rather than developers understand their build system and alter the package to work in harmony with the sandbox, opam should add a feature which allows the package just to switch off the sandbox?

Moreover, the proposed feature is generic, it would be useful for pgocaml today, but it will clearly be also very useful for other packages in the future, that neither of us can think about.

A feature being generic does not say anything about its merit. I'm not convinced (also being a PGO'Caml user) that this is useful for using pgocaml today. I'm curious to know how these packages are using postgres without any configuration requirement at the moment apart from disabling the sandbox - is it possible to see them?

@lefessan
Copy link
Contributor Author

I apologise if I'm misunderstanding this, but rather than developers understand their build system and alter the package to work in harmony with the sandbox, opam should add a feature which allows the package just to switch off the sandbox?

Yes. I have always been on the side of making the gap as small as possible for developers to adopt OCaml. And opam is supposed to be one of the tools designed to make this gap smaller. Expecting all developers to understand build systems, compilers, tooling, etc. is what makes OCaml remain in the small circle of PhD holders and friends.

We are not discussing about switching off the sandbox for all packages. Just to provide a simple and safer way to do it when it is needed.

A feature being generic does not say anything about its merit. I'm not convinced (also being a PGO'Caml user) that this is useful for using pgocaml today.

Do you have an opam package for your own work on PGOCaml ? Maybe it will help me solve my problem if this feature is not added...

I'm curious to know how these packages are using postgres without any configuration requirement at the moment apart from disabling the sandbox - is it possible to see them?

Yes, of course:
https://github.com/OCamlPro/ocp-opam-repository/tree/master/packages

If you try to build ft-crawler, it will build all the freeton_crawler_ packages (_versions provides the DB schemas, _updater creates the database, _lib calls the updater to create the database and builds the code with it).

@dra27
Copy link
Member

dra27 commented Jul 2, 2021

The original issue's fixed, so closing this one for now.

@dra27 dra27 closed this as completed Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants