Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safety of Private/Isolated FFI Dependencies #1850

Closed
damncabbage opened this issue Jan 25, 2016 · 15 comments
Closed

Safety of Private/Isolated FFI Dependencies #1850

damncabbage opened this issue Jan 25, 2016 · 15 comments
Milestone

Comments

@damncabbage
Copy link
Contributor

[This is a side-discussion springing out of https://github.com//issues/631]

Regardless what approach we take with a package manager, we'll need to decide how to treat FFI library dependencies. In this case, I've chosen to focus on JS (as the current backend), but I believe this applies equally to any C++/Lua/C FFI that may happen.

As I see it, we have two options for how we treat FFI dependencies when we resolve/install PureScript packages:

  • Globally, like PureScript packages themselves, or
  • Isolated / privately, in their own bubbles, not interfering with each other.

An example:

  • There is an npm library, quux.
  • purescript-foo is a package that depends on v1 of quux.
  • purescript-bar is a package that depends on v2 of quux.

Global resolution says that both purescript-foo and purescript-bar can't be installed at the same time, as there can only be one quux installed at any one time, and …-foo and …-bar would need both quux v1 AND quux v2, so it bails.

"Private" / Isolated resolution says that purescript-foo and purescript-bar can be installed at the same time. There is a separate quux package installed under …-foo and …-bar, because those PureScript package's JS FFI dependencies are resolved separately from each other.

So.

What I'm trying to determine is:

  1. Is this unsafe?
  2. As a distant second: does this cause other problems?

I'm after specific examples if at all possible. It can be as simple as a two-liner "Foo exports a type from Quux v1, Bar exports a type from Quux v2; therefore...", but I'm trying to nail down concrete cases, as I'm struggling to think of any that are unsafe specifically.

@damncabbage
Copy link
Contributor Author

I can think of a specific case where this would be annoying, but not unsafe:

  • Foo exposes a foreign type Thing representing an object from the quux JS library.
  • Bar exposes a foreign type Thing from a different version of the quux library.
  • A consumer of both packages can't take the result from a Int -> Foo.Thing function and give it to a Bar.Thing -> String function, as the foreign types won't unify.

This would also happen with two PureScript libraries exposing the same library, though, so I don't see this as a problem with isolated dependencies as unifying two types defined in different libraries.

@garyb
Copy link
Member

garyb commented Jan 25, 2016

I think the way private dependencies are intended to work is they are only allowed when their entire interface is used internally by another package: it should be impossible to tell from the outer module that the inner/private module is used at all. That means no types or classes defined in the inner module can appear in the exports for the outer module, and no members from the inner module can be re-exported by the outer module.

The reason it's unsafe is that the Thing export in your examples above would resolve internally to Quux.Thing from both modules, re-exports do this so we don't have constant conflicts with duplicate imports. Of course we could add more information to the qualification internally, so we could see that it's Quux.Thing [quux-v1.0.0] and Quux.Thing [quux-v2.0.0], but that's not how it works right now, and also it's not really "private" if types leak out like that. I think the proper thing would be to use a newtype for Thing in both Foo and Bar if they wanted to expose the type while maintaining quux as a private dependency.

@garyb garyb added this to the Discussion milestone Jan 25, 2016
@damncabbage
Copy link
Contributor Author

The reason it's unsafe is that the Thing export in your examples above would resolve internally to Quux.Thing from both modules, …

I'm not sure I've made myself clear on this, sorry. There is no Quux PureScript module, only the quux library; I'm talking about a file-structure that looks like this:

|-- purescript-foo/
|   |-- package.json
|   |-- node_modules/
|   |   '-- quux/...
|   '-- src/
|       |-- Foo.purs
|       '-- Foo.js
'-- purescript-bar/
    |-- package.json
    |-- node_modules/
    |   '-- quux/...
    '-- src/
        |-- Bar.purs
        '-- Bar.js

With Foo.purs (for example), looking like:

module Foo where
...
foreign import data Thing :: *
foreign import thing :: forall eff. Eff (console :: CONSOLE | eff) Thing

And Foo.js like:

/* global exports */
"use strict";
// module Foo
exports.thing = function() { return require('quux'); };

(Please excuse me if I've misunderstood your reply.)

Edit: Corrected the Foo example; was missing the function to create a Thing.

@garyb
Copy link
Member

garyb commented Jan 25, 2016

Ahh, sorry, I see what you mean now!

I think your second comment here covers the issue pretty well, since the types would be seen as different they would appear to be incompatible anyway. There is still some trickiness to be solved in regards to Foo and Bar potentially having required different versions of quux, but I guess the npm model of having a private node_modules for each would work for that, as you outlined above.

This is almost separate from the private dependencies issue, as "FFI dependencies" should probably always be local to the package that requires them, in an ideal world.

I can still think of cases where there are issues here, say we wrapped some browser routing library (or in fact any library that inserts itself into the window namespace) then things are probably not going to work too well, but I'm not sure there's any way we can prevent that in the compiler, as we wouldn't have the information to make that determination... unless that was covered in the library manifest somehow, but I don't know.

@damncabbage
Copy link
Contributor Author

This is almost separate from the private dependencies issue, as "FFI dependencies" should probably always be local to the package that requires them, in an ideal world.

Got it. The point of this ticket is to see if we can get consensus on whether this approach is a good idea. There's no point putting a bunch of effort into making this happen (in stop-gap form or otherwise) if it's not anything that we actually want to do.

I can still think of cases where there are issues here, say we wrapped some browser routing library (or in fact any library that inserts itself into the window namespace) …

Yeah. You can do some crazy things. Back to JS-library-quirks, this is what I think could be arguments against this private-dependency thing:

  • A library uses window.someGlobal within itself; with two versions of the library being run, they may not use that global in the same way.
  • A library relies on require(…)-caching to determine whether this is its "first load"; two versions means two "loading" instances.

If we want private dependencies, we need to figure out where we want to draw a line, if we want to draw a line.

@hdgarrood
Copy link
Contributor

Another potential issue I can imagine seeing is if some JS module at some particular version creates a data type, which is given the Foreign type in PureScript, and then that value is passed into a different version of the same module, which then throws an error because the structure was different to what it was expecting. Of course, opting to use the FFI means you lose lots of guarantees and have to be much more careful anyway, so maybe we can say this ought to be the user's responsibility at this point? It seems unlikely enough that I'm tempted to wait and see if it actually becomes a problem.

The other thing that occurred to me was, I realise the JS world is now somewhat accustomed to nested dependencies, but outside that world, it's pretty rare, right? Other languages usually have something closer to the Bower model, I think? So it might not necessarily be feasible to isolate FFI dependencies in this way when people are using languages other than JS? Perhaps the answer to this question is just "wait and see what happens" right now.

@garyb
Copy link
Member

garyb commented Jan 25, 2016

Hmm, yep, Foreign could indeed summon all kinds of issues... it's potentially a problem now even, if someone makes a prototype with the name Document foreign coercions in purescript-dom will accept it, since we just use runtime tag reading.

Sometimes that's a benefit, as instanceof checks don't work when you have different sandboxes in the iframe/web worker kind of cases, but it also opens the door to all kinds of nonsense. 😕

@bodil
Copy link

bodil commented Jan 25, 2016

Here's why private dependencies are a bad idea in JS (disregarding PS entirely) - I'll use the real world example that finally sold me on this:

I've got a parser combinator library A, whose parsers return result objects which are instances of either ParseResult or ParseError. To figure out which you got, you do a simple instanceof check.

I've got two libraries of parsers (B and C) built using A, which I use in application D to do some actual parsing. Both B and C depend on A. In D, I combine parsers from B and C to create my final top level parser.

I developed this using npm 3, and everything was fine. Then I started getting bug reports where my application would give weird parse errors for no apparent reason, and I couldn't understand why. A week later, I finally realised that these only happened if you installed application D using npm 2, where no attempt was made to flatten the dependency tree, thus ending up with two distinct copies of A: the one included by B, and the one included by C. They were the same version, so there were no API differences, but the instanceof checks were failing, because ParseError from A-in-B and ParseError from A-in-C were not derived from the same prototype. The only way to get around this issue safely was to not use instanceof at all, and the reason why was extremely non-obvious.

This is why, in my ideal JS package manager, every module must be promoted to the top level and dependencies with non-overlapping version ranges would cause a fatal error.

This is, on the other hand, very much not how npm works, and it's completely possible that there are many popular npm packages which can't even be installed if such constraints exist, because they are likely to have several subdependencies with conflicting version constraints.

Which is to say that while I'm aware of this very real problem, I'm honestly not sure which approach I'd advocate.

@robotlolita
Copy link

@bodil those libraries are already broken at that point (conflicting dependencies) if the objects from one side ever cross to another side, anyway.

The conservative approach (abort on version conflicts, potentially allowing the user to choose one of the versions) is the safe approach, and you're guaranteed that if the dependencies can be resolved at all your program will work (which is a very important thing to me).

The problem is potentially worse in languages whose modules are not first-class/or where you can't provide a path to search for things to link against your application. It would be a lot of work to get it working on the JVM, for example.

As for PureScript libraries themselves, PS uses nominal typing, so you'll get incompatible types for two different copies of purescript-foo, even if they're the same library. And a kind of non-nominal equality for types would be unsafe if type constraints aren't very detailed, and I think that'd break common Haskell/PS idioms with type classes too.

The PS part of the problem might be solved with separating implementations from interfaces, though, granted that only one interface version conflicts cause dependency resolution to fail. A system like Haskell's backpack would make that possible. This does complicate the process of writing new libraries, though (you'd need to publish an interface AND an implementation), and I'm not sure how it would work with decentralised dependencies.

@damncabbage
Copy link
Contributor Author

@hdgarrood / @garyb:
Good point about Foreign; I'd managed to complete miss that. And anyone else could write something similar in their SomeModule.js exports, albeit with less community support.

Again, I'm not sure where we draw the line on this, but nested dependencies sure makes it harder to guess what version of something you're accepting as Foreign and converting to your own local representation. It feels like a bad bet to be taking all 'round, though. :(

On this bit from @bodil's post:

This [flat-dependency method] is, on the other hand, very much not how npm works, and it's completely possible that there are many popular npm packages which can't even be installed if such constraints exist, because they are likely to have several subdependencies with conflicting version constraints.

To me personally, that's the deal-breaker for flat dependencies for JS FFI. We may be able to get fancy and have flat deps for C++ / Java / C FFI and nested deps for JS, but I think we need to stick with the dependency structure for the packages we're wrapping, otherwise we're going to hit that mismatch in how permissively these sub-dependencies can be specified (eg. allowing multiple versions per single JS dependency tree).

I think it may be down to what we're prioritising. My own personal preference is to prefer compatibility with JS (that is, being able to use most packages), with the unfortunate drawbacks of a larger footgun when passing things between two PS libraries wrapping the same JS library, or having local trees of packages resulting from an install that wouldn't have worked in JS or in PS. We as a group may differ on this; I'd love to see what the consensus on this ends up being.

(I've said my piece a couple of times now, so I'll stay out of this for a while to reduce discussion noise.)

@hdgarrood
Copy link
Contributor

I think we have a third option in addition to the two that have been suggested.

We currently have:

  1. @damncabbage's original proposal, isolated FFI dependencies
  2. @bodil's ideal JS package manager, where everything is installed at the top level

What I think could work, which is sort of in between these two, would be that our package manager would walk the dependency tree, collecting all JS dependencies after solving for PS ones, and install all these JS deps into one "universe" in the normal npm way, ie, by using npm install.

Assume we have successfully solved all the PureScript dependencies of some PureScript package. We would then collect details of all of those PS packages' JS dependencies, resulting in something like a Map PSPackageName (Map JSPackageName VersionRange). Then, we do this to it:

Map.elems >>> Map.unionsWith intersectVersionRange
  where
  intersectVersionRange :: VersionRange -> VersionRange -> VersionRange
  intersectVersionRange = [...]

So this gives us a Map JSPackageName VersionRange. If any of the resulting VersionRange values end up empty, then we have an incompatibility, and we abort. Otherwise, we pass the whole lot to npm install, and so everything goes into one node_modules directory at the project root.

I think this reduces the potential footgun danger considerably, while also remaining compatible with the way things are done in JS land. If an incompatibility between JS deps does arise, it will be, in some sense, amongst PureScript packages, rather than JS ones. This is a much weaker requirement than what @bodil (and I, and probably others too) would ideally like, namely: everything at the top level, but I think it's about as good as we can really get. It's also, essentially, an automated version of what everyone is doing now. We might also (warn unless|require that) npm is at least at version 3, which ought to further reduce this risk.

@robotlolita
Copy link

@hdgarrood this does sound like a good compromise. Should be similar to what Nix does with its dependencies, I think.

I'd still vote on warning the user about conflicts by default, but allow them to choose whether to resolve the conflict themselves, or install it npm's way, so they can be aware that the conflict might be the reason of any issues that arise later on.

@ttbodil
Copy link

ttbodil commented Jan 26, 2016

I'm tempted to go even further: all dependencies are installed flattened by default, but you can flag specific misbehaving dependencies as needing to be non-flat in package.json or wherever.

@paf31
Copy link
Contributor

paf31 commented Sep 24, 2016

I think flattening is a requirement here for the same reason as discussed in @hdgarrood's Bower/NPM blog post.

Relevant: #631

@hdgarrood
Copy link
Contributor

Given the lack of movement for 2+ years, and also the move to drop package management from the set of things the compiler concerns itself with, I think this can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants