Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing compiler Functions/Mixins across different compilations should not be allowed #2542

Open
ntkme opened this issue Mar 10, 2025 · 16 comments · May be fixed by #2544
Open

Passing compiler Functions/Mixins across different compilations should not be allowed #2542

ntkme opened this issue Mar 10, 2025 · 16 comments · May be fixed by #2544
Labels

Comments

@ntkme
Copy link
Contributor

ntkme commented Mar 10, 2025

See the following code as an illustration of the issue. On sass this "worked correctly", because callables from other function registries with in the same dart/js VM memory space can be directly accessed and called as functions/mixins are never serialized. However, on sass-embedded this is broken, as only the function id is being passed, which is not a real serialization of the function/mixin content, thus it will refer to a function in the current register even if we passed the function from a different registry.

const sass = require('sass-embedded');
// or
// const sass = require('sass');

const outer = `
  @use 'sass:meta';

  @function foo($n) {@return $n + 1}
  a {b: meta.call(bar(meta.get-function('foo')), 0); }
`

const inner = `
  @use 'sass:meta';

  @function foo($n) {@return $n + 2}
  a {b: meta.call(bar(meta.get-function('foo')), 0); }
`

sass.compileString(outer, {
  functions: {
    'bar($arg)': (args) => {
      const outerFn = args[0]
      console.log(sass.compileString(inner, {
        functions: {
          'bar($arg)': (args) => {
            const innerFn = args[0]
            console.log(outerFn.equals(innerFn))
            return outerFn
          }
        },
      }).css)
      return outerFn
    }
  }
})
@ntkme
Copy link
Contributor Author

ntkme commented Mar 10, 2025

This is currently "undefined behavior" that works in dart-sass, but probably it shouldn't even be allowed. There are a few things need to be done to address this:

  1. In dart-sass, if user defined callable returns a callable/mixins that has an environment that is not the current environment attached, throw an error.
  2. In embedded host protofier, attach the "function registry" (use it as "environment" tracker) as an attribute, and throw an error if user attempts to passing callables across compilation.
  3. In embedded host's compiler-function/mixin, the equals function need to compare the attached environment in additional to just id.

@nex3 nex3 added the bug label Mar 10, 2025
@nex3
Copy link
Contributor

nex3 commented Mar 10, 2025

The title refers to host functions, but outerFn in your example is a compiler function. This isn't an issue for host functions because they're protofied separately each time they're passed to the compiler.

I don't think we necessarily need to make this as complicated as attaching registries to function objects. All we really need is some tag (that can be completely opaque to the host) indicating to the compiler which compilation a given function belongs to. We could even reserve a few bits in the existing ID to use for this.

Either way we should definitely make it explicit in the JS API specification that passing a function object between compilations is an error.

@ntkme ntkme changed the title Passing "host" functions/mixins across different compilations has inconsistent behavior between sass and sass-embedded Passing compiler functions/mixins across different compilations has inconsistent behavior between sass and sass-embedded Mar 10, 2025
@ntkme ntkme changed the title Passing compiler functions/mixins across different compilations has inconsistent behavior between sass and sass-embedded Passing compiler functions/mixins across different compilations should not be allowed Mar 10, 2025
@ntkme
Copy link
Contributor Author

ntkme commented Mar 10, 2025

The title refers to host functions, but outerFn in your example is a compiler function.

Sorry, that was a typo.

All we really need is some tag (that can be completely opaque to the host)

Exactly. Anything that has 1-1 mapping with each compilation would work. In ruby implementation I just create a blank BasicObject for this purpose.

@nex3
Copy link
Contributor

nex3 commented Mar 10, 2025

What I mean is, it doesn't necessarily need to be the host's concern at all. The compiler itself should provide this guarantee.

@ntkme
Copy link
Contributor Author

ntkme commented Mar 10, 2025

I'm not sure if that can be done on compiler side reliably. In theory, the host side can retain a compiler function object in memory indefinitely, so that no matter what kind of bits we add, there is always a risk of collision at some point.

@ntkme ntkme changed the title Passing compiler functions/mixins across different compilations should not be allowed Passing compiler Functions/Mixins/ArgumentList across different compilations should not be allowed Mar 10, 2025
@ntkme ntkme changed the title Passing compiler Functions/Mixins/ArgumentList across different compilations should not be allowed Passing compiler Functions/Mixins across different compilations should not be allowed Mar 10, 2025
@ntkme
Copy link
Contributor Author

ntkme commented Mar 10, 2025

In additional, I found the compiler has special logic for argument list that may cause problem as well:

case Value_Value.argumentList:
if (value.argumentList.id != 0) {
return _argumentListForId(value.argumentList.id);
}

In theory, I should be able to save an ArgumentList from compilation A, and then pass it to compilation B. However, currently this will break. - Probably the host should check set the id to 0 before passing argument list to a different compilation.

@nex3
Copy link
Contributor

nex3 commented Mar 11, 2025

You're right that if we reserve (for example) eight bytes of the ID for a compilation ID, there will be a chance that you can get overlap once you have 256+ compilations running concurrently. But even then passing functions across compilation contexts will only work randomly and only very infrequently, so there's no real risk that it would hide errors from the user. Even then, we could still handle it on the compiler side by adding a separate CompilerFunction.compilation_id field—we're already effectively assuming that there won't be 4 billion concurrent compilations.

For ArgumentLists, I think the embedded protocol is fine as-is, although we could add a bit more explicit wording around how non-0 IDs should only be allocated by the compiler. The JS API should probably what happens to argument lists passed between compilations, though (the best option there is probably to just mark it as accessed).

@ntkme
Copy link
Contributor Author

ntkme commented Mar 11, 2025

It’s not just concurrent compilations. Even if I do them sequentially, I can save a “compiler function” to a local variable, and then return it in a different compilation much later. Also, in implementation like the ruby host, the next compilation id is reset to 1 every time the compiler becomes idle, it’s not safe even if it’s a separate field.

@ntkme
Copy link
Contributor Author

ntkme commented Mar 11, 2025

Note in the following example, both compilations have the same compilation id 0 with or without using persisted compiler due to implementation detail on the host side:

https://github.com/sass/embedded-host-node/blob/1ec3fbdf241267ce3c55eae4a4fb6d374ed370d9/lib/src/compiler/sync.ts#L136-L139

So relying on compilation id to determine if the variable comes from the same compilation is not going to work.

const sass = require('sass-embedded');
// or
// const sass = require('sass');

const first = `
  @use 'sass:meta';

  @function foo($n) {@return $n + 1}
  a {b: meta.call(bar(meta.get-function('foo')), 0); }
`

const second = `
  @use 'sass:meta';

  @function foo($n) {@return $n + 2}
  a {b: meta.call(bar(meta.get-function('foo')), 0); }
`

let fn
sass.compileString(first, {
  functions: {
    'bar($arg)': (args) => {
      fn = args[0]
      return fn
    }
  }
})

console.log(sass.compileString(second, {
  functions: {
    'bar($arg)': (args) => {
      const fn2 = args[0]
      console.log(fn.equals(fn2))
      return fn
    }
  },
}).css)

@ntkme
Copy link
Contributor Author

ntkme commented Mar 12, 2025

I've put up PRs to fix this:

Note that for best performance the approaches are a bit different in dart-sass and embedded-host-node. In embedded-host-node the check of whether function/mixin belongs to the current compilation is enforced eagerly in protofier. In dart-sass, the check is enforced lazily when function/mixin is invoked. This is to avoid having to recursively check JS function's return value every single time. It means the two implementation may throw the error on different span, but I think it's a fair compromise to avoid a larger overhead in dart-sass.

@nex3
Copy link
Contributor

nex3 commented Mar 12, 2025

The ID used to disambiguate compilations for first-class callables doesn't have to (and probably shouldn't) be the same as the compilation ID in the packet. As far as the host is concerned, it's fully opaque, so it should just be used for disambiguating.

Before we actually land any implementation, we're going to have to go through the language change process to update the embedded protocol and JS API spec.

@ntkme
Copy link
Contributor Author

ntkme commented Mar 12, 2025

This has nothing to do with compilation ID. The concern is that in order to make sure any opaque ID is not going to conflict with potentially retained opaque ID on the host side, this ID needs to be globally unique on compiler side AND host side for all compilations. For embedded compiler this is a huge challenge because it need to be globally unique across isolations. I just don't see compiler side opaque ID as a reliable solution.

In addition, I can even retain a reference of a compiler function from embedded compiler process 1, and then send the reference to a different compiler process 2, and any kind of ID can have collision across different dart vm process.

@ntkme
Copy link
Contributor Author

ntkme commented Mar 12, 2025

With a single dart VM (dart-sass) we just need a single opaque ID (in my PR it's just in the form of a opaque Object()), but with embedded-host I believe the only reliable way it to track the construction of compiler function reference with another opaque id on the host side itself. Trying to pass the compiler side opaque id to host and then pass it back won't work.

@nex3
Copy link
Contributor

nex3 commented Mar 13, 2025

I don't think we need the ID to be globally unique, I think we just need it to be unique enough to cause an error the vast majority of the time. Even just the output of the library random number generator would probably be sufficient, although if we wanted to get really deep we could allocate a certain number of bits for stuff like "the current time in nanoseconds" and "the current process ID". But this doesn't need to be cryptographically secure, it just needs to prevent users from checking in code that'll blow up in other contexts.

@ntkme
Copy link
Contributor Author

ntkme commented Mar 13, 2025

Having an opaque tag on the compiler side AND another opaque tag on the host side can avoid this 100% time, and I don’t see a good reason to not do that given the implementation is actually quite simple.

@nex3
Copy link
Contributor

nex3 commented Mar 17, 2025

All right, after talking it over with the team, we've decided to follow your suggestion and only make this part of the host. It'll still need to go through the JS API proposal process before implementing, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants