Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose files to wasm extensions #22557

Open
anuraaga opened this issue Aug 4, 2022 · 15 comments
Open

Expose files to wasm extensions #22557

anuraaga opened this issue Aug 4, 2022 · 15 comments
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!

Comments

@anuraaga
Copy link
Contributor

anuraaga commented Aug 4, 2022

Title: Allow files to be specified when enabling a wasm plugin which will be accessible via wasi.

Description:

It can be useful to load large-ish data files from a wasm plugin, for example complex WAF rules or neural network parameters used to classify requests. While wasi defines an interface for reading files, Envoy does not expose it to wasm plugins yet. Plugins should not be able to access any file on the host automatically - the configuration should decide what folders or files are expose. The configuration could look like

{
  "vm_id": ...,
  "runtime": ...,
  "code": {...},
  "configuration": {...},
  "allow_precompiled": ...,
  "nack_on_code_cache_miss": ...,
  "environment_variables": {...}
  "filesystem": [
    "/some/directory/",
    "/some/other/directory/file.dat",
  ]
}

[optional Relevant Links:]
This is similar to #14958 but for files instead of environment variables.

@anuraaga anuraaga added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Aug 4, 2022
@mathetake
Copy link
Member

@PiotrSikora, this one is really demanded one as we've seen in the corresponding issues in cpp-host, SDK and envoy slack. We (at tetrate) would love to work on the implementation once the API is agreed upon. I know wasi-2 will come but I image it would take a few more years to be stabilized considering that it comes with component model in Wasm spec, but I believe that the semantics of accessing file won't be different and therefore the API in Envoy won't be either. Could you tell us your current thoughts on this?

@PiotrSikora
Copy link
Contributor

PiotrSikora commented Aug 4, 2022

Indeed, this is needed. I have a hacky WIP, but I've dropped ball on this one numerous times, so if I won't have it out in the next 2 days, then feel free to implement it yourself.

Regarding xDS API, I think it should be pretty similar to the one for environment variables, i.e.

message HostPath {
    // The path on the host filesystem.
    string host_path = 1;
    // The path visible in Wasm plugins, if different than the path on the host filesystem.
    string wasm_path = 2;
}

message WasmFilesystem {
  // Paths (files or directories) from the host filesystem, that will be available for reading
  // in Wasm plugins.
  repeated HostPath host_paths = 1;

  // Pre-populated file contents available for reading in Wasm plugins.
  // The key represents the file path visible in Wasm plugins.
  // The value contains the content of that file.
  map<string, string> file_contents = 2;
}

I can push this proto for review if it looks reasonable to you.

@anuraaga
Copy link
Contributor Author

anuraaga commented Aug 4, 2022

Thanks @PiotrSikora - wondering should HostPath have a writable flag? While initial implementation would probably be only read, when adding support for write we presumably would still want to configure the filesystem with the same proto.

@PiotrSikora
Copy link
Contributor

@anuraaga what exactly would be the use case for writeable filesystem in Proxy-Wasm?

@anuraaga
Copy link
Contributor Author

anuraaga commented Aug 4, 2022

One is to write logs to a separate file. For example with WAF, from what I understand it's fairly standard to write audit logs separate from access or proxy debug logs.

@PiotrSikora
Copy link
Contributor

I agree that's a valid use case, but I don't think that direct file write access is the best way to support it, especially considering multiple workers that might be writting concurrently.

For something like audit logs (i.e. strictly append only), we should definitely have a dedicated API. We could either add a new API or extend the existing logging API with either "source" or "destination", which could be associated with either a local file or remote log service in Envoy's configuration.

cc @mpwarres

@anuraaga
Copy link
Contributor Author

Thanks @PiotrSikora - I filed #22669 to separate out the issue of log destinations.

Were you able to make progress on this issue or would it be good for us to help with it? Let us know what works best for you.

@mathetake
Copy link
Member

ping @PiotrSikora

@PiotrSikora
Copy link
Contributor

This is up for grabs, thanks!

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Sep 25, 2022
@github-actions
Copy link

github-actions bot commented Oct 2, 2022

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

@github-actions github-actions bot closed this as completed Oct 2, 2022
@kyessenov kyessenov reopened this Oct 6, 2022
@kyessenov kyessenov added the help wanted Needs help! label Oct 6, 2022
@github-actions github-actions bot removed the stale stalebot believes this issue/PR has not been touched recently label Oct 7, 2022
@acarlson0000
Copy link

Hi @PiotrSikora et al.

We have been attempting to write a new Envoy WASM filter utilizing the filesystem (ie. reading a file from the filesystem, it does not need write access), and encountered this issue (ie. it is not supported!)

Initially, we have been compiling our WASM filters with wasi32-unknown-unknown, but had to migrate to wasi32-wasi to include some external libraries which enabled the WASI Time functionality - is WASI the supported method for compilation going forwards (we've seen some other comms around this, but wanted to confirm explicitly)

[ as an aside, I can see that you had a WIP back in August 2022 - are you able to share this impementation at all? As you can expect (and see from comments above) it is definitely a useful feature / bit of functionality for all Envoy clients. ]

@PiotrSikora
Copy link
Contributor

@acarlson0000 I agree there are valid use cases for the read-only file/blob access, and it would be good to have it. Unfortunately, my WIP implementation was forever lost when I left Google, so I cannot share it anymore. @anuraaga @mathetake did you made any progress on this?

Re wasm32-wasi - that's the preferred target (at least, as long as it's still preview1).

@kyessenov
Copy link
Contributor

cc @mpwarres

@anuraaga
Copy link
Contributor Author

Sorry, got to a unit-test based very initial WIP at https://github.com/anuraaga/proxy-wasm-cpp-host/tree/wasi-fs but haven't been able to get cycles to continue it yet. We're currently discussing more active work on proxy-wasm and I'm hoping we'll be able to put more time into this and others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

No branches or pull requests

6 participants