Frontend <> Backend communication #20

cristianberneanu · 2021-06-09T10:19:07Z

We need to agree on the way the Frontend communicates with the Backend.

Since transpiling the reference code to JS resulted in poor performance, the anonymization code will stay in dotnet.
Furthermore, I don't think it is a good idea to manually build the query AST in JS land. It couples the Frontend and Backend internals too much. Sending a SQL statement feels cleaner.

As input we send: filename, query statement, anonymization settings.
As output we get: query result or an error.

Option 1: anonymize using the CLI.

We pass the input as command-line arguments , we get back the query result (as either CSV or JSON) in the stdout stream or we get an error in stderr stream.

PROs:

We don't need to have .NET code in the publisher repository;
It keeps the reference code separate from the GUI and free of pollution with Frontend concerns;
Allows for easy automatization, as all the functionality is easily accessible from the CLI;
Makes sure the reference tool works as intended from the CLI (since the Frontend depends on it).

CONs:

We won't have live progress reports (unless we get a bit hacky);
We pay the CLR startup cost for each anonymization call;
Functionality will be limited to what the CLI provides.

Option 2: anonymize using IPC.

We will need an additional .NET project in this repository that loads the core reference library and dispatches anonymization requests to it. We pass the input as a JSON object and we get back a JSON object with the result or error. We need to decide if we use a socket or the process stdio streams for message exchange.

PROs:

We can add functionality not supported by the CLI;
JSON messages are more expressive than invoking a CLI application;
Lower latency, since the CLR is kept loaded.

CONs:

Additional .NET code added to this repository;
CLI might become stale, since it will be rarely used;
Tighter coupling between the publisher and reference repositories;
Reference code will get polluted with Frontend concerns (like progress reports).

I am slightly in favor of Option 1 (I don't consider the drawbacks for it too big).

sebastian · 2021-06-09T14:11:40Z

I don't think it is a good idea to manually build the query AST in JS land. It couples the Frontend and Backend internals too much. Sending a SQL statement feels cleaner.

Yes, building the AST in JS only made sense as long as the AST could immediately be executed there too.

sebastian · 2021-06-09T14:19:08Z

I vote for Option 1 too.

I additionally vote for using JSON as the output as it's easier to use in the frontend than parsing some CSV output.

We can live without progress reports, and if we need it later we can get hacky then.

edongashi · 2021-06-09T14:20:27Z

Do we drop the JS CSV parser? If yes, do we use the backend to figure out the shape when we load a file?
If not, we need to use 2 different CSV libraries where each may have their own tiny differences.

sebastian · 2021-06-09T14:25:16Z

Do we drop the JS CSV parser? If yes, do we use the backend to figure out the shape when we load a file?
If not, we need to use 2 different CSV libraries where each may have their own tiny differences.

Good point, @edongashi.

We either need another parser for the GUI or need to extend the Reference with an endpoint that returns a schema...
In either case, as long as we want to support CSV, it seems the CLI interface must be extended to support providing a schema as part of the input too!?

cristianberneanu · 2021-06-09T16:23:21Z

I say we do the CSV parsing only in the backend/reference tool.
To load the initial raw data (including the schema) the frontend could issue a standard SELECT * FROM 'file_name' query.

cristianberneanu · 2021-06-11T10:22:10Z

This seems settled (at least for now).

cristianberneanu closed this as completed Jun 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frontend <> Backend communication #20

Frontend <> Backend communication #20

cristianberneanu commented Jun 9, 2021

sebastian commented Jun 9, 2021

sebastian commented Jun 9, 2021 •

edited

Loading

edongashi commented Jun 9, 2021

sebastian commented Jun 9, 2021

cristianberneanu commented Jun 9, 2021

cristianberneanu commented Jun 11, 2021 •

edited

Loading

Frontend <> Backend communication #20

Frontend <> Backend communication #20

Comments

cristianberneanu commented Jun 9, 2021

Option 1: anonymize using the CLI.

Option 2: anonymize using IPC.

sebastian commented Jun 9, 2021

sebastian commented Jun 9, 2021 • edited Loading

edongashi commented Jun 9, 2021

sebastian commented Jun 9, 2021

cristianberneanu commented Jun 9, 2021

cristianberneanu commented Jun 11, 2021 • edited Loading

sebastian commented Jun 9, 2021 •

edited

Loading

cristianberneanu commented Jun 11, 2021 •

edited

Loading