Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove JS Zed Query Parser #2972

Merged
merged 47 commits into from
Mar 27, 2024
Merged

Remove JS Zed Query Parser #2972

merged 47 commits into from
Mar 27, 2024

Conversation

jameskerr
Copy link
Member

@jameskerr jameskerr commented Jan 17, 2024

Overview

Removes our dependency on the JS version of the Zed Query Parser.

Fixes #2843

Behavioral Changes:

  • Infinite Scrolling works on non-summarized results
  • Infinite Scrolling does not work on summarized results
  • Summarized results only return the first 500 rows

Technical Changes

  • Query Info is fetched from the current lake's compile endpoint asyncronously and saved in the store.
  • Removed LIMIT status from results as well as aggregation and aggregationLimit values
  • Created AsyncTasks domain model
  • Refactored runHistogramQuery to use AsyncTasks

Side Effects

  • Zui Tests are more robust by waiting for the results associated with the app's current location key which is now rendered in a data-attribute.
  • Major re-organization of the zed-wasm repo and testing frameworks. Tests run on CI now
  • Removed unused components/status-bar folder and containing files

Tradeoffs
* A query is always pushed into history, even if there is an error in the query text.

Do Later

@mattnibs
Copy link
Contributor

@jameskerr I think we should back out the changes to models/Program then merge this.

test('parse a simple zed string', async ({ page }) => {
await page.fill('input', 'zed := "world"');
await page.click('button');
await expect(page.locator('pre')).toContainText('"kind": "OpAssignment"');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the zed side of things we avoid not having tests that introspect the contents of the ast since this is often subject to change. I think I'd just have a test here that looked for a parser error and verifies the error message- something like put x :=.

@jameskerr
Copy link
Member Author

I've updated the display of parsing errors.

CleanShot 2024-02-26 at 11 01 05@2x

@philrz

This comment was marked as resolved.

@philrz

This comment was marked as resolved.

@mattnibs
Copy link
Contributor

@jameskerr re @philrz's last comment. Instead of using the error message field returned by compile endpoint, I was thinking we could use info field from the response error message e.g.:
"info":{"parse_error_offset":6}

It tells you where in the source the parse error occurred and we could pass that to Monaco and have it do the error message natively.

@philrz
Copy link
Contributor

philrz commented Feb 26, 2024

When going over this PR on a Zoom earlier with @jameskerr I drilled down a bit on how this branch currently leverages:

...the "/compile" endpoint from the default lake which runs locally in Zui.

An ongoing concern I've had for some time is that as more users start to use remote Zed lakes they may encounter difficult-to-debug problems when their local Zui and the remote Zed service have some kind of language/AST incompatibilities. I'm sure there's multiple ways we'll go about addressing this over time (e.g., I assume stricter API versioning could help here) but I've always got my eyes open to ways we can improve incrementally in the meantime.

As @jameskerr confirmed, up until this change Zui has always depended on the local Zed parser that shipped with the app, so relying on the /compile endpoint of the Zed that shipped with the app would not be making things worse. However, it seems like we have an opportunity now to use the /compile endpoint for a remote lake when one is in use, since this has the potential to catch some of the difficult-to-debug problems I cited above. In brainstorming about possible down sides to doing this, the increased latency of calling /compile over the network (which might be a WAN) was the one we could think of, so I set out to quantify this a bit.

I set up an EC2 instance in AWS us-east-2, started a Zed lake service on it, and connected to it from my home desktop in SF. I made a branch based on the one in this PR to which I added a patch @jameskerr provided to call out to /compile on whatever lake the user is currently connected to (diffs) as well as a timer to log how long the parsing step takes. After loading some sample data to both the local and remote lakes, I set out to test the experience with needing to parse some simple Zed (just count()) and some sophisticated Zed (the attached 164-line shaper sample.zed.gz).

The summary of the findings:

Lake location count() sample.zed
Local ~4 ms ~75 ms
Remote ~150 ms ~300 ms

The video shows the user experience in terms of visible delay in the app:

Demo.mp4

My conclusion: I think the change is worth making. Yes, speed is important, but I'd argue that correctness in catching/reporting errors is even more important, especially as we start adding more features over time that actually might change the contents of a pool. In terms of how I imagine remote lakes will be used, I'm guessing what I'm doing here with accessing a lake on the other side of the country will be less common, and users that do it will likely become accustomed to other unavoidable delays. I suspect more remote lakes will be "local" in nature, e.g., like our community zync users that I think are all in the same geographic region. Finally, if the observed latency gets to be a problem I could picture fancy optimizations we could do, e.g., have Zui compare what's returned by the /version endpoint in both the local and remote lakes and if they match it could use the local /compile for parsing (though I'm not advocating doing any of that right now).

@mattnibs: Do you have any opinions to add here?

@jameskerr
Copy link
Member Author

I am fine with using the current lake for the /compile endpoint. I can change the order of some of these operations to improve the speed if I know we are going to loose between 150 and 300 milliseconds on parsing.

Also, would it be possible (or does this already happen?) to return the parsing error message and location information from the /query endpoint when you submit a bad query?

I could also parse the query async to improve speed. The only reason for parsing first is to prevent adding to the history stack an invalid query entry. However, I could be hopeful that it will work, and if it does not, remove the entry from the history stack.

@jameskerr jameskerr changed the title WASM Parser Remove JS Zed Query Parser Mar 5, 2024
@philrz

This comment was marked as resolved.

@philrz

This comment was marked as resolved.

@philrz

This comment was marked as resolved.

@@ -81,13 +81,29 @@ _Main Process Initializers_

Code that needs to be run one time before the app starts up can be put in an initializer. An initializer is a file that lives in the folder `src/electron/initializers/`. It must export a function named _initialize(main)_ that takes the Main Object as its only argument. See the FAQ for an example of creating a new initializer.

_Query Session_

This is a type of page that a tab can hold it the app. The page contains an editor pane, a results pane, and an inspector with various tabs related to querying data. These are session history, global history, data details, columns, and more.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This is a type of page that a tab can hold it the app. The page contains an editor pane, a results pane, and an inspector with various tabs related to querying data. These are session history, global history, data details, columns, and more.
This is a type of page that a tab can hold in the app. The page contains an editor pane, a results pane, and an inspector with various tabs related to querying data. These are session history, global history, data details, columns, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Intermittent failure of pool-loads.spec.ts: ".zed-table__cell" resolved to 2 elements
3 participants