Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster workspace execution #220

Merged
merged 11 commits into from Feb 3, 2022
Merged

Faster workspace execution #220

merged 11 commits into from Feb 3, 2022

Conversation

darabos
Copy link
Contributor

@darabos darabos commented Feb 2, 2022

I'm addressing multiple issues here:

  • Persistent outputs are slower because they write to disk. It's just a bad default.
  • The default LIMIT is annoying and no longer necessary if we don't persist the table. (Remove "limit 10" from the default SQL queries #151)
  • Each SQL box created 6 VertexToEdgeAttribute operations per input vertex attribute. With 100 attributes and 20 SQL boxes this was 12,000 operations on every setAndGetWorkspace request. That takes around 3 seconds. That's long enough that you start clicking on more things, creating an increasing backlog of requests. In my test I got 20–30 second response times easily.

Screenshot 2022-02-01 at 19 31 43

For the third issue 4 of the 6 VertexToEdgeAttribute operations were unnecessarily caused by the defaultTableName code. That's easy to fix. But the other two are unavoidable if we execute the box.

So I've added a "BoxCache" to avoid unnecessary box executions. This is a huge improvement. Even if you only moved a box, we used to execute all the boxes in the workspace. But with the cache we don't execute anything and it's super fast. When you change something, we only execute that one box.

I went with this universal solution rather than try to improve the SQL box code, because it's not the only box that scales linearly with the number of attributes. E.g. a filter box will pull over all attributes. It's a bit faster than the SQL box, but I still saw 1–2 second latencies with 100 attributes and 20 filter boxes. The cache fixes this for all boxes.

Do you see any potential issues with it?

Copy link
Contributor

@xandrew xandrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool! I like that you don't try to rebuild a new layer of lineage based caching!

@darabos darabos changed the title Faster SQL boxes Faster workspace execution Feb 3, 2022
@darabos
Copy link
Contributor Author

darabos commented Feb 3, 2022

Thanks!

@darabos darabos merged commit d9f9b19 into main Feb 3, 2022
@darabos darabos deleted the darabos-faster-sql branch February 3, 2022 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants