Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Massively reduce time spent parsing dataframes by Quiver #4463

Merged
merged 31 commits into from
Mar 10, 2022

Conversation

AnOctopus
Copy link
Contributor

@AnOctopus AnOctopus commented Mar 1, 2022

馃摎 Context

  • What kind of change does this PR introduce?

    • Bugfix
    • Feature
    • Refactoring
    • Other, please describe:
      performance improvement

馃 Description of Changes

  • This is a breaking API change
  • This is a visible (user-facing) change

馃И Testing Done

  • Screenshots included
  • Added/Updated unit tests
  • Added/Updated e2e tests

In my testing, a 150MB dataframe goes from taking 3s to process on the frontend down to 125ms, which is primarily the time spent by arrow turning the byte array into a table.

@AnOctopus AnOctopus requested a review from tconkling March 1, 2022 16:58
@AnOctopus AnOctopus requested a review from jroes as a code owner March 1, 2022 16:58
@AnOctopus AnOctopus changed the title Perf/quiver direct arrow [Perf] Massively reduce time spent parsing dataframes by Quiver Mar 1, 2022
Copy link
Contributor

@tconkling tconkling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good (to my very untrained eyes)! Some additional documentation would be great - I added a few notes where I was confused while reading the code.

frontend/jsdom-polyfill-env.js Outdated Show resolved Hide resolved
frontend/src/lib/Quiver.ts Outdated Show resolved Hide resolved
frontend/src/lib/Quiver.ts Outdated Show resolved Hide resolved
frontend/src/lib/Quiver.ts Show resolved Hide resolved
@AnOctopus AnOctopus requested a review from kantuni March 8, 2022 17:39
Copy link
Contributor

@jroes jroes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved for open source licensing perspective. No net new dependencies, just changing versions.

Copy link
Collaborator

@kantuni kantuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 馃憤

Can you please also include the performance measurements in the PR description?

@AnOctopus AnOctopus merged commit 02cbb22 into streamlit:develop Mar 10, 2022
@AnOctopus AnOctopus deleted the perf/quiver-direct-arrow branch March 10, 2022 18:12
tconkling added a commit to tconkling/streamlit that referenced this pull request Mar 14, 2022
* develop:
  [Perf] Massively reduce time spent parsing dataframes by Quiver (streamlit#4463)
  Remove legacy "`add_rows` coalescing" from ForwardMsgQueue (streamlit#4485)
tconkling added a commit that referenced this pull request Mar 14, 2022
* develop:
  [Perf] Massively reduce time spent parsing dataframes by Quiver (#4463)
  Remove legacy "`add_rows` coalescing" from ForwardMsgQueue (#4485)
@jeffreyrwatkins
Copy link

This is a really nice improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants