Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add color support to charts, and refactor things a bit. #7022

Merged
merged 32 commits into from Aug 12, 2023

Conversation

sfc-gh-tteixeira
Copy link
Contributor

@sfc-gh-tteixeira sfc-gh-tteixeira commented Jul 16, 2023

Describe your changes

Add support for color argument in charts. As part of this work, I also refactored how charts work a bit, in order to prepare the land for st.scatter_chart, which is coming in a subsequent PR.

This is PR 3 in the work to:

  • Add latitude, longitude, color, and size to to st.map. This was PR 2.
  • Add a color argument to our native charts 👈👈👈 THAT'S THIS PR!
  • Add st.scatter_chart. This is coming next! It's going to be a short one, don't worry ;)

(PR 1 was a no-op that added the color_util.py library, which is used in this and the next PRs)

What else this does:

  • Refactors the heart of Streamlit's chart generation code, which had grown quite messy. Since this PR and the next would only make the code even more messy, I felt it would have been worse to simply tack-on the new code. The biggest change is in _generate_chart(), and next is _maybe_melt(). The good news is the new code is much more readable 😉
  • Fixes MyPy typing all over the code. I couldn't avoid doing this, or the PR wouldn't pass tests.
  • Improves how we name columns that are automatically created by our code, to reduce chances of a collision with existing columns. So now we use "random" names, and hide them from the viewer using Vega-Lite's title attributes.
  • Makes add_rows do the exact same data pre-processing as _generate_chart. This is just the right thing, and may even address some undiscovered bugs.
  • Adds lots of tests we were missing.

How to review this

The main file you need to understand is lib/streamlit/elements/arrow_altair.py, in particular the _generate_chart() function. It's a bit of a behemoth. Luckily, everything else (especially outside that file) is just supporting code that should be quite straight-forward to understand once you grok that function.

GitHub Issue Link (if applicable)

n/a

Testing Plan

  • N/A - Explanation of why no additional tests are needed

  • ✅ Unit Tests (JS and/or Python)

  • ✅ E2E Tests

  • Any manual testing needed?

    Yes. I'll request this from the QA team soon.


Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

@@ -305,14 +303,7 @@ def generate_chart(chart_type, data, width: int = 0, height: int = 0):
data = {"": []}

if isinstance(data, pa.Table):
raise errors.StreamlitAPIException(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to its own error type. See bottom of the new code.

lib/streamlit/elements/arrow_altair.py Fixed Show fixed Hide fixed
lib/streamlit/elements/arrow_altair.py Fixed Show fixed Hide fixed
lib/streamlit/elements/arrow_altair.py Fixed Show fixed Hide fixed
lib/streamlit/elements/arrow_altair.py Fixed Show fixed Hide fixed
lib/streamlit/elements/dataframe_selector.py Fixed Show fixed Hide fixed
lib/streamlit/type_util.py Dismissed Show resolved Hide resolved
lib/streamlit/type_util.py Dismissed Show resolved Hide resolved
@@ -558,16 +597,16 @@ def convert_anything_to_df(


@overload
def ensure_iterable(obj: Iterable[V_co]) -> Iterable[V_co]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just me fixing some pre-existing MyPy typing problems that were blocking this PR from passing tests.

else:
return _melt_data(df=data, last_index=last_index)
out_data, *_ = prep_data(df, **add_rows_metadata.columns)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properly handle the data in add_rows() by calling here the exact same prep_data step we use in _generate_chart()

MELTED_Y_COLUMN_NAME = MELTED_Y_COLUMN_TITLE + PROTECTION_SUFFIX
MELTED_COLOR_COLUMN_NAME = MELTED_COLOR_COLUMN_TITLE + PROTECTION_SUFFIX

# Name we use for a column we know doesn't exist in the data, to address a Vega-Lite rendering bug
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a link to this vega-lite bug?

@willhuang1997
Copy link
Collaborator

This PR is pretty long so will review again later tonight or tomorrow morning.

Also Cypress test (4) is failing and I tried rerunning it. Can you fix that?

@sfc-gh-tteixeira
Copy link
Contributor Author

PR ready! I just sent it out for QA. Waiting for results before I can merge.

@kasim-inan kasim-inan added the QA-Needed QA is Required label Jul 31, 2023
@kasim-inan kasim-inan added QA-Done QA is Complete and removed QA-Needed QA is Required labels Aug 11, 2023
@sfc-gh-tteixeira sfc-gh-tteixeira merged commit 8efcfa6 into streamlit:develop Aug 12, 2023
48 checks passed
@sfc-gh-tteixeira sfc-gh-tteixeira deleted the chart_colors branch August 12, 2023 01:22
@sfc-gh-tteixeira sfc-gh-tteixeira mentioned this pull request Aug 12, 2023
@sfc-gh-tteixeira sfc-gh-tteixeira restored the chart_colors branch August 19, 2023 00:02
@sfc-gh-tteixeira sfc-gh-tteixeira deleted the chart_colors branch August 19, 2023 00:02
eric-skydio pushed a commit to eric-skydio/streamlit that referenced this pull request Dec 20, 2023
* Add color support to charts, and refactor things a bit.

* Fix tests and types.

* Fix types in type_util

* Removin unnecessary DF cloning, and add test to ensure we're not mutating the orig DF.

* Fix tests in latest Pandas and Arrow.

* Revert st_arrow_add_rows test to avoid too many changes in this PR. Will land in another PR.

* Remove snaps that should be regenerated.

* Don't show labels when no x or y specified (to match existing behavior).

* Update snapshots (need to regenerate)

* Fix bug where x axis was sometimes quantitative rather than temporal.

* Fix x enc when no x_column.

* Fix tests.

* Revert to melting data by hand since Snowflake doesn't include Altair 4.2

* Improve error messages.

* Fix MyPy types

* Update snapshots

* Fix empty check

* Revert setup.py

* Remove dead code.

* simple rename

* Remove more dead code

* Address comments.

* Fix snapshots

* Fix stupid copy/paste bug >_<

* Small cleanup. Always use str() around dataset ids. Should be a no-op since JSON stringifies keys anyways, but this is cleaner.
zyxue pushed a commit to zyxue/streamlit that referenced this pull request Mar 22, 2024
* Add color support to charts, and refactor things a bit.

* Fix tests and types.

* Fix types in type_util

* Removin unnecessary DF cloning, and add test to ensure we're not mutating the orig DF.

* Fix tests in latest Pandas and Arrow.

* Revert st_arrow_add_rows test to avoid too many changes in this PR. Will land in another PR.

* Remove snaps that should be regenerated.

* Don't show labels when no x or y specified (to match existing behavior).

* Update snapshots (need to regenerate)

* Fix bug where x axis was sometimes quantitative rather than temporal.

* Fix x enc when no x_column.

* Fix tests.

* Revert to melting data by hand since Snowflake doesn't include Altair 4.2

* Improve error messages.

* Fix MyPy types

* Update snapshots

* Fix empty check

* Revert setup.py

* Remove dead code.

* simple rename

* Remove more dead code

* Address comments.

* Fix snapshots

* Fix stupid copy/paste bug >_<

* Small cleanup. Always use str() around dataset ids. Should be a no-op since JSON stringifies keys anyways, but this is cleaner.
zyxue pushed a commit to zyxue/streamlit that referenced this pull request Apr 16, 2024
* Add color support to charts, and refactor things a bit.

* Fix tests and types.

* Fix types in type_util

* Removin unnecessary DF cloning, and add test to ensure we're not mutating the orig DF.

* Fix tests in latest Pandas and Arrow.

* Revert st_arrow_add_rows test to avoid too many changes in this PR. Will land in another PR.

* Remove snaps that should be regenerated.

* Don't show labels when no x or y specified (to match existing behavior).

* Update snapshots (need to regenerate)

* Fix bug where x axis was sometimes quantitative rather than temporal.

* Fix x enc when no x_column.

* Fix tests.

* Revert to melting data by hand since Snowflake doesn't include Altair 4.2

* Improve error messages.

* Fix MyPy types

* Update snapshots

* Fix empty check

* Revert setup.py

* Remove dead code.

* simple rename

* Remove more dead code

* Address comments.

* Fix snapshots

* Fix stupid copy/paste bug >_<

* Small cleanup. Always use str() around dataset ids. Should be a no-op since JSON stringifies keys anyways, but this is cleaner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants