Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much faster FormatStringEncoding #5315

Merged
merged 5 commits into from
Nov 10, 2022
Merged

Much faster FormatStringEncoding #5315

merged 5 commits into from
Nov 10, 2022

Conversation

jeylau
Copy link
Contributor

@jeylau jeylau commented Nov 8, 2022

Description

Loading a large pose estimation data file with a simple "{id}–{label}" format (using the napari-deeplabcut plugin; resulting feature table with >6M rows and four columns) lasted 3+ min.
While I first thought it had something to do with rendering the keypoints, a bit of profiling (see pink lines below) indicated that 92% of the time it took to load the annotations (177 s!) was spent in the TextManager, and specifically in _get_feature_row.

Screenshot 2022-11-08 at 6 12 17 PM

Screenshot 2022-11-08 at 6 12 30 PM

I substituted df.iloc with df.itertuples, and loading now takes only roughly 5 s (~35x speedup).

Type of change

  • New feature (non-breaking change which adds functionality)

References

How has this been tested?

  • example: the test suite for my feature covers cases x, y, and z
  • example: all tests pass with my change
  • example: I check if my changes works with both PySide and PyQt backends
    as there are small differences between the two Qt bindings.

Final checklist:

  • My PR is the minimum possible work for the desired functionality
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • If I included new strings, I have used trans. to make them localizable.
    For more information see our translations guide.

@andy-sweet
Copy link
Member

Thanks for the contribution and detailed write up! And sorry about the performance problems! I should be able to take a look either today or tomorrow.

Copy link
Member

@andy-sweet andy-sweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic speedup here! Approving with just a minor comment about a variable name.

One thing to watch out for is that when Layer.features is present, it will not necessarily be a pandas DataFrame in the future. Currently we always coerce it to be exactly a pandas DataFrame. But in the future we may want to support other dataframe libraries (dask dataframe, cudf) using a Protocol.

I've currently done a half-baked job of indicating that with the docstring and typing. But ideally, we can define exactly what attributes and methods we expect to find in a dataframe Protocol.

Previously we needed the dataframe to define iloc and support len, but now we need itertuples. I checked dask's and cudf's dataframe APIs and they both provide itertuples, so there's no problem here.

napari/layers/utils/string_encoding.py Outdated Show resolved Hide resolved
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
@andy-sweet
Copy link
Member

Also, if you have the profile data or even just a screenshot of snakeviz for the optimized version, that would be ideal. I'm curious what takes up the rest of that 5s.

@jeylau
Copy link
Contributor Author

jeylau commented Nov 8, 2022

I was unclear above; the 5 s are spent in the TextManager. The ColorManager asks for 9 other seconds, but this may be something I have to improve at the level of the plugin; I'll give it a look.

Screenshot 2022-11-08 at 8 52 38 PM

@andy-sweet
Copy link
Member

I was unclear above; the 5 s are spent in the TextManager. The ColorManager asks for 9 other seconds, but this may be something I have to improve at the level of the plugin; I'll give it a look.

Thanks for clarifying. The optimization here should just go in as is. There was an effort to replace ColorManager which is currently on hold, but if we pick it back up, we might be able to look for an optimization. You should also feel free to look for one too.

@andy-sweet
Copy link
Member

Also, FYI, I ran the relevant ASV benchmark for the format string case and we also see a big speedup there. At least 10x on my machine, though we only go up to 65536 (2^16) elements and timings are not linear.

(napari-dev) ➜  napari git:(pr/jeylau/5315) ✗ asv run --python=same --bench "TextManagerSuite.time_create"
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Benchmarking existing-py_Users_asweet_software_miniconda3_envs_napari-dev_bin_python
[ 50.00%] ··· Running (benchmark_text_manager.TextManagerSuite.time_create--).
[100.00%] ··· benchmark_text_manager.TextManagerSuite.time_create                                                                                                                                                                                                                                                                                                       ok
[100.00%] ··· ======= =========================================
              --                        string                 
              ------- -----------------------------------------
                 n     {string_property}: {float_property:.2f} 
              ======= =========================================
                 16                    829±50μs                
                 64                    855±20μs                
                256                  1.14±0.06ms               
                1024                 2.23±0.05ms               
                4096                 6.15±0.06ms               
               16384                  21.5±0.2ms               
               65536                  84.9±0.9ms               
              ======= =========================================

(napari-dev) ➜  napari git:(pr/jeylau/5315) ✗ git switch main          
M	napari/benchmarks/benchmark_text_manager.py
Switched to branch 'main'
Your branch is ahead of 'origin/main' by 4 commits.
  (use "git push" to publish your local commits)
(napari-dev) ➜  napari git:(main) ✗ asv run --python=same --bench "TextManagerSuite.time_create"
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Benchmarking existing-py_Users_asweet_software_miniconda3_envs_napari-dev_bin_python
[ 50.00%] ··· Running (benchmark_text_manager.TextManagerSuite.time_create--).
[100.00%] ··· benchmark_text_manager.TextManagerSuite.time_create                                                                                                                                                                                                                                                                                                       ok
[100.00%] ··· ======= =========================================
              --                        string                 
              ------- -----------------------------------------
                 n     {string_property}: {float_property:.2f} 
              ======= =========================================
                 16                   1.04±0.1ms               
                 64                  1.87±0.08ms               
                256                   5.81±0.6ms               
                1024                   19.9±2ms                
                4096                   78.9±8ms                
               16384                   290±7ms                 
               65536                  1.15±0.01s               
              ======= =========================================

@andy-sweet andy-sweet added the performance Relates to performance label Nov 8, 2022
@jeylau
Copy link
Contributor Author

jeylau commented Nov 8, 2022

Sweet!

@andy-sweet
Copy link
Member

I'll merge this after 48 hours unless anyone objects.

Copy link
Contributor

@brisvag brisvag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Small suggestion, but otherwise approving.

Comment on lines +168 to 172
feature_names = features.columns.to_list()
values = [
self.format.format(**_get_feature_row(features, i))
for i in range(len(features))
self.format.format(**dict(zip(feature_names, row)))
for row in features.itertuples(index=False, name=None)
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a comment explaining the code here. Before it had at least the name of the function explaining something, but now it's rather cryptic.

@JoOkuma
Copy link
Contributor

JoOkuma commented Nov 9, 2022

The functionality is very similar to pd.Dataframe.to_dict("records"), it does essentially the same thing with a few extra checks.
This PR is still 2x faster than to_dict("records").

This PR:

[100.00%] ··· ======= =========================================
              --
              ------- -----------------------------------------
                 n     {string_property}: {float_property:.2f}
              ======= =========================================
                 16                    676±40μs
                 64                    864±80μs
                256                  1.02±0.05ms
                1024                  2.00±0.1ms
                4096                  5.67±0.1ms
               16384                  19.6±0.2ms
               65536                   78.6±2ms
              ======= =========================================

with values = [self.format.format(**row) for row in features.to_dict("records")]

[100.00%] ··· ======= =========================================
             --
             ------- -----------------------------------------
                n     {string_property}: {float_property:.2f}
             ======= =========================================
                16                    759±30μs
                64                    803±30μs
               256                  1.17±0.05ms
               1024                 2.90±0.03ms
               4096                  9.54±0.3ms
              16384                  34.4±0.4ms
              65536                   144±4ms
             ======= =========================================

@jeylau
Copy link
Contributor Author

jeylau commented Nov 9, 2022

@JoOkuma, and df.itertuples memory footprint will also be very low, as it returns an iterator.

@beckernick
Copy link

Hi! I came across this PR due to the cuDF mention. Performance gains look fantastic with this change!

I wanted to add some context about cuDF and itertuples:

Previously we needed the dataframe to define iloc and support len, but now we need itertuples. I checked dask's and cudf's dataframe APIs and they both provide itertuples, so there's no problem here.

cuDF doesn't support this kind of row based iteration via itertuples or itterrows. Iterating one row at a time from raw Python will cause a GPU->CPU transfer of one row at time, which is very inefficient due to the transfer overhead. If there's interest in cuDF support in the future, it may be fine to jump back and forth between cuDF and pandas here, as one bulk transfer of pdf = gdf.to_pandas(); ...; gdf = cudf.from_pandas(...) will not be as slow.

I also wanted to mention that cuDF is adding support for DataFrame.to_dict, which will essentially do the pandas conversion for you and then call pd.DataFrame.to_dict. to_dict provides a similar "dictionary of every row in a list" output.

import cudf

df = cudf.datasets.randomdata(nrows=3)
pdf = df.to_pandas()
print(pdf.to_dict(orient="records"))

# from the function in the PR
feature_names = pdf.columns.to_list()
print([dict(zip(feature_names, row)) for row in pdf.itertuples(index=False, name=None)])
[{'id': 1023, 'x': -0.3437607026238143, 'y': 0.4419788553645101}, {'id': 1012, 'x': 0.2760312523846038, 'y': 0.8273451449034162}, {'id': 1013, 'x': -0.35755297004454434, 'y': -0.13542889747873632}]
[{'id': 1023, 'x': -0.3437607026238143, 'y': 0.4419788553645101}, {'id': 1012, 'x': 0.2760312523846038, 'y': 0.8273451449034162}, {'id': 1013, 'x': -0.35755297004454434, 'y': -0.13542889747873632}]

Happy to chat further if helpful.

@andy-sweet
Copy link
Member

cuDF doesn't support this kind of row based iteration via itertuples or itterrows. Iterating one row at a time from raw Python will cause a GPU->CPU transfer of one row at time, which is very inefficient due to the transfer overhead. If there's interest in cuDF support in the future, it may be fine to jump back and forth between cuDF and pandas here, as one bulk transfer of pdf = gdf.to_pandas(); ...; gdf = cudf.from_pandas(...) will not be as slow.

Thanks for the very useful information!

napari doesn't yet support different types of tables/dataframes, but we can at least imagine that it could in the not too distant future. Will definitely reference this information then and may pick your brains more. I imagine support would look something like defining a core functional protocol and maybe do some specialized implementation for specific types (e.g. here doing something special if it's a cuDF dataframe).

@exactlyallan
Copy link

FYI RAPIDS viz works closely with HoloViews and have done a similar cuDF implementations with success, for example.

@andy-sweet
Copy link
Member

andy-sweet commented Nov 10, 2022

Merging after 24 but before 48 hours since we have multiple approvals here.

@andy-sweet andy-sweet merged commit cd5c314 into napari:main Nov 10, 2022
@jeylau jeylau deleted the fast_format_encoding branch November 11, 2022 11:30
@Czaki Czaki mentioned this pull request Jun 7, 2023
@Czaki Czaki added this to the 0.4.18 milestone Jun 13, 2023
Czaki pushed a commit that referenced this pull request Jun 16, 2023
* Much faster iteration over a DataFrame's rows

* Remove unused import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update napari/layers/utils/string_encoding.py

Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
Czaki pushed a commit that referenced this pull request Jun 17, 2023
* Much faster iteration over a DataFrame's rows

* Remove unused import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update napari/layers/utils/string_encoding.py

Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
Czaki pushed a commit that referenced this pull request Jun 18, 2023
* Much faster iteration over a DataFrame's rows

* Remove unused import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update napari/layers/utils/string_encoding.py

Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
Czaki pushed a commit that referenced this pull request Jun 19, 2023
* Much faster iteration over a DataFrame's rows

* Remove unused import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update napari/layers/utils/string_encoding.py

Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
Czaki pushed a commit that referenced this pull request Jun 21, 2023
* Much faster iteration over a DataFrame's rows

* Remove unused import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update napari/layers/utils/string_encoding.py

Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
Czaki pushed a commit that referenced this pull request Jun 21, 2023
* Much faster iteration over a DataFrame's rows

* Remove unused import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update napari/layers/utils/string_encoding.py

Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
Czaki pushed a commit that referenced this pull request Jun 21, 2023
* Much faster iteration over a DataFrame's rows

* Remove unused import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update napari/layers/utils/string_encoding.py

Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement performance Relates to performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants