Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapper visualisation refactor: fix bugs, add summary statistics in hovertext, improve opacity, remove matplotlib dependency, add node_scale kwarg, add clone_pipeline kwarg to interactive plots,restructure/rename plotly_kwargs, improve code #406

Merged
merged 40 commits into from
May 31, 2020

Conversation

ulupo
Copy link
Collaborator

@ulupo ulupo commented May 12, 2020

Reference issues/PRs
Fixes #399, #403 and #404 (thanks @miltminz for reporting these). It is an alternative to #405 by @MikeG112, developed in parallel to that solution. It also contains an implementation of a part of #382.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Description
This is a thorough refactoring of gtda.mapper, It achieves the following:

  • It makes the code clearer and shorter, and removes several redundancies.
  • It fixes issues [BUG] Mapper colormap for large range variables #399 and Scaled color_variable in plot_static_mapper #403 concerning node coloring. Concerning [BUG] Mapper colormap for large range variables #399, there was a simple bug in the code responsible to scaling between 0 and 1. However, [0, 1]-scaling has been completely removed from the code. This is because plotly colorscales are capable of handling arbitrary floats. Concerning Scaled color_variable in plot_static_mapper #403, the color bar now always shows the true range, without rescaling to [0, 1].
  • It adds the node summary statistics data to the node hovertext, rounding it to a user-specified number of significant figures controlled by a new parameter n_sig_figs. This addresses feature request Display value of colouring attribute on nodes of plot_static_mapper_graph #404.
  • It renames plotly_kwargs to plotly_params, slightly changes the specifications for it, and slightly changes the logic for updating plotly figures according to this parameter.
  • It removes the need for matplotlib's rgb2hex function and hence the need for the matplotlib requirement in giotto-tda altogether. This is achieved via a new helper function _get_colors_for_vals performing colorscale-based interpolations.
  • It adds a node_scale kwarg to the static and interactive plotting functions and makes set_node_sizeref private.
  • It adds a clone_pipeline kwarg to plot_interactive_mapper_graph, overlapping with [WIP] Add option to modify the mapper pipeline #382.
  • It improves the overall look of the Mapper graph, particularly by increasing the opacity of node colors so that edges do not hide them.

Checklist

  • I have read the guidelines for contributing.
  • My code follows the code style of this project. I used flake8 to check my Python changes.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed. I used pytest to check this on Python tests.

@ulupo ulupo requested a review from wreise May 12, 2020 08:53
@ulupo ulupo marked this pull request as draft May 12, 2020 08:53
@ulupo ulupo added the bug Something isn't working label May 12, 2020
@ulupo ulupo linked an issue May 12, 2020 that may be closed by this pull request
@ulupo ulupo changed the title WIP: Refactor mapper submodules bug fixing and code cleanliness WIP: Refactor mapper submodules for bug fixing and code cleanliness May 12, 2020
@ulupo
Copy link
Collaborator Author

ulupo commented May 12, 2020

@MonkeyBreaker, as perhaps you expected the Azure C++ errors we say in pyflagser seem to have finally appeared in the CI for this project too.

@ulupo ulupo changed the title WIP: Refactor mapper submodules for bug fixing and code cleanliness WIP: Refactor mapper visualisation submodules for bug fixing and code cleanliness May 12, 2020
@MonkeyBreaker
Copy link
Collaborator

Looking at the error, It actually seems to be same same problem as in pyflagser.

We could apply the same fix set(CMAKE_CXX_STANDARD 14) into CMakeLists.txt.
Would you want me to do a PR or can we integrate the fix into this PR ?

@ulupo
Copy link
Collaborator Author

ulupo commented May 12, 2020

Would you want me to do a PR or can we integrate the fix into this PR ?

Thanks for the quick reply. I think it would be better to place the fix in another PR. Thanks!

@MonkeyBreaker
Copy link
Collaborator

You want to do it or do I proceed ?

@ulupo ulupo linked an issue May 12, 2020 that may be closed by this pull request
@ulupo
Copy link
Collaborator Author

ulupo commented May 12, 2020

@MonkeyBreaker, feel free to proceed!

@ulupo ulupo self-assigned this May 12, 2020
Copy link
Collaborator

@wreise wreise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (i tested it briefly from a user perspective), just two things:

  1. The plotly arguments were useful, but maybe they need to be compromised,
  2. Is it possible to format the strings in summary statstics to display only a few digits?

@ulupo
Copy link
Collaborator Author

ulupo commented May 15, 2020

Thanks @wreise!

Is it possible to format the strings in summary statstics to display only a few digits?

I thought about it, my issue with that is this: imagine all numbers are in some small range, e.g. (0, 0.0001). If you simply used number of decimals and e.g. set it to 3, all numbers would be shown as 0.

@ulupo
Copy link
Collaborator Author

ulupo commented May 15, 2020

@wreise and @miltminz, I've now implemented a rounding which defaults to 3 significant figures (not decimals, to avoid the kind of issue I mentioned above).

Umberto added 2 commits May 18, 2020 01:09
…down

- Minor code improvements
- Revert relative orders of node_trace and edge_trace
- Docstring improvements
- Linting
@ulupo
Copy link
Collaborator Author

ulupo commented May 21, 2020

Thanks @lewtun, I see no strange behaviour in the 3D plots, but I haven't tested extensively. But e.g. I haven't observed situations where everything becomes monochromatic. Do you think you could reconstruct a relevant example?

Edit: I misread "3D" for "interactive". I don't think I have checked the situation in 3D at all actually. Thanks, I'll look now.

Copy link
Collaborator

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

gtda/mapper/pipeline.py Outdated Show resolved Hide resolved
@lewtun
Copy link
Collaborator

lewtun commented May 21, 2020

sure, let me checkout this branch and see if i can break it 😈

@lewtun
Copy link
Collaborator

lewtun commented May 21, 2020

Screen Shot 2020-05-21 at 17 45 19

Unfortunately the rgb2hex really seems to be necessary for the 3D plots 😭. Check out the screenshot, where the color of the tooltip (i.e. info box) does not match the corresponding node.

For reference, here's what the expected behaviour is like on master:

Screen Shot 2020-05-21 at 17 49 50

@ulupo
Copy link
Collaborator Author

ulupo commented May 21, 2020

Hmm, I see. Let me think about whether I can find a workaround. In any case, the argument for having the info box the same colour as the node is a wee bit weaker now that the statistic is displayed in the box.

@lewtun
Copy link
Collaborator

lewtun commented May 21, 2020

Good point! If you can't find a neat workaround, I suggest we use your new changes since the info box colour appears to be constant anyway, so less likely to cause confusion.

@ulupo
Copy link
Collaborator Author

ulupo commented May 22, 2020

I wasn't able to find a workaround. I get the feeling that the actual interpolation according to the colorscale (which we achieve by hand via get_cmap) is done in javascript (i.e. by plotly.js). I asked for help in https://community.plotly.com/t/hover-background-color-on-scatter-3d/9185/5?u=umberto.lupo.

Umberto added 3 commits May 25, 2020 17:58
… to plot_interactive_mapper_graph

- Update docstrings
- Update quickstart notebook
…of plotly_params

Static and interactive plots display the expect hoverlabel colors in 3D, or white if things go wrong. Thanks to @lewtun for pointing out that this was needed.
@ulupo
Copy link
Collaborator Author

ulupo commented May 25, 2020

@lewtun the latest commits bring a speedy vectorised version of the hoverlabel color calculator back on the table. Feel free to try to break it again!

@ulupo
Copy link
Collaborator Author

ulupo commented May 25, 2020

@wreise additionally, I introduced a version of #382 in 018392f and it seems to work. But I can remove it if you think it's not going to work. Otherwise, would you object if #382 continued being about testing this behaviour?

@ulupo ulupo changed the title Mapper visualisation refactor: fix bugs, add summary statistics in hovertext, improve opacity, remove matplotlib dependency, restructure/rename plotly_kwargs, improve code Mapper visualisation refactor: fix bugs, add summary statistics in hovertext, improve opacity, remove matplotlib dependency, add node_scale kwarg, add clone_pipeline kwarg to interactive plots,restructure/rename plotly_kwargs, improve code May 25, 2020
Copy link
Collaborator

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on making the 3D hoverlabel colours work! Overall the PR LGTM and can be merged!

f'Data points: {X[node_elements[0]]}')
>>> node_id = mapper_graph['node_metadata']['node_id']
>>> node_elements = mapper_graph['node_metadata']['node_elements']
>>> print(f"Node Id: {node_id[0]}, Node elements: {node_elements[0]}, "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "Node ID" for consistency elsewhere in our docs?

@giotto-ai giotto-ai deleted a comment from azure-pipelines bot May 31, 2020
@ulupo ulupo merged commit 23a113f into giotto-ai:master May 31, 2020
@ulupo ulupo mentioned this pull request May 31, 2020
@ulupo ulupo deleted the mapper/fix_mapper_colors branch May 31, 2020 10:21
ulupo pushed a commit to ulupo/giotto-tda that referenced this pull request May 31, 2020
ulupo added a commit that referenced this pull request May 31, 2020
* Fix issues with mapper docs following #406
ulupo added a commit that referenced this pull request Jun 2, 2020
* Fix issue with docstring example and document reshaping of 1D outputs (#396)

* Toc (#394)

* Update P landscapes

Signed-off-by: ammedmar <anibal@medina-mardones.com>

* Add distances, inner products and kernels glossary entry

Signed-off-by: ammedmar <anibal@medina-mardones.com>

* Remake vectorization changes

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Change [] for \lbrack \rbrack

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after W's comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after W's comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update afte W's second comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after Umbe's comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after Umbe's second comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after Umbe's third comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after Umbe's 4th comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Remove concept k-skeleton

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Add table of content

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Add Lp & lp. Update landscape

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update TOC indentation

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Add heat vectorizations entry

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update indentation of TOC

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Fix extra spacing in bibliography

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update bibliography hack for caps

Signed-off-by: ammedmar <ammedmar@gmail.com>

* After Umbe's Comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after issue #398

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after Umbe's comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

* Update after Umbe's 2nd comments

Signed-off-by: ammedmar <ammedmar@gmail.com>

Co-authored-by: ammedmar <anibal@medina-mardones.com>

* Speedup windows pipeline (#402)

* Improve boost location for azure pipeline on windows

The boost version installed in the pipeline is now used

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>
Co-authored-by: Umberto Lupo <46537483+ulupo@users.noreply.github.com>

* Make bindings public (#395)

* Make bindings public

Signed-off-by: Guillaume Tauzin <guillaumetauzin.ut@gmail.com>

* Fix pipeline on Mac (#407)

* Refresh ccache

* Enforce CXX standard to 14 on each module

* Change variable name to comply with E741

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>
Co-authored-by: Umberto Lupo <46537483+ulupo@users.noreply.github.com>

* Mapper visualisation refactor: fix bugs, add summary statistics in hovertext, improve opacity, remove matplotlib dependency, add node_scale kwarg, add clone_pipeline kwarg to interactive plots,restructure/rename plotly_kwargs, improve code (#406)

* Refactor of `mapper/visualisation.py` and `mapper/utils/visualisation.py`
- Removal of color scaling
- More modularity
- Variable and function name changes
- Remove matplotlib functions used for hoverlabel background color
- Remove cmin and cmax
- Change return signature of some functions

* Add test-output.xml to .gitignore

* Replace "text" key in plot_options["node_trace"] with "hovertext"

* Improve width and opacity of nodes and edges

* Display summary statistics in hovertext with significant figure rounding via new n_sig_figs kwarg

* Rename plotly_kwargs to plotly_params

* Remove matplotlib installation requirement

* Fix mapper notebook

* Add small comment on use of matplotlib in voids_on_the_plane

* Place matplotlib in examples requirement in setup.py

* Improve docstring of make_mapper_pipeline

* Improve examples for make_mapper_pipeline and create one for plot_static_mapper_graph

* Hide set_node_sizeref, add node_scale kwarg, add clone_pipeline kwarg to plot_interactive_mapper_graph

- Update docstrings
- Update quickstart notebook

* Add helper function for colorscale-based interpolations, improve use of plotly_params

Static and interactive plots display the expect hoverlabel colors in 3D, or white if things go wrong. Thanks to @lewtun for pointing out that this was needed.

* Improve docstrings

* Hide visualization module in mapper/utils

* Add pip install matplotlib to notebook tests in manylinux job (#410)

* Fix mapper docstring issues following #406 (#411)

* Fix issues with mapper docs following #406

* Create giotto-tda version 0.2.2 (#413)

* Turn CODE_OWNERS and CODE_AUTHORS into an rst file

* Bump version number to 0.2.2

* Add release notes for v0.2.2

Co-authored-by: Anibal M. Medina-Mardones <ammedmar@gmail.com>
Co-authored-by: ammedmar <anibal@medina-mardones.com>
Co-authored-by: REDS institute <reds-heig@users.noreply.github.com>
Co-authored-by: Guillaume Tauzin <guillaumetauzin.ut@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
4 participants