Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore cmap to config; simplify config of cmap/dpi/image format #777

Merged
merged 9 commits into from
Jan 25, 2024

Conversation

ns-rse
Copy link
Collaborator

@ns-rse ns-rse commented Jan 12, 2024

Closes #776

TL;DR

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the components interact and work and can be modified and getting this as clear as possible will be really helpful. This is the configuration.md file. Ignore the changes to the table at the top and please review and feedback on the section ### Further customisation.

It won't be added to/rendered to the website until this PR is merged so I can't easily provide it as a web-page for review.

Details

  • Aligns command line and configuration field names with those in matplotlibrc files.
  • Restores the cmap configuration option to default_config.yaml and introduces savefig_dpi option.
  • Adds command line options for setting DPI (--savefig-dpi), Colormap (--cmap)and Output file format (--savefig-format).
  • Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output format.
  • Updates the header to topostats.mplstyle to explain how to use it as typically users will have created a copy of the file (after the convenience function topostats create-matplotlibrc was introduced with Adds entry point to create matplotlibrc parameters file #773).
  • To achieve this the dictionary config["plotting"] needed explicitly updating as the update_config() function doesn't update nested configurations (since this is the first PR that introduces command line options that modify any of the values in the nested dictionaries).
  • Updates options for topostats toposum`` to align with savefig_format` and adds flag to entry point so output format is consistent.
  • Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables that align with those in the configuration.

Testing

If users could test this it would be very much appreciated, if you use the Git installed version something like the following would switch branches and allow you test it.

conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf

Each invocation of topostats process will save output to its own directory (either base, test1, test2 and test3) for comparison. There should be differences between the scan plots under each run, i.e. base v test_config.yaml and saved under test1 and those under test2 and test3 should also differ.

Copy link

codecov bot commented Jan 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (7dee515) 84.30% compared to head (8bfa700) 84.36%.
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #777      +/-   ##
==========================================
+ Coverage   84.30%   84.36%   +0.06%     
==========================================
  Files          21       21              
  Lines        3128     3134       +6     
==========================================
+ Hits         2637     2644       +7     
+ Misses        491      490       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
@derollins
Copy link
Collaborator

derollins commented Jan 12, 2024

Tested this branch as suggested above, it worked as described and both the updated config file and the new command line options all behave as expected.

Used :

topostats process --output-dir base
topostats create-config -f test_config.yaml  
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf

Saved the outputs in 4 folders with the different format options.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Jan 12, 2024

Thanks for checking @derollins and glad the tests work for you.

Did you edit the savefig_dpi/savefig_format/cmap fields in the generated test_config.yaml so that output differed from base output?

Is this the desired functionality for changing these options (i.e. three methods firstly via a custom Matplotlibrc style file; secondly changing a custom_config.yaml; thirdly via command line options)?

Does the documentation make sense? (If you want a rendered view of it you can copy and paste this into here then scroll down to the Further Customisation section).

Is the header to topostats/topostats.mplstyle which is generated by topostats create-matplotlibrc less worrying as it no longer includes DO NOT EDIT and are the instructions contained therein concise and clear?

I've spent 4-5hrs digging through this and making the changes and would like to round it off whilst its fresh in my mind if its not right so I can get back to debugging some other work I'm developing with my TopoStats time next week.

@derollins
Copy link
Collaborator

Did you edit the savefig_dpi/savefig_format/cmap fields in the generated test_config.yaml so that output differed from base output?

Yes, changed all three of these in the generated test config file. So it is different from all the other folders.

This is a good improvement in functionality, it's really helpful to have all the most used parameters in one place (the main config file) and accessible from the command line for routine analysis and image generation. It's still helpful to have the matplotlib style file for more detailed changes and for matching presentation or journal style guides etc.

The documentation seems clear, the only small thing I thought of was making it clear that the plotting dictionary is not a file to copy/ generate then edit like the config file and matplotlib style file but a single file to edit. Although I don't know if that would actually overcomplicate.

The new header on the style file is very clear and much less intimidating! I'm no longer scared I will break everything haha.

Everything seems to be working how it should and this is a really helpful change.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Jan 12, 2024

Thanks for the clarification @derollins

What is most used might change between users and what they are trying to achieve. I've always seen TopoStats as a batch processing tool that produces standardised output for work in progress and expected that where bespoke images need customisation they would be plotted outside of this batch processing framework.

Glad you find the changes useful.

I'll await feedback/approval from other users before merging.

Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
@derollins
Copy link
Collaborator

What is most used might change between users and what they are trying to achieve. I've always seen TopoStats as a batch processing tool that produces standardised output for work in progress and expected that where bespoke images need customisation they would be plotted outside of this batch processing framework.

This is true, hopefully the upcoming training sessions may help identify the needs of the 'general' users which may feedback to usability development moving forward.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Jan 19, 2024

Would be great if we could get a few more eyes on this to checkout that it works as people would like it to please.

Copy link
Collaborator

@MaxGamill-Sheffield MaxGamill-Sheffield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my testing, I see that the defaults are 'nulls' for the configuration field output to the terminal which could be confusing. Could we move the references to mpl.rcParams from plottingfuncs.py into run_topostats.py or the utils.update_plotting_config() so that the dictionaries are updated earlier, and thus the defaults are reflected in terminal output?
Screenshot 2024-01-22 at 10 26 31

Additionally, none of the matplotlib colourmaps are available due to the validation restriction. I propose we swap this to just str, make it clear in the config file of any additional ones we've added + mpl ones, and let mpl deal with the error when it hits a plot . I'll add these suggestions during the code review. The alternative is to label all mpl cmaps here and keep up-to-date with any additional ones they add which may be tedious.

Also I think we should mention in the config file that the savfig_dpi affects all figures and thus having them too low may affect the traces seen on the dnatracing images (maybe one for after the cats branch merge though).

Other than that it all seems to work well, thanks Neil.

topostats/topostats.mplstyle Outdated Show resolved Hide resolved
topostats/validation.py Outdated Show resolved Hide resolved
docs/configuration.md Outdated Show resolved Hide resolved
ns-rse and others added 4 commits January 22, 2024 11:52
Co-authored-by: Max Gamill <91465918+MaxGamill-Sheffield@users.noreply.github.com>
Co-authored-by: Max Gamill <91465918+MaxGamill-Sheffield@users.noreply.github.com>
Co-authored-by: Max Gamill <91465918+MaxGamill-Sheffield@users.noreply.github.com>
@ns-rse
Copy link
Collaborator Author

ns-rse commented Jan 22, 2024

Thanks for the feedback @MaxGamill-Sheffield

I hadn't thought about None being listed in the completion_message() for DPI/Output Format/cmap and appreciate this
would be confusing so thanks for highlighting that. The solution I've gone for (removing the additions that reported
these) is different from that suggested (update the config dictionary with parameters from mpl.rcParams early in
processing). My reasoning being...

  • Previously we didn't report these, no one has ever asked to see them in the logging output.
  • We write configuration options to YAML file via write_yaml() at the end of processing. Its a verbatim copy of that
    which was used (either user specified or default_config.yaml)and it contains the settings used. If a user didn't
    specify DPI/cmap/format I'm not sure we should alter this. It could be argued it is useful to provide them but then
    that would also require writing all other configuration/plotting options to be consistent should the default values
    ever change in the future.

Currently only a handful of parameters are read from topostats.mplstyle and this is conditional on
whether any of these are being over-ridden or not when instantiating the Images() class. Currently we

plottingfuncs.Images() > load_mplstyle() > Set DPI/cmap/format based on arguments to Images()

It is certainly possible to change this process as suggested and..

load_mplstyle() > Update DPI/cmap/format > plottingfuncs.Images()

...but that is a larger amount of work to undertake and introduces scope drift to this PR. If it is desirable to report this information in logging and/or ensure the configuration file that is written contains the default parameters from topostats.mplstyle then we can address that as a separate issue.

One challenge there might be is that as TopoStats evolves the configuration files will become dated (e.g. new options may be added, some may be removed, defaults might change). That said specific versions of TopoStats should always work with an associated configuration file, so perhaps we need to give this more consideration and include TopoStats versions in configs too to ensure results are reproducible. 🤔 🤯

Thanks for the feedback @MaxGamil-Sheffield this commit...

+ Loosens validation of `cmap` so that any Matplotlib color map can be used.
+ Removes reporting of DPI/output format/cmap from early logging stages and output of `completion_message()`.

I hadn't thought about `None` being listed in the `completion_message()` for DPI/Output Format/cmap and appreciate this
would be confusing so thanks for highlighting that. The solution I've gone for (removing the additions that reported
these) is different from that suggested (update the `config` dictionary with parameters from `mpl.rcParams` early in
processing). My reasoning being...

+ Previously we didn't report these, no one has ever asked to see them in the logging output.
+ We write configuration options to YAML file via `write_yaml()` at the end of processing. Its a verbatim copy of that
  which was used (either user specified or `default_config.yaml`)and it contains the settings used. If a user didn't
  specify DPI/cmap/format I'm not sure we should alter this. It could be argued it is useful to provide them but then
  that would also require writing _all_ other configuration/plotting options to be consistent should the default values
  ever change in the future.

Currently only a handful of parameters are read from `topostats.mplstyle` and this is conditional on
whether any of these are being over-ridden or not when instantiating the `Images()` class. Currently we

```
plottingfuncs.Images() > load_mplstyle() > Set DPI/cmap/format based on arguments to Images()
```

It is certainly possible to change this process as suggested and..

```
load_mplstyle() > Update DPI/cmap/format > plottingfuncs.Images()
```

...but that is a larger amount of work to undertake and introduces scope drift to this PR. If it is desirable to report
this information in logging and/or ensure the configuration file that is written contains the default parameters from
`topostats.mplstyle` then we can address that as a separate issue.
Copy link
Collaborator

@MaxGamill-Sheffield MaxGamill-Sheffield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks Neil!

Just a suggestion to the docs and 2 other things I've just forgotten while writing this. Lmk what you think

topostats/validation.py Outdated Show resolved Hide resolved
topostats/validation.py Show resolved Hide resolved
docs/configuration.md Outdated Show resolved Hide resolved
+ Correctly details in validation the values for `figure` in plotting.
+ Details what the `core` set outputs.

I've not added the request to add links in validation output to Matplotlib cmap as links already exist in the
documentation and I would expect if someone wishes to use a particular colormap here they would already be aware of what
the options are.
Copy link
Collaborator

@MaxGamill-Sheffield MaxGamill-Sheffield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go! And sorry I didn't see the Newsletter comment until just now.

@ns-rse ns-rse added this pull request to the merge queue Jan 25, 2024
Merged via the queue into main with commit 6900c06 Jan 25, 2024
13 checks passed
@ns-rse ns-rse deleted the ns-rse/776-config-jigging branch January 25, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move some params back to config file
3 participants