Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move some params back to config file #776

Closed
MaxGamill-Sheffield opened this issue Jan 11, 2024 · 4 comments · Fixed by #777
Closed

Move some params back to config file #776

MaxGamill-Sheffield opened this issue Jan 11, 2024 · 4 comments · Fixed by #777
Labels
configuration enhancement New feature or request Low Priority Plotting Issues pertaining to the plotting class

Comments

@MaxGamill-Sheffield
Copy link
Collaborator

Is your feature request related to a problem?

Some commonly used parameters are found in the "intimidating" .mplstyle file and it can be hard to locate these due to it's size.

Describe the solution you would like.

Users have requested that the image colourmap, savefig.format and savefig.dpi return back to the original config file

Even better might be if the dpi can be moved to the plotting config so it can be done on a per image basis making the plotting quicker as very high dpi's only need to be done for certain images e.g. dnatracing images.

Describe the alternatives you have considered.

No response

Additional context

No response

@MaxGamill-Sheffield MaxGamill-Sheffield added the enhancement New feature or request label Jan 11, 2024
@ns-rse
Copy link
Collaborator

ns-rse commented Jan 11, 2024

The easy way to navigate large files is Ctrl + f to "find" the term you want to modify, takes a bit of getting used to but is much quicker than having to read everything.

Users have requested that the image colourmap, savefig.format and savefig.dpi return back to the original config file

  • Only the colourmap was removed in the PR that added this.
  • savefig.format is still in the default_config.yaml under plotting.save_format. Perhaps better alignment of values in default_config.yaml with their equivalent values in topostats.mplstyle might be useful and maybe this isn't currently working as a consequence.
  • The DPI was never configurable in the plotting section of default_config.yaml. The DPI settings are actually set on a per-image basis via plotting.

Reference changes to default_config.yaml

@ns-rse ns-rse added Low Priority Plotting Issues pertaining to the plotting class labels Jan 11, 2024
@MaxGamill-Sheffield
Copy link
Collaborator Author

MaxGamill-Sheffield commented Jan 11, 2024

The easy way to navigate large files is Ctrl + f to "find" the term you want to modify, takes a bit of getting used to but is much quicker than having to read everything.

Yep but the "color" being americanised and different to cmap which was used before made this counter intuitive for the users, making the search function unhelpful and lots of scrolling required.

  • Only the colourmap was removed in the PR that added this.
  • savefig.format is still in the default_config.yaml under plotting.save_format. Perhaps better alignment of values in default_config.yaml with their equivalent values in topostats.mplstyle might be useful and maybe this isn't currently working as a consequence.

My bad, I was checking for uncommented lines in the .mplstyle file. I haven't yet tested if these are overrided when set differently in the config file, but I think it would make sense to have them in either file not both, for clarity in this situation.

  • The DPI was never configurable in the plotting section of default_config.yaml. The DPI settings are actually set on a per-image basis via plotting.

This is what I meant, apologies if I didn't make that clear. I also hadn't realised this change had been made so great job :)

@ns-rse
Copy link
Collaborator

ns-rse commented Jan 11, 2024

Not much we can do about Americanisation, perhaps make a PR against Matplotlib to include multiple options (the R package ggplot supports both American and Anglican spellings so its not beyond the realm of possibility but considerable work and probably discussed many times already).

I think it would make sense to have them in either file not both, for clarity in this situation.

I'm of the opposite mind, sensible defaults in the .mplstyle that can be over-written, just as we can over-ride individual settings in default_config.yaml via command line options if users specify them. Making these defaults modifiable by both the default_config.yaml and a command line option is possible.

Part of the reason for including a complete Matplotlib style file was so that everything was there up-front as initially it was just a couple of font sizes, and I could forsee over time more and more tweaks/modifications being requested/added and so it made sense to put everything there and make it configurable.

I think it was when you were setting high-DPI values for plotting traces that the impact on speed arose and we opted to set on a per-image basis.

@ns-rse
Copy link
Collaborator

ns-rse commented Jan 12, 2024

Investigation...

DPI

In #707 DPI settings untouched and they are loaded from topostats/plotting_dictionary.yaml and appended to the dictionary config["plotting"]["plot_dict"]. The image specific settings have the global parameters (which come from the plotting section of topostats/default_config.yaml) added to their dictionary (see the [call]( config["plotting"] = update_plotting_config(config["plotting"]) to the topostats.utils.update_plotting_config()).

When processing each image that is to saved (controlled by the option image_set in topostats/default_config.yaml) uses the specific configuration deriving from topostats/plotting_dictionary.yaml which included the DPI and has been updated with all other arguments. These are passed in to each stages class (Filter, Grains, Grainstats, Tracing) as a dictionary and used whenever the topostats.plottingfuncs.Images() class is instantiated so that the image specific options are passed in correctly.

This class has many options when it is initialised (see __init__). #707 changed these from defaults explicitly defined in the class definition to be None and if they remain None the values from topostats/topostats.mplstyle are used instead.

        cmap: str | None = None,
        ...
        save_format: str = None,
        ...
        dpi: str | float | None = None,

The values are

Lines 194, 203 and 206 read...

        cmap = mpl.rcParams["image.cmap"] if cmap is None else cmap
        ...
        self.save_format = mpl.rcParams["savefig.format"] if save_format is None else save_format
        ...
        self.dpi = mpl.rcParams["savefig.dpi"] if dpi is None else dpi

For DPI this means that because the values from topostats/plotting_dictionary.yaml are loaded and passed as these the values in the image specific dictionary that are passed in as **kwargs.

To which end I think...

Even better might be if the dpi can be moved to the plotting config so it can be done on a per image basis making the plotting quicker as very high dpi's only need to be done for certain images e.g. dnatracing images.

...is already the case, although I've not explicitly tested this yet (but will in due course).

This leads us on to the other options...

savefig.format / save_format

This wasn't removed from the options in #707 and as per above it should carry through from the topostats/default_config.yaml to the image specific settings that are passed to topostats.plottingfuncs.Image() class where, because they are explicitly included in the dictionary for each image that is passed in as **kwargs and used in favour over the values defined in topostats/topostats.mplsytle as per...

        self.save_format = mpl.rcParams["savefig.format"] if save_format is None else save_format

Which leaves us with...

cmap

This was removed in #707 (along with histogram_bins) where it has used the variable name cmap (rather than colourmap or colormap) for some time.

As with DPI and savefig.format/save_format it can be reinstated and because of the "magic" of **kwargs taking an updated dictionary should then make this configurable.

General thoughts/comments

Reinstating cmap returns configuration to be on a parity with the previous state though and for convenience I think it would be relatively easy to add a command line option for these three parameters as the topostats/default_config.yaml is already updated on loading with command line options.

Having all options available in a matplotlibrc file gives the maximum flexibility without the need to write what could easily grow into a large amount of boilerplate code and I'm wary of making everything configurable via such options as it increases the code base and defeats the intention of introducing the topostats/topostats.mplstyle file for doing so. Plots/figures have so many options that they are often most suited to specific plotting and saving in Notebooks, particularly when something bespoke for publications is required (although with tools like Quarto a literate approach to producing slides and papers can be taken so that images are updated inline of manuscripts/slides).

Documentation

An attempt to document a lot of this complexity was introduced in #773 which was merged last week and is available on the documentation website for the latest version under Configuration : Matplotlib Style.

I'll revise this to try and make it clearer in light of the above investigations.

ns-rse added a commit that referenced this issue Jan 12, 2024
Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
ns-rse added a commit that referenced this issue Jan 12, 2024
Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
ns-rse added a commit that referenced this issue Jan 12, 2024
Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
ns-rse added a commit that referenced this issue Jan 12, 2024
Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
ns-rse added a commit that referenced this issue Jan 12, 2024
Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
ns-rse added a commit that referenced this issue Feb 6, 2024
Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
ns-rse added a commit that referenced this issue Feb 6, 2024
Closes #776

+ Aligns command line and configuration field names with those in matplotlibrc files.
+ Restores the `cmap` configuration option to `default_config.yaml` and introduces `savefig_dpi` option.
+ Adds command line options for setting DPI (`--savefig-dpi`), Colormap (`--cmap`)and Output file
  format (`--savefig-format`).
+ Expands documentation on how to use custom configuration files or command line options to set the DPI/Colormap/Output
  format.
+ Updates the header to `topostats.mplstyle` to explain how to use it as typically users will have created a copy of the
  file (after the convenience function `topostats create-matplotlibrc` was introduced with #773).
+ To achieve this the dictionary `config["plotting"]` needed explicitly updating as the `update_config()` function
  doesn't update nested configurations (since this is the first PR that introduces command line options that modify any
  of the values in the nested dictionaries).
+ Updates options for `topostats toposum`` to align with `savefig_format` and adds flag to entry point so output format
  is consistent.
+ Updates and expands the configuration documentation explaining how to use these conveniences.

As a consequence quite a few files are touched to ensure that validation and processing functions all have variables
that align with those in the configuration.

If users could test this it would be very much appreciated, if you use the Git installed version something like the
following would switch branches and allow you test it.

```
conda create --name topostats-config # Create and activate a virtual env specific to this
conda activate topostats-config
cd ~/path/to/TopoStats
git pull
git checkout ns-rse/776-config-jigging
pip install -e .
topostats process --output-dir base
topostats create-config test_config.yaml   # Create test_config.yaml to try changing parameters
topostats process --config test_config.yaml --output-dir test1
topostats process  --output-dir test2 --savefig-dpi 10 --cmap rainbow --savefig-format svg
topostats process --config test_config.yaml  --output-dir test3 --savefig-dpi 80 --cmap viridis --savefig-format pdf
```

Each invocation of `topostats process` will save output to its own directory (either `base`, `test1`, `test2` and
`test3`) for comparison. There should be differences between each `base` the values used in `test_config.yaml` and
saved under `test1` and those under `test2` and `test3` should also differ.

I would really appreciate feedback on the documentation as without clear documentation it is perhaps confusing how the
components interact and work and can be modified and getting this as clear as possible will be really helpful.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
configuration enhancement New feature or request Low Priority Plotting Issues pertaining to the plotting class
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants