Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalized Taylor Diagram #214

Merged
merged 28 commits into from
Jun 19, 2024
Merged

Normalized Taylor Diagram #214

merged 28 commits into from
Jun 19, 2024

Conversation

coxipi
Copy link
Contributor

@coxipi coxipi commented May 28, 2024

Pull Request Checklist:

  • This PR addresses an already opened issue (for bug fixes / features)
    • This PR fixes #xyz
  • (If applicable) Documentation has been added / updated (for bug fixes / features).
  • (If applicable) Tests have been added.
  • CHANGELOG.rst has been updated (with summary of main changes).
    • Link to issue (:issue:number) and pull request (:pull:number) has been added.

What kind of change does this PR introduce?

  • A new type of Taylor diagram with normalized standard deviation is added

Does this PR introduce a breaking change?

  • Yes, fg.taylordiagram would return fig, floating_ax, legend instead of just floating_ax

Other information:

See fig.6 Cannon, 2018

Since the std is transformed to be unitless (divided by the reference std), we can now compare taylor diagrams which have a different reference. The code accepts datasets with supplementary dimensions (max. 2).

Regarding the breaking change, it could be avoided. But, as things are right now, it's hard to modify the plot only with floating_ax but fig.legend() is used to generate the legend. I think it makes sense that fg.taylordiagram should return more objects, regardless of fg.normalized_taylordiagram.

Also, fg.normalized_taylordiagram could also just be an option of fg.taylordiagram. But I wanted to allow a new kind of DataArray with this plotting scheme, that is: a da with dimensions taylor_params, [dim_1], [[dim_2]], so two additional dimensions can be given. I just didn't to another type of data in the same mix of fg.taylordiagram and more options to the simpler methods fg.taylordiagram, but it might be better to keep this all in one place

Copy link

github-actions bot commented May 28, 2024

Welcome, new contributor!

It appears that this is your first Pull Request. To give credit where it's due, we ask that you add your information to the AUTHORS.rst and .zenodo.json:

  • The relevant author information has been added to AUTHORS.rst and .zenodo.json.

Please make sure you've read our contributing guide. We look forward to reviewing your Pull Request shortly ✨

@coxipi
Copy link
Contributor Author

coxipi commented May 28, 2024

A small artificial example:

import figanos.matplotlib as fg
import xarray as xr
from xclim import sdba
from xclim.testing import open_dataset

ds = open_dataset("sdba/CanESM2_1950-2100.nc").sel(time=slice("1950", "2013"))
ds2 = open_dataset("sdba/ahccd_1950-2013.nc")
ds["pr"] = ds2["tasmax"] 
ds2["pr"] = ds["tasmax"] 
da = sdba.stack_variables(ds.rename({"pr":"tasmax_2"}))
da2 = sdba.stack_variables(ds2.rename({"pr":"tasmax_2"}))
out = sdba.measures.taylordiagram(ref=da, sim=da2, dim="time")
fg.normalized_taylordiagram(out, markers_dim="location", colors_dim="multivar")

image

or without any particular indication for markers/colors, which is more similar to current fg.taylordiagram, with fg.normalized_taylordiagram(out, markers_dim="location")

image

The distribution of data looks weird because of the weird example I took, I can show a more realistic one later

Comment on lines +2027 to +2036
# points.append(ct_line)
ct_line = ax.plot(
[0],
[0],
ls=contours_kw["linestyles"],
lw=1,
c="k" if "colors" not in contours_kw else contours_kw["colors"],
label="rmse",
)
points.append(ct_line)
points.append(ct_line[0])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without this change, the text appears on the legend, but weirdly, it can't be accessed with methods like .get_legend_handles_labels

@sarahclaude
Copy link
Collaborator

Thanks Eric for the new plotting function!

Do you think it would be possible to incorporate normalized_taylordiagram into taylordiagram by adding an argument normalized=T/F and allowing multi dimensions data in taylordiagram or are the two functions too different?

@coxipi
Copy link
Contributor Author

coxipi commented May 30, 2024

  • if extra dimensions (other than taylor_params), then fg.taylordiagram will create more entries in the dictionnary, every distinct coord gets it key. e.g. data = {"da1":da1, "da2":da2} where da1,da2 have an extra loc dim, this will become:
data = {
"da1_loc1":da1.sel(loc="loc1"),
 "da1_loc2":da1.sel(loc="loc2"), 
 ...
 "da2_loc1":da2.sel(loc="loc1"),
 "da2_loc2":da2.sel(loc="loc2"), 
 ...
}

and if it's not normalized, then we simply get the same error as before, stating that ref_std need to be equal.

  • colors_key and markers_key indicate what the colors and markers distinguish, this can either be a dimension in the arrays, e.g. markers_key = "loc", or an attribute in the datasets (for instance if you had a model in the attribute like: da.attrs[ 'cat:driving_model'], you could simply specify "colors_key" = "cat:driving_model" . If nothing is specified, then each key in data get its own marker as before.

CHANGELOG.rst Outdated Show resolved Hide resolved
remove normalized_taylordiagram mentions
CHANGELOG.rst Outdated Show resolved Hide resolved
@coxipi
Copy link
Contributor Author

coxipi commented May 31, 2024

There are many ways to use the function now to achieve similar results:

import figanos.matplotlib as fg
from xclim.testing import open_dataset
from xclim import sdba
import xclim
import numpy as np
ds = open_dataset("sdba/CanESM2_1950-2100.nc").sel(time=slice("1950", "2013"))
ds2 = open_dataset("sdba/ahccd_1950-2013.nc")
for v in ds.data_vars: 
    ds2[v] = xclim.core.units.convert_units_to(ds2[v], ds[v], context="hydro")
da = sdba.stack_variables(ds)
da2 = sdba.stack_variables(ds2)
out = sdba.measures.taylordiagram(ref=da, sim=da2, dim="time")
# normalization
out[{"taylor_param":[0,1]}] = out[{"taylor_param":[0,1]}]/ out[{"taylor_param":0}]
# precip gives negatives correlations, just for plotting purposes
out[{"taylor_param":2}] = np.abs(out[{"taylor_param":2}])

# 1, old way of organizing points, with dimensions
fg.taylordiagram(out)

# 2, new way of organizing points, with dimensions
fg.taylordiagram(out, colors_key="multivar", markers_key="location")

# 3,  old way of organizing points, with dict
outs = out.stack(dimm=list(set(out.dims) - set(["taylor_param"])))
dd = {"-".join(v): outs.sel(dimm=v) for v in outs.dimm.values}
fg.taylordiagram(dd)

# 4,  new way of organizing points, with dict
outall = out.stack(dimm=list(set(out.dims) - set(["taylor_param"])))
dd = {"-".join(v): outs.sel(dimm=v) for v in outs.dimm.values}
for k in dd.keys():
    dd[k].attrs["attr_multivar"] = dd[k]["multivar"].values.item()
    dd[k].attrs["attr_location"] = dd[k]["location"].values.item()
fg.taylordiagram(dd, colors_key="attr_multivar", markers_key="attr_location")

# 5,  old way of organizing points, mix dims/dict
dd = {v: out.sel(multivar=v) for v in out.multivar.values}
fg.taylordiagram(dd)
# 6,  new way of organizing points, mix dims/dict
dd = {v: out.sel(multivar=v) for v in out.multivar.values}
for k in dd.keys():
    dd[k].attrs["attr_multivar"] = dd[k]["multivar"].values.item()
fg.taylordiagram(dd, colors_key="attr_multivar", markers_key="location")

which all give:

old way of organizing points

image

new way of organizing points

image

@coxipi
Copy link
Contributor Author

coxipi commented May 31, 2024

I know there's a "spiro.mplstyle" in figanos, should this be used by default when using a figanos plot? I just coded color dimensions as "C0", "C1", ... but it seems to use default python colors and not default figanos colors.

@sarahclaude
Copy link
Collaborator

sarahclaude commented May 31, 2024

If you run the ouranos matplotlib stylesheet at the start of your code it should automatically use the figanos colors as it overwrites matplotlib default unless you directly write the color in your code.

fg.utils.set_mpl_style('ouranos') before calling other fg functions

@sarahclaude
Copy link
Collaborator

You could also add your new example to the docs instead of the old taylordiagram!

@coxipi
Copy link
Contributor Author

coxipi commented Jun 4, 2024

I've realized the scaling of the angular axis is not consistent with what is shown in the literature (unrelated to my PR). For instance, there is a larger portion of the graph that shows values near correlation=1.

image

I used the same reference mentioned in our code, but I followed it more closely. We were using a relation like, $\text{corr} = 1 - \frac{\theta}{\pi/2}$, now, I'm using $\text{corr} = \cos{\theta}$.

Was this on purpose? I can say that for my own work currently, the latter formulation is more useful.

# plot
pt = ax.scatter(
plot_corr, da.sel(taylor_param="sim_std").values, **plot_kw[key]
np.arccos(da.sel(taylor_param="corr").values),
Copy link
Contributor Author

@coxipi coxipi Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completes the comment with (*) above

@sarahclaude
Copy link
Collaborator

No i don't remember discussing it with Alexis when he made the function

@coxipi
Copy link
Contributor Author

coxipi commented Jun 5, 2024

I have a cheap way of having multiple Taylordiagram together. Just allow fig as an input, and also allow to define subplot_num. I don't think it's a nice way to do things, but still I wanted to showcase what small tweak can achieve, I was using this for my project:

for iloc, loc in enumerate(out.location.values):
    # after first run, use the previous figure
    fig = None if iloc==0 else fig
    fig, ax,_ = fg.taylordiagram(out.isel(location=iloc), subplot_num=131+iloc, fig=fig, fig_kw={"figsize":(14,4)})
    ax.set_title(f"{loc}")

image

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Collaborator

@sarahclaude sarahclaude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the sections Taylor Diagram and Normalized taylor diagram to the figanos_doc.ipynb (you can remove the old section taylor diagrams).

You could also add your multi plot examples in figanos_multiplots
image

@coxipi
Copy link
Contributor Author

coxipi commented Jun 11, 2024

(you can remove the old section taylor diagrams)

oops, I missed that part on my first read. Well, I've modified the already existing example, it's more simple and doesn't use random values, I think it's clean as it is. I like the examples as they are now (in figanos_doc).

You could also add your multi plot examples in figanos_multiplots

This was just showcasing what I can do with my own custom implementation I was using for a project, I should have been clearer. I thought the way I implemented this was a bit too hacky for a serious release, just wanted to showcase what is possible with simple modifications.

fig, ax, leg = fg1.taylordiagram(out1, fig = None, subplot_num=131)
fig, ax, leg = fg1.taylordiagram(out2, fig = fig, subplot_num=132)
fig, ax, leg = fg1.taylordiagram(out3, fig = fig, subplot_num=133)

If fig, subplot_num are not specified, then we have the normal behaviour.

But come to think of it, I saw the warning:

warnings.warn("Only figsize and figure.add_subplot() arguments can be passed to fig_kw when using facetgrid.")

in some other functions. Is the figure.add_subplot() supposed to achieve something similar to what I'm doing above? Should

@sarahclaude
Copy link
Collaborator

It works by passing the args to xarray facetgrid.add_subplot() and without facetgrid you can also send it to matplotlib.pyplot.figure.add_subplot(). So if I understood correctly, the end result would be the same, but your subplot_num is more evident and requires less nested dictionaries.

@coxipi
Copy link
Contributor Author

coxipi commented Jun 14, 2024

I will try to add a proper FacetGrid support in another PR, this one is mostly ready to go.

I added the possibility of a line on std==ref_std ref_std_line (True/False) argument. It takes the same color as the reference marker:

image

  • No legend entry for this new line. rmse are a bit less evident, but in this case I think it's clear that the line represents ref_std, yes?
  • Color: Same a marker for reference

If that's all right I'll merge the branch

@coxipi coxipi merged commit 290023d into main Jun 19, 2024
12 checks passed
@coxipi coxipi deleted the normalized_taylor branch June 19, 2024 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants