Labels fix #304

williamjameshandley · 2023-06-28T09:57:50Z

Description

Following offline conversations with @lukashergt and @AdamOrmondroyd, this PR aims to resolve #253 and #236.

It upgrades labelled data frames so the LabelledSeries.name attribute now reads as the tex, rather than the label (or something else), which will prove useful in improving plotting tools.

It achieves this by implementing the strategy "If the resulting accessed slice has a 'name' attribute, set this to the value of the label that was dropped".

e.g. a frame with a multiindex column element ('vowel','A','$A$'), would return frame.vowel.A as a series with the name '$A$', rather than 'A' or ('vowel','A').

The tests have been updated to reflect this new behaviour, and new tests added to check that this occurs.

Checklist:

I have performed a self-review of my own code
My code is PEP8 compliant (flake8 anesthetic tests)
My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
New and existing unit tests pass locally with my changes (python -m pytest)
I have added tests that prove my fix is effective or that my feature works
I have appropriately incremented the semantic version number in both README.rst and anesthetic/_version.py

anesthetic/labelled_pandas.py

williamjameshandley · 2023-06-28T10:01:16Z

anesthetic/labelled_pandas.py

+        return ac([(_LocIndexer_("loc",
+                                 super(_LabelledObject,
+                                       self.obj.drop_labels(i))
+                                 ).__getitem__,
+                    self.obj.get_labels_map(i))
+                   for i in self.obj._all_axes()], key)


There are several changes to all of the indexers/accessors:

_all_axes now returns None as well, to capture the extra edge case (which was previously tagged on as +[...]

crucially we return the super object in the Indexers (this is one of those, 'how did that ever work?' observations)

We return a list of (function, labels_map) pairs, rather than just a list of functions.

codecov · 2023-06-28T10:04:14Z

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (8456b21) 100.00% compared to head (c714214) 100.00%.

Additional details and impacted files

@@            Coverage Diff            @@
##            master      #304   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           30        30           
  Lines         2667      2691   +24     
=========================================
+ Hits          2667      2691   +24

Impacted Files	Coverage Δ
anesthetic/_version.py	`100.00% <100.00%> (ø)`
anesthetic/labelled_pandas.py	`100.00% <100.00%> (ø)`
anesthetic/plotting/_matplotlib/core.py	`100.00% <100.00%> (ø)`
anesthetic/samples.py	`100.00% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

lukashergt

A few comments inline and two general points below.

Following offline conversations with @lukashergt and @AdamOrmondroyd, this PR aims to resolve #253 and #256.

I believe the reference to replace .boxplot()with .plot.box() #256 was actually supposed to be New pandas.DataFrame.attrs feature possibly useful for metadata such as root or label #236, right?
This PR does not address the incorrect (non tex) axes labels from Retain latex label when slicing DataFrame to Series #253 in:
```
samples.plot.kde_2d('x0', 'x1')
```
Do we want this PR to address that issue as well?
Note that the other part of Retain latex label when slicing DataFrame to Series #253 is resolved:
```
samples.x0.plot.kde()
samples.x1.plot.kde()
plt.legend()
```

anesthetic/labelled_pandas.py

williamjameshandley · 2023-06-28T21:42:56Z

I believe the reference to replace .boxplot()with .plot.box() #256 was actually supposed to be New pandas.DataFrame.attrs feature possibly useful for metadata such as root or label #236, right

yes : (

williamjameshandley · 2023-06-28T22:21:12Z

This PR does not address the incorrect (non tex) axes labels from Retain latex label when slicing DataFrame to Series #253 in:
samples.plot.kde_2d('x0', 'x1')
Do we want this PR to address that issue as well?

Done in 3f1e2e6

williamjameshandley · 2023-06-28T22:21:31Z

I've also snuck in a resolution + tests for #297 in b223d1a

lukashergt · 2023-06-29T00:54:58Z

anesthetic/plotting/_matplotlib/core.py

-        x = data[self.x].values
-        y = data[self.y].values
+        x = data[self.x]
+        y = data[self.y]
+        self.x = x.name  # transfer the tex label
+        self.y = y.name  # transfer the tex label


This nicely works for labelled dataframes, but now unlabelled dataframes don't have any x- and y-labels anymore:

samples[['x0', 'x1']].drop_labels().plot_2d()

See also some suggested tests below.

I think this requires a similar check à la:

if x.name: self.x = x.name

i.e. only use x.name if it is not empty.

Nevermind, that won't help...

I think we might need to ensure that samples.drop_labels().get_label('x0') does not return an empty string, but 'x0' instead...

~~done in aa715ab I think~~ no

I think this requires a similar check à la:

if x.name: self.x = x.name

i.e. only use x.name if it is not empty.

I put this in and took it back out again now that name is always used when slicing, but maybe worth retaining anyway

tackled in 10a5b42

tests/test_samples.py

lukashergt · 2023-06-29T01:05:10Z

tests/test_samples.py

+def test_plot_1d_no_axes():
+    np.random.seed(3)
+    ns = read_chains('./tests/example_data/pc')
+    ns[['x0', 'x1', 'x2']].plot_1d()


Following up on my above comment, we should add something along the lines of:

axes = ns[['x0', 'x1', 'x2']].plot_1d() assert axes[0].get_xlabel() == '$x_0$' assert axes[1].get_xlabel() == '$x_1$' assert axes[2].get_xlabel() == '$x_2$' axes = ns[['x0', 'x1', 'x2']].drop_labels().plot_1d() assert axes[0].get_xlabel() == 'x0' assert axes[1].get_xlabel() == 'x1' assert axes[2].get_xlabel() == 'x2'

I've added these tests, but the drop_labels() ones are failing as expected. What would you expect for a mixture of labelled and unlabelled columns?

If a parameter/column has a (tex) label, use that, if a parameter/column does not have a (tex) label then use the column handle.

I've fixed this for the separate plots by only changing name to the label if the label exists, rather than setting it to a blank string as currently happens.

This solves the issues for individual plots (e.g. plot.kde_1d()), but not plot_1d/2d(), which use the label mapping to create the axes, which leads to blanks.

tests/test_samples.py

lukashergt · 2023-06-29T01:08:26Z

anesthetic/samples.py

+            If not provided, then all parameters are plotted. This is intended
+            for plotting a sliced array (e.g. `samples[['x0','x1]].plot_1d()`.
+            It is not advisible to plot an entire frame, as it is
+            computationally expensive, and liable to run into linear algebra
+            errors for degenerate derived parameters.


Is this true for the 1d case?

lukashergt · 2023-06-29T03:02:05Z

Failing tests appear to be an issue in fastkde, not in anesthetic :/

AdamOrmondroyd · 2023-06-29T14:11:47Z

There's a similar problem with 1D legends, where the labels are beautifully displayed when they're present, but omitted when they are dropped

fig, ax = plt.subplots()
ns['x0'].plot.kde_1d(ax=ax)
ns['x1'].plot.kde_1d(ax=ax)
ns['x2'].plot.kde_1d(ax=ax)
ax.legend()

ns = ns.drop_labels()
...

This reverts commit 4ea5195.

This reverts commit 05c99b3.

…abelled LabelledDataFrames

AdamOrmondroyd · 2023-06-29T15:41:16Z

While this deals with an unlabelled LabelledDataFrame, what about the mixed case where only some columns have labels? e.g.

ns['a'] = ns.x0 + ns.x1
ns['b'] = ns.x0 + ns.x2

ns[['x0', 'a', 'b']].plot_2d()

lukashergt · 2023-06-29T15:45:41Z

While this deals with an unlabelled LabelledDataFrame, what about the mixed case where only some columns have labels? e.g.
ns['a'] = ns.x0 + ns.x1
ns['b'] = ns.x0 + ns.x2

ns[['x0', 'a', 'b']].plot_2d()

For this example one could claim that in addition to creating a new parameter 'a' you should also create its label ns.set_label('a', '$a$')...

lukashergt

Happy with the change in 10a5b42 ?

Opinions on #304 (comment) ?

Otherwise happy to approve this, now.

williamjameshandley · 2023-06-29T17:18:51Z

Happy with the change in 10a5b42

Yes -- to my mind this is quite a neat API.

Opinions on #304 (comment) ?

Adjusted -- will need reapproval

williamjameshandley · 2023-06-29T17:33:17Z

While this deals with an unlabelled LabelledDataFrame, what about the mixed case where only some columns have labels? e.g.
ns['a'] = ns.x0 + ns.x1
ns['b'] = ns.x0 + ns.x2

ns[['x0', 'a', 'b']].plot_2d()

Although in an ideal world this would have '$x_0$' 'a' and 'b' as labels...

AdamOrmondroyd · 2023-06-29T18:04:03Z

While this deals with an unlabelled LabelledDataFrame, what about the mixed case where only some columns have labels? e.g.
ns['a'] = ns.x0 + ns.x1
ns['b'] = ns.x0 + ns.x2

ns[['x0', 'a', 'b']].plot_2d()
Although in an ideal world this would have '$x_0$' 'a' and 'b' as labels...

I think I side with Lukas - not only is this PR then done, but blank axes serve as a reminder to the user to set_label()

…ng filling of defaults optional

williamjameshandley added 8 commits June 28, 2023 09:08

Simplified setup with None axis

a7827d0

Instated method for swapping in names

f3857ba

Formatting correct

174bc68

Updated tests

5ac21f3

Updated tests to check that names are now returned correctly

5136458

Subtantially neater setup with None axis

c5a4d39

Merge branch 'master' into labels

aa33ea3

Bumped version

bc55ee0

williamjameshandley added this to the 2.0.0 milestone Jun 28, 2023

williamjameshandley commented Jun 28, 2023

View reviewed changes

anesthetic/labelled_pandas.py Outdated Show resolved Hide resolved

williamjameshandley commented Jun 28, 2023

View reviewed changes

williamjameshandley requested a review from lukashergt June 28, 2023 10:01

lukashergt reviewed Jun 28, 2023

View reviewed changes

anesthetic/labelled_pandas.py Outdated Show resolved Hide resolved

anesthetic/labelled_pandas.py Outdated Show resolved Hide resolved

williamjameshandley added 2 commits June 28, 2023 22:35

Exception->ValueError,TypeError,KeyError

280e881

Adjusted exceptions

5267cb7

williamjameshandley added 2 commits June 28, 2023 23:03

Added code which sorts out the 2d labels

3f1e2e6

Now resolving #297

b223d1a

williamjameshandley requested a review from lukashergt June 28, 2023 22:22

Reverted back to positional arguments

c4cd589

lukashergt requested changes Jun 29, 2023

View reviewed changes

williamjameshandley and others added 5 commits June 29, 2023 09:21

Merge branch 'master' into labels

b5c11c5

Bumped version

db45346

Corrected for fastkde case

3c19504

Merge remote-tracking branch 'handley-lab/master' into HEAD

b0999f4

version bump

7a551d8

AdamOrmondroyd added 2 commits June 29, 2023 13:17

only change axis labels to name if name exists

4ea5195

add tests for drop_labels (plot_1d and plot_2d currently failing)

4358118

AdamOrmondroyd and others added 4 commits June 29, 2023 15:34

retain name if no label

aa715ab

Revert "only change axis labels to name if name exists"

05c99b3

This reverts commit 4ea5195.

Revert "Revert "only change axis labels to name if name exists""

7f8c27f

This reverts commit 05c99b3.

return column names instead of tex labels in get_labels_map for unl…

10a5b42

…abelled LabelledDataFrames

add test for legend

ff1a2ba

lukashergt previously approved these changes Jun 29, 2023

View reviewed changes

Adjusted plot_1d comment

3ec3a3e

williamjameshandley dismissed lukashergt’s stale review via 3ec3a3e June 29, 2023 17:17

lukashergt previously approved these changes Jun 29, 2023

View reviewed changes

Updated so blank labels are replaced by keys

09bde40

williamjameshandley dismissed lukashergt’s stale review via 09bde40 June 29, 2023 18:40

williamjameshandley added 2 commits June 29, 2023 20:26

Fixed set_labels in light of the new get_labels functionality by maki…

11a368d

…ng filling of defaults optional

new unfilling fix means that get_labels now returns non-defaults

c714214

williamjameshandley requested review from lukashergt and AdamOrmondroyd June 29, 2023 19:41

lukashergt approved these changes Jun 29, 2023

View reviewed changes

williamjameshandley merged commit 61bed43 into master Jun 29, 2023
23 checks passed

williamjameshandley deleted the labels branch June 29, 2023 20:23

williamjameshandley mentioned this pull request Jun 29, 2023

plot_2d() functionality #297

Closed

williamjameshandley mentioned this pull request Jul 26, 2023

make_axes_2d KeyError #321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Labels fix #304

Labels fix #304

williamjameshandley commented Jun 28, 2023 •

edited by AdamOrmondroyd

Loading

williamjameshandley Jun 28, 2023 •

edited

Loading

codecov bot commented Jun 28, 2023 •

edited

Loading

lukashergt left a comment

williamjameshandley commented Jun 28, 2023

williamjameshandley commented Jun 28, 2023 •

edited

Loading

williamjameshandley commented Jun 28, 2023 •

edited

Loading

lukashergt Jun 29, 2023

lukashergt Jun 29, 2023

lukashergt Jun 29, 2023

AdamOrmondroyd Jun 29, 2023 •

edited

Loading

AdamOrmondroyd Jun 29, 2023

lukashergt Jun 29, 2023

lukashergt Jun 29, 2023

AdamOrmondroyd Jun 29, 2023

lukashergt Jun 29, 2023

AdamOrmondroyd Jun 29, 2023

lukashergt Jun 29, 2023

lukashergt commented Jun 29, 2023

AdamOrmondroyd commented Jun 29, 2023

AdamOrmondroyd commented Jun 29, 2023

lukashergt commented Jun 29, 2023

lukashergt left a comment

williamjameshandley commented Jun 29, 2023

williamjameshandley commented Jun 29, 2023

AdamOrmondroyd commented Jun 29, 2023

Labels fix #304

Labels fix #304

Conversation

williamjameshandley commented Jun 28, 2023 • edited by AdamOrmondroyd Loading

Description

Checklist:

williamjameshandley Jun 28, 2023 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Jun 28, 2023 • edited Loading

Codecov Report

lukashergt left a comment

Choose a reason for hiding this comment

williamjameshandley commented Jun 28, 2023

williamjameshandley commented Jun 28, 2023 • edited Loading

williamjameshandley commented Jun 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AdamOrmondroyd Jun 29, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukashergt commented Jun 29, 2023

AdamOrmondroyd commented Jun 29, 2023

AdamOrmondroyd commented Jun 29, 2023

lukashergt commented Jun 29, 2023

lukashergt left a comment

Choose a reason for hiding this comment

williamjameshandley commented Jun 29, 2023

williamjameshandley commented Jun 29, 2023

AdamOrmondroyd commented Jun 29, 2023

williamjameshandley commented Jun 28, 2023 •

edited by AdamOrmondroyd

Loading

williamjameshandley Jun 28, 2023 •

edited

Loading

codecov bot commented Jun 28, 2023 •

edited

Loading

williamjameshandley commented Jun 28, 2023 •

edited

Loading

williamjameshandley commented Jun 28, 2023 •

edited

Loading

AdamOrmondroyd Jun 29, 2023 •

edited

Loading