Make _assign_metrics public #199

dcherian · 2020-04-16T19:45:22Z

#108 (comment) suggests that _assign_metrics should be public.

That way you can use grid.interp and grid.diff to construct metrics when necessary and then just assign them to an existing object.

The text was updated successfully, but these errors were encountered:

jbusecke · 2020-04-26T20:12:42Z

I think this is resonable. @rabernat, do you have thoughts?

jbusecke · 2020-06-23T01:15:10Z

I think if we do this, we could also implement a .get_metrics_dict. I have been messing around with subsets of data and it would be very helpful to get a complete dict like {'X':['dx1', 'dx2', 'dx3',...], 'Y':[...]} to pass it to a new grid object (we currently only store the actual dataarrays, not the coordinate names). This would probably only require small effort to refactor.

dcherian · 2020-06-23T02:45:40Z

How about .metrics with property setters and getters?

jbusecke · 2020-06-23T12:05:40Z

I would be in favor of that!

rabernat · 2020-06-23T12:33:32Z

Sure, sounds good to me.

jbusecke · 2020-09-14T17:55:01Z

Something we can integrate in this feature is a way to return the original metrics_dict so it could be passed to another grid object. This might be helpful for subsets of data (#193 )

jbusecke · 2020-10-14T19:43:20Z

Just started to tinker with a PR for #103 and I think it would be helpful to address this here first. A few additional questions:

I would create a subclass metrics on the grid object? which would store a dictionary like this:

metric_dict = {axis_combo:[(metric_name1, metric_da1), (metric_name2, metric_da2), ...]}

this would enable us to store the name in combination with the dataarray. This is helpful for naming derived metrics (e.g. for #222 and #103) and prepare the internals to be able to generate for instance a subset with all needed kwargs to create a new grid (useful for #193 #200 ).

should the set method accept only str (names of variables in the grid._ds dataset? Or should we expand this functionality to accept an actual dataarray?

Should we be able to update a single metric once the grid object is created? Or require the user to completely 'reset' the metric_dict?

What checks should we perform in the set method?

Make sure that each grid position is unique (e.g. if two x distances are passed and are located on the same position, we should error out)
... anything else?

dcherian · 2020-10-14T20:32:57Z

I would create a subclass metrics on the grid object? which would store a dictionary like this:

How about something like

class Metrics(dict):
    def __setitem__(self, key, value):
        # do validation / alignment checks here
        super().__setitem__(key, value)

    def validate(self):
        # could be useful
        pass

    def infer_remaining(self):
        # maybe?
        pass
    
class Grid:
    
    metrics: Metrics = Metrics()
    
inst = Grid()
print(inst.metrics)  # {}
inst.metrics["a"] = 10
inst.metrics["b"] = 20
print(inst.metrics)  # {a: 10, b: 20}

store the name in combination with the dataarray

What are these names? Are they the same as metric_da1.name?

should the set method accept only str (names of variables in the grid._ds dataset? Or should we expand this functionality to accept an actual dataarray?

To help with #108 (comment) (grid operations used to construct metrics), I think we should allow setting DataArrays.

Should we be able to update a single metric once the grid object is created? Or require the user to completely 'reset' the metric_dict?

Hmm.. maybe not until this is actually needed? #108 (comment) should only require setting metrics that have not been assigned

jbusecke · 2020-10-14T21:08:09Z

I like your suggestion a lot.

Couple of thoughts (forgive me if these are stupid, I am not that versed with classes yet):

We need to keep track of what axes the metrics represent (currently a tuple of one or more axis names, lets call that axis_combo. Think distance axis_combo=('X') vs area axis_combo=('X','Y'), how can we modify the above to keep track of that? Use a tuple like (value, axis_combo), so that the above is modified to inst.metrics['distance_a']=(10,('X'))?
We could also use an attr of the dataarray? That is if we require that as
I really like the infer logic here, but I think we should be able to track internally, which ones are 'native' versus inferred. Any proposal how to do this most elegantly? Again we could use the attrs?

What are these names? Are they the same as metric_da1.name?

Yes, if we require a dataarray as value, we could just check for a name on the input, and allow to pass either a str (value gets picked from grid._ds), or a dataarray (name gets parsed from there or we raise a warning if da.name is None).

Should we be able to update a single metric once the grid object is created? Or require the user to completely 'reset' the metric_dict?
Hmm.. maybe not until this is actually needed? #108 (comment) should only require setting metrics that have not been assigned

I think I wasnt exactly clear here. I think your example clarified, that you can set each metric by itself, and dont need to pass them as a list of values or anything like that.

dcherian · 2020-10-14T23:18:36Z

I thought Grid.metrics would have axis_combo as keys ('X',), ('X', 'Y'), ('X', 'Y', 'Z'); What is distance_a and why do we need it :D?

Why should we track "native" vs "inferred" metrics? We could add a private attribute to the Metrics class _inferred = list(str).

jbusecke · 2020-10-15T14:02:45Z

I thought Grid.metrics would have axis_combo as keys ('X',), ('X', 'Y'), ('X', 'Y', 'Z'); What is distance_a and why do we need it :D?

For future features (#103 and extension of the grid.transform() functionality to metrics and datasets), I would like to be able to attach 'altered' metrics with the original name onto the output array/datasets. For example, lets say you have some dataarray data, which has the coordinate area_t. I would like to be able to coarsen this and get another dataarray which has a coarsened coordinate area_t or optional area_t_coarsened. Either way for things to stay consistent, I need the original name. Does that make sense?

Why should we track "native" vs "inferred" metrics? We could add a private attribute to the Metrics class _inferred = list(str).

Let me clarify what I envision under inferred. We are currently able to multiple lower order metrics to higher order metrics (e.g. multiply distances to get area). I think it would also be good to have an option in the future to interpolate metrics to positions without metrics (like in #196). I think both of these should be clearly marked and the user needs to be aware of the fact that these might be less accurate than native metrics.

I am also working on a draft of a new vanity feature that Ill describe in an issue soon, which would need both of these informations, hehe.

dcherian · 2020-10-15T16:12:37Z

Either way for things to stay consistent, I need the original name. Does that make sense?

Yes but the name should be metric_da.name no? I think we can raise errors if metric_da.name is None when assigning to Metrics

I think coarsen syntax could be

coarsegrid = grid.coarsen(X=5)
newds = coarsegrid.mean(ds)

This would preserve the names of metrics variables area_t and you have a new consistent Grid object for the new grid.
I prefer this over

coarsegrid, newds = grid.coarsen(X=5).mean(ds)

I think it would also be good to have an option in the future to interpolate metrics to positions without metrics

Yes, but this could be opt-in and we can add a verbose flag to print out what was inferred. Alternatively, we could store inferred names under a private attribute and mark those appropriately in a repr for Metrics

jbusecke · 2020-10-15T17:16:28Z

Yes but the name should be metric_da.name no? I think we can raise errors if metric_da.name is None when assigning to Metrics

Yes, I think we are thinking the same thing here.

I think coarsen syntax could be

coarsegrid = grid.coarsen(X=5)
newds = coarsegrid.mean(ds)
This would preserve the names of metrics variables area_t and you have a new consistent Grid object for the new grid.
I prefer this over

coarsegrid, newds = grid.coarsen(X=5).mean(ds)

Interesting. I am not quite sure which version I favor at the moment. But could you copy+paste this suggestion to #103 , so folks who are not watching this issue could discuss?

I think this is sufficiently concrete that I can take a swing at implementing this maybe tomorrow or next week, and then we can discuss details over in the PR? Thanks for all the input.

github-actions · 2021-06-11T12:03:22Z

This issue has been marked 'stale' due to lack of recent activity. If there is no further activity, the issue will be closed in another 30 days. Thank you for your contribution!

github-actions · 2021-06-18T12:03:29Z

This issue has been closed due to inactivity. If you feel this is in error, please reopen the issue or file a new issue with the relevant details.

jbusecke · 2021-06-24T15:28:44Z

Just for posteriority: This was closed via #336

jbusecke mentioned this issue Oct 14, 2020

grid.coarsen function #103

Open

jbusecke mentioned this issue Oct 14, 2020

examples for different models #142

Closed

10 tasks

jbusecke mentioned this issue Oct 15, 2020

Shiny html representation for xgcm [FEATURE] #276

Open

This was referenced Jun 8, 2021

adding metrics method #336

Merged

[FEATURE] metrics object #342

Closed

github-actions bot added the Stale label Jun 11, 2021

github-actions bot closed this as completed Jun 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make _assign_metrics public #199

Make _assign_metrics public #199

dcherian commented Apr 16, 2020

jbusecke commented Apr 26, 2020

jbusecke commented Jun 23, 2020

dcherian commented Jun 23, 2020

jbusecke commented Jun 23, 2020

rabernat commented Jun 23, 2020

jbusecke commented Sep 14, 2020

jbusecke commented Oct 14, 2020 •

edited

Loading

dcherian commented Oct 14, 2020

jbusecke commented Oct 14, 2020

dcherian commented Oct 14, 2020

jbusecke commented Oct 15, 2020

dcherian commented Oct 15, 2020

jbusecke commented Oct 15, 2020

github-actions bot commented Jun 11, 2021

github-actions bot commented Jun 18, 2021

jbusecke commented Jun 24, 2021

Make _assign_metrics public #199

Make _assign_metrics public #199

Comments

dcherian commented Apr 16, 2020

jbusecke commented Apr 26, 2020

jbusecke commented Jun 23, 2020

dcherian commented Jun 23, 2020

jbusecke commented Jun 23, 2020

rabernat commented Jun 23, 2020

jbusecke commented Sep 14, 2020

jbusecke commented Oct 14, 2020 • edited Loading

dcherian commented Oct 14, 2020

jbusecke commented Oct 14, 2020

dcherian commented Oct 14, 2020

jbusecke commented Oct 15, 2020

dcherian commented Oct 15, 2020

jbusecke commented Oct 15, 2020

github-actions bot commented Jun 11, 2021

github-actions bot commented Jun 18, 2021

jbusecke commented Jun 24, 2021

jbusecke commented Oct 14, 2020 •

edited

Loading