Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add assign_(coord/mask/attr) methods of data_array #3110

Merged
merged 21 commits into from Apr 13, 2023
Merged

Add assign_(coord/mask/attr) methods of data_array #3110

merged 21 commits into from Apr 13, 2023

Conversation

YooSunYoung
Copy link
Member

@YooSunYoung YooSunYoung commented Apr 5, 2023

Fixes #2960

TODO: python binding is not implemented yet

I used the name assign since I thought it should also be able to update the existing one.
Like this: https://docs.xarray.dev/en/stable/generated/xarray.DataArray.assign_coords.html

In the xarray, the syntax is like below

values = [...]
coord_name = "coord0"
dimension_name = "dim0"
assign_coords(coord_name=(dimension_name, values))

But since we are using the Variable which already has dimension name, it should be like below...?

coord_name = "coord0"
coord0 = sc.Variable(dims=["dim0"], values=[...])
da.assign_coords(coord_name=coord0)

@YooSunYoung
Copy link
Member Author

I have a question!

Should we check if the given coords/masks/attrs has overlapping keys?

@SimonHeybrock
Copy link
Member

TODO: python binding is not implemented yet

Any reason why we don't implement the entire thing in Python?

@SimonHeybrock
Copy link
Member

I have a question!

Should we check if the given coords/masks/attrs has overlapping keys?

You mean if we are replacing existing coords? I think I would not check. After all, __setitem__ does not check either?

@YooSunYoung
Copy link
Member Author

TODO: python binding is not implemented yet

Any reason why we don't implement the entire thing in Python?

No... drop_* were in cpp so I just kept there together. We can write all of them in python, sure.

@SimonHeybrock
Copy link
Member

TODO: python binding is not implemented yet

Any reason why we don't implement the entire thing in Python?

No... drop_* were in cpp so I just kept there together. We can write all of them in python, sure.

👍

@YooSunYoung
Copy link
Member Author

I have a question!
Should we check if the given coords/masks/attrs has overlapping keys?

You mean if we are replacing existing coords? I think I would not check. After all, __setitem__ does not check either?

No, I meant when you want to assign multiple of them, for example

assign_coords({'a': coord0, 'b':coord1, 'a': coord2})

@SimonHeybrock
Copy link
Member

I have a question!
Should we check if the given coords/masks/attrs has overlapping keys?

You mean if we are replacing existing coords? I think I would not check. After all, __setitem__ does not check either?

No, I meant when you want to assign multiple of them, for example

assign_coords({'a': coord0, 'b':coord1, 'a': coord2})

This will never reach assign_coords anyway, because Python first creates a dict?

@YooSunYoung
Copy link
Member Author

I have a question!
Should we check if the given coords/masks/attrs has overlapping keys?

You mean if we are replacing existing coords? I think I would not check. After all, __setitem__ does not check either?

No, I meant when you want to assign multiple of them, for example

assign_coords({'a': coord0, 'b':coord1, 'a': coord2})

This will never reach assign_coords anyway, because Python first creates a dict?

In this case below it will survive but the keyword arguments will be used. I guess that's fine...? Or should we allow only one of them...?

assign_coords({'a': coord0}, a=coord0)

@YooSunYoung YooSunYoung marked this pull request as ready for review April 11, 2023 15:38
Copy link
Member

@SimonHeybrock SimonHeybrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments below. Everything I said for the coords case applies to the others as well.

src/scipp/core/assignments.py Outdated Show resolved Hide resolved
src/scipp/core/assignments.py Outdated Show resolved Hide resolved


def assign_coords(
self, coords: Optional[Dict] = None, **coords_kwargs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coords should be position-only, such that it does not prevent the use of a keyword arg of that name.

collected_coords = coords_kwargs

for coord_key, coord in collected_coords.items():
self.coords[coord_key] = coord
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not change the input. Must make a shallow copy first (self = self.copy(deep=False))

da = sc.DataArray(data)
coord0 = sc.linspace('x', start=0.2, stop=1.61, num=4)
coord1 = sc.linspace('y', start=1, stop=4, num=3)
da.assign_coords({'coord0': coord0, 'coord1': coord1})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

da should be kept unchanged. Make sure to also a a test for that.

YooSunYoung and others added 3 commits April 12, 2023 09:03
Co-authored-by: Simon Heybrock <12912489+SimonHeybrock@users.noreply.github.com>
Co-authored-by: Simon Heybrock <12912489+SimonHeybrock@users.noreply.github.com>
from ..typing import DataArray, Dataset


def assign_coords(self, coords: Dict) -> Union[DataArray, Dataset]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the keyword-arg syntax?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coords should be position-only, such that it does not prevent the use of a keyword arg of that name.

Sorry I misunderstood this comment. Then... what do you mean by position-only here ...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the dict should be position-only, but keep the keyword args.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def assign_coords(self, coords: Dict[str, Variable], /, **kwargs):

or something like that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. My question was more like, should we allow only either of them

More specifically, will this be allowed?

da.assign_coords({'coord0': coord0_a}, coord0=coord0_b)

Currently, the keyword argument will overwrite the positional argument in this case.

@YooSunYoung
Copy link
Member Author

rename_dims allows using both positional arguments and keyword arguments,
but check if there is any overlapping names as you said,
so I did the similar thing for assign_*.

Comment on lines 30 to 39
coords_posarg = {} if coords is None else coords
collected_coords = {**coords_posarg, **coords_kwargs}

if len(collected_coords) != len(coords_posarg) + len(coords_kwargs):
overlapped = set(coords).intersection(coords_kwargs)
raise ValueError(
'The names of coords passed in the dict '
'and as keyword arguments must be distinct.'
f'Following names were used in both places: {overlapped}.'
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can all this code duplication be avoided?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can write a general function that combines positional dictionary argument and keyword arguments I guess...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it exists already:

def _combine_dims(
dims_dict: Optional[Dict[str, str]], names: Dict[str, str]
) -> Dict[str, str]:
dims_dict = {} if dims_dict is None else dims_dict
if set(dims_dict).intersection(names):
raise ValueError(
'The names passed in the dict and as keyword arguments must be distinct.'
f'Got {dims_dict} and {names}'
)
return {**dims_dict, **names}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that what I referenced from.
It prints all dictionaries here but since coords/attrs/masks are Variable,
but I didn't want to print them all so I just changed the message a little.

@@ -8,6 +8,21 @@
from ..typing import DataArray, Dataset, Variable


def _combine_args(arg0: Optional[Dict[str, Variable]] = None, /, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still duplicate with

def _combine_dims(
dims_dict: Optional[Dict[str, str]], names: Dict[str, str]
) -> Dict[str, str]:
dims_dict = {} if dims_dict is None else dims_dict
if set(dims_dict).intersection(names):
raise ValueError(
'The names passed in the dict and as keyword arguments must be distinct.'
f'Got {dims_dict} and {names}'
)
return {**dims_dict, **names}
. You can keep the error and not worry about it (since it should be extremely rare), or adjust it there to only print the keys?

from typing import Dict, Optional


def combine_dict_args(arg: Optional[Dict], /, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type hints missing now

@YooSunYoung YooSunYoung merged commit 362e921 into scipp:main Apr 13, 2023
4 checks passed
@YooSunYoung YooSunYoung deleted the feature-assign branch April 13, 2023 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add methods for adding coords/masks/attrs returning a modified object
2 participants