Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support static type analysis #3967

Closed
eric-czech opened this issue Apr 13, 2020 · 4 comments
Closed

Support static type analysis #3967

eric-czech opened this issue Apr 13, 2020 · 4 comments

Comments

@eric-czech
Copy link

As a related discussion to #3959, I wanted to see what possibilities exist for a user or API developer building on Xarray to enforce Dataset/DataArray structure through static analysis.

In my specific scenario, I would like to model several different types of data in my domain as Dataset objects, but I'd like to be able enforce that names and dtypes associated with both data variables and coordinates meet certain constraints.

@keewis mentioned an example of this in #3959 (comment) where it might be possible to use something like a TypedDict to constrain variable/coord names and array dtypes, but this won't work with TypedDict as it's currently implemented. Another possibility could be generics, and I took a stab at that in #3959 (comment) (though this would certainly be more intrusive).

An example of where this would be useful is in adding extensions through accessors:

@xr.register_dataset_accessor('ext')
def ExtAccessor:
    def __init__(self, ds)
        self.data = ds
    
    def is_zero(self):
        return self.ds['data'] == 0

ds = xr.Dataset(dict(DATA=xr.DataArray([0.0])))
# I'd like to catch that "data" was misspelled as "DATA" and that 
# this particular method shouldn't be run against floats prior to runtime
ds.ext.is_zero() 

I probably care more about this as someone looking to build an API on top of Xarray, but I imagine typical users would find a solution to this problem beneficial too.

There is a related conversation on doing something like this for Pandas DataFrames at python/typing#28 (comment), so that might be helpful context for possibilities with TypeDict.

@crusaderky
Copy link
Contributor

What you're asking for has two huge blocker dependencies:

@stale
Copy link

stale bot commented May 1, 2022

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label May 1, 2022
@max-sixty max-sixty removed the stale label May 1, 2022
@dcherian
Copy link
Contributor

dcherian commented May 2, 2022

See #6462 also

@headtr1ck
Copy link
Collaborator

Lets continue the discussion over at #8199 which is a more generic feature request but should cover this as well.

@headtr1ck headtr1ck closed this as not planned Won't fix, can't repro, duplicate, stale Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants