Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define abstract read_table helper function #632

Open
tcompa opened this issue Dec 11, 2023 · 2 comments
Open

Define abstract read_table helper function #632

tcompa opened this issue Dec 11, 2023 · 2 comments
Labels
ngio OME-Zarr reader/writer Tables AnnData and ROI/feature tables

Comments

@tcompa
Copy link
Collaborator

tcompa commented Dec 11, 2023

We now have a first version of write_table, which will be part of the upcoming v0.14.0.
We should check whether also a read_table function may be useful. This could replace of lines like

    import anndata as ad

    # Load the ROI table and its metadata attributes
    ROI_table = ad.read_zarr(ROI_table_path)
    attrs = zarr.group(ROI_table_path).attrs
    MaskingROITableAttrs(**attrs.asdict())
    column_name = attrs["instance_key"]
    # Check that ROI_table.obs has the right column and extract label_value
    if column_name not in ROI_table.obs.columns:
        raise ValueError(
            'In _preprocess_input, "{column_name}" '
            f" missing in {ROI_table.obs.columns=}"
        )

with lines which could look like

    from fractal_tasks_core.tables import read_table
    # Load the ROI table and its metadata attributes
    table, attrs, column_names = read_table(path, options={"validate_attrs": True})
    column_name = attrs["instance_key"]
    # Check that ROI_table.obs has the right column and extract label_value
    if column_name not in column_names:
        raise ValueError(
            'In _preprocess_input, "{column_name}" '
            f" missing in {columns=}"
        )

This is partly relevant also for #629, since it would force us to think more about what attributes a table must have; e.g. do all V1 tables have an obs attribute with some specific contents? TBD

@tcompa tcompa added the Tables AnnData and ROI/feature tables label Dec 11, 2023
@jluethi
Copy link
Collaborator

jluethi commented Dec 12, 2023

Big fan of the idea!

For read_table(path, options={"validate_attrs": True}), I'd rather go with something like:

read_table(path, validate_attrs=True)

(with a potential default for validate_attrs)

Also, couldn't this part be part of the validation block?

  column_name = attrs["instance_key"]
  # Check that ROI_table.obs has the right column and extract label_value
  if column_name not in column_names:
      raise ValueError(
          'In _preprocess_input, "{column_name}" '
          f" missing in {columns=}"
      )

@jluethi
Copy link
Collaborator

jluethi commented Dec 13, 2023

For the future, something like:

table, attrs = read_table(path, validate_attrs=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ngio OME-Zarr reader/writer Tables AnnData and ROI/feature tables
Projects
Development

No branches or pull requests

2 participants