Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue loading MERSCOPE results #89

Open
mkunst23 opened this issue Oct 25, 2023 · 3 comments
Open

issue loading MERSCOPE results #89

mkunst23 opened this issue Oct 25, 2023 · 3 comments

Comments

@mkunst23
Copy link

Hi,

when trying to read in a MERSCOPE results I'm getting the following error

Cell In[10], line 1
----> 1 sdata = sdio.merscope("/allen/programs/celltypes/production/mfish/mfishmerscopemouseatlas/1288760920/region_0")

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/github_projects/spatialdata-io/src/spatialdata_io/readers/merscope.py:192, in merscope(path, vpt_outputs, z_layers, region_name, slide_name, imread_kwargs, image_models_kwargs)
190 geo_df = geo_df.rename_geometry("geometry")
191 geo_df = geo_df[geo_df[MerscopeKeys.Z_INDEX] == 0] # Avoid duplicate boundaries on all z-levels
--> 192 geo_df.geometry = geo_df.geometry.map(lambda x: x.geoms[0]) # The MultiPolygons contain only one polygon
193 geo_df.index = geo_df[MerscopeKeys.METADATA_CELL_KEY].astype(str)
195 polygons = ShapesModel.parse(geo_df, transformations={"microns": Identity()})

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/SpatialData/lib/python3.10/site-packages/pandas/core/series.py:4540, in Series.map(self, arg, na_action)
4460 def map(
4461 self,
4462 arg: Callable | Mapping | Series,
4463 na_action: Literal["ignore"] | None = None,
4464 ) -> Series:
4465 """
4466 Map values of Series according to an input mapping or function.
4467
(...)
4538 dtype: object
4539 """
-> 4540 new_values = self._map_values(arg, na_action=na_action)
4541 return self._constructor(new_values, index=self.index, copy=False).finalize(
4542 self, method="map"
4543 )

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/SpatialData/lib/python3.10/site-packages/pandas/core/base.py:919, in IndexOpsMixin._map_values(self, mapper, na_action, convert)
916 arr = self._values
918 if isinstance(arr, ExtensionArray):
--> 919 return arr.map(mapper, na_action=na_action)
921 return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/SpatialData/lib/python3.10/site-packages/pandas/core/arrays/base.py:2189, in ExtensionArray.map(self, mapper, na_action)
2169 def map(self, mapper, na_action=None):
2170 """
2171 Map values using an input mapping or function.
2172
(...)
2187 a MultiIndex will be returned.
2188 """
-> 2189 return map_array(self, mapper, na_action=na_action)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/SpatialData/lib/python3.10/site-packages/pandas/core/algorithms.py:1814, in map_array(arr, mapper, na_action, convert)
1812 values = arr.astype(object, copy=False)
1813 if na_action is None:
-> 1814 return lib.map_infer(values, mapper, convert=convert)
1815 else:
1816 return lib.map_infer_mask(
1817 values, mapper, mask=isna(values).view(np.uint8), convert=convert
1818 )

File lib.pyx:2917, in pandas._libs.lib.map_infer()

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/github_projects/spatialdata-io/src/spatialdata_io/readers/merscope.py:192, in merscope..(x)
190 geo_df = geo_df.rename_geometry("geometry")
191 geo_df = geo_df[geo_df[MerscopeKeys.Z_INDEX] == 0] # Avoid duplicate boundaries on all z-levels
--> 192 geo_df.geometry = geo_df.geometry.map(lambda x: x.geoms[0]) # The MultiPolygons contain only one polygon
193 geo_df.index = geo_df[MerscopeKeys.METADATA_CELL_KEY].astype(str)
195 polygons = ShapesModel.parse(geo_df, transformations={"microns": Identity()})

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/SpatialData/lib/python3.10/site-packages/shapely/geometry/base.py:997, in GeometrySequence.getitem(self, key)
995 if isinstance(key, (int, np.integer)):
996 if key + m < 0 or key >= m:
--> 997 raise IndexError("index out of range")
998 if key < 0:
999 i = m + key

IndexError: index out of range

@LucaMarconato
Copy link
Member

Hi @mkunst23, is the data that you are using public? If not, would you be able please to make a reproducible example with mock data or by subsetting and perturbing your original data? Happy to look into this, thank you!

@mkunst23
Copy link
Author

mkunst23 commented Oct 26, 2023

Hi @LucaMarconato, yes the data is available on BIL (https://doi.brainimagelibrary.org/doi/10.35077/g.610).

But in the meantime I did find a fix for that problem by adding a line to make sure the index of the cell by gene table and the metadata match
data = pd.read_csv(count_path, index_col=0, dtype={MerscopeKeys.COUNTS_CELL_KEY: str})
obs = pd.read_csv(obs_path, index_col=0, dtype={MerscopeKeys.METADATA_CELL_KEY: str})
obs = obs.reindex(data.index)

I can do a PR later

UPDATE, this is more related to scverse/spatialdata#385 and only shows up when I use an older version of MERSCOPE segmentation

@LucaMarconato
Copy link
Member

Hi, thanks for the update. Yes please a PR would be helpful, either in spatialdata or spatialdata-io, whichever you find it most appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants