Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug loading vizgen data #673

Open
mkunst23 opened this issue Mar 24, 2023 · 12 comments
Open

bug loading vizgen data #673

mkunst23 opened this issue Mar 24, 2023 · 12 comments
Assignees

Comments

@mkunst23
Copy link

mkunst23 commented Mar 24, 2023

Description

Trouble using vizgen data with sq.read.vizgen(). Function is expected 8 columns in metadata file mine has 9

Minimal reproducible example

adata = sq.read.vizgen(
    path=data_path,
    counts_file=os.path.join(data_path,section,file_path,cbg_file),
    meta_file=os.path.join(data_path,section,file_path,meta_file),
    transformation_file=os.path.join(data_path,section,'region_0/images/micron_to_mosaic_pixel_transform.csv'),
)

Traceback

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [54], in <cell line: 1>()
----> 1 adata = sq.read.vizgen(
      2     path=data_path,
      3     counts_file=os.path.join(data_path,section,file_path,cbg_file),
      4     meta_file=os.path.join(data_path,section,file_path,meta_file),
      5     transformation_file=os.path.join(data_path,section,'region_0/images/micron_to_mosaic_pixel_transform.csv'),
      6 )

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/scvi-tool/lib/python3.9/site-packages/squidpy/read/_read.py:146, in vizgen(path, counts_file, meta_file, transformation_file, library_id, **kwargs)
    144 # fmt: off
    145 coords = pd.read_csv(path / meta_file, header=0, index_col=0)
--> 146 coords.columns = ["fov", "volume", "center_x", "center_y", "min_x", "max_x", "min_y", "max_y"]
    147 # fmt: on
    149 adata.obs = pd.merge(adata.obs, coords, how="left", left_index=True, right_index=True)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/scvi-tool/lib/python3.9/site-packages/pandas/core/generic.py:5588, in NDFrame.__setattr__(self, name, value)
   5586 try:
   5587     object.__getattribute__(self, name)
-> 5588     return object.__setattr__(self, name, value)
   5589 except AttributeError:
   5590     pass

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/scvi-tool/lib/python3.9/site-packages/pandas/_libs/properties.pyx:70, in pandas._libs.properties.AxisProperty.__set__()

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/scvi-tool/lib/python3.9/site-packages/pandas/core/generic.py:769, in NDFrame._set_axis(self, axis, labels)
    767 def _set_axis(self, axis: int, labels: Index) -> None:
    768     labels = ensure_index(labels)
--> 769     self._mgr.set_axis(axis, labels)
    770     self._clear_item_cache()

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/scvi-tool/lib/python3.9/site-packages/pandas/core/internals/managers.py:214, in BaseBlockManager.set_axis(self, axis, new_labels)
    212 def set_axis(self, axis: int, new_labels: Index) -> None:
    213     # Caller is responsible for ensuring we have an Index object.
--> 214     self._validate_set_axis(axis, new_labels)
    215     self.axes[axis] = new_labels

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/scvi-tool/lib/python3.9/site-packages/pandas/core/internals/base.py:69, in DataManager._validate_set_axis(self, axis, new_labels)
     66     pass
     68 elif new_len != old_len:
---> 69     raise ValueError(
     70         f"Length mismatch: Expected axis has {old_len} elements, new "
     71         f"values have {new_len} elements"
     72     )

ValueError: Length mismatch: Expected axis has 9 elements, new values have 8 elements

Version

'1.2.2'

...

@andrewjkwok
Copy link

I'm experiencing something similar except it's 16 elements rather than 9?

ValueError: Length mismatch: Expected axis has 16 elements, new values have 8 elements

@michalk8
Copy link
Collaborator

Hi @mkunst23 and @andrewjkwok , this should've been fixed in #648, installing squidpy from main should fix this.

@andrewjkwok
Copy link

@michalk8 Thanks for pointing us in the right direction - it works correctly now!

@andrewjkwok
Copy link

Sorry it seems it isn't entirely working yet. When I read the data in, my obs dataframe gets all NaNs, but if I check my cell_metadata.csv file, it's looks populated with the various cell coordinates etc. to me. The result is that I have no spatial coordinates to plot.

A second quick thing is that previous merscope outputs gave the cell coordinates in a set of hdf5 files, but that has since the merscope software update to v232 and onwards becomes a single parquet file instead - does squidpy need this info at all? I can't seem to find anywhere in the squidpy documentation that uses this file and am wondering if that would help the issue of the lack of spatial coordinates.

@giovp
Copy link
Member

giovp commented Apr 3, 2023

hey @andrewjkwok ,

thanks for reporting this, I'm afraid it's a bit tricky to help out without the data available. Could you share the data download so we can test it out? thanks!

@andrewjkwok
Copy link

@giovp Yes very happy to. Is there an email I could share a google drive link to? Many thanks in advance.

@andrewjkwok
Copy link

Sorry just a quick follow up @giovp @michalk8 was wondering if there was somewhere to share my data with your team to take a look?

@giovp
Copy link
Member

giovp commented Apr 11, 2023

@andrewjkwok any chance you could point us to some public data? for example, some data shared by vizgen?

@andrewjkwok
Copy link

andrewjkwok commented Apr 11, 2023

@giovp hmm the problem is that the cell metadata file from my MERSCOPE output (running their latest v232 software) doesn't look the same as the ones that are on vizgen's website.

So if I go to the squidpy website and follow the tutorial (https://squidpy.readthedocs.io/en/stable/external_tutorials/tutorial_vizgen.html) for the data download (https://info.vizgen.com/mouse-brain-map?submissionGuid=a66ccb7f-87cf-4c55-83b9-5a2b6c0c12b9), the cell_metadata.csv file doesn't look the same as the one from my merscope.

I've attached a truncated version of my cell metadata file for reference.

Vizgen website data:
datasets_mouse_brain_map_BrainReceptorShowcase_Slice1_Replicate1_cell_metadata_S1R1.csv

Output from my merscope:
cell_metadata_truncated.csv

@giovp
Copy link
Member

giovp commented Apr 12, 2023

hi @andrewjkwok I am unfortunately unable to look into this in the next two weeks, thanks for sharing the data, I'll get back to you soon

@andrewjkwok
Copy link

Hi - just wanted to quickly check whether there was any progress with this?

@dfhannum
Copy link
Contributor

dfhannum commented Jun 9, 2023

There was an issue with indexing but installing squidpy from main should fix the metadata not populating.

The spatial coordinates are being populated by the center_x and center_y from the metadata. The sq.read.vizgen function doesn't use the cell segmentation output, either the older hdf5 or the newer parquet formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants