Multiple-Sample Integration for filtering cell ID based off Seurat #9

cfayx1996 · 2021-03-27T20:01:45Z

Hello,

Thank you for the well detailed instructions for this they are very helpful. I am rather new to python and I am having a challenging time trying to filter the loom files to match my Seurat object. My Seurat consists of 3 individual samples that are integrated together. I have three separate loom files that were made using Velocyto. I have followed all the instructions in your tutorial up to the filtering step for the loom files. After calling in all the CSV files for the CellIds, UMAP, and cluster ids I moved onto the Multiple-Sample Integration step as my CellID_Obs file has combined 3 samples just like your example table. I use the code:

cellID_obs_sample_one = cellID_obs[cellID_obs_sample_one[0].str.contrains("sample1_")]
cellID_obs_sample_two = cellID_obs[cellID_obs_sample_two[0].str.contrains("sample2_")]
cellID_obs_sample_three = cellID_obs[cellID_obs_sample_three[0].str.contrains("sample3_")]

sample_one = sample_one[np.isin(sample_one.obs.index, cellID_obs_sample_one)]
sample_two = sample_one[np.isin(sample_two.obs.index, cellID_obs_sample_two)]
sample_two = sample_one[np.isin(sample_two.obs.index, cellID_obs_sample_two)]

When I run the first line it errors out with:

cellID_obs_sample_one = sample_obs[cellID_obs_sample_one[0].str.contrains("sample1_")]
Traceback (most recent call last):
File "", line 1, in
NameError: name 'cellID_obs_sample_one' is not defined

If i separate the samples cellID_obs from Seurat into 3 separate lists and run it i still error out:

cellID_obs_sample1 = pd.read_csv("/home/cfay/Documents/cellID_obs_sample1.csv")

sample_one = sample_one[np.isin(sample_one.obs.index,cellID_obs_sample1["x"])]
cellID_obs_sample2 = pd.read_csv("/home/cfay/Documents/cellID_obs_sample2csv")
sample_two = sample_two[np.isin(sample_two.obs.index,cellID_obs_sample2["x"])]
cellID_obs_sample3 = pd.read_csv("/home/cfay/Documents/cellID_obs_sample3.csv")
sample_three = sample_three[np.isin(sample_three.obs.index,cellID_obs_sample3["x"])]
sample_one = sample_one.concatenate(sample_two, sample_three)
Traceback (most recent call last):
File "", line 1, in
File "/home/cfay/anaconda3/lib/python3.8/site-packages/anndata/_core/anndata.py", line 1710, in concatenate
out.obs = concat(
File "/home/cfay/anaconda3/lib/python3.8/site-packages/anndata/_core/anndata.py", line 834, in obs
self._set_dim_df(value, "obs")
File "/home/cfay/anaconda3/lib/python3.8/site-packages/anndata/_core/anndata.py", line 783, in _set_dim_df
value_idx = self._prep_dim_index(value.index, attr)
File "/home/cfay/anaconda3/lib/python3.8/site-packages/anndata/_core/anndata.py", line 810, in _prep_dim_index
value[0], (str, bytes)
File "/home/cfay/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 4101, in getitem
return getitem(key)
IndexError: index 0 is out of bounds for axis 0 with size 0

I figure that I am doing some part of this wrong and wanted to know if you would be able to help me pinpoint the issue as I want to calculate RNA velocity and use my seurat UMAP.
Thank you for your help and consideration!

sweebinee · 2021-04-26T07:58:56Z

hi @cfayx1996,
I'm a user like you, but I think I can help you.

I think you need to check your cell IDs first, especially their pattern.

cellID_obs_sample_one = cellID_obs[cellID_obs_sample_one[0].str.contrains("sample1_")]

In this line, str.contains() python function finds given string pattern("sample1_") in the front object(cellID_obs_sample_one[0]).
It's possible that your cell ID pattern is not "sampleX_".

And I think you need to modify the code like this:
cellID_obs_sample_one = cellID_obs[cellID_obs[0].str.contains("sample1_")]

contains is right, not contrains. Probably. @basilkhuder

AAA-3 · 2021-08-13T13:01:59Z

Hello! I tried attempting this solution (see #13 ) but it did not work for me and produced a long traceback error. @cfayx1996 did yoz have any luck?

cfayx1996 · 2021-08-13T15:36:33Z

Hi @AAA-3,

I presume you are trying to filter your data for RNA Velocity?

I tried to use this tutorial in for sorting in python, but found it was a lot easier to sort and create the object in R since I was analyzing the data with Seurat v4.

If you are using R and Seurat I would be happy to share what I did if that would help!

AAA-3 · 2021-08-13T16:22:13Z

Hi @AAA-3,

I presume you are trying to filter your data for RNA Velocity?

I tried to use this tutorial in for sorting in python, but found it was a lot easier to sort and create the object in R since I was analyzing the data with Seurat v4.

If you are using R and Seurat I would be happy to share what I did if that would help!

Hi @cfayx1996 Yes I am :) I’d be happy to try your method out as well!! You can email or message through the forum, whichever is convenient: Ali.a.ali@fau.de

Marc-Benoit · 2021-11-18T17:49:38Z

Hi @AAA-3,

I presume you are trying to filter your data for RNA Velocity?

I tried to use this tutorial in for sorting in python, but found it was a lot easier to sort and create the object in R since I was analyzing the data with Seurat v4.

If you are using R and Seurat I would be happy to share what I did if that would help!

Hi @cfayx1996 I am having similar trouble - is there a solution using R you could post here? Thank you!

SimoniMD · 2022-03-29T19:48:04Z

Hi @AAA-3,

I presume you are trying to filter your data for RNA Velocity?

I tried to use this tutorial in for sorting in python, but found it was a lot easier to sort and create the object in R since I was analyzing the data with Seurat v4.

If you are using R and Seurat I would be happy to share what I did if that would help!

Hi! Would you be able to share this with me, too? michael.simoni@pennmedicine.upenn.edu if you'd like to email. Thanks!

AAA-3 mentioned this issue Aug 10, 2021

Variable names are not unique and 'cellID_obs' is not defined #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple-Sample Integration for filtering cell ID based off Seurat #9

Multiple-Sample Integration for filtering cell ID based off Seurat #9

cfayx1996 commented Mar 27, 2021

sweebinee commented Apr 26, 2021 •

edited

AAA-3 commented Aug 13, 2021

cfayx1996 commented Aug 13, 2021

AAA-3 commented Aug 13, 2021 •

edited

Marc-Benoit commented Nov 18, 2021

SimoniMD commented Mar 29, 2022

Multiple-Sample Integration for filtering cell ID based off Seurat #9

Multiple-Sample Integration for filtering cell ID based off Seurat #9

Comments

cfayx1996 commented Mar 27, 2021

sweebinee commented Apr 26, 2021 • edited

AAA-3 commented Aug 13, 2021

cfayx1996 commented Aug 13, 2021

AAA-3 commented Aug 13, 2021 • edited

Marc-Benoit commented Nov 18, 2021

SimoniMD commented Mar 29, 2022

sweebinee commented Apr 26, 2021 •

edited

AAA-3 commented Aug 13, 2021 •

edited