Error: "array_row" not found during SME #219

bmlett · 2022-12-28T18:31:43Z

Good morning. I get the below error when attempting to run the st.spatial.SME.SME_normalize command.

I used create_stlearn to import data where the counts_matrix is from a spatial experiment object counts(spe) command and the spatial data is from as.data.frame(spatialCoords(spe)). I even attempted to add the array_col and array_row data from my SpatialExperiment obje to the spatialcoordinates file assuming this was the issue and still obtain the above error.

Thanks

duypham2108 · 2022-12-30T01:56:39Z

Can you write the anndata object and send me?

adata.write_h5ad("adata_object.h5ad")

Anyways, it's holiday now but I will check it ASAP.

bmlett · 2023-01-03T16:21:30Z

Hi, Commands used to get A1_count_matrix.csv and A1_spatialCoords.csv `count_matrix = t(counts(data)) spCoords = as.data.frame(spatialCoords(data)) array_coords = colData(data)[2:3] spatial = cbind(spCoords, array_coords) names(spatial)[1:2] = c("imagecol", "imagerow")` These are the commands run to import the data. `count_matrix = pd.read_csv("A1_count_matrix.csv") xy = pd.read_csv("A1_spatialCoords.csv") adata = st.create_stlearn(count=count_matrix,spatial=xy,library_id="A48_A1", image_path="tif/WSA_LngSP10193345.tif",scale=1,background_color="white")` The `adata.write_h5ad("adata_object.h5ad", compression="gzip")` result is: [https://res-geo.cdn.office.net/assets/mail/file-icon/png/generic_16x16.png] adata_object.h5ad<https://uwprod-my.sharepoint.com/:u:/g/personal/blett_wisc_edu/Eajcx9wAG4FHgjbDuy4sflwBfDAeUdV7p4R6Lgq1XsLALw> By adding the following after building the stlearn object, it seems to fix the issue. `adata.obs['array_row']=xy.iloc[:,2] adata.obs['array_col']=xy.iloc[:,3]` Though if there is a better way to import the data from R SpatialExperiment object that would be advantageous to learn about. Thanks and happy holidays! ******************************************************************* Beth M. Lett, Ph.D. Postdoctoral Trainee fellow - ERP | Ong Lab School of Medicine and Public health (SMPH), ADMIN, & Endocrinology and Reproductive Physiology (ERP) Pronouns: She, Her, Hers Office Address: 2778 WIMR West 1111 Highland Ave Madison, WI 53705

…

________________________________ From: Duy Pham ***@***.***> Sent: Thursday, December 29, 2022 7:56 PM To: BiomedicalMachineLearning/stLearn ***@***.***> Cc: BETH LETT ***@***.***>; Author ***@***.***> Subject: Re: [BiomedicalMachineLearning/stLearn] Error: "array_row" not found during SME (Issue #219) Can you write the anndata object and send me? adata.write_h5ad("adata_object.h5ad") Anyways, it's holiday now but I will check it ASAP. — Reply to this email directly, view it on GitHub<#219 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AH4C24W2XNZ72HK3CPJ5RDTWPY6ODANCNFSM6AAAAAATLPDXRE>. You are receiving this because you authored the thread.Message ID: ***@***.***>

duypham2108 · 2023-01-04T01:00:56Z

Good to know this issue is solved. We will try to make a function to convert from R objects like SpatialExperiment or SeuratObject in the near future. Thanks for suggestion

bmlett · 2023-01-07T17:10:02Z

There seems to be additional issues with using the create.stlearn for downstream functions. When trying to run the st.spatial.trajectory.pseudotime. When I run it on the above data it hits an error:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/ua/blett/anaconda3/envs/stlearn/lib/python3.8/site-packages/stlearn/spatials/trajectory/pseudotime.py", line 158, in pseudotime adata.uns["global_graph"]["graph"] = nx.to_scipy_sparse_array(G) File "/ua/blett/anaconda3/envs/stlearn/lib/python3.8/site-packages/networkx/convert_matrix.py", line 880, in to_scipy_sparse_array raise nx.NetworkXError("Graph has no nodes or edges") Graph has no nodes or edges

However when I use Read10x this error does not happen. I compared the two anndata objects and noticed it stores different values in the .uns['spatial'] section. I am not sure of a good way to fix this as there are qc steps in R I wish to run before preforming clustering. Thanks for developing this tool. I am hoping something comes from this thread soon: theislab/zellkonverter#61 to create a nice way to convert between similar to singlecell.

duypham2108 · 2023-01-08T03:14:16Z

In this step, there is a parameter to define the distance between every adjacent nodes: eps (based on DBSCAN). The Visium data is ~2000x2000 px then we use eps = 50 as default here. But it will depend on your data like what is the average distance between every adjacent nodes? Then you can use that distance to specify the eps parameter here:

st.spatial.trajectory.pseudotime(data,eps=50,use_rep="X_pca",use_label="louvain")

bmlett · 2023-01-09T15:16:36Z

Hi -

Thanks for the reply. I tried various values in the eps (2,18,50,100) and all values returned the same error for the data load using create.stlearn. I even tried two different 10x genomic datasets. When I switched to the Read10x method used in the trajectory example that worked.

I looked at the one 10x genomics dataset annData when load using create.stlearn and Read10x. This is the first few lines of the spatial adata.uns file between the two loading methods.

create.stlearn
OverloadedDict, wrapping: OrderedDict([( 'spatial', {'Bcancer_A1': {'images': { 'hires': array([[[188, 192, 191], [188, 192, 190], [188, 191, 188],

Read10x
OverloadedDict, wrapping: OrderedDict([( 'spatial', {'Parent_Visium_Human_BreastCancer': {'images': {'hires': array([[[0.7294118 , 0.74509805, 0.7372549 ], [0.7294118 , 0.74509805, 0.7372549 ],

The other key difference is in the scalefactors:

create.stlearn
'use_quality': 'hires', 'scalefactors': {'tissue_hires_scalef': 1, 'spot_diameter_fullres': 50}}})])

Read10x
'scalefactors': {'spot_diameter_fullres': 177.4984743134119, 'tissue_hires_scalef': 0.08250825, 'fiducial_diameter_fullres': 286.7283046601269, 'tissue_lowres_scalef': 0.024752475},

It makes sense to me that the Read10x method would have more information since the create.stlearn is only provided a base amount information. Does the fact that the create.stlearn version being whole numbers imply that the eps value needs to be higher?

duypham2108 · 2023-01-10T03:14:04Z

Basically, the difference is about the spatial information scale. The Visium data provided all the informatio like scalefactors. In the Read10X function, we store the raw spatial info in adata.obsm["spatial"] and adata.obs[["imagecol","imagerow"]] which is the .obsm["spatial"] * tissue_hires_scalef for example of the hires image. In the create.stlearn, we only store raw spatial info and you willl see both adata.obsm["spatial"] and adata.obs[["imagecol","imagerow"]] are similar with the scale factor = 0.

In downstream analysis, we use those spatial information to construct the neighborhood array for each spot/cell and also the input for the local clustering (using spatial data only) by using DBSCAN. The eps parameter defines how separate the cluster should be. Then it will depends on your spatial information scale as the input. I would say that you should look on your spatial information scale to set the eps value. Another way is using parameter scale in create_stlearn function. For example, set it with max(your_spatial_coordinate) / 2000 and then use similar eps with the tutorial.

bmlett · 2023-01-10T19:10:30Z

Thank you for that explanation!! By adding the scale information during creation, I was able to get the data to run through st.spatial.trajectory.pseudotime. Thank you for the patience and explaining the difference in the two methods. I do have one final follow up question regarding the spot_diameter_fullres and what the purpose is for this argument? This is partially out of curiosity and partially because though I provided an argument for the spot_diameter_fullres in create.stlearn viewing the data.uns shows the value as 50.

Again, thank you so much for answering my questions!!

duypham2108 · 2023-01-11T11:41:09Z

It will be used in the CCI prediction part if you want to calculate the distance for neighborhood spots automatically

stLearn/stlearn/tools/microenv/cci/base.py

Line 64 in ebc9b52

def calc_distance(adata: AnnData, distance: float):

Otherwise, it's not so important. Also, it only be useful when you use the Visium or any platform that have constant distance between spots. Hope it helps

duypham2108 closed this as completed Jan 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: "array_row" not found during SME #219

Error: "array_row" not found during SME #219

bmlett commented Dec 28, 2022

duypham2108 commented Dec 30, 2022

bmlett commented Jan 3, 2023 via email

duypham2108 commented Jan 4, 2023

bmlett commented Jan 7, 2023

duypham2108 commented Jan 8, 2023

bmlett commented Jan 9, 2023

duypham2108 commented Jan 10, 2023

bmlett commented Jan 10, 2023

duypham2108 commented Jan 11, 2023

Error: "array_row" not found during SME #219

Error: "array_row" not found during SME #219

Comments

bmlett commented Dec 28, 2022

duypham2108 commented Dec 30, 2022

bmlett commented Jan 3, 2023 via email

duypham2108 commented Jan 4, 2023

bmlett commented Jan 7, 2023

duypham2108 commented Jan 8, 2023

bmlett commented Jan 9, 2023

duypham2108 commented Jan 10, 2023

bmlett commented Jan 10, 2023

duypham2108 commented Jan 11, 2023