Failed to reproduce the DLPFC analysis results #13

Junjun1919 · 2023-08-10T15:34:19Z

Hi,
Thank you for this great work!
While attempting to reproduce your DLPFC results, I encountered an issue. The code provided at https://feiyoung.github.io/PRECAST/articles/PRECAST.DLPFC.html includes code only for single-sample analysis. The code available at https://github.com/feiyoung/PRECAST_Analysis/blob/main/Real_data_analysis/dorsolateral_prefrontal_cortex.R is not encapsulated and lacks input files of the result of SPARK method.

I conducted a joint analysis of 12 DLPFC data slices, adhering to the example code and parameters from the Human Breast Cancer Data Analysis. I adjusted parameters including selectGenesMethod, gene.number, postmin.spots (set to 1 or 15), and postmin.features (set to 10 or 15). However, the resulting ARI is significantly low and inconsistent with Figure 2.c and Figure S21.d.

Here are the results with postmin.spots = 15, postmin.features = 15, and gene.number = 2000:

For selectGenesMethod = "SPARK-X":

For selectGenesMethod = "HVGs":

Could you kindly share the code used to generate the results in Figure 2?

The text was updated successfully, but these errors were encountered:

feiyoung · 2023-08-11T14:05:40Z

We have updated the file https://github.com/feiyoung/PRECAST_Analysis/blob/main/Real_data_analysis/dorsolateral_prefrontal_cortex.R. Now user can run the code locally and the required data will be downloaded automatically. We have run the code again and produce the results of ARIs as follows:

Junjun1919 · 2023-08-16T13:31:54Z

Thanks a lot! I've successfully reproduced the results!

Junjun1919 · 2023-08-17T05:14:28Z

Sorry to bother you again. Is there any fundamental difference in principle between code in the tutorial and in the GitHub PRECAST_Analysis package? While I can achieve relatively better results using the SPARK results and the code from the GitHub PRECAST_Analysis package, the results are not favorable (median ARI = 0.37) when I extract the top 2000 genes list based on the SPARK results with the code provided by PRECAST_Analysis and use CreatePRECASTObject(customGenelist=genelist).

The complete code is as follows:

gene = read.csv('./PRECAST/brain12_genelist2000.csv')
set.seed(2022)
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000, 
                                  premin.spots = 20, premin.features = 20, postmin.spots = 15, postmin.features = 15,
                                 customGenelist = gene$x)
PRECASTObj <- AddAdjList(PRECASTObj, platform = "Visium")
PRECASTObj <- AddParSetting(PRECASTObj, Sigma_equal = FALSE, verbose = TRUE, int.model = NULL, maxIter = 30)
PRECASTObj <- PRECAST(PRECASTObj, K = 10)
resList <- PRECASTObj@resList
PRECASTObj <- SelectModel(PRECASTObj)
for(i in 1:12){
    print(mclust::adjustedRandIndex(PRECASTObj@resList$cluster[[i]], PRECASTObj@seulist[[i]]$layer_guess_reordered))
}

The results of ARIs:

Why is this happening?

feiyoung · 2023-08-18T08:41:21Z

All the high-level functions CreatePRECASTObject() and PRECAST() were developed subsequent to the completion of the paper, to make convenient use for users. Thus, we don't directly use the high-level functions to analyze the datasets in paper. At present, the reason behind this remains unclear. Our conjecture is that it could possibly be attributed to the impact of the initialization process. You can try making the parameters settings in AddParSetting() functioin smae as PRECAST_Analysis repository, including setting int.model as defualt.

Junjun1919 · 2023-08-18T10:48:04Z

Thank you for your response. I will try as you suggested!

Junjun1919 · 2023-08-22T08:04:26Z

Hello, I tried setting int.model = 'EEE' as default but still obtained the same results as before. Are there any other possible
reasons and solutions for this?

feiyoung · 2023-08-23T07:01:50Z

We will check why this happened. Thank you for your feedback!

Junjun1919 · 2023-08-23T12:35:38Z

OK, thanks!

feiyoung · 2023-08-24T09:21:03Z

Hi, you can try this setting:
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000,
premin.spots = 0, premin.features = 0, postmin.spots = 0, postmin.features = 0,
customGenelist = gene$x)
PRECASTObj <- AddParSetting(PRECASTObj, maxIter=30, Sigma_equal = FALSE,
verbose = TRUE, int.model = "EEE", seed = 1)

Junjun1919 · 2023-08-25T04:37:49Z

Thank you! I've got it!

tkcaccia · 2024-06-01T16:12:00Z

Hi,
I was following this chat because I had issues with repeating some of the code in GitHub.
As you suggested, I wrote this code:

set.seed(2022)
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000,
                                  premin.spots = 0, premin.features = 0, postmin.spots = 0, postmin.features = 0)
PRECASTObj <- AddParSetting(PRECASTObj, maxIter=30, Sigma_equal = FALSE,
                            verbose = TRUE, int.model = "EEE", seed = 1)

PRECASTObj <- AddAdjList(PRECASTObj, platform = "Visium")
PRECASTObj <- PRECAST(PRECASTObj, K = 7)
resList <- PRECASTObj@resList
PRECASTObj <- SelectModel(PRECASTObj)

for(i in 1:12){
  print(mclust::adjustedRandIndex(PRECASTObj@resList$cluster[[i]], PRECASTObj@seulist[[i]]$layer_guess_reordered))
}

The ARI reported have high values. If I add the genelist identified with your code, the values drop considerevoly

[1] 0.5247836 [1] 0.5408016 [1] 0.5039616 [1] 0.5660206 [1] 0.4431008 [1] 0.3924656 [1] 0.6090096 [1] 0.5974283 [1] 0.5317708 [1] 0.4941678 [1] 0.5215401 [1] 0.5304256

In addition, I noted a problem of aggressive correction. In the second subject (slides 151669, 151670, 151671, 151672), the grey matter is clusterized together the Layer 6.

Is there a parameter setting to avoid aggressive clustering?

feiyoung · 2024-06-03T07:07:08Z

Hi, I was following this chat because I had issues with repeating some of the code in GitHub. As you suggested, I wrote this code:
set.seed(2022)
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000,
                                  premin.spots = 0, premin.features = 0, postmin.spots = 0, postmin.features = 0)
PRECASTObj <- AddParSetting(PRECASTObj, maxIter=30, Sigma_equal = FALSE,
                            verbose = TRUE, int.model = "EEE", seed = 1)

PRECASTObj <- AddAdjList(PRECASTObj, platform = "Visium")
PRECASTObj <- PRECAST(PRECASTObj, K = 7)
resList <- PRECASTObj@resList
PRECASTObj <- SelectModel(PRECASTObj)

for(i in 1:12){
  print(mclust::adjustedRandIndex(PRECASTObj@resList$cluster[[i]], PRECASTObj@seulist[[i]]$layer_guess_reordered))
}
The ARI reported have high values. If I add the genelist identified with your code, the values drop considerevoly

[1] 0.5247836 [1] 0.5408016 [1] 0.5039616 [1] 0.5660206 [1] 0.4431008 [1] 0.3924656 [1] 0.6090096 [1] 0.5974283 [1] 0.5317708 [1] 0.4941678 [1] 0.5215401 [1] 0.5304256

In addition, I noted a problem of aggressive correction. In the second subject (slides 151669, 151670, 151671, 151672), the grey matter is clusterized together the Layer 6.

Is there a parameter setting to avoid aggressive clustering?

Prior to the development of this package, we relied on a set of low-level functions to execute operations on the genelist. The reproducible code that demonstrates this process is available at:

https://github.com/feiyoung/PRECAST_Analysis/blob/main/Real_data_analysis/dorsolateral_prefrontal_cortex.R

In addressing the issue where grey matter clusters together with Layer 6, you may consider increasing the number of clusters to achieve a more accurate segregation.

Junjun1919 closed this as completed Aug 10, 2023

Junjun1919 reopened this Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to reproduce the DLPFC analysis results #13

Failed to reproduce the DLPFC analysis results #13

Junjun1919 commented Aug 10, 2023 •

edited

Loading

feiyoung commented Aug 11, 2023 •

edited

Loading

Junjun1919 commented Aug 16, 2023

Junjun1919 commented Aug 17, 2023 •

edited

Loading

feiyoung commented Aug 18, 2023

Junjun1919 commented Aug 18, 2023

Junjun1919 commented Aug 22, 2023

feiyoung commented Aug 23, 2023

Junjun1919 commented Aug 23, 2023

feiyoung commented Aug 24, 2023

Junjun1919 commented Aug 25, 2023

tkcaccia commented Jun 1, 2024

feiyoung commented Jun 3, 2024

Failed to reproduce the DLPFC analysis results #13

Failed to reproduce the DLPFC analysis results #13

Comments

Junjun1919 commented Aug 10, 2023 • edited Loading

feiyoung commented Aug 11, 2023 • edited Loading

Junjun1919 commented Aug 16, 2023

Junjun1919 commented Aug 17, 2023 • edited Loading

feiyoung commented Aug 18, 2023

Junjun1919 commented Aug 18, 2023

Junjun1919 commented Aug 22, 2023

feiyoung commented Aug 23, 2023

Junjun1919 commented Aug 23, 2023

feiyoung commented Aug 24, 2023

Junjun1919 commented Aug 25, 2023

tkcaccia commented Jun 1, 2024

feiyoung commented Jun 3, 2024

Junjun1919 commented Aug 10, 2023 •

edited

Loading

feiyoung commented Aug 11, 2023 •

edited

Loading

Junjun1919 commented Aug 17, 2023 •

edited

Loading