Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to reproduce the DLPFC analysis results #13

Open
Junjun1919 opened this issue Aug 10, 2023 · 12 comments
Open

Failed to reproduce the DLPFC analysis results #13

Junjun1919 opened this issue Aug 10, 2023 · 12 comments

Comments

@Junjun1919
Copy link

Junjun1919 commented Aug 10, 2023

Hi,
Thank you for this great work!
While attempting to reproduce your DLPFC results, I encountered an issue. The code provided at https://feiyoung.github.io/PRECAST/articles/PRECAST.DLPFC.html includes code only for single-sample analysis. The code available at https://github.com/feiyoung/PRECAST_Analysis/blob/main/Real_data_analysis/dorsolateral_prefrontal_cortex.R is not encapsulated and lacks input files of the result of SPARK method.

I conducted a joint analysis of 12 DLPFC data slices, adhering to the example code and parameters from the Human Breast Cancer Data Analysis. I adjusted parameters including selectGenesMethod, gene.number, postmin.spots (set to 1 or 15), and postmin.features (set to 10 or 15). However, the resulting ARI is significantly low and inconsistent with Figure 2.c and Figure S21.d.

Here are the results with postmin.spots = 15, postmin.features = 15, and gene.number = 2000:

For selectGenesMethod = "SPARK-X":
SPARK-X

For selectGenesMethod = "HVGs":
HVGs

Could you kindly share the code used to generate the results in Figure 2?

@feiyoung
Copy link
Owner

feiyoung commented Aug 11, 2023

We have updated the file https://github.com/feiyoung/PRECAST_Analysis/blob/main/Real_data_analysis/dorsolateral_prefrontal_cortex.R. Now user can run the code locally and the required data will be downloaded automatically. We have run the code again and produce the results of ARIs as follows:
image

@Junjun1919
Copy link
Author

Thanks a lot! I've successfully reproduced the results!

@Junjun1919
Copy link
Author

Junjun1919 commented Aug 17, 2023

Sorry to bother you again. Is there any fundamental difference in principle between code in the tutorial and in the GitHub PRECAST_Analysis package? While I can achieve relatively better results using the SPARK results and the code from the GitHub PRECAST_Analysis package, the results are not favorable (median ARI = 0.37) when I extract the top 2000 genes list based on the SPARK results with the code provided by PRECAST_Analysis and use CreatePRECASTObject(customGenelist=genelist).

The complete code is as follows:

gene = read.csv('./PRECAST/brain12_genelist2000.csv')
set.seed(2022)
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000, 
                                  premin.spots = 20, premin.features = 20, postmin.spots = 15, postmin.features = 15,
                                 customGenelist = gene$x)
PRECASTObj <- AddAdjList(PRECASTObj, platform = "Visium")
PRECASTObj <- AddParSetting(PRECASTObj, Sigma_equal = FALSE, verbose = TRUE, int.model = NULL, maxIter = 30)
PRECASTObj <- PRECAST(PRECASTObj, K = 10)
resList <- PRECASTObj@resList
PRECASTObj <- SelectModel(PRECASTObj)
for(i in 1:12){
    print(mclust::adjustedRandIndex(PRECASTObj@resList$cluster[[i]], PRECASTObj@seulist[[i]]$layer_guess_reordered))
}

The results of ARIs:
ARI

Why is this happening?

@feiyoung
Copy link
Owner

All the high-level functions CreatePRECASTObject() and PRECAST() were developed subsequent to the completion of the paper, to make convenient use for users. Thus, we don't directly use the high-level functions to analyze the datasets in paper. At present, the reason behind this remains unclear. Our conjecture is that it could possibly be attributed to the impact of the initialization process. You can try making the parameters settings in AddParSetting() functioin smae as PRECAST_Analysis repository, including setting int.model as defualt.

@Junjun1919
Copy link
Author

Thank you for your response. I will try as you suggested!

@Junjun1919
Copy link
Author

Hello, I tried setting int.model = 'EEE' as default but still obtained the same results as before. Are there any other possible
reasons and solutions for this?

@feiyoung
Copy link
Owner

We will check why this happened. Thank you for your feedback!

@Junjun1919
Copy link
Author

OK, thanks!

@feiyoung
Copy link
Owner

Hi, you can try this setting:
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000,
premin.spots = 0, premin.features = 0, postmin.spots = 0, postmin.features = 0,
customGenelist = gene$x)
PRECASTObj <- AddParSetting(PRECASTObj, maxIter=30, Sigma_equal = FALSE,
verbose = TRUE, int.model = "EEE", seed = 1)

@Junjun1919
Copy link
Author

Thank you! I've got it!

@tkcaccia
Copy link

tkcaccia commented Jun 1, 2024

Hi,
I was following this chat because I had issues with repeating some of the code in GitHub.
As you suggested, I wrote this code:

set.seed(2022)
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000,
                                  premin.spots = 0, premin.features = 0, postmin.spots = 0, postmin.features = 0)
PRECASTObj <- AddParSetting(PRECASTObj, maxIter=30, Sigma_equal = FALSE,
                            verbose = TRUE, int.model = "EEE", seed = 1)

PRECASTObj <- AddAdjList(PRECASTObj, platform = "Visium")
PRECASTObj <- PRECAST(PRECASTObj, K = 7)
resList <- PRECASTObj@resList
PRECASTObj <- SelectModel(PRECASTObj)

for(i in 1:12){
  print(mclust::adjustedRandIndex(PRECASTObj@resList$cluster[[i]], PRECASTObj@seulist[[i]]$layer_guess_reordered))
}

The ARI reported have high values. If I add the genelist identified with your code, the values drop considerevoly

[1] 0.5247836 [1] 0.5408016 [1] 0.5039616 [1] 0.5660206 [1] 0.4431008 [1] 0.3924656 [1] 0.6090096 [1] 0.5974283 [1] 0.5317708 [1] 0.4941678 [1] 0.5215401 [1] 0.5304256

In addition, I noted a problem of aggressive correction. In the second subject (slides 151669, 151670, 151671, 151672), the grey matter is clusterized together the Layer 6.

image

Is there a parameter setting to avoid aggressive clustering?

@feiyoung
Copy link
Owner

feiyoung commented Jun 3, 2024

Hi, I was following this chat because I had issues with repeating some of the code in GitHub. As you suggested, I wrote this code:

set.seed(2022)
PRECASTObj <- CreatePRECASTObject(seuList, project = "DLPFC", gene.number = 2000,
                                  premin.spots = 0, premin.features = 0, postmin.spots = 0, postmin.features = 0)
PRECASTObj <- AddParSetting(PRECASTObj, maxIter=30, Sigma_equal = FALSE,
                            verbose = TRUE, int.model = "EEE", seed = 1)

PRECASTObj <- AddAdjList(PRECASTObj, platform = "Visium")
PRECASTObj <- PRECAST(PRECASTObj, K = 7)
resList <- PRECASTObj@resList
PRECASTObj <- SelectModel(PRECASTObj)

for(i in 1:12){
  print(mclust::adjustedRandIndex(PRECASTObj@resList$cluster[[i]], PRECASTObj@seulist[[i]]$layer_guess_reordered))
}

The ARI reported have high values. If I add the genelist identified with your code, the values drop considerevoly

[1] 0.5247836 [1] 0.5408016 [1] 0.5039616 [1] 0.5660206 [1] 0.4431008 [1] 0.3924656 [1] 0.6090096 [1] 0.5974283 [1] 0.5317708 [1] 0.4941678 [1] 0.5215401 [1] 0.5304256

In addition, I noted a problem of aggressive correction. In the second subject (slides 151669, 151670, 151671, 151672), the grey matter is clusterized together the Layer 6.

image

Is there a parameter setting to avoid aggressive clustering?

Prior to the development of this package, we relied on a set of low-level functions to execute operations on the genelist. The reproducible code that demonstrates this process is available at:

https://github.com/feiyoung/PRECAST_Analysis/blob/main/Real_data_analysis/dorsolateral_prefrontal_cortex.R

In addressing the issue where grey matter clusters together with Layer 6, you may consider increasing the number of clusters to achieve a more accurate segregation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants