Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMultiXcan error - No objects to concatenate #129

Closed
ZhenyaoYe opened this issue Jun 29, 2021 · 8 comments
Closed

SMultiXcan error - No objects to concatenate #129

ZhenyaoYe opened this issue Jun 29, 2021 · 8 comments

Comments

@ZhenyaoYe
Copy link

Dear MetaXcan team!

I am having an issue with SMultiXcan using the elastic net models.
The error I got is "INFO - Unexpected error: No objects to concatenate".

./SMulTiXcan.py --models_folder PrediXcan/elastic_net_models/ --models_name_filter "en_Brain_(.).db" --models_name_pattern "en_Brain_(.).db" --snp_covariance SMulTiXcan/gtex_v8_expression_elastic_net_snp_smultixcan_covariance.txt.gz --metaxcan_folder SMulTiXcan/metaxcan_spredixcan_folder/ --metaxcan_filter "en_Brain_(.)allchr_cpd.csv" --metaxcan_file_name_parse_pattern "en(.).csv" --gwas_folder GWASonCPD/en_gwas/ --gwas_file_pattern *.txt --snp_column SNP --effect_allele_column A1 --non_effect_allele_column A2 --beta_column BETA --pvalue_column P --output SMulTiXcan/en_output_chr/SMultixcan_allchr_cpd.csv
INFO - Creating context
INFO - Creating MetaXcan results manager
INFO - Loading genes
INFO - Context for snp covariance
INFO - Assessing GWAS-Models SNP intersection
INFO - Processing GWAS command line parameters
INFO - Unexpected error: No objects to concatenate

I used spredixcan to generate metaxcan files.
For example,
./SPrediXcan.py --model_db_path PrediXcan/elastic_net_models/en_Brain_Substantia_nigra.db --covariance PrediXcan/elastic_net_models/en_Brain_Substantia_nigra.txt.gz --gwas_folder GWASonCPD/en_gwas/ --gwas_file_pattern ".*txt" --snp_column SNP --effect_allele_column A1 --non_effect_allele_column A2 --beta_column BETA --pvalue_column P --output_file SMulTiXcan/metaxcan_spredixcan_folder/en_Brain_Substantia_nigra_allchr_cpd.csv.

Is it correct? If not, what are metaxcan files, and how to generate them? Because I didn't see the Metaxcan.py script in the "software" directory, which was downloaded from the GitHub MetaXcan.

Any suggestion is appreciated. Thank you very much for your help!

Best regards,
Zhenyao

@Fnyasimi
Copy link
Collaborator

Fnyasimi commented Jul 2, 2021

@ZhenyaoYe add these arguments --throw and -- verbosity 7 to your SMultiXcan command and share the log file. I would like to check what is happening.

@ZhenyaoYe
Copy link
Author

ZhenyaoYe commented Jul 2, 2021 via email

@Fnyasimi
Copy link
Collaborator

Fnyasimi commented Jul 2, 2021

Kindly comment on the github issue with the log file rather than sending an email. I cant see the attachment

@ZhenyaoYe
Copy link
Author

Sorry for bringing any inconvenience to you!

Please see below:
/data/mprc_data2/BrightData/zhenyao.ye/QTLsmr/PrediXcan/MetaXcan/software/SMulTiXcan.py
--models_folder /data/mprc_data2/BrightData/zhenyao.ye/QTLsmr/PrediXcan/elastic_net_models/
--models_name_pattern "en_Brain_(.).db"
--snp_covariance /data/mprc_data3/zhenyao.ye/QTLsmr/SMulTiXcan/gtex_v8_expression_elastic_net_snp_smultixcan_covariance.txt.gz
--metaxcan_folder /data/mprc_data3/zhenyao.ye/QTLsmr/SMulTiXcan/metaxcan_spredixcan_folder/
--metaxcan_filter "Allchr_cpd_en_Brain_(.
).csv"
--metaxcan_file_name_parse_pattern "(.)en_Brain(.).csv"
--gwas_folder /data/mprc_data3/zhenyao.ye/QTLsmr/GWASonCPD/en_gwas/
--gwas_file_pattern *.txt
--snp_column SNP
--effect_allele_column A1
--non_effect_allele_column A2
--beta_column BETA
--pvalue_column P
--verbosity 7
--throw
--output /data/mprc_data3/zhenyao.ye/QTLsmr/SMulTiXcan/en_output_chr/Allchr_cpd_SMultixcan.txt

INFO - Creating context
INFO - Creating MetaXcan results manager
Level 9 - Building data
INFO - Loading genes
INFO - Context for snp covariance
INFO - Assessing GWAS-Models SNP intersection
INFO - Processing GWAS command line parameters
Traceback (most recent call last):
File "/data/mprc_data2/BrightData/zhenyao.ye/QTLsmr/PrediXcan/MetaXcan/software/SMulTiXcan.py", line 93, in
run(args)
File "/data/mprc_data2/BrightData/zhenyao.ye/QTLsmr/PrediXcan/MetaXcan/software/SMulTiXcan.py", line 22, in run
context = CrossModelUtilities.context_from_args(args)
File "/data/mprc_data2/BrightData/zhenyao.ye/QTLsmr/PrediXcan/MetaXcan/software/metax/cross_model/Utilities.py", line 186, in context_from_args
intersection = GWASAndModels.gwas_model_intersection(args)
File "/data/mprc_data2/BrightData/zhenyao.ye/QTLsmr/PrediXcan/MetaXcan/software/metax/misc/GWASAndModels.py", line 36, in gwas_model_intersection
gwas= GWASUtilities.load_plain_gwas_from_args(args)
File "/data/mprc_data2/BrightData/zhenyao.ye/QTLsmr/PrediXcan/MetaXcan/software/metax/gwas/Utilities.py", line 117, in load_plain_gwas_from_args
gwas = pandas.concat(files)
File "/data/mprc_data1/software/andaconda/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 255, in concat
sort=sort,
File "/data/mprc_data1/software/andaconda/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 304, in init
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

If you need any additional information, please feel free to let me know.

Thank you very much for your help!
Best regards,
Zhenyao

@Fnyasimi
Copy link
Collaborator

Fnyasimi commented Jul 5, 2021

I have noted a few issues with your code;

  1. The code is not loading the multixcan results, you will need to modify your command on this parameter --metaxcan_filter. Foe example if you have the following results in your metaxcan folder
PGZ-SCZ_Artery_Aorta.csv
PGZ-SCZ_Artery_Coronary.csv
PGZ-SCZ_Artery_Tibial.csv
PGZ-SCZ_Brain_Amygdala.csv
PGZ-SCZ_Brain_Anterior_cingulate_cortex_BA24.csv
PGZ-SCZ_Brain_Caudate_basal_ganglia.csv

And you want to select Spredixcan results for brain only your input parameters should be like this

  • --metaxcan_filter "PGZ-SCZ_Brain_(.*).csv"
  • --metaxcan_file_name_parse_pattern "(.*)_Brain_(.*).csv"
    NB: Ensure your --metaxcan_folder path is correct and modify your pattern to match the files correctly
  1. The dbs are also not loaded to the tool. If you are interested in loading only brain databases update your code with;
  • --models_name_filter "en_Brain_(.*).db"
  • --models_name_pattern "en_Brain_(.*).db"
  1. The gwas_folder wildcard also needs an update. If you wish to use all the gwas in that directory your command should flag should be like this;
  • --gwas_file_pattern "(.*).txt"
    If your gwas are prefixed with brain your argument should look like this
  • --gwas_file_pattern "Brain_(.*).gz"
    NB: All files in the folder are assumed to belong to a single study. Your gwas files should have the same format.
  1. Also provide the --model_db_snp_key argument. This is key to use ad snp_id to match between the models and gwas.
    Your output log should be something like this
INFO - Creating context
INFO - Creating MetaXcan results manager
Level 9 - Loading metaxcan /gpfs/festus/PGZ-SCZ_Brain_Amygdala.csv
Level 9 - Loading metaxcan /gpfs/festus/PGZ-SCZ_Brain_Anterior_cingulate_cortex_BA24.csv
Level 9 - Loading metaxcan /gpfs/festus/PGZ-SCZ_Brain_Caudate_basal_ganglia.csv
Level 9 - Loading metaxcan /gpfs/festus/PGZ-SCZ_Brain_Cerebellar_Hemisphere.csv
Level 9 - Processing Amygdala
Level 9 - Processing Anterior_cingulate_cortex_BA24
Level 9 - Processing Caudate_basal_ganglia
Level 9 - Processing Cerebellar_Hemisphere
Level 9 - Processing Cerebellum
INFO - Loading genes
/gpfs/elastic_net_models/en_Brain_Anterior_cingulate_cortex_BA24.db
/gpfs/elastic_net_models/en_Brain_Nucleus_accumbens_basal_ganglia.db
/gpfs/elastic_net_models/en_Brain_Caudate_basal_ganglia.db
/gpfs/elastic_net_models/en_Brain_Cerebellum.db
INFO - Context for snp covariance
INFO - Assessing GWAS-Models SNP intersection
INFO - Processing GWAS command line parameters
INFO - Reading input gwas: /gpfs/gwas/Brain_scz.txt
INFO - Processing input gwas
Level 9 - Using declared zscore
INFO - Reading input gwas: /gpfs/gwas/Brain_scz-cohort2.txt
INFO - Processing input gwas
Level 9 - Using declared zscore
Level 9 - loading /gpfs/elastic_net_models/en_Brain_Amygdala.db
Level 9 - loading /gpfs/elastic_net_models/en_Brain_Anterior_cingulate_cortex_BA24.db
Level 9 - loading /gpfs/elastic_net_models/en_Brain_Caudate_basal_ganglia.db
Level 9 - loading /gpfs/elastic_net_models/en_Brain_Cerebellar_Hemisphere.db
Level 9 - loading /gpfs/elastic_net_models/en_Brain_Cerebellum.db
INFO - Loading Model Manager
Level 9 - preloading models
Level 9 - processing Amygdala
Level 9 - processing Anterior_cingulate_cortex_BA24
Level 9 - processing Caudate_basal_ganglia
Level 9 - processing Cerebellar_Hemisphere
Level 9 - processing Cerebellum
Level 9 - preparing models (dictionary layout)
INFO - Preparing SNP covariance
INFO - Processing
Level 7 - Gene 1/14219: ENSG00000277007.1
Level 7 - Gene 2/14219: ENSG00000203710.10
.....

I hope this example will help you set up your code and wildcards correctly. If you are stuck share the names of files in your directories for further assistance.

@ZhenyaoYe
Copy link
Author

Thank you very much for your answer! This is really helpful and the problem was solved. Is there any value of the arguments --cutoff_condition_number, --cutoff_threshold, and --cutoff_ratiothat that should be used? I found an example of using 30 for the argument --cutoff_condition_number shown in the link https://github.com/hakyimlab/MetaXcan/wiki/Tutorial:-GTEx-v8-MASH-models-integration-with-a-Coronary-Artery-Disease-GWAS.

Thank you very much for your help!
Best regards,
Zhenyao

@Fnyasimi
Copy link
Collaborator

Fnyasimi commented Jul 9, 2021

You can use the --cutoff_condition_number argument with the default cutoff of 30

@ZhenyaoYe
Copy link
Author

ZhenyaoYe commented Jul 13, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants