The question about .config file #4

houruiyan · 2021-10-05T07:22:46Z

Hi, thanks for the great tool. I am trying to use it to solve some problems in my project. I have the 10x data and I used the cellranger to align them into the human ref. Finally, I got the bam file. So I want to configure the .config file. But I found it seems is not friendly to the input file exception the SICILIAN. I cannot how to write the input_file and meta file. Could you please give me some examples? I cannot understand the definition of "grouping_level_1 and grouping_level_2" and could you give me some explanation? Thank you in advance!

kaitlinchaung · 2021-10-05T17:03:13Z

Hello! Thank you for your question.

It sounds like you have some cellranger-aligned bams, and you have not run SICILIAN on that bam, is that correct?
In that case, I think you would want to have the following options:
SICILIAN = false
samplesheet = YOUR_SAMPLESHET_HERE.csv

For 10X data, I would follow the instructions in the first block to create the samplesheet: https://github.com/salzmanlab/SpliZ#samplesheets
You should have 2 comma-separated columns:

the name of the bam file(translates to the bam_ID)
the path to that bam file

For the metadata, that file should have at least 3 columns:

cell_id formatted as ${bam_ID}_${cellranger_barcode}
grouping_level_1 the metadata unit over which you would like to perform differential analysis
grouping_level_2 the metadata unit that you would like to calculate differential analysis

It is possible that you only have one group over which you'd like to perform differential analysis( #2 ), in which case, you can leave grouping_level_1 blank, and your metadata would look like:

cell_id formatted as ${bam_ID}_${cellranger_barcode}
grouping_level_2 the metadata unit that you would like to calculate differential analysis

An example I can provide is if you have data from multipletissue (i.e. lung, kidney, and heart) and multiple cell_type (i.e endothelial, blood, capillary) within each tissue.

If grouping_level_1 = tissue and grouping_level_2 = cell_type, then you would be looking for differential SpliZ in endothelial vs blood vs capillary FOR EACH tissue.
If grouping_level_2 = tissue and there is no grouping_level_1, then you would be looking for differential SpliZ in endothelial vs blood vs capillary, irrespective of tissue.
If grouping_level_2 = cell_type and there is no grouping_level_1, then you would be looking for differential SpliZ in lung vs kidney vs heart, irrespective of cell_type.

I hope that helps, and feel free to paste in your config file/metadata/samplesheets to check. And thanks again for your question, I'll update the readme to clarify the parameters a bit.

houruiyan · 2021-10-06T09:37:44Z

Thank you very much! Your explanation is very clear! I write the .config file and build the meta data/samplesheet according to your instruction. I think there is also point that should be paid attention. When we use the bam file, we do not need to set value for the "input file". I think it works.
This is my meta data.

This is my config.

But there is another new problem appear.

I don't know the point causing this problem. Hope to get your help. Thank you!

kaitlinchaung · 2021-10-06T17:53:50Z

Can you please navigate to the 'Work dir' of that failed job, and paste the results of *.log?

The 'Work dir' path is located in the bottom of your second image, i.e. /storage/yhhuang/../work/..

kaitlinchaung · 2021-10-06T17:58:26Z

It may also be helpful to paste in a couple lines of your MS_ann_splices.tsv file.

houruiyan · 2021-10-07T02:07:11Z

Dear Dr Chaung,

This is my calc_splizvd.log in the "work dir":

This is the MS_ann_splices.tsv file in my "work dir"

Thank you!

kaitlinchaung · 2021-10-07T03:04:16Z

Hi, if the column names of your metadata file are grouping_level_1 and grouping_level_2, then your config file should have:
grouping_level_1 = grouping_level_1
grouping_level_2 = grouping_level_2

houruiyan · 2021-10-07T03:06:37Z

ok, thank you very much! I will try it! Thank you again!

houruiyan · 2021-10-07T03:35:38Z

It works. thank you!

kaitlinchaung · 2021-10-07T03:42:53Z

No problem!

wlei-amu · 2022-03-31T11:47:49Z

Hello，
I want to run this tool for non-SICILIAN inputs,but I don't know what code to run, can you show me yours?Thanks!

wlei-amu · 2022-03-31T11:50:42Z

Hello， I want to run this tool for non-SICILIAN inputs,but I don't know what code to run, can you show me yours?Thanks!

If I configure the .config file,Where should I modify the.config file and what code should I run?Thanks!

juliaolivieri · 2022-03-31T18:31:45Z

Hellow @wlei-amu, what kind of data do you want to run on? 10X cellranger BAMs?

tjhwangxiong · 2022-04-01T13:20:58Z

Hellow @wlei-amu, what kind of data do you want to run on? 10X cellranger BAMs?

Dear juliaolivieri, I build SpliZ as following:

git clone https://github.com/salzmanlab/SpliZ.git
cd SpliZ
conda env create --name spliz_env --file=environment.yml
conda activate spliz_env
conda install nextflow

I have ran test data successfully via modifing small.config to set input_file = "small_data/small.pq".

Here, I wonder, if we run SpliZ using 10X cellranger BAMs, which config file shall we edit or generate? Can I justed modified the nextflow.config file as following:

// Global default params, used in configs
params {
  // Workflow flags for SpliZ
  // TODO nf-core: Specify your pipeline's command line flags
  dataname = wx
  input_file = wx_1.bam
  SICILIAN = false
  pin_S = 0.01
  pin_z = 0.0
  bounds = 5
  light = false
  svd_type = "normdonor"
  n_perms = 100
  grouping_level_1 = grouping_level_1
  grouping_level_2 = grouping_level_2
  libraryType = null
  run_analysis = false
  samplesheet = samplesheet.csv
  annotator_pickle = hg38_refseq.pkl
  exon_pickle = hg38_refseq_exon_bounds.pkl
  splice_pickle = hg38_refseq_splices.pkl
  meta = metadata.tsv
  gtf = GRCh38_genomic.gtf
  rank_quant = 0
  outdir = './results/${params.dataname}'
  publish_dir_mode = 'copy'

Or should I generate a new config file? If so, how shall I load the new config file.

Thanks a lot.

houruiyan added the bug Something isn't working label Oct 5, 2021

kaitlinchaung closed this as completed Oct 7, 2021

tjhwangxiong mentioned this issue Apr 1, 2022

Issues fo grouping levels #10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The question about .config file #4

The question about .config file #4

houruiyan commented Oct 5, 2021

kaitlinchaung commented Oct 5, 2021 •

edited

Loading

houruiyan commented Oct 6, 2021

kaitlinchaung commented Oct 6, 2021

kaitlinchaung commented Oct 6, 2021

houruiyan commented Oct 7, 2021

kaitlinchaung commented Oct 7, 2021

houruiyan commented Oct 7, 2021

houruiyan commented Oct 7, 2021

kaitlinchaung commented Oct 7, 2021

wlei-amu commented Mar 31, 2022

wlei-amu commented Mar 31, 2022

juliaolivieri commented Mar 31, 2022

tjhwangxiong commented Apr 1, 2022 •

edited

Loading

The question about .config file #4

The question about .config file #4

Comments

houruiyan commented Oct 5, 2021

kaitlinchaung commented Oct 5, 2021 • edited Loading

houruiyan commented Oct 6, 2021

kaitlinchaung commented Oct 6, 2021

kaitlinchaung commented Oct 6, 2021

houruiyan commented Oct 7, 2021

kaitlinchaung commented Oct 7, 2021

houruiyan commented Oct 7, 2021

houruiyan commented Oct 7, 2021

kaitlinchaung commented Oct 7, 2021

wlei-amu commented Mar 31, 2022

wlei-amu commented Mar 31, 2022

juliaolivieri commented Mar 31, 2022

tjhwangxiong commented Apr 1, 2022 • edited Loading

kaitlinchaung commented Oct 5, 2021 •

edited

Loading

tjhwangxiong commented Apr 1, 2022 •

edited

Loading