-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help understanding json config for basic demultiplexing #32
Comments
Hi @jcmcnch First, welcome and thank you for using Pheniqs. Now about your config files...
> },
> "undetermined": {
> "output": [
> "JMSNRSCF_undetermined_s01.fastq.gz",
> "JMSNRSCF_undetermined_s02.fastq.gz",
> "JMSNRSCF_undetermined_s03.fastq.gz",
> "JMSNRSCF_undetermined_s04.fastq.gz"
> ] linting the files sorts the JSON dictionaries which makes comparing two files much easier.
if you search for
Your config file specifies the output template has 2 tokens "template" : {
"transform" : {
"token" : [ "0::", "3::" ]
}
}, but in all your output directives you specify 4 output files. The number of output files in the output directive must be the number of segments you specify in your template.
the
If you still have questions, please don't hesitate to ask. We appreciate your feedback and are here to help. Regards {
"PL": "ILLUMINA",
"PM": "NovaSeq_2x250_UCDavis_run064",
"base input url": ".",
"filter incoming qc fail": true,
"flowcell id": "JMSNRSCF",
"input": [
"JMSNRSCF_S1_L001_R1_001.fastq.gz",
"JMSNRSCF_S1_L001_I1_001.fastq.gz",
"JMSNRSCF_S1_L001_I2_001.fastq.gz",
"JMSNRSCF_S1_L001_R2_001.fastq.gz"
],
"report url": "211123_JMSNRSCF_demux_sample_report.json",
"sample": {
"algorithm": "pamld",
"codec": {
"@11_15": {
"LB": "11_15",
"barcode": [
"AGTCAA"
],
"output": [
"JMSNRSCF_11_15_s01.fastq.gz",
"JMSNRSCF_11_15_s02.fastq.gz"
]
},
"@11_200": {
"LB": "11_200",
"barcode": [
"ATGTCA"
],
"output": [
"JMSNRSCF_11_200_s01.fastq.gz",
"JMSNRSCF_11_200_s02.fastq.gz"
]
},
"@11_25": {
"LB": "11_25",
"barcode": [
"CGTACG"
],
"output": [
"JMSNRSCF_11_25_s01.fastq.gz",
"JMSNRSCF_11_25_s02.fastq.gz"
]
},
"@2_400": {
"LB": "2_400",
"barcode": [
"ATCTCA"
],
"output": [
"JMSNRSCF_2_400_s01.fastq.gz",
"JMSNRSCF_2_400_s02.fastq.gz"
]
},
"@2_45": {
"LB": "2_45",
"barcode": [
"AGGAAT"
],
"output": [
"JMSNRSCF_2_45_s01.fastq.gz",
"JMSNRSCF_2_45_s02.fastq.gz"
]
},
"@7_120": {
"LB": "7_120",
"barcode": [
"ATTCCT"
],
"output": [
"JMSNRSCF_7_120_s01.fastq.gz",
"JMSNRSCF_7_120_s02.fastq.gz"
]
},
"@7_200": {
"LB": "7_200",
"barcode": [
"GTGGCC"
],
"output": [
"JMSNRSCF_7_200_s01.fastq.gz",
"JMSNRSCF_7_200_s02.fastq.gz"
]
},
"@7_25": {
"LB": "7_25",
"barcode": [
"GTGAAA"
],
"output": [
"JMSNRSCF_7_25_s01.fastq.gz",
"JMSNRSCF_7_25_s02.fastq.gz"
]
},
"@8_200": {
"LB": "8_200",
"barcode": [
"ATGAGC"
],
"output": [
"JMSNRSCF_8_200_s01.fastq.gz",
"JMSNRSCF_8_200_s02.fastq.gz"
]
},
"@8_800": {
"LB": "8_800",
"barcode": [
"CCGTCC"
],
"output": [
"JMSNRSCF_8_800_s01.fastq.gz",
"JMSNRSCF_8_800_s02.fastq.gz"
]
},
"@9_200": {
"LB": "9_200",
"barcode": [
"ACTTCC"
],
"output": [
"JMSNRSCF_9_200_s01.fastq.gz",
"JMSNRSCF_9_200_s02.fastq.gz"
]
},
"@BGT_PL01_rev11": {
"LB": "BGT_PL01_rev11",
"barcode": [
"GCGGAC"
],
"output": [
"JMSNRSCF_BGT_PL01_rev11_s01.fastq.gz",
"JMSNRSCF_BGT_PL01_rev11_s02.fastq.gz"
]
},
"@BGT_PL01_rev12": {
"LB": "BGT_PL01_rev12",
"barcode": [
"TTTCAC"
],
"output": [
"JMSNRSCF_BGT_PL01_rev12_s01.fastq.gz",
"JMSNRSCF_BGT_PL01_rev12_s02.fastq.gz"
]
},
"@BGT_PL01_rev13": {
"LB": "BGT_PL01_rev13",
"barcode": [
"CCGGTG"
],
"output": [
"JMSNRSCF_BGT_PL01_rev13_s01.fastq.gz",
"JMSNRSCF_BGT_PL01_rev13_s02.fastq.gz"
]
},
"@BGT_PL01_rev14": {
"LB": "BGT_PL01_rev14",
"barcode": [
"ATCGTG"
],
"output": [
"JMSNRSCF_BGT_PL01_rev14_s01.fastq.gz",
"JMSNRSCF_BGT_PL01_rev14_s02.fastq.gz"
]
},
"@BGT_PL02_rev15": {
"LB": "BGT_PL02_rev15",
"barcode": [
"TGAGTG"
],
"output": [
"JMSNRSCF_BGT_PL02_rev15_s01.fastq.gz",
"JMSNRSCF_BGT_PL02_rev15_s02.fastq.gz"
]
},
"@BGT_PL02_rev16": {
"LB": "BGT_PL02_rev16",
"barcode": [
"CGCCTG"
],
"output": [
"JMSNRSCF_BGT_PL02_rev16_s01.fastq.gz",
"JMSNRSCF_BGT_PL02_rev16_s02.fastq.gz"
]
},
"@CAP_PL01_rev01": {
"LB": "CAP_PL01_rev01",
"barcode": [
"CGTGAT"
],
"output": [
"JMSNRSCF_CAP_PL01_rev01_s01.fastq.gz",
"JMSNRSCF_CAP_PL01_rev01_s02.fastq.gz"
]
},
"@CAP_PL01_rev02": {
"LB": "CAP_PL01_rev02",
"barcode": [
"ACATCG"
],
"output": [
"JMSNRSCF_CAP_PL01_rev02_s01.fastq.gz",
"JMSNRSCF_CAP_PL01_rev02_s02.fastq.gz"
]
},
"@CAP_PL01_rev03": {
"LB": "CAP_PL01_rev03",
"barcode": [
"GCCTAA"
],
"output": [
"JMSNRSCF_CAP_PL01_rev03_s01.fastq.gz",
"JMSNRSCF_CAP_PL01_rev03_s02.fastq.gz"
]
},
"@CAP_PL01_rev04": {
"LB": "CAP_PL01_rev04",
"barcode": [
"TGGTCA"
],
"output": [
"JMSNRSCF_CAP_PL01_rev04_s01.fastq.gz",
"JMSNRSCF_CAP_PL01_rev04_s02.fastq.gz"
]
},
"@Colette_rev21": {
"LB": "Colette_rev21",
"barcode": [
"TTCGTC"
],
"output": [
"JMSNRSCF_Colette_rev21_s01.fastq.gz",
"JMSNRSCF_Colette_rev21_s02.fastq.gz"
]
},
"@Colette_rev22": {
"LB": "Colette_rev22",
"barcode": [
"CCAACT"
],
"output": [
"JMSNRSCF_Colette_rev22_s01.fastq.gz",
"JMSNRSCF_Colette_rev22_s02.fastq.gz"
]
},
"@Colette_rev23": {
"LB": "Colette_rev23",
"barcode": [
"TCAGTT"
],
"output": [
"JMSNRSCF_Colette_rev23_s01.fastq.gz",
"JMSNRSCF_Colette_rev23_s02.fastq.gz"
]
},
"@Colette_rev24": {
"LB": "Colette_rev24",
"barcode": [
"CTGACC"
],
"output": [
"JMSNRSCF_Colette_rev24_s01.fastq.gz",
"JMSNRSCF_Colette_rev24_s02.fastq.gz"
]
},
"@Delaney_rev19": {
"LB": "Delaney_rev19",
"barcode": [
"GGAACT"
],
"output": [
"JMSNRSCF_Delaney_rev19_s01.fastq.gz",
"JMSNRSCF_Delaney_rev19_s02.fastq.gz"
]
},
"@Delaney_rev20": {
"LB": "Delaney_rev20",
"barcode": [
"CGAAAC"
],
"output": [
"JMSNRSCF_Delaney_rev20_s01.fastq.gz",
"JMSNRSCF_Delaney_rev20_s02.fastq.gz"
]
},
"@Rae_rev05": {
"LB": "Rae_rev05",
"barcode": [
"CACTGT"
],
"output": [
"JMSNRSCF_Rae_rev05_s01.fastq.gz",
"JMSNRSCF_Rae_rev05_s02.fastq.gz"
]
},
"@mystery12264": {
"LB": "mystery12264",
"barcode": [
"ACTGAT"
],
"output": [
"JMSNRSCF_mystery12264_s01.fastq.gz",
"JMSNRSCF_mystery12264_s02.fastq.gz"
]
},
"@mystery4895": {
"LB": "mystery4895",
"barcode": [
"GGGGGG"
],
"output": [
"JMSNRSCF_mystery4895_s01.fastq.gz",
"JMSNRSCF_mystery4895_s02.fastq.gz"
]
},
"@mystery8619": {
"LB": "mystery8619",
"barcode": [
"AGTTCC"
],
"output": [
"JMSNRSCF_mystery8619_s01.fastq.gz",
"JMSNRSCF_mystery8619_s02.fastq.gz"
]
}
},
"confidence threshold": 0.95,
"noise": 0.05,
"transform": {
"token": [
"1::6"
]
}
},
"template": {
"transform": {
"token": [
"0::",
"3::"
]
}
}
}
|
Hi Lior, Thank you so much for your detailed reply, and I'm sorry I never replied. This will be a great reference going forward, and I hope using your explanations I can now figure out how to construct the json config. If I'm still running into issues, I'll be sure to reach out. Jesse |
Hi,
I'm wondering if you'd be able to help me understand how to construct a correct JSON config file. My goal is to demultiplex based on 6bp I1 indices from 4 files provided by our sequencing center (we have I2, but it's meaningless in this case).
What I've tried:
pheniqs/tool/pheniqs-io-api.py -c 211123_run064_pheniqs_test.json -LS -F fastq --compression gz > 211129_run064_pheniqs_post-io-api.json
.Where I'm confused
When I try to run pheniqs as follows
pheniqs mux -F fastq --compression gz -c 211129_run064_pheniqs_post-io-api.json
, it just prints the gzip-compressed output to standard out. Yet in the config JSON produced using your python script there are filenames as output. When I run the validate step, it also specifies output will be put to stdout which is a bit confusing to me. I've provided both configs in plaintext below.Please accept my apologies for the vague query, but I am a bit stumped. I'm unfamiliar with JSON-formatted configs and used to providing this info in a TSV/CSV and am not quite sure where to start in troubleshooting this. Even if you can tell me this must be due to some stray punctuation / a missing section in my original config or some such issue that would be really helpful for me to understand how pheniqs works.
Thanks a lot for your help,
Jesse
Original JSON file I constructed with your template:
The output of your python processing script:
The text was updated successfully, but these errors were encountered: