ValueError: too many values to unpack (expected 2) #2

HLHsieh · 2024-06-14T22:19:18Z

Hi there,

I was trying to execute NASTRA on my own data as follows:

python nastra.py call -b C9ORF72.sorted.bam  -o out

I got the error message:

[Processing]: 54it [00:01, 42.05it/s]
Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
    sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

Any suggestions would be appreciated.

Best,
Hsin

The text was updated successfully, but these errors were encountered:

renzilin · 2024-06-15T04:42:09Z

Hi there,

I was trying to execute NASTRA on my own data as follows:

python nastra.py call -b C9ORF72.sorted.bam  -o out

I got the error message:

[Processing]: 54it [00:01, 42.05it/s]
Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
    sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

Any suggestions would be appreciated.

Best, Hsin

Hi Hsin,
In NASTRA, we provided config files for STRs used for cell line authentication and forensic application, which is stored in https://github.com/renzilin/NASTRA/blob/main/NASTRA/cfgs/panel_forenseq.csv and https://github.com/renzilin/NASTRA/blob/main/NASTRA/cfgs/repeat_structure.pat.

You need to generate the locus information file before using NASTRA.

If you have any further question, please contact us.

Best,
Zilin

HLHsieh · 2024-06-15T19:33:26Z

Hi Zilin,

Thank you for your explanation. I looked into the arguments, and I am wondering whether PANEL, FACTSHEET, and CONFIG are required for my job.

Although I defined my own config, the same error message returns.

python $sc/nastra.py call -b C9ORF72_1_9R_NanoSim_2x.sorted.bam -o out -f repeat_structure.pat -p panel_forenseq.csv -c threshold.cfg

[Processing]: 1it [00:00, 38.61it/s]
Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
    sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

repeat_structure.pat

Loci Chrom.  Seq. Pattern  Publication  STRSeq BioProject
C9ORF72  9  [GGCCCC]n  NA. NA

panel_forenseq.csv

STR,CHROM,START,END,LEN,PREFIX,SUFFIX
C9ORF72,chr9,27573529,27573546,6,GGGCCCGCCCCGACCACGCCCCG,TAGCGCGCGACTCCTGA

threshold.cfg

locus,cov_0,cov_10,cov_15,cov_20,cov_25,cov_30,cov_50
C9ORF72,0.31,0.3,0.35,0.36,0.375,0.345,0.39

Best,
Hsin

renzilin · 2024-06-16T12:15:29Z

Hi Zilin,

Thank you for your explanation. I looked into the arguments, and I am wondering whether PANEL, FACTSHEET, and CONFIG are required for my job.

Although I defined my own config, the same error message returns.

python $sc/nastra.py call -b C9ORF72_1_9R_NanoSim_2x.sorted.bam -o out -f repeat_structure.pat -p panel_forenseq.csv -c threshold.cfg

[Processing]: 1it [00:00, 38.61it/s]
Traceback (most recent call last):
 File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
   main()
 File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
   args.func(args)
 File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
   sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

repeat_structure.pat

Loci Chrom.  Seq. Pattern  Publication  STRSeq BioProject
C9ORF72  9  [GGCCCC]n  NA. NA

panel_forenseq.csv

STR,CHROM,START,END,LEN,PREFIX,SUFFIX
C9ORF72,chr9,27573529,27573546,6,GGGCCCGCCCCGACCACGCCCCG,TAGCGCGCGACTCCTGA

threshold.cfg

locus,cov_0,cov_10,cov_15,cov_20,cov_25,cov_30,cov_50
C9ORF72,0.31,0.3,0.35,0.36,0.375,0.345,0.39

Best, Hsin

Hi Hsin,

Would it be convenient for you to provide a sample file? If there is a file, I can test it quickly.

Best,
Zilin

HLHsieh · 2024-06-16T16:09:15Z

Hi Zilin,

Sure. I put all files under: https://www.dropbox.com/scl/fo/yjxy2wvtt7fxf4trkat2v/AAbemqYvKvHnO6Wd0svQA-w?rlkey=mz8947kt97b2lonbjos3lq3o5&dl=0
Please let me know if there is any problem.

Best,
Hsin

renzilin · 2024-06-20T03:10:34Z

@ElroyLR Please check the code. The files were already downloaded.

renzilin · 2024-06-28T03:37:33Z

Hi Zilin,

Sure. I put all files under: https://www.dropbox.com/scl/fo/yjxy2wvtt7fxf4trkat2v/AAbemqYvKvHnO6Wd0svQA-w?rlkey=mz8947kt97b2lonbjos3lq3o5&dl=0 Please let me know if there is any problem.

Best, Hsin

Hi Hsin，
Sorry for the late reply.

We found that rename the input bam file as 'barcode01.bam'. Cuz we originally want to build a pipeline directly for off-load data.
We also have modified the configuration file. In detial, we used the human reference hg19 to conduct the whole genome alignment, as well as the locus position information.

As for the threshold for the determination of hom or het, you may define it according to your data.

The files are attached.
pattern.txt
panel.csv

Best,
Zilin

HLHsieh · 2024-07-04T19:39:58Z

Hi Zilin,

Thank you for your suggestions. The issue has been fixed, and I can execute NASTRA successfully. I have tried it on my 20 samples using the following command:

python $script call -b barcode01.bam -o ${myseq} -f repeat_structure.pat -p panel_forenseq.csv -c threshold.cfg --sncutoff 0

The threshold settings are:

locus,cov_0,cov_10,cov_15,cov_20,cov_25,cov_30,cov_50
C9ORF72,0.35,0.35,0.35,0.35,0.35,0.35,0.35

Only one sample analysis encountered this error:

Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 119, in calling_func
    merged_dat = pd.concat(results, axis=0)
  File "/nfs/turbo/umms-kinfai/hsinlun/miniconda3/envs/nastra_env/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 372, in concat
    op = _Concatenator(
  File "/nfs/turbo/umms-kinfai/hsinlun/miniconda3/envs/nastra_env/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 429, in __init__
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

I would appreciate any solution you could provide for this issue.

Best,
Hsin-Lun

renzilin · 2024-07-05T02:22:57Z

The result shows that ' No objects to concatenate', maybe the results list is empty

HLHsieh · 2024-07-06T20:22:17Z

Hi Zilin,

In this case, could I consider that NASTRA was able to detect any reads related to this STR region? I have analyzed more samples and found several had some issues, but these samples should contain STR.

Besides, I encountered other issue as follows:

Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 99, in calling_func
    cluster_alleles         = cluster_func.cluster(counter_dct)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 65, in cluster
    allele_dct = self.allele_init(part_group)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 76, in allele_init
    allele, supnum     = part_group[0]
IndexError: list index out of range

Do you have any ideas what caused this issue and how to fix it?

Best,
Hsin-Lun

renzilin · 2024-07-09T02:58:00Z

Hi Zilin,

In this case, could I consider that NASTRA was able to detect any reads related to this STR region? I have analyzed more samples and found several had some issues, but these samples should contain STR.

Besides, I encountered other issue as follows:
Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 99, in calling_func
    cluster_alleles         = cluster_func.cluster(counter_dct)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 65, in cluster
    allele_dct = self.allele_init(part_group)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 76, in allele_init
    allele, supnum     = part_group[0]
IndexError: list index out of range
Do you have any ideas what caused this issue and how to fix it?

Best, Hsin-Lun

How's the repeat structure in your reads, which contain STR? The part_group could be empty. This indicates no cluster_alleles

HLHsieh · 2024-07-12T05:45:02Z

Hi Zilin,

I tried several times, and the same issue occurred. The repeat structure is CC [GGCCCC]264 TAG. I checked, and there are five reads supporting this region. For some reasons, NASTRA did not consider these reads. Therefore, I guess the error might be derived from the assumption that no reads support this region.

Best,
Hsin-Lun

renzilin · 2024-07-12T06:35:21Z

I think the clustering step may make thie true reads is aligned to some wrong reads with the largest supporting number? On Jul 12, 2024, at 13:45, HLHsieh ***@***.***> wrote: Hi Zilin, I tried several times, and the same issue occurred. The repeat structure is CC [GGCCCC]264 TAG. I checked, and there are five reads supporting this region. For some reason, NASTRA did not consider these reads. Therefore, I guess the error was derived from the assumption that no reads support this region. Best, Hsin-Lun — Reply to this email directly, view it on GitHub<#2 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AECXSIGILW75NQJ357VJM63ZL5UPHAVCNFSM6AAAAABJLCUEFKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRUG43DAMBYGY>. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

renzilin closed this as completed Jul 5, 2024

renzilin reopened this Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: too many values to unpack (expected 2) #2

ValueError: too many values to unpack (expected 2) #2

HLHsieh commented Jun 14, 2024 •

edited

Loading

renzilin commented Jun 15, 2024

HLHsieh commented Jun 15, 2024

renzilin commented Jun 16, 2024

HLHsieh commented Jun 16, 2024

renzilin commented Jun 20, 2024

renzilin commented Jun 28, 2024

HLHsieh commented Jul 4, 2024

renzilin commented Jul 5, 2024

HLHsieh commented Jul 6, 2024 •

edited

Loading

renzilin commented Jul 9, 2024

HLHsieh commented Jul 12, 2024 •

edited

Loading

renzilin commented Jul 12, 2024 via email

ValueError: too many values to unpack (expected 2) #2

ValueError: too many values to unpack (expected 2) #2

Comments

HLHsieh commented Jun 14, 2024 • edited Loading

renzilin commented Jun 15, 2024

HLHsieh commented Jun 15, 2024

renzilin commented Jun 16, 2024

HLHsieh commented Jun 16, 2024

renzilin commented Jun 20, 2024

renzilin commented Jun 28, 2024

HLHsieh commented Jul 4, 2024

renzilin commented Jul 5, 2024

HLHsieh commented Jul 6, 2024 • edited Loading

renzilin commented Jul 9, 2024

HLHsieh commented Jul 12, 2024 • edited Loading

renzilin commented Jul 12, 2024 via email

HLHsieh commented Jun 14, 2024 •

edited

Loading

HLHsieh commented Jul 6, 2024 •

edited

Loading

HLHsieh commented Jul 12, 2024 •

edited

Loading