Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: too many values to unpack (expected 2) #2

Open
HLHsieh opened this issue Jun 14, 2024 · 12 comments
Open

ValueError: too many values to unpack (expected 2) #2

HLHsieh opened this issue Jun 14, 2024 · 12 comments

Comments

@HLHsieh
Copy link

HLHsieh commented Jun 14, 2024

Hi there,

I was trying to execute NASTRA on my own data as follows:

python nastra.py call -b C9ORF72.sorted.bam  -o out

I got the error message:

[Processing]: 54it [00:01, 42.05it/s]
Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
    sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

Any suggestions would be appreciated.

Best,
Hsin

@renzilin
Copy link
Owner

Hi there,

I was trying to execute NASTRA on my own data as follows:

python nastra.py call -b C9ORF72.sorted.bam  -o out

I got the error message:

[Processing]: 54it [00:01, 42.05it/s]
Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
    sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

Any suggestions would be appreciated.

Best, Hsin

Hi Hsin,
In NASTRA, we provided config files for STRs used for cell line authentication and forensic application, which is stored in https://github.com/renzilin/NASTRA/blob/main/NASTRA/cfgs/panel_forenseq.csv and https://github.com/renzilin/NASTRA/blob/main/NASTRA/cfgs/repeat_structure.pat.

You need to generate the locus information file before using NASTRA.

If you have any further question, please contact us.

Best,
Zilin

@HLHsieh
Copy link
Author

HLHsieh commented Jun 15, 2024

Hi Zilin,

Thank you for your explanation. I looked into the arguments, and I am wondering whether PANEL, FACTSHEET, and CONFIG are required for my job.

Although I defined my own config, the same error message returns.

python $sc/nastra.py call -b C9ORF72_1_9R_NanoSim_2x.sorted.bam -o out -f repeat_structure.pat -p panel_forenseq.csv -c threshold.cfg
[Processing]: 1it [00:00, 38.61it/s]
Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
    sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

repeat_structure.pat

Loci Chrom.  Seq. Pattern  Publication  STRSeq BioProject
C9ORF72  9  [GGCCCC]n  NA. NA

panel_forenseq.csv

STR,CHROM,START,END,LEN,PREFIX,SUFFIX
C9ORF72,chr9,27573529,27573546,6,GGGCCCGCCCCGACCACGCCCCG,TAGCGCGCGACTCCTGA

threshold.cfg

locus,cov_0,cov_10,cov_15,cov_20,cov_25,cov_30,cov_50
C9ORF72,0.31,0.3,0.35,0.36,0.375,0.345,0.39

Best,
Hsin

@renzilin
Copy link
Owner

Hi Zilin,

Thank you for your explanation. I looked into the arguments, and I am wondering whether PANEL, FACTSHEET, and CONFIG are required for my job.

Although I defined my own config, the same error message returns.

python $sc/nastra.py call -b C9ORF72_1_9R_NanoSim_2x.sorted.bam -o out -f repeat_structure.pat -p panel_forenseq.csv -c threshold.cfg
[Processing]: 1it [00:00, 38.61it/s]
Traceback (most recent call last):
 File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
   main()
 File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
   args.func(args)
 File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 138, in calling_func
   sample_name, locus = key.split('_')
ValueError: too many values to unpack (expected 2)

repeat_structure.pat

Loci Chrom.  Seq. Pattern  Publication  STRSeq BioProject
C9ORF72  9  [GGCCCC]n  NA. NA

panel_forenseq.csv

STR,CHROM,START,END,LEN,PREFIX,SUFFIX
C9ORF72,chr9,27573529,27573546,6,GGGCCCGCCCCGACCACGCCCCG,TAGCGCGCGACTCCTGA

threshold.cfg

locus,cov_0,cov_10,cov_15,cov_20,cov_25,cov_30,cov_50
C9ORF72,0.31,0.3,0.35,0.36,0.375,0.345,0.39

Best, Hsin

Hi Hsin,

Would it be convenient for you to provide a sample file? If there is a file, I can test it quickly.

Best,
Zilin

@HLHsieh
Copy link
Author

HLHsieh commented Jun 16, 2024

Hi Zilin,

Sure. I put all files under: https://www.dropbox.com/scl/fo/yjxy2wvtt7fxf4trkat2v/AAbemqYvKvHnO6Wd0svQA-w?rlkey=mz8947kt97b2lonbjos3lq3o5&dl=0
Please let me know if there is any problem.

Best,
Hsin

@renzilin
Copy link
Owner

@ElroyLR Please check the code. The files were already downloaded.

@renzilin
Copy link
Owner

Hi Zilin,

Sure. I put all files under: https://www.dropbox.com/scl/fo/yjxy2wvtt7fxf4trkat2v/AAbemqYvKvHnO6Wd0svQA-w?rlkey=mz8947kt97b2lonbjos3lq3o5&dl=0 Please let me know if there is any problem.

Best, Hsin

Hi Hsin,
Sorry for the late reply.

We found that rename the input bam file as 'barcode01.bam'. Cuz we originally want to build a pipeline directly for off-load data.
We also have modified the configuration file. In detial, we used the human reference hg19 to conduct the whole genome alignment, as well as the locus position information.

As for the threshold for the determination of hom or het, you may define it according to your data.

The files are attached.
pattern.txt
panel.csv

Best,
Zilin

@HLHsieh
Copy link
Author

HLHsieh commented Jul 4, 2024

Hi Zilin,

Thank you for your suggestions. The issue has been fixed, and I can execute NASTRA successfully. I have tried it on my 20 samples using the following command:

python $script call -b barcode01.bam -o ${myseq} -f repeat_structure.pat -p panel_forenseq.csv -c threshold.cfg --sncutoff 0

The threshold settings are:

locus,cov_0,cov_10,cov_15,cov_20,cov_25,cov_30,cov_50
C9ORF72,0.35,0.35,0.35,0.35,0.35,0.35,0.35

Only one sample analysis encountered this error:

Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 119, in calling_func
    merged_dat = pd.concat(results, axis=0)
  File "/nfs/turbo/umms-kinfai/hsinlun/miniconda3/envs/nastra_env/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 372, in concat
    op = _Concatenator(
  File "/nfs/turbo/umms-kinfai/hsinlun/miniconda3/envs/nastra_env/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 429, in __init__
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

I would appreciate any solution you could provide for this issue.

Best,
Hsin-Lun

@renzilin renzilin closed this as completed Jul 5, 2024
@renzilin renzilin reopened this Jul 5, 2024
@renzilin
Copy link
Owner

renzilin commented Jul 5, 2024

The result shows that ' No objects to concatenate', maybe the results list is empty

@HLHsieh
Copy link
Author

HLHsieh commented Jul 6, 2024

Hi Zilin,

In this case, could I consider that NASTRA was able to detect any reads related to this STR region? I have analyzed more samples and found several had some issues, but these samples should contain STR.

Besides, I encountered other issue as follows:

Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 99, in calling_func
    cluster_alleles         = cluster_func.cluster(counter_dct)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 65, in cluster
    allele_dct = self.allele_init(part_group)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 76, in allele_init
    allele, supnum     = part_group[0]
IndexError: list index out of range

Do you have any ideas what caused this issue and how to fix it?

Best,
Hsin-Lun

@renzilin
Copy link
Owner

renzilin commented Jul 9, 2024

Hi Zilin,

In this case, could I consider that NASTRA was able to detect any reads related to this STR region? I have analyzed more samples and found several had some issues, but these samples should contain STR.

Besides, I encountered other issue as follows:

Traceback (most recent call last):
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 154, in <module>
    main()
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 15, in main
    args.func(args)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/nastra.py", line 99, in calling_func
    cluster_alleles         = cluster_func.cluster(counter_dct)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 65, in cluster
    allele_dct = self.allele_init(part_group)
  File "/nfs/turbo/umms-kinfai/hsinlun/bin/NASTRA/NASTRA/libs/pairwise_alignment.py", line 76, in allele_init
    allele, supnum     = part_group[0]
IndexError: list index out of range

Do you have any ideas what caused this issue and how to fix it?

Best, Hsin-Lun

How's the repeat structure in your reads, which contain STR? The part_group could be empty. This indicates no cluster_alleles

@HLHsieh
Copy link
Author

HLHsieh commented Jul 12, 2024

Hi Zilin,

I tried several times, and the same issue occurred. The repeat structure is CC [GGCCCC]264 TAG. I checked, and there are five reads supporting this region. For some reasons, NASTRA did not consider these reads. Therefore, I guess the error might be derived from the assumption that no reads support this region.

Best,
Hsin-Lun

@renzilin
Copy link
Owner

renzilin commented Jul 12, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants