Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysterious hyphens when processing INDELs from ICGC data #159

Open
mattiyeh opened this issue Oct 9, 2023 · 2 comments
Open

mysterious hyphens when processing INDELs from ICGC data #159

mattiyeh opened this issue Oct 9, 2023 · 2 comments
Assignees

Comments

@mattiyeh
Copy link

mattiyeh commented Oct 9, 2023

https://github.com/AlexandrovLab/SigProfilerMatrixGenerator/blob/f945199230a4fc0671d90a7873b079930a84d227/SigProfilerMatrixGenerator/scripts/convert_input_to_simple_files.py#L332C10-L332C10

Hello,
why are the hyphens added to ref and mut when the other functions don't do similar actions? This breaks downstream because they are added again in MutationMatrixGenerator.py (lines 1176-1179) and then you can get a KeyError at line 1617 revcompl(type_sequence) because the '-' character is not in the revcompl map.

i fixed this by commenting out the lines in convert_input_to_sample_files, but can someone explain if this will have unintended consequences?

thanks,
Marc

@mdbarnesUCSD
Copy link
Collaborator

Hi @mattiyeh,

Thanks for reaching out again about the issue you encountered with ICGC input files. It would be a great help if you could please provide an input file to reproduce the issue you identified. Thanks!

@mdbarnesUCSD mdbarnesUCSD self-assigned this Nov 10, 2023
@mattiyeh
Copy link
Author

Hi Mark,

Sure. here is a sample input file.

stomach_indel_mutations.txt

Thanks,
Marc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants