Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what's the meanning of "interval_dict" in long_read_typing.py ? #30

Closed
lidd77 opened this issue May 17, 2024 · 7 comments
Closed

what's the meanning of "interval_dict" in long_read_typing.py ? #30

lidd77 opened this issue May 17, 2024 · 7 comments

Comments

@lidd77
Copy link

lidd77 commented May 17, 2024

Hello , professor,
For "A":"HLA_A:1000-4503" from the below code line , what's the meaning for these two number "1000" and "4503" ?
and how to get these coordinate number values ,like "C:1000-5304", "DPA1:1000-10775" ?
'''
interval_dict = {"A":"HLA_A:1000-4503", "B":"HLA_B:1000-5081","C": "HLA_C:1000-5304","DPA1":"HLA_DPA1:1000-10775",
"DPB1":"HLA_DPB1:1000-12468","DQA1":"HLA_DQA1:1000-7492","DQB1":"HLA_DQB1:1000-8480","DRB1":"HLA_DRB1:1000-12229" }
'''

Expecting your reply !

thank you .

@wshuai294
Copy link
Collaborator

Hello, because we added upstream and downstream extended 1000bp sequences to the reference allele. So the interval like "C:1000-5304" represents the actual gene sequence intervals.

@lidd77
Copy link
Author

lidd77 commented May 17, 2024

Hello, because we added upstream and downstream extended 1000bp sequences to the reference allele. So the interval like "C:1000-5304" represents the actual gene sequence intervals.

sorry , could you give us a script to show us how to get the interval , for example , how to get the interval for HLA-A ?
I know HLA-A 's GRCH38 position is chr6:29,941,260-29,949,572, so how to get the interval ?
I want to know exactly how to set the interval so that I can understand specHLA deeply and modify and test this interval effect.

Expecting your reply !

@wshuai294
Copy link
Collaborator

Hello. For example, we extract the upstream sequence of chr6:29,940,260-29,941,260 (left seq), and the downstream sequence of chr6:29,949,572-29,949,572 (right seq). After that, we combine these two sequences and our reference allele (left seq + allele + right seq). In this case, the edited ref's length is 5503. And the actual HLA-A sequence is in HLA_A:1000-4503".

@lidd77
Copy link
Author

lidd77 commented May 17, 2024

Hello. For example, we extract the upstream sequence of chr6:29,940,260-29,941,260 (left seq), and the downstream sequence of chr6:29,949,572-29,949,572 (right seq). After that, we combine these two sequences and our reference allele (left seq + allele + right seq). In this case, the edited ref's length is 5503. And the actual HLA-A sequence is in HLA_A:1000-4503".

it is typo error ? for downstream sequence is chr6:29,949,572-(29,949,572+4503 ) ?

@wshuai294
Copy link
Collaborator

Our reference allele is different from that in GRCH38.

@lidd77
Copy link
Author

lidd77 commented May 28, 2024

it is typo error ? for downstream sequence is chr6:29,949,572-(29,949,572+4503 ) ?

for downstream sequence is chr6:29,949,572-(29,949,572+1000 ) ?

@wshuai294
Copy link
Collaborator

Please find the reference allele in the supplementary table of the manuscript.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants