Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fan et al. 2022 #97

Merged
merged 9 commits into from
Nov 14, 2022
Merged

Fan et al. 2022 #97

merged 9 commits into from
Nov 14, 2022

Conversation

standage
Copy link
Member

@standage standage commented Nov 10, 2022

This branch adds a new marker collection from Fan et al. 2022. All 22 of these markers include SNPs with no RSIDs, so population frequency estimates based on 1000 genomes haplotypes cannot be determined.

Closes #93. Closes #95.

Comment on lines +13 to +24
rule get_hg37_coords:
input:
bed38="marker38.bed",
chain="hg38ToHg19.over.chain.gz",
output:
bed37="marker37.bed",
bedunmapped="marker-unmapped.bed",
shell:
"""
liftOver {input} {output}
[ ! -s {output.bedunmapped} ]
"""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the SNPs/InDels don't have RSIDs, so positions for GRCh37 and GRCh38 must be provided explicitly. The authors provide only GRCh38 positions, so I whipped out UCSC's 38 --> 37 liftOver chain file for that conversion. Yay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh fun!!

data.sort_values('Marker').to_csv(output[0], sep='\t', index=False)
data.sort_values(['Marker', 'VariantIndex']).to_csv(output[0], sep='\t', index=False)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a slight change here...

Comment on lines -9 to +19
mh06PK-24844 2 G GA
mh06FHL-001 2 C CATT
mh06FHL-001 7 A AC
mh06FHL-002 0 AT A,GT
mh06FHL-002 14 TG T
mh06FHL-002 25 C CA
mh06FHL-002 26 A AG
mh06PK-24844 1 C CT
mh06PK-24844 2 G GA
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...to enforce a consistent order here.

@standage standage marked this pull request as ready for review November 10, 2022 20:39
@rnmitchell
Copy link
Contributor

This looks like a Monday morning job... :)

@standage
Copy link
Member Author

No rush!

@standage
Copy link
Member Author

On second thought, please hold off on this. There's another massive PR I'd like to resolve first...

@standage standage marked this pull request as draft November 14, 2022 13:58
@standage standage marked this pull request as ready for review November 14, 2022 17:22
@standage
Copy link
Member Author

Ok this is ready for review @rnmitchell!

@rnmitchell
Copy link
Contributor

Looks good!

@rnmitchell rnmitchell merged commit 0d6d83a into master Nov 14, 2022
@standage standage deleted the fan2022 branch November 15, 2022 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fan et al 2022 Suggestion: Add -y when installing pytest in README
2 participants