Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insert 1bp #13

Open
wants to merge 3 commits into
base: insert-1bp
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,8 +163,30 @@ python setup.py nosetests
This library requires at least Python 2.6, but otherwise has no
external dependencies.


The library does assume that genome sequence is available through a `pygr`
compatible `SequenceFileDB` object. For an example of writing a wrapper for
a different genome sequence back-end, see
[hgvs.tests.genome.MockGenome](hgvs/tests/genome.py).

## Flask Web Service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution!

Do you think this could be done as a separate package that imports the hgvs library? Majority of use cases will require just the library, not the HTTP API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ho @pkaleta totally agree, I can put this as a separate one. When I originally wrote this it was because our pipeline was primarily java and wanted a way to interact with this. Alternatively this can be a multi-package project. Up to you.

Let me know.

hgvs api for counsyl hgvs tool

To run:

$git clone https://github.com/alexfrieden/hgvs.git -b insert-1bp
$cd hgvs-api
$python setup.py install
$cd hgvs/hgvs-api/
$sudo easy_install flask
...
$sudo easy_install pygr
...
$python app.py
* Running on http://127.0.0.1:5000/
* Restarting with reloader


Now go and try it out! Go to http://127.0.0.1:5000/ to see examples of how to go from vcf to cdna and cdna to vcf


114 changes: 114 additions & 0 deletions app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
import csv
import sys
import hgvs
import hgvs.utils
#from hgvs import utils
#import pygr
from pygr.seqdb import SequenceFileDB
from flask import Flask,jsonify, request


application = app = Flask(__name__)


genome = SequenceFileDB('/Users/afrieden/test/hg/hg18.fa')



geneTranscripts = {'CFTR':'NM_000492.3','PCDH15':'NM_033056.3','ABCC8':'NM_000352.3','ASPA':'NM_000049.2','BCKDHA':'NM_000709.3','BCKDHB':'NM_000056.3',
'BLM':'NM_000057.2','CLRN1':'NM_174878.2','DLD':'NM_000108.3','FANCC':'NM_000136.2','G6PC':'NM_000151.3','HEXA':'NM_000520.4',
'IKBKAP':'NM_003640.3',
'MCOLN1':'NM_020533.2',
'SMPD1':'NM_000543.4',
'TMEM216':'NM_001173990.2',
'FKTN':'NM_006731.2',
'GBA':'NM_001005741.2',
'NEB':'NM_001164507.1'}

def getGeneFromPosition(_position):
#check which gene it is in
geneExportPath = '/Users/afrieden/Projects/hgvs/hgvs/data/geneExport.csv'
reader = csv.reader(open(geneExportPath), delimiter=',')
next(reader, None)
contents = [line for line in reader]
position = int(_position)
for row in contents:
if(position > int(row[1])):
if(position < int(row[2])):
return row[0]
_max = sys.maxint
closestGene = ''
for row in contents:
diff_start = abs(int(row[1])-position)
diff_end = abs(int(row[2]) - position)
diff = min(diff_start,diff_end)
if(diff < _max):
_max = diff
closestGene = row[0]
return closestGene


with open('./hgvs/data/gsg-transcript-03-06-2014.txt') as infile:
transcripts = hgvs.utils.read_transcripts(infile)
#transcripts = read_transcripts(infile)

def get_transcript(name):
return transcripts.get(name)




@app.route('/')
def index():
return jsonify( {
'first':'http://localhost:5000/convert/cdnaToVcf?gene=CFTR&cdna=c.1521_1523delCTT',
'second':'http://localhost:5000/convert/vcfToCdna?chrom=chr7&pos=116986880&ref=ATCT&alt=A'
})



@app.route('/convert/vcfToCdna')
def vcfToCdna():
_chrom = str(request.args.get('chrom'))
_pos = int(request.args.get('pos'))
_ref = str(request.args.get('ref'))
_alt = str(request.args.get('alt'))

chrom, offset, ref, alt = (_chrom, _pos, _ref, _alt)
gene = str(getGeneFromPosition(_pos))
transcript = get_transcript(geneTranscripts[gene])
hgvs_name = hgvs.format_hgvs_name(
chrom, offset, ref, alt, genome, transcript)
cdnaName = hgvs_name.split(':')[1]
fullName = gene + ':' + cdnaName
return jsonify( {'name':fullName})



@app.route('/convert/cdnaToVcf')
def cdnaToVcf():
gene = request.args.get('gene')
cdna_name = request.args.get('cdna')
transcript = geneTranscripts[gene]
transcriptName = str(transcript + ':' + cdna_name)
chrom, offset, ref, alt = hgvs.parse_hgvs_name(
transcriptName, genome, get_transcript=get_transcript)

return jsonify({
'foundTranscript': transcript,
'geneInput':gene,
'cdna_name':cdna_name,
'transcriptName':transcriptName,
'chrom':chrom,
'position':offset,
'ref':ref,
'alt':alt

})





if __name__ == '__main__':
app.run(debug = True)
18 changes: 18 additions & 0 deletions hgvs/data/geneExport.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
geneName,geneLower,geneUpper
BCKDHA,46595573,46622353
IKBKAP,110671216,110733247
CLRN1,152128413,152173185
BLM,89091627,89159513
HEXA,70423472,70455367
BCKDHB,80873130,81110240
CFTR,116907385,117094398
PCDH15,55251624,56094028
ASPA,3326204,3349132
G6PC,38306420,38316969
DLD,107318932,107346940
SMPD1,6368405,6372413
MCOLN1,7493637,7504681
HBB,5203404,5204827
ABCC8,17371114,17454899
FANCC,96903810,97051394
GBA,153471410,153477527
19 changes: 19 additions & 0 deletions hgvs/data/gsg-transcript-03-06-2014.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
184 NM_000492.3 chr7 + 116907384 117094398 116907384 117094398 27 116907384,116931542,116936323,116958188,116961565,116962537,116963837,116967389,116969305,116975930,116986753,117015028,117017642,117019223,117022219,117030115,117030821,117033963,117037808,117038870,117041902,117054811,117069727,117080131,117091977,117092748,117094197, 116907437,116931653,116936432,116958404,116961655,116962701,116963963,116967636,116969398,116976113,116986945,117015123,117017729,117019947,117022348,117030153,117031072,117034043,117037959,117039098,117042003,117055060,117069883,117080221,117092150,117092854,117094398, 0 cmpl cmpl 0,2,2,0,0,0,2,2,0,0,0,0,2,2,0,0,2,1,0,1,1,0,0,0,0,2,0,
126 NM_033056.3 chr10 - 55251623 56094028 55251623 56094028 32 55251623,55257158,55258330,55261080,55270085,55286940,55296407,55333008,55368580,55370631,55389497,55391517,55425414,55449957,55452657,55496522,55509096,55519749,55562640,55582865,55613209,55614899,55625448,55643701,55666588,55747036,55759361,55776130,55798885,55808547,55957577,56093937, 55253124,55257314,55258339,55261299,55270262,55287029,55296623,55333136,55368721,55370741,55389610,55391658,55425531,55450182,55452963,55496651,55509190,55519829,55562773,55583059,55613359,55615034,55625655,55643814,55666697,55747207,55759472,55776250,55799041,55808708,55957643,56094028, 0 cmpl cmpl 2,2,2,2,2,0,0,1,1,2,0,0,0,0,0,0,2,0,2,0,0,0,0,1,0,0,0,0,0,1,1,0,
89 NM_000352.3 chr11 - 17371113 17454899 17371113 17454899 39 17371113,17371819,17372388,17373294,17373732,17373974,17375038,17375315,17375806,17376461,17380783,17382634,17383616,17384744,17385010,17385476,17386514,17388638,17390788,17391516,17392634,17393426,17395052,17405171,17405989,17406411,17406687,17408936,17410326,17420842,17421300,17426638,17431241,17438610,17439705,17441560,17448223,17453008,17454751, 17371251,17371882,17372522,17373398,17373841,17374053,17375169,17375436,17375920,17376564,17380876,17382792,17383686,17384911,17385252,17385576,17386640,17388776,17390869,17391601,17392733,17393462,17395085,17405277,17406065,17406528,17406793,17409082,17410367,17421005,17421435,17426794,17431406,17438799,17439948,17441727,17448345,17453150,17454899, 0 cmpl cmpl 0,0,1,2,1,0,1,0,0,2,2,0,2,0,1,0,0,0,0,2,2,2,2,1,0,0,2,0,1,0,0,0,0,0,0,1,2,1,0,
610 NM_000049.2 chr17 + 3326203 3349132 3326203 3349132 6 3326203,3331646,3333542,3339278,3344393,3348934, 3326439,3331842,3333636,3339386,3344503,3349132, 0 cmpl cmpl 0,2,0,1,1,0,
904 NM_000709.3 chr19 + 46595572 46622353 46595572 46622353 9 46595572,46608381,46608667,46611793,46616879,46619908,46620373,46620742,46622182, 46595680,46608561,46608754,46611902,46617041,46620115,46620515,46620914,46622353, 0 cmpl cmpl 0,0,0,0,1,1,1,2,0,
150 NM_000056.3 chr6 + 80873129 81110240 80873129 81110240 10 80873129,80893982,80895596,80934113,80935310,80937717,80967369,80969537,81039570,81110099, 80873325,80894060,80895665,80934247,80935466,80937826,80967467,80969648,81039657,81110240, 0 cmpl cmpl 0,1,1,1,0,0,1,0,0,0,
160 NM_000057.2 chr15 + 89091626 89159513 89091626 89159513 21 89091626,89093600,89096020,89099044,89104380,89104827,89107199,89109529,89111143,89113366,89113671,89127055,89129154,89134882,89138400,89142423,89147754,89148400,89153370,89155438,89159335, 89091724,89094301,89096180,89099172,89104513,89105489,89107391,89109648,89111257,89113465,89113820,89127162,89129315,89135078,89138591,89142571,89147954,89148593,89153493,89155640,89159513, 0 cmpl cmpl 0,2,1,2,1,2,1,1,0,0,0,2,1,0,1,0,1,0,1,1,2,
1734 NM_174878.2 chr3 - 152128412 152173185 152128412 152173185 3 152128412,152142058,152172932, 152128678,152142238,152173185, 0 cmpl cmpl 1,1,0,
1500 NM_000198.3 chr7 + 107318931 107346940 107318931 107346940 14 107318931,107320880,107329418,107330005,107331158,107332638,107333041,107333947,107343186,107344474,107344953,107345604,107346690,107346874, 107318970,107320959,107329498,107330074,107331228,107332739,107333185,107334049,107343377,107344645,107345143,107345742,107346780,107346940, 0 cmpl cmpl 0,0,1,0,0,1,0,0,0,2,2,0,0,0,
166 NM_000136.2 chr9 - 96903809 97051394 96903809 97051394 14 96903809,96909168,96913565,96916731,96919417,96927188,96928631,96937448,96952025,96973181,96974139,97042751,97049534,97051229, 96903953,96909372,96913740,96916813,96919493,96927288,96928684,96937605,96952190,96973246,96974250,97042846,97049619,97051394, 0 cmpl cmpl 0,0,2,1,0,2,0,2,2,0,0,1,0,0,
898 NM_000151.3 chr17 + 38306419 38316969 38306419 38316969 5 38306419,38309473,38313065,38314845,38316457, 38306649,38309583,38313171,38314961,38316969, 0 cmpl cmpl 0,2,1,2,1,
1139 NM_000520.4 chr15 - 70423471 70455367 70423471 70455367 14 70423471,70424840,70425629,70425921,70427080,70427442,70428473,70429912,70430527,70432462,70433085,70434953,70435919,70455114, 70423535,70424945,70425720,70426105,70427153,70427529,70428654,70430045,70430629,70432573,70433132,70435019,70436012,70455367, 0 cmpl cmpl 2,2,1,0,2,2,1,0,0,0,1,1,1,0,
179 NM_003640.3 chr9 - 110671215 110733247 110671215 110733247 36 110671215,110676995,110680095,110680723,110681546,110682152,110683805,110684225,110691432,110693303,110695086,110696043,110698596,110699053,110699248,110700602,110700771,110701922,110702360,110703525,110703728,110704939,110705663,110708403,110710405,110713110,110714364,110718304,110719647,110720911,110721353,110724942,110728623,110729472,110731869,110733097, 110671283,110677071,110680250,110680851,110681658,110682266,110683866,110684288,110691494,110693505,110695184,110696167,110698745,110699139,110699386,110700682,110700850,110701996,110702476,110703631,110703782,110705043,110705770,110708586,110710505,110713281,110714595,110718398,110719771,110721002,110721450,110725028,110728704,110729554,110732022,110733247, 0 cmpl cmpl 1,0,1,2,1,1,0,0,1,0,1,0,1,2,2,0,2,0,1,0,0,1,2,2,1,1,1,0,2,1,0,1,1,0,0,0,
642 NM_020533.2 chr19 + 7493636 7504681 7493636 7504681 14 7493636,7495846,7497324,7497646,7498405,7498749,7499043,7499482,7499706,7499986,7500475,7501171,7504408,7504644, 7493667,7496052,7497492,7497812,7498514,7498846,7499143,7499589,7499856,7500088,7500598,7501387,7504539,7504681, 0 cmpl cmpl 0,1,0,0,1,2,0,1,0,0,0,0,0,2,
633 NM_000543.4 chr11 + 6368404 6372413 6368404 6372413 6 6368404,6369189,6371021,6371422,6371701,6372003, 6368722,6369962,6371193,6371499,6371847,6372413, 0 cmpl cmpl 0,0,2,0,2,1,
1051 NM_001173990.2 chr11 + 60916679 60922324 60916679 60922324 5 60916679,60917278,60917931,60921821,60922317, 60916713,60917380,60918024,60922023,60922324 0 cmpl cmpl 0,1,1,1,2,
176 NM_001079802.1 chr9 + 107377134 107437366 107377134 107437366 9 107377134,107398699,107403246,107406316,107409920,107417379,107420060,107422035,107437152, 107377239,107398759,107403450,107406594,107410053,107417509,107420194,107422163,107437366, 0 cmpl cmpl 0,0,0,0,2,0,1,0,2,
1769 NM_001005741.2 chr1 - 153471409 153477527 153471409 153477527 11 153471409,153471609,153472095,153472659,153473755,153474548,153474931,153476030,153476300,153477044,153477500, 153471515,153471726,153472259,153472884,153473993,153474721,153475065,153476177,153476492,153477132,153477527, 0 cmpl cmpl 2,2,0,0,2,0,1,1,1,0,0,
218 NM_001164507.1 chr2 - 152050519 152297916 152050519 152297916 180 152050519,152054730,152055131,152056442,152056857,152057146,152058112,152058534,152058920,152061035,152061700,152062385,152063019,152064057,152066150,152067552,152068108,152070237,152070925,152071668,152072764,152077492,152078339,152079077,152079577,152081218,152083077,152083724,152084417,152089070,152089274,152089922,152090717,152090917,152091677,152092240,152093969,152095753,152096551,152098974,152100450,152101891,152102631,152102899,152105103,152105456,152106206,152110647,152111104,152112188,152112393,152113067,152114395,152116497,152117431,152118157,152118587,152119690,152125351,152125763,152125967,152126869,152127370,152128365,152128582,152129803,152130260,152130475,152131927,152132830,152133075,152133377,152134027,152134840,152135256,152140454,152140910,152144097,152145556,152146242,152147223,152148214,152150266,152151164,152152133,152154648,152156107,152156793,152157775,152158766,152160819,152161717,152162686,152165201,152166660,152167346,152168328,152169319,152171365,152172263,152173232,152174568,152175275,152175521,152176945,152179035,152180761,152182123,152183034,152184210,152185678,152190293,152191761,152192282,152194298,152195474,152195902,152198413,152200986,152204039,152204616,152205114,152207333,152207524,152207909,152208577,152209227,152210889,152214935,152215333,152218751,152220029,152220576,152220912,152222742,152223824,152226894,152228307,152229260,152229518,152230092,152230849,152232563,152233786,152235782,152237128,152239236,152240046,152242324,152242635,152244480,152244677,152245489,152247421,152249537,152252178,152252385,152253051,152255486,152256623,152256818,152257020,152259082,152259281,152260337,152261396,152261907,152262109,152262303,152271640,152274415,152275193,152282175,152288141,152289019,152289616,152290212,152292450,152294374,152297880, 152050693,152054837,152055278,152056535,152057041,152057254,152058205,152058627,152059013,152061128,152061793,152062478,152063112,152064150,152066243,152067645,152068201,152070330,152071018,152071773,152072869,152077597,152078453,152079188,152079688,152081323,152083182,152083829,152084528,152089175,152089376,152090033,152090828,152091022,152091782,152092345,152094074,152095861,152096656,152099079,152100555,152101996,152102736,152103004,152105208,152105561,152106311,152110758,152111203,152112293,152112498,152113175,152114500,152116605,152117536,152118262,152118785,152119804,152125459,152125868,152126072,152126974,152127568,152128479,152128690,152129908,152130365,152130580,152132239,152132938,152133177,152133476,152134132,152135152,152135361,152140559,152141114,152144409,152145664,152146347,152147427,152148526,152150374,152151269,152152337,152154960,152156215,152156898,152157979,152159078,152160927,152161822,152162890,152165513,152166768,152167451,152168532,152169631,152171473,152172368,152173436,152174880,152175383,152175626,152177149,152179347,152180869,152182228,152183238,152184522,152185786,152190398,152191965,152192594,152194406,152195579,152196106,152198725,152201094,152204144,152204820,152205426,152207441,152207629,152208113,152208889,152209335,152210994,152215139,152215645,152218859,152220134,152220783,152221224,152222850,152223929,152227101,152228619,152229368,152229623,152230299,152231161,152232671,152233891,152235989,152237440,152239344,152240151,152242531,152242947,152244588,152244782,152245588,152247529,152249735,152252292,152252493,152253156,152255585,152256728,152256926,152257122,152259196,152259389,152260442,152261495,152262012,152262217,152262408,152271757,152274523,152275298,152282280,152288246,152289124,152289721,152290320,152292666,152294416,152297916, 0 cmpl cmpl 0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,