Skip to content

kevinblighe/PythonScripts

Repository files navigation

PythonScripts

IncrementPos.py

increment POS field by 1 in an uncompressed VCF

SplitMultialleles.py

split multi-allelic calls into multiple entries in an uncompressed VCF

LocusTagSearch.py / LocusTagSearch.list

search for locus tags (in input file) at NCBI via eUtils (https://www.biostars.org/p/278614/)

ProteinFASTASearch.py / ProteinFASTASearch.list

search for FASTA protein sequences based on protein IDs (e.g. UniProt) taken from an input file (https://www.biostars.org/p/278747/)
cat ProteinFASTASearch.list

Q66LE6
Q9UKV3


python ProteinFASTASearch.py -f 1 -e kevin@clinicalbioinformatics.co.uk ProteinFASTASearch.list

>NP_060931.2 serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B delta isoform isoform a [Homo sapiens]
MAGAGGGGCPAGGNDFQWCFSQVKGAIDEDVAEADIISTVEFNYSGDLLATGDKGGRVVIFQREQENKSR
PHSRGEYNVYSTFQSHEPEFDYLKSLEIEEKINKIRWLPQQNAAHFLLSTNDKTIKLWKISERDKRAEGY
NLKDEDGRLRDPFRITALRVPILKPMDLMVEASPRRIFANAHTYHINSISVNSDHETYLSADDLRINLWH
LEITDRSFNIVDIKPANMEELTEVITAAEFHPHQCNVFVYSSSKGTIRLCDMRSSALCDRHSKFFEEPED
PSSRSFFSEIISSISDVKFSHSGRYMMTRDYLSVKVWDLNMESRPVETHQVHEYLRSKLCSLYENDCIFD
KFECCWNGSDSAIMTGSYNNFFRMFDRDTRRDVTLEASRESSKPRASLKPRKVCTGGKRRKDEISVDSLD
FNKKILHTAWHPVDNVIAVAATNNLYIFQDKIN

>NP_001158286.1 apoptotic chromatin condensation inducer in the nucleus isoform 2 [Homo sapiens]
MWRRKHPRTSGGTRGVLSGNRGVEYGSGRGHLGTFEGRWRKLPKMPEAVGTDPSTSRKMAELEEVTLDGK
PLQALRVTDLKAALEQRGLAKSGQKSALVKRLKGALMLENLQKHSTPHAAFQPNSQIGEEMSQNSFIKQY
LEKQQELLRQRLEREAREAAELEEASAESEDEMIHPEGVASLLPPDFQSSLERPELELSRHSPRKSSSIS
EEKGDSDDEKPRKGERRSSRVRQARAAKLSEGSQPAEEEEDQETPSRNLRVRADRNLKTEEEEEEEEEEE
EDDEEEEGDDEGQKSREAPILKEFKEEGEEIPRVKPEEMMDERPKTRSQEQEVLERGGRFTRSQEEARKS
HLARQQQEKEMKTTSPLEEEEREIKSSQGLKEKSKSPSPPRLTEDRKKASLVALPEQTASEEETPPPLLT
KEASSPPPHPQLHSEEEIEPMEGPAPPVLIQLSPPNTDADTRELLVSQHTVQLVGGLSPLSSPSDTKAES
PAEKVPEESVLPLVQKSTLADYSAQKDLEPESDRSAQPLPLKIEELALAKGITEECLKQPSLEQKEGRRA
SHTLLPSHRLKQSADSSSSRSSSSSSSSSRSRSRSPDSSGSRSHSPLRSKQRDVAQARTHANPRGRPKMG
SRSTSESRSRSRSRSRSASSNSRKSLSPGVSRDSSTSYTETKDPSSGQEVATPPVPQLQVCEPKERTSTS
SSSVQARRLSQPESAEKHVTQRLQPERGSPKKCEAEEAEPPAATQPQTSETQTSHLPESERIHHTVEEKE
EVTMDTSENRPENDVPEPPMPIADQVSNDDRPEGSVEDEEKKESSLPKSFKRKISVVSTKGVPAGNSDTE
GGQPGRKRRWGASTATTQKKPSISITTESLKEAVVDLHADDSRISEDETERNGDDGTHDKGLKICRTVTQ
VVPAEGQENGQREEEEEEKEPEAEPPVPPQVSVEVALPPPAEHEVKKVTLGDTLTRRSISQQKSGVSITI
DDPVRTAQVPSPPRGKISNIVHISNLVRPFTLGQLKELLGRTGTLVEEAFWIDKIKSHCFVTYSTVEEAV
ATRTALHGVKWPQSNPKFLCADYAEQDELDYHRGLLVDRPSETKTEEQGIPRPLHPPPPPPVQPPQHPRA
EQREQERAVREQWAEREREMERRERTRSEREWDRDKVREGPRSRSRSRDRRRKERAKSKEKKSEKKEKAQ
EEPPAKLLDDLFRKTKAAPCIYWLPLTDSQIVQKEAERAERAKEREKRRKEQEEEEQKEREKEAERERNR
QLEREKRREHSRERDRERERERERDRGDRDRDRERDRERGRERDRRDTKRHSRSRSRSTPVRDRGGRR

ProteinFASTASearchByFASTATitle.py

search for FASTA sequences based on any keyword (https://www.biostars.org/p/279584/)
python ProteinFASTASearchByFASTATitle.py -e kevin@clinicalbioinformatics.co.uk -t "Bacillus anthracis"

>WP_154574506.1 IS3 family transposase, partial [Bacillus anthracis]
KKDEYSIKEICILIGIPRSTYYRWKNKEKDVKEAKLEQAILTICMTNHFRYGHRKVTALLKRKYNYHPNR
KTVQKIMQKKNLQCRVKRKRRTWINGESRIVVENLLNRNFQANKPNEKWVTDITYLPFGTEMLYLLSIMD
LYNNEIIAYEISNRQDVTLVLRTVEKAIKLQQKTQIILHSDQGAVYTSYAFQTLSKKMALPQVCPVKEIV
MIMP

>WP_154556816.1 DUF4180 domain-containing protein, partial [Bacillus anthracis]
FAIVGDFSMYTSKSLKDFIYECNKGKDIFYLATEQQAIEKLSTLK

>WP_154556815.1 helix-turn-helix transcriptional regulator, partial [Bacillus anthracis]
MEFYDLGITIKELRIKKNISQSELCHGICSQSQISKIEKGVIYPSSILLYQLSERLGIDPNNIFALTKNK
KFKYIENVKCIMKDCIRQHQ

>WP_154556814.1 DUF4180 domain-containing protein, partial [Bacillus anthracis]
MEIKKVVIDGINIAVIRNNKVLISDVQSALDTMATVQYEVNAKHIIIHKSLISEDFFDLKTRLAGDIL

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages