Skip to content

jvirico/dna-sequence-data-handling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TBD

DATA

Data is in fasta format. We use Biopython to process it. The working example pdb_seqres.txt has been downloaded from [2].

Encoding sequence data

Showcasting three approaches to encode sequence data:

  • Ordinal encoding DNA sequence
  • One-hot encoding DAN sequence
  • DNA sequence as a "language", known as k-mer counting

REFERENCES

[1] - Demystify DNA sequencing with ML and Python [2] - RCBS Protein Data Bank

About

Testing different python libraries for bioinformatics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages