Skip to content

LouaiKB/-BWT-Huffman-coding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Genome compression using the Burrows Wheeler Transform and the Huffman coding

Python tkinter MIT License LinkedIn


Logo

Burrows Wheeler Transform and Huffman coding

with Python

Explore the BWT »
Explore the Huffman coding »

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Contributing
  5. License
  6. Contact

About The Project

NOTICE: For better interface aesthetic use Windows instead of Linux.


This project implements the Burrows Wheeler Transform and the Huffman coding algorithm using Python in order to compress genome sequences.

We can use this application:

  • To compress genome files (fasta, txt).
  • To decompress files to get genomic sequence.
  • To implement step by step the Burrows Wheeler Transform of a given sequecne.
  • To decrypt a Burrows Wheeler Sequence.
  • To visualize the full compression and decompression process of a genome.

Of course, we can enter genomic sequence manually or with a file (fasta, txt)

Built With

Getting Started

To get a local copy up and running follow these simple steps

1. Clone the repo

  • Clone the repository locally

    git clone https://github.com/LouaiKB/-BWT-Huffman-coding
    
    cd -BWT-Huffman-coding/

2. Installation

  • Install all the dependencies from the requirements.txt
    pip install -r requirements.txt
  • If problems occur with the dependecies installation try:
     pip install -r requirements.txt --no-index --find-links file:///tmp/packages

3. Start the app

cd scripts/

# run main.py
python main.py

Usage

1. Compression process

  • If you want to proceed the compression of a file you can enter the sequence manually in the text box then press the compression button. Or if you have a genome file Note:Enter only fasta or txt files you can press button directly to proceed the compression step by step.
  • Once you click the button a toplevel window appears ET VOILA!

NOTICE: The Huffman binary tree is presented in the Newick format. Check the Nwick format here.



  • Next button to complete the process









NOTICE: The compression process will save two files the compressed file + json associated file which will be used for the decompression process.

2. Decompression process

  • For the decompression WE CAN'T ENTER THE SEQUENCE MANUALLY because we need the json associated file.
  • Press decompress and choose the compressed sequence file and the json associated to this file. NOTICE: The compressed file and the json file have the same name.

3. Burrows Wheeler encryption

  • The BWT button performs the Burrows Wheeler Tranformation.
  • Enter the sequence manually or with a file.
  • You can choose if you want to proceed the Transform step by step or not.



* The ***Burrows Wheeler transform*** is presented in the next column of the Burrows Wheeler construction matrix.



4. Burrows Wheeler decrytion

  • Choose BWT decryption to perform this algorithm if you want to decrypt (or retransform the original sequence from BWT).
  • The original sequence is presented in the row which ends by '$' in the Burrows Wheeler reconstruction matrix.



5. Full compression and decompression

  • This will allow us to proceed all the compression and decompression process starting from the Burrows Wheeler Transform to the Huffman coding.

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Project Link: https://github.com/LouaiKB/-BWT-Huffman-coding

About

Burrows Wheeler Transform and Huffman coding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages