Skip to content

Program designed for DNA storage using Yin-Yang transcoding algorithm

License

Notifications You must be signed in to change notification settings

ntpz870817/DNA-storage-YYC

Repository files navigation

YYC


YYC is a DNA storage codec algorithm developed by BGI-research. Briefly, it can transcode two binary sequences into one DNA sequence. This algorithm can help to achieve a high-density, high-feasibility DNA storage based on DNA synthesis.

Prior to utilizing this repository, we kindly request that you possess a fundamental comprehension of DNA-based data storage. An inappropriate communication case can be seen here for reference. Should you have any inquiries, please don't hesitate to reach out to us via either Dr. Zhi Ping, Mr. Haoling Zhang or Dr. Yue Shen.

Environment Configuration

The kit is developed by Python3.5.

In addition, the packages we are calling now is as follows:

  • sys
  • os
  • random
  • math
  • struct
  • datetime
  • pickle

Kit Tree Diagram

├── examples                          // Test module
│    ├── files                        // Test files
│    │    ├── Mona Lisa.jpg           // Mona Lisa.jpg
│    │    ├── United Nations Flag.bmp // United Nations Flag.bmp
│    ├── output                       // Generated files from handle
│    ├── test_mona_lisa.py            // Run YYC using Mona Lisa.jpg
│    ├── test_united_nations_flag.py  // Run YYC using United Nations Flag.bmp
├── yyc
│    ├── utils                        // Util module
│    │    ├── data_handle.py          // Conversion of DNA motifs and binary document
│    │    ├── index_operator.py       // Processing the relationship between index and data
│    │    ├── log.py                  // Output the logs in console
│    │    ├── model_saver.py          // Save model to file and load model from file
│    │    ├── monitor.py              // Get the progress situation and the time left
│    │    ├── validity.py             // Determining whether a DNA sequence is easy or not for sequencing and synthesis
│    ├── pipeline.py                  // Main calling function
│    ├── scheme.py                    // YYC (Yin-Yang DNA Storage Code)
├── README.md                         // Description document of kit

Introduction of Yin-Yang Code

Yin-Yang Code is the algorithm describes the collection of derivative rules reported by Ping et. al.

The users could install this package by 'pip install yyc'. When you have finished installing the package, the sample program in folder could be run to make sure the package is correct.

We strongly suggest using Python IDE (such as PyCharm) to complete your transcoding task, please see the examples here. The command line is relatively long and hard to customize, for example:

In the encoding process

python
>>> from yyc import pipeline
>>> from yyc import scheme
>>> pipeline.encode(method=scheme.YYC(support_bases="A", base_reference=[0, 1, 0, 1], current_code_matrix=[[1, 1, 0, 0], [1, 0, 0, 1], [1, 1, 0, 0], [1, 1, 0, 0]], search_count=100, max_homopolymer=4, max_content=0.6), input_path="./files/Mona Lisa.jpg", output_path="./output/mona_lisa.dna", model_path="./output/yyc.pkl", need_index=True, need_log=True)

In the decoding process

python
>>> from yyc import pipeline
>>> pipeline.decode(model_path="./output/yyc.pkl",input_path="./output/mona_lisa.dna", output_path="./output/output_mona_lisa.jpg", has_index=True, need_log=True)

Citing

If you think this repo helps or being used in your research, please consider refer this paper.

Towards practical and robust DNA-based data archiving using the yin--yang codec system

@article{ping2022towards,
  title={Towards practical and robust DNA-based data archiving using the yin--yang codec system},
  author={Ping, Zhi and Chen, Shihong and Zhou, Guangyu and Huang, Xiaoluo and Zhu, Sha Joe and Zhang, Haoling and Lee, Henry H and Lan, Zhaojun and Cui, Jie and Chen, Tai and others},
  journal={Nature Computational Science},
  pages={234--242},
  year={2022},
  volume={2},
  number={4},
  publisher={Nature Publishing Group}
}

Thank you!

About

Program designed for DNA storage using Yin-Yang transcoding algorithm

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages