scale-cpp

Introduction

SCALE (Simple Compression Algorithm that's Lossless and sometimes Efficient) is a file format based on Huffman encryption. For details on what a compressed file entails, see the documentation.

This repository also contains a SCALE codec as per the specifications, written in C++.

Because SCALE uses a Huffman tree for compression, it carries the strengths and weaknesses of Huffman trees. What this means is that SCALE compresses text files well, because they only use the 99 printable ASCII characters. Binaries, on the other hand, use all 256 ASCII characters and that tends to result with every code being at least 8 bits long and some being 9 or even 10 bits long, which guarantees a compressed file larger than the original file.

Hence the name saying sometimes efficient. The more unique characters your file uses, the less efficiently it will compress. (This is just a rule of thumb, it doesn't always apply, since Huffman trees are also affected by the frequency of characters in a file.)

Compiling SCALE

If you have at least C++17, you should be fine. If you have a lower version, you will need to edit IStreamWrapper::size() and OStreamWrapper::size() in util.cpp to use std::ifstream::tellg() or std::ofstream::tellg() rather than std::filesystem::file_size().

Dependencies

If your compiler has the std namespace, you should be fine. Otherwise, you need std::filesystem, std::algorithm, std::iostream, std::string, std::fstream, std::queue, std::vector, and std::unordered_map.

You will also need cstdlib to generate a 1 GB binary file for testing purposes via gigatest.cpp or the make gigatest command.

To compile SCALE, enter make in the repository's root directory.

Running SCALE

The syntax for using SCALE is as follows:

./scale <-c|-d> <input_file> <output_file>

Using the -c flag compresses input_file and writes it into output_file, and the -d flag decompressses input_file and writes it to output_file.

Theoretical and practical limits

Because SCALE stores data such as filesize as an unsigned 64-bit integer, it is theoretically capable of operating on up to files of up to 16 EiB in size. In practice, given that the program is run in a single thread and the hardware limitations of today, it should probably not be used for files larger than 50 MiB. (I tried compressing a 1 GiB file on my laptop with this program, and it took nearly an hour.). This program was made merely for practice with C++ and implementing Huffman coding. If you actually want efficient, fast compression, use 7zip!

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
test		test
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
ccf.md		ccf.md
cfg.json		cfg.json
decrypt.cpp		decrypt.cpp
decrypt.h		decrypt.h
encrypt.cpp		encrypt.cpp
encrypt.h		encrypt.h
gigatest.cpp		gigatest.cpp
main.cpp		main.cpp
treelib.cpp		treelib.cpp
treelib.h		treelib.h
util.cpp		util.cpp
util.h		util.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scale-cpp

Introduction

Compiling SCALE

Dependencies

Running SCALE

Theoretical and practical limits

About

Releases

Packages

Languages

Akhil841/scale-cpp

Folders and files

Latest commit

History

Repository files navigation

scale-cpp

Introduction

Compiling SCALE

Dependencies

Running SCALE

Theoretical and practical limits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages