Skip to content
This repository has been archived by the owner on Jun 7, 2022. It is now read-only.

Commit

Permalink
Merge pull request #167 from darrenldl/dev
Browse files Browse the repository at this point in the history
README, SBX_FORMAT, BLKAR_SPECS polish
  • Loading branch information
darrenldl committed Apr 12, 2019
2 parents df829e0 + 0348abb commit 2b439fd
Show file tree
Hide file tree
Showing 3 changed files with 45 additions and 43 deletions.
6 changes: 3 additions & 3 deletions BLKAR_SPECS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ blkar returns

## Output

The cli argument parsing library (clap) outputs errors to stderr
The cli argument parsing library (clap) outputs errors to stderr.

If no errors are discovered by the cli argument parsing library, then

Expand Down Expand Up @@ -104,7 +104,7 @@ Metadata block is valid if

## Calc workflow

Calc mode only operates at UI/UX level and does not handle any file data, thus it is not documented here
Calc mode only operates at UI/UX level and does not handle any file data, thus it is not documented here.

## Check workflow

Expand All @@ -130,7 +130,7 @@ Data block is valid if and only if

### If output to file

1. A reference block is retrieved first and is used for guidance on alignment, version, and uid (see **Finding reference block** procedure specified above)#
1. A reference block is retrieved first and is used for guidance on alignment, version, and uid (see **Finding reference block** procedure specified above)
2. Scan for valid blocks from start of SBX container to decode and output using reference block's block size as alignment
- if a block is invalid, nothing is done
- if a block is valid, and is a metadata block, nothing is done
Expand Down
36 changes: 19 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,30 +10,32 @@

[Documentation](https://github.com/darrenldl/blockyarchive/wiki)

Blockyarchive/blkar (pronounced "bloc-kar") is a comprehensive utility for creating, rescuing, and general handling of SeqBox archives, with optional forward error correction
Blockyarchive/blkar (pronounced "bloc-kar") is a comprehensive utility for creating, rescuing, and general handling of SeqBox archives, with optional forward error correction via Error-correcting SeqBox.

SeqBox is a single-file archive format designed by [Marco Pontello](https://github.com/MarcoPon) that facilitates sector level data recovery for when file system metadata is corrupted/missing, while the archive itself still exists as a normal file on file system
SeqBox is a single-file archive format designed by [Marco Pontello](https://github.com/MarcoPon) that facilitates sector level data recovery for when file system metadata is corrupted/missing, while the archive itself still exists as a normal file on file system. Please visit the official [SeqBox](https://github.com/MarcoPon/SeqBox) repo for the original implementation and technical details on this.

Please visit the official [SeqBox](https://github.com/MarcoPon/SeqBox) repo for the original implementation and technical details on this
Error-correcting SeqBox (or EC-SeqBox for short) is an extended version of SeqBox developed for this project, introducing forward error correction via Reed-Solomon erasure code.

Blockyarchive/blkar was formerly known as rust-SeqBox/rsbx prior to renaming
Blockyarchive/blkar was formerly known as rust-SeqBox/rsbx prior to renaming.

## Comparison to the original SeqBox implementation/design

The original SeqBox implementation and format do not support repairing of data, only sector level recoverability
The original SeqBox implementation and format do not support repairing of data, only sector level recoverability.

Blockyarchive allows repairs to be made by adding forward error correction (Reed-Solomon erasure code) to extended versions of SeqBox format (named Error-correcting SeqBox or EC-SeqBox for short), and also allows arranging the blocks in a burst error resistant pattern
Blockyarchive supports both SeqBox and EC-SeqBox, while the original implementation only supports the SeqBox specification.

Blockyarchive is also more robust compared to the original SeqBox implementation, as it does not assume the SBX container to be well formed, and makes as few assumptions about the SBX container as possible
Blockyarchive is also more robust compared to the original SeqBox implementation, as it does not assume the SBX container to be well formed, and makes as few assumptions about the SBX container as possible.

blkar is overall based around [osbx](https://github.com/darrenldl/ocaml-SeqBox), but much more optimized
blkar is overall based around [osbx](https://github.com/darrenldl/ocaml-SeqBox), but much more optimized.

## Features overall

- Data recovery that does not depend on file system metadata (sector level recovery)
- This allows data recovery even when data is fragmented and out of order
- Supports error correction (via Reed-Solomon erasure code) for EC-SeqBox
- Supports burst (sector) error resistance for EC-SeqBox
- This is done via an interleaving block arrangement scheme. It is mainly to address the data repair limitation of the simple archive design
- More complex archive designs such as PAR2 can repair burst errors without any extra arrangement scheme, but they are also vastly more complex than EC-SeqBox
- JSON mode
- Outputs information in JSON format instead of human readable text, allowing easy integration with scripts

Expand All @@ -45,13 +47,13 @@ blkar is overall based around [osbx](https://github.com/darrenldl/ocaml-SeqBox),

## Goals

As blkar is to be used largely as a backup utility, security/robustness of the code will be prioritised over apparent performance
As blkar is to be used largely as a backup utility, security/robustness of the code will be prioritised over apparent performance.

## Status

This project has reached its intended feature completeness, so no active development for new features will occur. However, this project is still actively looked after, i.e. I will respond to PRs, issues, and emails, will consider feature requests, respond to bug reports quickly, and so on.

In other words, this is a completed project with respect to its original scope, but it is not abandoned
In other words, this is a completed project with respect to its original scope, but it is not abandoned.

## Getting started

Expand All @@ -65,11 +67,11 @@ cargo install blkar

#### Usage guides & screencasts & other resources

The [wiki](https://github.com/darrenldl/blockyarchive/wiki) contains comprehensive guides and resources
The [wiki](https://github.com/darrenldl/blockyarchive/wiki) contains comprehensive guides and resources.

## Note on Rust to Bash ratio

Just to avoid confusion, blkar is written purely in Rust, Bash is only used to write tests
Just to avoid confusion, blkar is written purely in Rust, Bash is only used to write tests.

## Got a question?

Expand All @@ -93,28 +95,28 @@ Contributions are welcome. Note that by submitting contributions, you agree to l

I would like to thank [Marco](https://github.com/MarcoPon) (the official SeqBox author) for discussing and clarifying aspects of his project, and also providing of test data during development of osbx. I would also like to thank him for his feedback on the numbering of the error correction enabled ECSBX versions (versions 17, 18, 19).

I would like to thank [Ming](https://github.com/mdchia/) for his feedback on the documentation, UX design, and several other general aspects of the osbx project, of which most of the designs are carried over to blkar, and also his further feedback on this project as well
I would like to thank [Ming](https://github.com/mdchia/) for his feedback on the documentation, UX design, and several other general aspects of the osbx project, of which most of the designs are carried over to blkar, and also his further feedback on this project as well.

The design of the readable rate in progress report text is copied from [Arch Linux pacman](https://wiki.archlinux.org/index.php/Pacman)'s progress bar design
The design of the readable rate in progress report text is copied from [Arch Linux pacman](https://wiki.archlinux.org/index.php/Pacman)'s progress bar design.

The design of block set interleaving arrangement in RS enabled versions is heavily inspired by [Thanassis Tsiodras's design of RockFAT](https://www.thanassis.space/RockFAT.html). The interleaving provides resistance against burst sector errors.

## Donation

**Note** : Donation will **NOT** fuel development of new features. As mentioned above, this project is meant to be stable, well tested and well maintained, but normally I am not actively adding new features to it.

If blockyarchive has been useful to you, and you would like to donate to me for the development effort, you can donate through [here](http://ko-fi.com/darrenldl)
If blockyarchive has been useful to you, and you would like to donate to me for the development effort, you can donate through [here](http://ko-fi.com/darrenldl).

## License

#### Libcrc code

The crcccitt code is translated from the C implementation in [libcrc](https://github.com/lammertb/libcrc) and are under the same MIT License as used by libcrc and as stated in libcrc source code, the license text of the crcccitt.c is copied over to `crc-ccitt/build.rs`, `crc-ccitt/src/lib.rs`, `build.rs` and `src/crc_ccitt.rs` as well
The crcccitt code is translated from the C implementation in [libcrc](https://github.com/lammertb/libcrc) and are under the same MIT License as used by libcrc and as stated in libcrc source code, the license text of the crcccitt.c is copied over to `crc-ccitt/build.rs`, `crc-ccitt/src/lib.rs`, `build.rs` and `src/crc_ccitt.rs` as well.

#### Official SeqBox code

The following files in tests folder copied from official SeqBox are under its license, which is MIT as of time of writing

- tests/SeqBox/*

All remaining files are distributed under the MIT license as stated in the LICENSE file
All remaining files are distributed under the MIT license as stated in the LICENSE file.
46 changes: 23 additions & 23 deletions SBX_FORMAT.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
## Technical Specification

The following specification for SBX is copied directly from the official specification with minor to no modifications
The following specification for SBX is copied directly from the official specification with minor to no modifications.

ECSBX is the extended version of SBX with error-correcting capability
ECSBX is the extended version of SBX with error-correcting capability.

Byte order: Big Endian

Expand Down Expand Up @@ -73,23 +73,23 @@ Supported crypto hashes since 1.0.0 are
- SHA512
- BLAKE2B\_512

Metadata block (block 0) can be disabled
Metadata block (block 0) can be disabled.

## For ECSBX versions : 17 (0x11), 18 (0x12), 19 (0x13)

ECSBX specification is overall similar to the SBX specification above
ECSBX specification is overall similar to the SBX specification above.

Block categories : `Meta`, `Data`, `Parity`

`Meta` and `Data` are mutually exclusive, and `Meta` and `Parity` are mutually exclusive. A block can be both `Data` and `Parity`
`Meta` and `Data` are mutually exclusive, and `Meta` and `Parity` are mutually exclusive. A block can be both `Data` and `Parity`.

Assumes configuration is **M** data shards and **N** parity shards
Assumes configuration is **M** data shards and **N** parity shards.

### Note

The following only describes the sequence number arrangement, not the actual block arrangement
The following only describes the sequence number arrangement, not the actual block arrangement.

See section "Block set interleaving scheme" below for details on actual block arrangement
See section "Block set interleaving scheme" below for details on actual block arrangement.

### Common blocks header:

Expand Down Expand Up @@ -126,9 +126,9 @@ For **N** continuous blocks

RS arrangement : M blocks (M data shards) N blocks (N parity shards)

The M blocks are `Data` only
The M blocks are `Data` only.

The N blocks are both `Data` and `Parity`
The N blocks are both `Data` and `Parity`.

### Last set of blocks

Expand Down Expand Up @@ -159,11 +159,11 @@ For **N** continuous blocks
| --- | -------- | ---- | ------ |
| 16 | blockend | var | parity |

RS arrangement : M blocks (X data shards + (M - X) padding blocks) N blocks
RS arrangement : M blocks (X data shards + (M - X) padding blocks) N blocks.

The M blocks are `Data` only
The M blocks are `Data` only.

The N blocks are both `Data` and `Parity`
The N blocks are both `Data` and `Parity`.

### Versions:

Expand Down Expand Up @@ -199,25 +199,25 @@ Supported forward error correction algorithms since 1.0.0 are

- Reed-Solomon erasure code - probably the only one for versions 17, 18, 19

Metadata and the parity blocks are mandatory in versions 17, 18, 19
Metadata and the parity blocks are mandatory in versions 17, 18, 19.

### Block set interleaving scheme

This block set interleaving is heavily inspired by [Thanassis Tsiodras's design of RockFAT](https://www.thanassis.space/RockFAT.html)
This block set interleaving is heavily inspired by [Thanassis Tsiodras's design of RockFAT](https://www.thanassis.space/RockFAT.html).

The major difference between the two schemes is that RockFAT's one is byte based interleaving, blkar's one is SBX block based interleaving
The major difference between the two schemes is that RockFAT's one is byte based interleaving, blkar's one is SBX block based interleaving.

The other difference is that blkar allows customizing level of resistance against burst sector errors
The other difference is that blkar allows customizing level of resistance against burst sector errors.

A burst error is defined as consecutive SBX block erasures
A burst error is defined as consecutive SBX block erasures.

Burst error resistance is defined as the maximum number of consective SBX block erasures tolerable for any instance of burst error
Burst error resistance is defined as the maximum number of consective SBX block erasures tolerable for any instance of burst error.

The maximum number of such errors tolerable is same as the parity shard count
The maximum number of such errors tolerable is same as the parity shard count.

Assuming arrangement of **M** data shards, **N** parity shards, **B** burst error resistance
Assuming arrangement of **M** data shards, **N** parity shards, **B** burst error resistance.

Then the SBX container can tolerate up to **N** burst errors in every set of **(M + N) * B** consecutive blocks, and each individual error may be up to **B** SBX blocks
Then the SBX container can tolerate up to **N** burst errors in every set of **(M + N) * B** consecutive blocks, and each individual error may be up to **B** SBX blocks.

#### Diagrams

Expand Down Expand Up @@ -256,4 +256,4 @@ Let **K > 1 + N** :

#### Limitations

While an arbitrary number can be used for burst error resistance level during encoding, blkar will only guess up to 1000 when automatically guessing the burst error resistance level
While an arbitrary number can be used for burst error resistance level during encoding, blkar will only guess up to 1000 when automatically guessing the burst error resistance level.

0 comments on commit 2b439fd

Please sign in to comment.