Skip to content
This repository has been archived by the owner on Jun 7, 2022. It is now read-only.

Commit

Permalink
Merge pull request #162 from darrenldl/dev
Browse files Browse the repository at this point in the history
Transitioning to calling the extended versions of SeqBox as Error-correcting SeqBox (EC-SeqBox for short)
  • Loading branch information
darrenldl committed Apr 10, 2019
2 parents f07787c + 36db83b commit 4d2cc7e
Show file tree
Hide file tree
Showing 8 changed files with 72 additions and 48 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## 5.0.0

- Error-correcting versions of SeqBox are now called Error-correcting SeqBox or EC-SeqBox for short, and use the file extension `.ecsbx`
- This is done for easier differentiation between the extended versions and the original versions
- Fundamentally this does not change how blkar functions, as blkar does not take file extensions into account for all modes interacting with SBX containers

## 4.0.0

- Changed "Uid" to "UID" in encode help messages for consistency
Expand Down
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "blkar"
version = "4.0.0"
version = "5.0.0"
authors = ["Darren Ldl <darrenldldev@gmail.com>"]
edition = "2018"
build = "build.rs"
Expand All @@ -24,6 +24,7 @@ readme = "README.md"

keywords = [
"SeqBox",
"EC-SeqBox",
"backup",
"data-recovery",
"reed-solomon",
Expand Down
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Blockyarchive/blkar was formerly known as rust-SeqBox/rsbx prior to renaming

The original SeqBox implementation and format do not support repairing of data, only sector level recoverability

Blockyarchive allows repairs to be made by adding forward error correction (Reed-Solomon erasure code) to extended versions of SeqBox format, and also allows arranging the blocks in a burst error resistant pattern
Blockyarchive allows repairs to be made by adding forward error correction (Reed-Solomon erasure code) to extended versions of SeqBox format (named Error-correcting SeqBox or EC-SeqBox for short), and also allows arranging the blocks in a burst error resistant pattern

Blockyarchive is also more robust compared to the original SeqBox implementation, as it does not assume the SBX container to be well formed, and makes as few assumptions about the SBX container as possible

Expand All @@ -32,14 +32,14 @@ blkar is overall based around [osbx](https://github.com/darrenldl/ocaml-SeqBox),

- Data recovery that does not depend on file system metadata (sector level recovery)
- This allows data recovery even when data is fragmented and out of order
- Supports error correction (via Reed-Solomon erasure code)
- Supports burst (sector) error resistance
- Supports error correction (via Reed-Solomon erasure code) for EC-SeqBox
- Supports burst (sector) error resistance for EC-SeqBox
- JSON mode
- Outputs information in JSON format instead of human readable text, allowing easy integration with scripts

## Limitations

- Only a single file is supported for encoding as SeqBox is a single-file archive format
- Only a single file is supported for encoding as SeqBox and EC-SeqBox are both single-file archive formats
- However, blkar may still be usable when you have multiple files, as blkar supports taking input from stdin during encoding, and also supports outputting to stdout during decoding
- This means if you have an archiver that supports bundling and unbundling on the fly with pipes, like tar, you can combine the use of the archiver and blkar into one encoding and decoding step

Expand Down Expand Up @@ -81,7 +81,7 @@ Feel free to join the [Gitter chat](https://gitter.im/blockyarchive/community) i

## Specifications

[SBX format](SBX_FORMAT.md)
[SBX format](SBX_FORMAT.md) (EC-SeqBox is also specified in this document)

[blkar specs](BLKAR_SPECS.md)

Expand All @@ -91,7 +91,7 @@ Contributions are welcome. Note that by submitting contributions, you agree to l

## Acknowledgement

I would like to thank [Marco](https://github.com/MarcoPon) (the official SeqBox author) for discussing and clarifying aspects of his project, and also providing of test data during development of osbx. I would also like to thank him for his feedback on the numbering of the error correction enabled SBX versions (versions 17, 18, 19).
I would like to thank [Marco](https://github.com/MarcoPon) (the official SeqBox author) for discussing and clarifying aspects of his project, and also providing of test data during development of osbx. I would also like to thank him for his feedback on the numbering of the error correction enabled ECSBX versions (versions 17, 18, 19).

I would like to thank [Ming](https://github.com/mdchia/) for his feedback on the documentation, UX design, and several other general aspects of the osbx project, of which most of the designs are carried over to blkar, and also his further feedback on this project as well

Expand Down
52 changes: 27 additions & 25 deletions SBX_FORMAT.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
## Technical Specification

The following specification is copied directly from the official specification with extensions.
The following specification for SBX is copied directly from the official specification with minor to no modifications

ECSBX is the extended version of SBX with error-correcting capability

Byte order: Big Endian

## For versions : 1, 2, 3
## For SBX versions : 1, 2, 3

### Common blocks header:

Expand Down Expand Up @@ -73,21 +75,21 @@ Supported crypto hashes since 1.0.0 are

Metadata block (block 0) can be disabled

## For versions : 17 (0x11), 18 (0x12), 19 (0x13)
## For ECSBX versions : 17 (0x11), 18 (0x12), 19 (0x13)

Overall similar to above specs.
ECSBX specification is overall similar to the SBX specification above

Block categories : `Meta`, `Data`, `Parity`

`Meta` and `Data` are mutually exclusive, and `Meta` and `Parity` are mutually exclusive. A block can be both `Data` and `Parity`.
`Meta` and `Data` are mutually exclusive, and `Meta` and `Parity` are mutually exclusive. A block can be both `Data` and `Parity`

Assumes configuration is **M** data shards and **N** parity shards.
Assumes configuration is **M** data shards and **N** parity shards

### Note

The following only describes the sequence number arrangement, not the actual block arrangement.
The following only describes the sequence number arrangement, not the actual block arrangement

See section "Block set interleaving scheme" below for details on actual block arrangement.
See section "Block set interleaving scheme" below for details on actual block arrangement

### Common blocks header:

Expand Down Expand Up @@ -122,11 +124,11 @@ For **N** continuous blocks
| --- | -------- | ---- | ------ |
| 16 | blockend | var | parity |

RS arrangement : M blocks (M data shards) N blocks (N parity shards).
RS arrangement : M blocks (M data shards) N blocks (N parity shards)

The M blocks are `Data` only.
The M blocks are `Data` only

The N blocks are both `Data` and `Parity`.
The N blocks are both `Data` and `Parity`

### Last set of blocks

Expand All @@ -145,7 +147,7 @@ For **X** continuous blocks, where **X** is the remaining number of data blocks
| 16 | n | var | data |
| n+1 | blockend | var | padding (0x1a) |

For **M - X** continuous blocks, where **M** is the specified data shards count.
For **M - X** continuous blocks, where **M** is the specified data shards count

| pos | to pos | size | desc |
| --- | -------- | ---- | -------------- |
Expand All @@ -157,11 +159,11 @@ For **N** continuous blocks
| --- | -------- | ---- | ------ |
| 16 | blockend | var | parity |

RS arrangement : M blocks (X data shards + (M - X) padding blocks) N blocks.
RS arrangement : M blocks (X data shards + (M - X) padding blocks) N blocks

The M blocks are `Data` only.
The M blocks are `Data` only

The N blocks are both `Data` and `Parity`.
The N blocks are both `Data` and `Parity`

### Versions:

Expand Down Expand Up @@ -197,25 +199,25 @@ Supported forward error correction algorithms since 1.0.0 are

- Reed-Solomon erasure code - probably the only one for versions 17, 18, 19

Metadata and the parity blocks are mandatory in versions 17, 18, 19.
Metadata and the parity blocks are mandatory in versions 17, 18, 19

### Block set interleaving scheme

This block set interleaving is heavily inspired by [Thanassis Tsiodras's design of RockFAT](https://www.thanassis.space/RockFAT.html).
This block set interleaving is heavily inspired by [Thanassis Tsiodras's design of RockFAT](https://www.thanassis.space/RockFAT.html)

The major difference between the two schemes is that RockFAT's one is byte based interleaving, blkar's one is SBX block based interleaving.
The major difference between the two schemes is that RockFAT's one is byte based interleaving, blkar's one is SBX block based interleaving

The other difference is that blkar allows customizing level of resistance against burst sector errors.
The other difference is that blkar allows customizing level of resistance against burst sector errors

A burst error is defined as consecutive SBX block erasures.
A burst error is defined as consecutive SBX block erasures

Burst error resistance is defined as the maximum number of consective SBX block erasures tolerable for any instance of burst error.
Burst error resistance is defined as the maximum number of consective SBX block erasures tolerable for any instance of burst error

The maximum number of such errors tolerable is same as the parity shard count.
The maximum number of such errors tolerable is same as the parity shard count

Assuming arrangement of **M** data shards, **N** parity shards, **B** burst error resistance.
Assuming arrangement of **M** data shards, **N** parity shards, **B** burst error resistance

Then the SBX container can tolerate up to **N** burst errors in every set of **(M + N) * B** consecutive blocks, and each individual error may be up to **B** SBX blocks.
Then the SBX container can tolerate up to **N** burst errors in every set of **(M + N) * B** consecutive blocks, and each individual error may be up to **B** SBX blocks

#### Diagrams

Expand Down Expand Up @@ -254,4 +256,4 @@ Let **K > 1 + N** :

#### Limitations

While an arbitrary number can be used for burst error resistance level during encoding, blkar will only guess up to 1000 when automatically guessing the burst error resistance level.
While an arbitrary number can be used for burst error resistance level during encoding, blkar will only guess up to 1000 when automatically guessing the burst error resistance level
12 changes: 8 additions & 4 deletions src/cli_encode.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,9 @@ pub fn sub_command<'a, 'b>() -> App<'a, 'b> {
.help("File to encode. Supply - to use stdin as input. Use ./- for files named -."),
)
.arg(out_arg().help(
"SBX container name (defaults to INFILE.sbx). If OUT is a directory, then the
container is stored as OUT/INFILE.sbx (only the file part of INFILE is used).",
"SBX container name (defaults to INFILE.sbx or INFILE.ecsbx). If OUT is a
directory, then the container is stored as OUT/INFILE.sbx or
OUT/INFILE.ecsbx (only the file part of INFILE is used).",
))
.arg(force_arg().help("Force overwrite even if OUT exists"))
.arg(
Expand Down Expand Up @@ -95,13 +96,16 @@ pub fn encode<'a>(matches: &ArgMatches<'a>) -> i32 {

let (version, data_par_burst) = get_ver_and_data_par_burst_w_defaults!(matches, json_printer);

let out_extension = if ver_uses_rs(version) { "ecsbx" } else { "sbx" };

let in_file = get_in_file!(accept_stdin matches, json_printer);

let out = match matches.value_of("out") {
None => {
if file_utils::check_if_file_is_stdin(in_file) {
exit_with_msg!(usr json_printer => "Explicit output file name is required when input is stdin");
} else {
format!("{}.sbx", in_file)
format!("{}.{}", in_file, out_extension)
}
}
Some(x) => {
Expand All @@ -111,7 +115,7 @@ pub fn encode<'a>(matches: &ArgMatches<'a>) -> i32 {
}

let in_file = file_utils::get_file_name_part_of_path(in_file);
misc_utils::make_path(&[x, &format!("{}.sbx", in_file)])
misc_utils::make_path(&[x, &format!("{}.{}", in_file, out_extension)])
} else {
String::from(x)
}
Expand Down
31 changes: 21 additions & 10 deletions src/cli_utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ pub fn pr_verbosity_level_arg<'a, 'b>() -> Arg<'a, 'b> {
0 (show nothing)
1 (only show progress stats when done)
(default) 2 (show both progress bar and progress stats)
This only affects progress text printing.",
This only affects progress text printing",
)
}

Expand Down Expand Up @@ -227,15 +227,26 @@ pub fn sbx_version_arg<'a, 'b>() -> Arg<'a, 'b> {
.takes_value(true)
.help(
"SBX container version, one of :
| SBX block size | Reed-Solomon | Burst error resistance |
1 | 512 bytes | not enabled | not supported |
2 | 128 bytes | not enabled | not supported |
3 | 4096 bytes | not enabled | not supported |
(default) 17 (0x11) | 512 bytes | enabled | supported |
18 (0x12) | 128 bytes | enabled | supported |
19 (0x13) | 4096 bytes | enabled | supported |
Details of default option : sbx-version=17, rs-data=10, rs-parity=2, burst=10",
| SBX block size | FEC enabled | Burst error resistance |
1 | 512 bytes | no | not supported |
2 | 128 bytes | no | not supported |
3 | 4096 bytes | no | not supported |
(default) 17 (0x11) | 512 bytes | yes | supported |
18 (0x12) | 128 bytes | yes | supported |
19 (0x13) | 4096 bytes | yes | supported |
| File extension |
1 | .sbx |
2 | .sbx |
3 | .sbx |
(default) 17 (0x11) | .ecsbx |
18 (0x12) | .ecsbx |
19 (0x13) | .ecsbx |
Details of default option : sbx-version=17, rs-data=10, rs-parity=2, burst=10
Note : blkar will function correctly regardless of the file extension you pick,
the ones listed above are just the defaults",
)
}

Expand Down
2 changes: 1 addition & 1 deletion src/show_core.rs
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ pub fn show_file(param: &Param) -> Result<Stats, Error> {
return Err(Error::with_message(&format!(
"Error encountered when guessing : {}",
e
)))
)));
}
Ok(None) => {
print_if!(not_json => json_printer => "Failed to guess level";);
Expand Down

0 comments on commit 4d2cc7e

Please sign in to comment.