Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Iroh console (REPL) and restructured CLI #1356

Merged
merged 193 commits into from
Aug 24, 2023
Merged

feat: Iroh console (REPL) and restructured CLI #1356

merged 193 commits into from
Aug 24, 2023

Conversation

Frando
Copy link
Member

@Frando Frando commented Aug 15, 2023

Description

This adds a REPL to the main iroh binary. It builds upon #1216.

  • The REPL embeds all commands that operate via the RPC client - which, currently, is everything but provide (which should be called start), get and doctor.

  • The REPL has two new, REPL-only commands: set-doc and set-author. set-doc changes the state of the REPL: author and document will be displayed above the input line, and the document commands (set, get, list , share etc) will be available top-level.

  • Most of the changes in src/commands.rs only move code and the rpc_port option around, the individual commands are not changed in this PR.

Notes & open questions

  • The document and author IDs in the list commands are currently printed in hex. Should change to base32.

  • I'm not yet super sure about the set-doc and set-author commands. Another path might be to dig more into a pwd-like structure and have a cd command or so. This could then also move further into documents. I'm not sure how the author fits in here.

  • When in the level of a document, there's a conflict between the doc list command (which is just list then) and the top-level list command (to list blobs and collections). Not sure yet what the solution is. For now I embedded only the sync commands on the doc level, but I think I'd prefer to have the global set of commands not change between levels. Maybe we just rename the top-level list command to blobs and group the other blob-related commands (add', a tbd getvia RPC, possiblyexport`) there.

  • The REPL embeds all the existing RPC CLI commands. For this I changed the structure of the src/cli/commands.rs to a) split between RpcCommands and FullCommands , the latter are the ones that start an actual iroh node (plus doctor, wasn't sure about that for now). The former all work atop the RPC client. For them to not create a new RPC client for each REPL command, I moved the --rpc-port option to the top level. This is not super correct, because it does not apply to provide, get, doctor. Clap does not allow to scope an argument to a set of subcommands by default, see this discussion. Still thinking about what the cleanest solution is.

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.

Frando and others added 30 commits August 14, 2023 12:33
* removes content support from iroh-sync
* adds a quick-and-dirty writable database to iroh-bytes (will be
  replaced with a better generic writable database soon)
* adds a `Downloader` to queue get requests for individual hashes from
  individual peers
* adds a `BlobStore` that combines the writable db with the downloader
* adds a `Doc` abstraction that combines an iroh-sync `Replica` with a
  `BlobStore` to download content from peers on-demand
* updates the sync repl example to plug it all together
* also adds very basic persistence to `Replica` (encode to byte string)
  and uses this in the repl example
* make the REPL in the sync example work properly with rustyline for
  editing and reading input, shell-style argument parsing and clap for
  parsing commands
* add a docs store for opening and closing docs
* add author to doc struct
uses flume channels to allow for combined sync and async usage
@Arqu
Copy link
Collaborator

Arqu commented Aug 24, 2023

/netsim branch changes-for-1356

github-merge-queue bot pushed a commit that referenced this pull request Aug 24, 2023
## Description

This PR adds `iroh-sync`, a document synchronization protocol, to iroh,
and integrates with `iroh-net`, `iroh-gossip` and `iroh-bytes`.

* At the core is the `iroh-sync` crate, with a set reconciliation
algorithm implemented by @dignifiedquire. See [the old iroh-sync
repo](https://github.com/n0-computer/iroh-sync/) for the prehistory and
#1216 for the initial PR (fully included in this PR, and by now
outdated)
* Iroh sync is integrated in the iroh node, with iroh-gossip, in the RPC
interface, and the CLI.
* `LiveSync` is the handle to an actor that integrates sync with
[gossip](#1149 ) to broadcast and receive document updates from peers.
For each open document a gossip swarm is joined with a `TopicId` derived
from the doc namespace.
* mod `download` contains the new downloader. It will be improved in
#1344 .
* mod `client` is the new high-level RPC client. It currently only has
methods for dealing with docs and sync, other things should be added
once we merged this. CLI commands for sync are in `commands/sync.rs`.
Will be much better with #1356 .
* `examples/sync.rs` has a REPL to modify and sync docs. It does a full
setup without using the iroh console. Also includes code to sync
directories, and a hammer command for load testing.
* The PR also introduces `iroh::client::Iroh`, a wrapper around the RPC
client, and `iroh::client::Doc`, a wrapper around RPC client for a
single document

## Notes & open questions

#### Should likely happen before merge:

* [x] Make `iroh_sync::Store:::list_authors` and `list_replicas` return
iterators `iroh-sync` *fixed in #1366 *
* [ ] Add `iroh_sync::Store::close_replica`
* [x] `ContentStatus` in `on_insert` callback is reported as `Ready` if
the content is still `baomap::PartialEntry` (in-process download) *fixed
in a8e8093*


#### Can happen after merge, but  before `0.6` release

* [ ] Implement `AuthorImport` and `AuthorShare` RPC & CLI commands
* [ ] sync store `list_namespaces` and `list_authors` internally
collect, return iterator instead
* [ ] Fix cross-compiles to arm/android. See
cross-rs/cross#1311
* [ ] Ensure that fingerpring calculation is efficient and/or cached for
large documents. Currently calculating the initial fingerprint iterates
once over all entries in a document.
* [ ] Make content downloads be more reliable
* [ ] Add some way to download content from peers independent of the
first insertion event for a remote entry. The downloader with retries is
tracked in #1334 and 1344, but independent of that, we still would
currently only ever try to queue a download when the `on_insert`
callback triggers, which is only once. There should be a way, even if
manual for now, to try to download missing content in a replica from
peers.
* [ ] during `iroh-sync` sync include info if content is available for
each entry
* [ ] Add basic peer management and persistence. Currently live sync
will stop to do anything after a node restart.
* [ ] Persist the addressbook of peers for a document, to reconnect when
restarting the node
* [ ] Implement `PeerAdd` and `PeerList` RPC & CLI commands. The latter
needs changes in `iroh-net` to expose information of currently-connected
peers and their peer info.
* [ ] Make read-only replicas possible
* [ ] Improve reliablity of replica sync. 
* sync is triggered on each `NeighborUp` event from gossip. check that
we don't sync too much.
* maybe include peer info in gossip messages, to queue syncs with those
(but not all at once)
* track and exchange the timestamp of last full sync for peers, to know
if you missed gossiped message and react accordingly
     * add more tests with peers coming and leaving

#### Open questions

* [ ] `iroh_sync::EntrySignature` should the signatures include a
namespace prefix?
* [ ] do we want the 1:1 mapping of `NamespaceId`and gossip `TopicId`,
or would the topic id as a hash be better?

#### Other TODOs collected from the code

* [ ] Port `hammer` and `fs` commands from REPL example to iroh cli
* [ ] docs/naming: settle terminology about keypairs,
private/secret/signing keys, public keys/identifiers and make docs and
symbols consistent
* [ ] Make `bytes_get` streaming in the RPC interface
* [ ] Allow to unset the subscription on a replica
* [ ] `iroh-sync` shouldn't depend on `iroh-bytes` only for `Hash` type
-> #1354
* [ ] * [ ] Move `sync::live::PeerSource` to iroh-net or even better ->
#1354
* [ ] `StoreInstance::put` propagate error and verify timestamp is
reasonable.
* [ ] `StoreInstance::get_range` implement inverted range
* [ ] `iroh_sync`: Remove some items only used in tests (marked with
#[cfg(test)])
* [ ] `iroh_sync` fs store: verify get method fetches all keys with this
namespace
* [ ] `ranger::SimpleStore::get_range`: optimize
* [ ] `ranger::Peer` avoid allocs?
* [ ] `fs::StoreInstance::get_fingerprint` optimize
* [ ] `SyncEngine::doc_subscribe` remove unwrap, handle error


## Change checklist

- [x] Self-review.
- [x] Documentation updates if relevant.
- [ ] Tests if relevant.

---------

Co-authored-by: dignifiedquire <me@dignifiedquire.com>
Co-authored-by: Asmir Avdicevic <asmir.avdicevic64@gmail.com>
Co-authored-by: Kasey <klhuizinga@gmail.com>
@github-actions
Copy link

sync-repl.28a32b9f6c2819b39d1654a6263522b7c6eb233b
Perf report:

test case throughput_gbps throughput_transfer
iroh_latency_20ms 1_to_1 2.71 3.61
iroh_latency_20ms 1_to_3 7.36 8.90
iroh_latency_20ms 1_to_5 8.50 9.44
iroh_latency_20ms 1_to_10 8.35 8.29
iroh_latency_20ms 2_to_2 5.27 7.27
iroh_latency_20ms 2_to_4 9.47 12.15
iroh_latency_20ms 2_to_6 11.49 13.14
iroh_latency_20ms 2_to_10 14.09 14.94
iroh 1_to_1 2.71 3.62
iroh 1_to_3 6.63 7.81
iroh 1_to_5 8.21 9.01
iroh 1_to_10 9.11 9.07
iroh 2_to_2 5.71 6.49
iroh 2_to_4 10.09 12.51
iroh 2_to_6 12.16 13.82
iroh 2_to_10 14.31 15.35
iroh_latency_200ms 1_to_1 2.62 3.43
iroh_latency_200ms 1_to_3 7.69 9.42
iroh_latency_200ms 1_to_5 7.82 8.36
iroh_latency_200ms 1_to_10 9.42 9.49
iroh_latency_200ms 2_to_2 5.15 6.70
iroh_latency_200ms 2_to_4 10.40 12.64
iroh_latency_200ms 2_to_6 11.29 13.08
iroh_latency_200ms 2_to_10 14.41 15.44

@Frando Frando changed the base branch from sync-integration to main August 24, 2023 17:59
@Frando Frando enabled auto-merge August 24, 2023 18:27
@Frando Frando added this pull request to the merge queue Aug 24, 2023
Merged via the queue into main with commit b73d950 Aug 24, 2023
15 checks passed
@dignifiedquire dignifiedquire deleted the sync-repl branch November 1, 2023 14:26
matheus23 pushed a commit that referenced this pull request Nov 14, 2024
## Description

This PR adds `iroh-sync`, a document synchronization protocol, to iroh,
and integrates with `iroh-net`, `iroh-gossip` and `iroh-bytes`.

* At the core is the `iroh-sync` crate, with a set reconciliation
algorithm implemented by @dignifiedquire. See [the old iroh-sync
repo](https://github.com/n0-computer/iroh-sync/) for the prehistory and
#1216 for the initial PR (fully included in this PR, and by now
outdated)
* Iroh sync is integrated in the iroh node, with iroh-gossip, in the RPC
interface, and the CLI.
* `LiveSync` is the handle to an actor that integrates sync with
[gossip](#1149 ) to broadcast and receive document updates from peers.
For each open document a gossip swarm is joined with a `TopicId` derived
from the doc namespace.
* mod `download` contains the new downloader. It will be improved in
#1344 .
* mod `client` is the new high-level RPC client. It currently only has
methods for dealing with docs and sync, other things should be added
once we merged this. CLI commands for sync are in `commands/sync.rs`.
Will be much better with #1356 .
* `examples/sync.rs` has a REPL to modify and sync docs. It does a full
setup without using the iroh console. Also includes code to sync
directories, and a hammer command for load testing.
* The PR also introduces `iroh::client::Iroh`, a wrapper around the RPC
client, and `iroh::client::Doc`, a wrapper around RPC client for a
single document

## Notes & open questions

#### Should likely happen before merge:

* [x] Make `iroh_sync::Store:::list_authors` and `list_replicas` return
iterators `iroh-sync` *fixed in #1366 *
* [ ] Add `iroh_sync::Store::close_replica`
* [x] `ContentStatus` in `on_insert` callback is reported as `Ready` if
the content is still `baomap::PartialEntry` (in-process download) *fixed
in a8e8093*


#### Can happen after merge, but  before `0.6` release

* [ ] Implement `AuthorImport` and `AuthorShare` RPC & CLI commands
* [ ] sync store `list_namespaces` and `list_authors` internally
collect, return iterator instead
* [ ] Fix cross-compiles to arm/android. See
cross-rs/cross#1311
* [ ] Ensure that fingerpring calculation is efficient and/or cached for
large documents. Currently calculating the initial fingerprint iterates
once over all entries in a document.
* [ ] Make content downloads be more reliable
* [ ] Add some way to download content from peers independent of the
first insertion event for a remote entry. The downloader with retries is
tracked in #1334 and 1344, but independent of that, we still would
currently only ever try to queue a download when the `on_insert`
callback triggers, which is only once. There should be a way, even if
manual for now, to try to download missing content in a replica from
peers.
* [ ] during `iroh-sync` sync include info if content is available for
each entry
* [ ] Add basic peer management and persistence. Currently live sync
will stop to do anything after a node restart.
* [ ] Persist the addressbook of peers for a document, to reconnect when
restarting the node
* [ ] Implement `PeerAdd` and `PeerList` RPC & CLI commands. The latter
needs changes in `iroh-net` to expose information of currently-connected
peers and their peer info.
* [ ] Make read-only replicas possible
* [ ] Improve reliablity of replica sync. 
* sync is triggered on each `NeighborUp` event from gossip. check that
we don't sync too much.
* maybe include peer info in gossip messages, to queue syncs with those
(but not all at once)
* track and exchange the timestamp of last full sync for peers, to know
if you missed gossiped message and react accordingly
     * add more tests with peers coming and leaving

#### Open questions

* [ ] `iroh_sync::EntrySignature` should the signatures include a
namespace prefix?
* [ ] do we want the 1:1 mapping of `NamespaceId`and gossip `TopicId`,
or would the topic id as a hash be better?

#### Other TODOs collected from the code

* [ ] Port `hammer` and `fs` commands from REPL example to iroh cli
* [ ] docs/naming: settle terminology about keypairs,
private/secret/signing keys, public keys/identifiers and make docs and
symbols consistent
* [ ] Make `bytes_get` streaming in the RPC interface
* [ ] Allow to unset the subscription on a replica
* [ ] `iroh-sync` shouldn't depend on `iroh-bytes` only for `Hash` type
-> #1354
* [ ] * [ ] Move `sync::live::PeerSource` to iroh-net or even better ->
#1354
* [ ] `StoreInstance::put` propagate error and verify timestamp is
reasonable.
* [ ] `StoreInstance::get_range` implement inverted range
* [ ] `iroh_sync`: Remove some items only used in tests (marked with
#[cfg(test)])
* [ ] `iroh_sync` fs store: verify get method fetches all keys with this
namespace
* [ ] `ranger::SimpleStore::get_range`: optimize
* [ ] `ranger::Peer` avoid allocs?
* [ ] `fs::StoreInstance::get_fingerprint` optimize
* [ ] `SyncEngine::doc_subscribe` remove unwrap, handle error


## Change checklist

- [x] Self-review.
- [x] Documentation updates if relevant.
- [ ] Tests if relevant.

---------

Co-authored-by: dignifiedquire <me@dignifiedquire.com>
Co-authored-by: Asmir Avdicevic <asmir.avdicevic64@gmail.com>
Co-authored-by: Kasey <klhuizinga@gmail.com>
matheus23 pushed a commit that referenced this pull request Nov 14, 2024
## Description

This adds a REPL to the main `iroh` binary. It builds upon #1216.


* The REPL embeds all commands that operate via the RPC client - which,
currently, is everything but `provide` (which should be called `start`),
`get` and `doctor`.

* The REPL has two new, REPL-only commands: `set-doc` and `set-author`.
`set-doc` changes the state of the REPL: author and document will be
displayed above the input line, and the document commands (`set`, `get`,
`list` , `share` etc) will be available top-level.

* Most of the changes in `src/commands.rs` only move code and the
`rpc_port` option around, the individual commands are not changed in
this PR.

## Notes & open questions

* The document and author IDs in the `list` commands are currently
printed in hex. Should change to base32.

* I'm not yet super sure about the `set-doc` and `set-author` commands.
Another path might be to dig more into a pwd-like structure and have a
`cd` command or so. This could then also move further into documents.
I'm not sure how the author fits in here.

* When in the level of a document, there's a conflict between the `doc
list` command (which is just `list` then) and the top-level `list`
command (to list blobs and collections). Not sure yet what the solution
is. For now I embedded only the `sync` commands on the doc level, but I
think I'd prefer to have the global set of commands not change between
levels. Maybe we just rename the top-level `list` command to `blobs` and
group the other blob-related commands (`add', a tbd `get` via RPC,
possibly `export`) there.

* The REPL embeds all the existing RPC CLI commands. For this I changed
the structure of the `src/cli/commands.rs` to a) split between
`RpcCommands` and `FullCommands` , the latter are the ones that start an
actual iroh node (plus doctor, wasn't sure about that for now). The
former all work atop the RPC client. For them to not create a new RPC
client for each REPL command, I moved the `--rpc-port` option to the top
level. This is not super correct, because it does not apply to
`provide`, `get`, `doctor`. Clap does not allow to scope an argument to
a set of subcommands by default, see [this
discussion](clap-rs/clap#5070 (comment)).
Still thinking about what the cleanest solution is.

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates if relevant.
- [ ] Tests if relevant.

---------

Co-authored-by: dignifiedquire <me@dignifiedquire.com>
Co-authored-by: Asmir Avdicevic <asmir.avdicevic64@gmail.com>
Co-authored-by: Kasey <klhuizinga@gmail.com>
Co-authored-by: Brendan O'Brien <sparkle_pony_2000@qri.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants