Skip to content
This repository has been archived by the owner on Sep 21, 2023. It is now read-only.

Nearup rewrite #81

Closed
chefsale opened this issue Jul 2, 2020 · 10 comments
Closed

Nearup rewrite #81

chefsale opened this issue Jul 2, 2020 · 10 comments
Assignees
Labels

Comments

@chefsale
Copy link
Contributor

chefsale commented Jul 2, 2020

Currently the nearup is written in Python without any external packages, so a lot of the logic is cumbersome and implemented in a sub-optimal way. Nearup is in the critical path for our network and a lot of validators are depending on it.

In the current state this is a liability and we should put effort into making it production ready and reliable.

Main proposal was rewriting the whole nearup application in Rust making it strongly typed and compiled, so we can catch issues faster at compile time. As well rust provides all the necessary crates needed to make this happen:

The other option was to use GoLang, but that would introduce another language and ecosystem which is unnecessary, as it can be all done in rust.

The general idea is to rewrite nearup and replace our current deployments of devnet, betanet and testnet with nearup everywhere. This would deprecate a lot of the complicated deployment setup we have and we could just deploy either prebaked nearup images (docker/packer).

Nearup would provide the ability to be configured in a way:

  • to automatically update itself or not
  • to automatically run the latest version of a specified release phase (stable, rc, beta) or a specific version
  • provide support for joining betanet, testnet, localnet or other custom network if needed
  • support a canary mode which we could use to test out nodes for a specified period of time (this would be a replacement for devnet)
    • this would be used on every commit to validate that master is still backwards compatible and working
    • this would be used to run a regular node on TestNet: node syncs from scratch and keeps up with the head
    • this would be used to run a validator node on Testnet: node starts from existing state and make sure there is no block production failures and this node is continues to be a validator
    • run an RPC node on Testnet : make sure there is no failures in RPC calls

As we plan to support flag to force next protocol version in neard this should be also supported in the nearup configuration.
neard --protocol_version=6` or `neard --next_protocol

cc: @bowenwang1996 @ailisp @damons @frol

@chefsale chefsale self-assigned this Jul 2, 2020
@mfornet
Copy link
Member

mfornet commented Jul 2, 2020

If we write nearup in rust how are we going to distribute it. The current one-liner for nearup is really cool, but ofc if relies on users having python working out of the box.
I guess we could still have similar one-liner and:

  1. ask if they want to download pre-compiled binary or
  2. Install nearup from crates.io with cargo install (potentially installing cargo first).

@chefsale
Copy link
Contributor Author

chefsale commented Jul 2, 2020

Yes, I think we could do both crate + a cool script which does it or even we could add it to the distro specific package managers, add support for:

apt-get install nearup
dnf install nearup
etc...

That would be nice as well.

@ailisp
Copy link
Member

ailisp commented Jul 2, 2020

I don't suggest we maintain a apt and dnf repo, we need self update nearup, with apt/dnf this needs sudo. A binary + a nearup-init.sh sounds good, it's how rustup works (rustup-init.sh/rust-init.bat + rustup binary)

@ailisp
Copy link
Member

ailisp commented Jul 2, 2020

I agree with a binary+shell script to distribute but I don't think rewrite in rust is necessary

  • Python has decent packaging solution, AppImage, or https://www.pantsbuild.org/index.html, https://buck.build/, https://bazel.build/ suggested by @chefsale
  • we don't have to rust to avoid bugs, the major bugs of an devops tool like nearup is not type errors, but imo a series of integration activity: downloading, github integration, subprocess controlling, etc. even in rust it needs to be covered with same suite of integration test but rewrite in rust some and adding new features take more time than use python, so I don't think it's a good idea.
  • python has also prove its ability and rich ecosystem for using in industrial strength devops tools (gcloud-cli, aws cli, azure cli, ansible, saltstack, openstack are all written in python), but rust have not (rustup's logic is simpler compare to this tools, even compare to nearup)

@chefsale
Copy link
Contributor Author

chefsale commented Jul 2, 2020

Agree with @ailisp on maybe sticking to Python, as well. Happy to go with either solutions, happy to hear other peoples opinions, obviously there's cons and pros :)

@frol
Copy link
Collaborator

frol commented Jul 2, 2020

Even though I dream to have nearup, near-shell, and rainbow cli implemented in Rust, I want to make sure we weigh all the pros and cons of a rewrite.

we don't have to rust to avoid bugs, the major bugs of an devops tool like nearup is not type errors, but imo a series of integration activity

Python is great for happy-path scripting, but handling corner cases (real world is scary) requires all-catching try-except (a single call to requests.get may throw a myriad of types of exceptions ranging from low-level native Python exceptions to high-level errors), while Rust make all of that explicit (you may choose to ignore the errors with .unwrap() / .expect(), but you can easily identify them in next iterations when you ready to handle them).

I believe that the maintainability [reliability over time after refactorings] of Rust code is much greater than Python. Also, once you get from PoC to a reliable CLI in Python, the amount of code is the same or even more than in Rust, I believe.

Still, we already have the implementation in Python, so we should be careful about re-implementation.

/cc @ilblackdragon @nearmax @khorolets

@chefsale
Copy link
Contributor Author

chefsale commented Jul 2, 2020

I agree on that as well, still we have to take into account that we cannot really reuse much of the current code, so personally I believe it would be easier to start from scratch either in python or rust.

@ailisp
Copy link
Member

ailisp commented Jul 2, 2020

Python is great for happy-path scripting, but handling corner cases (real world is scary) requires all-catching try-except (a single call to requests.get may throw a myriad of types of exceptions ranging from low-level native Python exceptions to high-level errors), while Rust make all of that explicit (you may choose to ignore the errors with .unwrap() / .expect(), but you can easily identify them in next iterations when you ready to handle them).

Unfortunately python is not java and it's hard to find all possible exceptions could raise from an lib function :( So in a robust python package, inclined to use only std functions or libraries that has well wrapped and documented type of exceptions. requests unfortunately is not, so we have to wrap it, enforce write python error handling in rust-like way:

  • a module can only raise exception defined in the same module, raise anything else is a bug of this module
  • higher level module call low level module functions, must catch low level module exception (and only these exceptions) and
    • if don't handle, wrap low level module error into a error class defined in current module
    • or handle it.

Ignore handle some error exception is implicit means unwrap in rust :(

I believe that the maintainability [reliability over time after refactorings] of Rust code is much greater than Python. Also, once you get from PoC to a reliable CLI in Python, the amount of code is the same or even more than in Rust, I believe.

So this is possibly true, unless in practice some parts never fail. And write same amount of (well error handled) code in python is faster than write in rust, especially in this use case, rust static check can't help detect errors of integrating things, need edit-recompile-test quite a few times.

@frol
Copy link
Collaborator

frol commented Jul 5, 2020

After a brief discussion with @ilblackdragon, we identified that before shooting for any major refactoring on nearup side, we should make sure that neard (nearcore) CLI is good enough to be running without nearup in the first place. After that, we can draw the requirements for nearup and decide on the language and packaging strategy.

@chefsale
Copy link
Contributor Author

chefsale commented Jul 6, 2020

So, can you provide more context on what would be needed to be done in the nearcore CLI side? cc: @ilblackdragon @frol

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants