Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] Restructure nearcore project #1881

Open
frol opened this issue Dec 19, 2019 · 22 comments
Open

[Epic] Restructure nearcore project #1881

frol opened this issue Dec 19, 2019 · 22 comments
Labels
A-build Area: Anything related to the build and configuration process for nearcore. C-discussion Category: Discussion, leading to research or enhancement C-epic Category: an epic C-housekeeping Category: Refactoring, cleanups, code quality

Comments

@frol
Copy link
Collaborator

frol commented Dec 19, 2019

My developer experience with nearcore is not great, so I contemplated the improvements that can be made.

The very first thing that is missing for me is a clear project structure. I believe that the best code is no code and the same goes for documentation (this is why I don't want to introduce documentation of the project structure, but rather prefer having an obvious project structure that does not require a lot of time to understand and makes it easy to navigate without a need to memorize all the components).

I don't have enough experience with the existing infrastructure, but here is my first take on it.

The root folder:

 .dockerignore
 .gitattributes
 .github/
 .gitignore
 .gitlab-ci.yml
 .pre-commit-config.yaml

neard/
 libraries/
 tools/

ATTRIBUTIONS.md
 Cargo.lock
 Cargo.toml
 CODEOWNERS
 Dockerfile
 LICENSE
 Makefile
 METRICS.md
 README.md
 rustfmt.toml
 rust-toolchain

It is still crowded with various config files, but at least the folders have a clear structure. Just for the reference, here is the current project structure:

 .dockerignore
 .gitattributes
 .github/
 .gitignore
 .gitlab-ci.yml
 .pre-commit-config.yaml

async-utils/
 chain/
 conf/ # Would you guess that this is Grafana config? Do we actually need this config in this repository?
 core/
 docs/ # Nice! We have some documentation! Well, there is a single logo.svg file (let's move it to .github/).
 genesis-tools/
 near/
 nightly/ # Nightly what?
 pytest/ # This is a very popular testing framework in Python, right? Nope
 runtime/
 scripts/
 tests/ # Can we move them into `near` <span class="error">[`neard`]</span>?
 test-utils/

ATTRIBUTIONS.md
 Cargo.lock
 Cargo.toml
 CODEOWNERS
 Dockerfile
 LICENSE
 Makefile
 METRICS.md
 package-lock.json # ???!
 README.md
 rustfmt.toml
 rust-toolchain

So, I suggest collapsing those 13 folders into 3 top-level ones:

 neard/
 src/
 tests/
 tests_py/ # rename from /pytest (I have no idea where to put the nightly folder with those two .txt files, yet)
 libraries/
 chain/
 core/
 runtime/
 test-utils/
 utils/
 tools/
 near-loadtester
 near-state-viewer
 near-genesis
 near-deploy # rename from /scripts

The exact naming is yet to be discussed.

The crates structure and naming also needs some love.

@frol frol added C-housekeeping Category: Refactoring, cleanups, code quality C-discussion Category: Discussion, leading to research or enhancement C-epic Category: an epic labels Dec 19, 2019
@MaksymZavershynskyi
Copy link
Contributor

This is really great that someone is looking into this and raising these concerns, I really liked the comments.

pytest are not just stress tests they are very similar to nearcore/near/tests and nearcore/tests, just written in Python.

So I suggest we do the following renaming:

  • nearcore/pytest -> nearcore/neard/tests_py;
  • nearcore/nightly -> nearcore/neard/tests_nightly_py;
  • nearcore/tests -> nearcore/neard/tests_rs;
  • nearcore/near/tests -> nearcore/neard/tests_rs.

WDYT?

@frol
Copy link
Collaborator Author

frol commented Dec 19, 2019

So I suggest we do the following renaming

The suggested renaming sounds reasonable to me. I will apply the suggestion (update my post) tomorrow.

@ailisp
Copy link
Member

ailisp commented Dec 19, 2019

I agree with @frol and @nearmax comment above.

nearcore/nightly -> nearcore/neard/tests_nightly_py;

This is a webapp that host and run tests_py and above neard_rs tests, i suggest to move it another repo or put it to tools/nightly_test_runner

@SkidanovAlex
Copy link
Collaborator

also note that nightly heavily relies on the relative paths, so if we change the dirs, nightly needs to be changed accordingly.

@MaksymZavershynskyi
Copy link
Contributor

This is a webapp that host and run tests_py and above neard_rs tests, i suggest to move it another repo or put it to tools/nightly_test_runner

Let's jut put in into tools directory.

@MaksymZavershynskyi
Copy link
Contributor

Assigning it to @frol so that we have only one assignee as per guidelines.

@damons damons added the A-build Area: Anything related to the build and configuration process for nearcore. label Feb 21, 2020
@ilblackdragon ilblackdragon removed this from the MainNet milestone Mar 20, 2020
@frol
Copy link
Collaborator Author

frol commented Apr 24, 2020

The first steps in this direction are #2511 and #2520.

@SkidanovAlex @nearmax @ilblackdragon Do you have any objections against project refactoring? The code is not going to change, it should be just more clear folder naming towards better navigation around the project. I want to make the following steps (separate PR iterations):

  1. Rename the crate folders to match the crate names (chain/chain -> chain/near-chain, runtime/runtime -> runtime/near-runtime, core/store -> core/near-store), so you can easily resolve the use statement into the folder name.
  2. Move some of the folders to clean up the root of the project and also make it clear what our internal libraries/modules/components are (see the issue description).

My motivation mentioned in the issue description only grows with the time I spend with nearcore. I don't want to have lengthy conversations about this, but it hurts my productivity. It is somewhat related to #2433.

@MaksymZavershynskyi
Copy link
Contributor

I agree, this is the kinda of stuff that slows down everyone and makes our code less appealing to external contributors. We discussed with @fckt that having a folder named "libraries" is kinda weird, given most of our crates are libraries, but on the second looks it looks ok.

@lexfrl
Copy link
Contributor

lexfrl commented Apr 29, 2020

Rename the crate folders to match the crate names (chain/chain -> chain/near-chain, runtime/runtime -> runtime/near-runtime, core/store -> core/near-store), so you can easily resolve the use statement into the folder name.

I'd suggest the following adjustment. We can have a near folder to hold all the near-* components.

neard
near
  /chain
  /primitives
  /store
  /runtime
  /json-rpc
utils
test-utils
tools
docs
scripts
...

@frol
Copy link
Collaborator Author

frol commented Apr 29, 2020

@fckt Given the project is nearcore, it implies that everything in the repository is related to near. Thus, "near" as a folder name does not introduce any clarity (the same way I don't really like chain/chain and runtime/runtime). It has its benefit of pulling some of the root folders inside, but utils and test-utils left behind.

@lexfrl
Copy link
Contributor

lexfrl commented Apr 29, 2020

The idea is to associate crates which are the core components (direct dependencies) of neard process and their dependencies.

Given the project is nearcore, it implies that everything in the repository is related to near.

Right, but not all of crates are the direct dependencies of the neard (or dependencies of these crates), there is a lot of supplementary (or optional) crates. There is a category of crates which are essential for the node operation.

The following crates make up the system and these can be naturally form a list of the core components (approximated):

near-config
near-crypto
near-primitives
near-store
near-runtime
near-chain
near-chunks
near-client
near-pool
near-network
near-jsonrpc
near-telemetry
near-epoch-manager

However, don't really think that this list as an accurate and complete. For example, its questionable if near-jsonrpc or near-telemetry are the core components (since these could be optional for neard - I can imagine that node operators would want to disable it). But the most probably the near-chain, near-store, near-network, near-runtime and near-primitives are the essential ones (not optional in any mode). Note: I don't count artificial execution modes like testing environments.

@frol
Copy link
Collaborator Author

frol commented Apr 29, 2020

Let me list all the crates we have:

genesis-csv-to-json
genesis-populate
keypair-generator
loadtester
neard
near-actix-utils
near-chain
near-chain-configs
near-chunks
near-client
near-crypto
near-epoch-manager
near-jsonrpc
near-jsonrpc-client
near-logger-utils
near-metrics
near-network
near-pool
near-primitives
near-rpc-error-core
near-rpc-error-macro
near-runtime-configs
near-runtime-fees
near-store
near-telemetry
near-vm-errors
near-vm-logic
near-vm-runner
near-vm-runner-standalone
node-runtime
restaked
runtime-params-estimator
state-viewer
testlib

Let's extract tools:

genesis-csv-to-json
genesis-populate
keypair-generator
loadtester
restaked
runtime-params-estimator
state-viewer

Let's move neard out of scope and list the rest:

near-actix-utils
near-chain
near-chain-configs
near-chunks
near-client
near-crypto
near-epoch-manager
near-jsonrpc
near-jsonrpc-client
near-logger-utils
near-metrics
near-network
near-pool
near-primitives
near-rpc-error-core
near-rpc-error-macro
near-runtime-configs
near-runtime-fees
near-store
near-telemetry
near-vm-errors
near-vm-logic
near-vm-runner
near-vm-runner-standalone
node-runtime
testlib

Everyone is welcome to suggest a clear separation of the packages. Currently, we have:

  • core (I think "common" would be a better name describing that these are widely reusable components)
  • chain
  • runtime
  • test-utils (I would move testlib and near-logger-utils to "common" and move the rest to "tools")
  • utils (I hope to merge it somewhere)

I find this separation quite reasonable, but it is hard to find a place for things like near-jsonrpc, near-jsonrpc-client, and [hopefully extracted one day] neard-lib.

@bowenwang1996
Copy link
Collaborator

Where does the neard go under this separation @frol ? Also I think we can merge utils and test-utils (those that are not test utils in utils should probably go to tools).

@ailisp
Copy link
Member

ailisp commented Apr 29, 2020 via email

@frol
Copy link
Collaborator Author

frol commented Apr 30, 2020

I propose keeping neard in the root folder since it is the main entry point to the project.

"utils" was just recently introduced (#2511) and it has the only crate (near-actix-utils), and I can see we either extract it to a standalone helper (separate package and repo) or merge into neard-lib (extract neard/src/lib.rs and its dependencies into a separate crate).

I think we can get rid of both "utils" and "test-utils" (just move the relevant crates to "common" and "tools").

what is difference between util and tool?

I view them as follows: a tool is something that you use as a standalone piece (i.e. executable), while util is anything that is useful (e.g. a library, a file, a script, and even an executable).

@lexfrl
Copy link
Contributor

lexfrl commented Apr 30, 2020

As a general methodology, I think if we want to define a category it should be valuable and not tautological. In other words the definition should add a value, be unique and be not ambiguous.

"utils" was just recently introduced (#2511) and it has the only crate (near-actix-utils), and I can see we either extract it to a standalone helper (separate package and repo) or merge into neard-lib (extract neard/src/lib.rs and its dependencies into a separate crate).

neard-lib - makes sense to me! Why it makes sense: besides of the neard itself, we have a lot of other entry points (tests) which would want to reuse some process initialization logic.

@lexfrl
Copy link
Contributor

lexfrl commented Apr 30, 2020

The "utils" to me is a common code which exists because you don't want to repeat it (and this is a dominating characteristic of it). But in the same time, it's a small enough and don't deserve to evolve into a separate library (yet). In the same time, that code should be not exclusively specific to the project (e.g. it must not use custom types, defined in the project). For the exclusive common code and types of the components, primitives serves good.

Essential logic which covers dedicated pieces of the project spec (and at the same time are not useful separately (excluding artificial purposes like testing)), better to be colocated together. \

@stale
Copy link

stale bot commented Jul 1, 2021

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months.
It will be closed in 7 days if no further activity occurs.
Thank you for your contributions.

@stale stale bot added the S-stale label Jul 1, 2021
@bowenwang1996
Copy link
Collaborator

@frol @matklad do we still want to proceed with this?

@matklad
Copy link
Contributor

matklad commented Jul 6, 2021

Yes, I believe there's still room for improvement for project structure, and that this is non-urgent, but important. @miraclx recently make great progress here as well! Let's keep this open.

@matklad
Copy link
Contributor

matklad commented Aug 24, 2021

One suggestion I have is to flatten out the directory layout and keep all rust crates in the one-level deep crates/ directory. In other words, the first list from #1881 (comment) looks pretty reasonable to me, and it nicely solves the "where do I put a thing which doesn't belong anywhere else" problem.

See https://matklad.github.io/2021/08/22/large-rust-workspaces.html for an extended motivation.

(0.8 confidence the end state will be much better, 0.95 confidence that its a big and a somewhat painful change, 0.7 confidence that we should actually do that)

@akhi3030 akhi3030 removed the S-pinned label Nov 28, 2022
@frol frol removed their assignment Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build Area: Anything related to the build and configuration process for nearcore. C-discussion Category: Discussion, leading to research or enhancement C-epic Category: an epic C-housekeeping Category: Refactoring, cleanups, code quality
Projects
None yet
Development

No branches or pull requests

10 participants