Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[forge] refactory Forge CLI interface #9061

Merged
merged 1 commit into from Sep 1, 2021
Merged

Conversation

zihaoccc
Copy link
Contributor

@zihaoccc zihaoccc commented Aug 31, 2021

Forge Cli is increasingly complicate since it supports more and more operations. Refactory CLI to different operation categories for better user experience.

For cluster Opertions:

  • cargo run -p forge-cli -- operator clean-up
  • cargo run -p forge-cli -- operator resize
  • cargo run -p forge-cli -- operator set-validator

for Testing:

  • cargo run -p forge-cli -- test local-swarm
  • ./scripts/fgi/run --local-swarm
  • ./script/fgi/run --pr xxx
  • ./scripts/fgi/run -T dev_czh_pull_9061 --suite land_blocking_compat

Motivation

(Write your motivation for proposed changes here.)

Have you read the Contributing Guidelines on pull requests?

(Write your answer here.)

Test Plan

(Share your test plan here. If you changed code, please provide us with clear instructions for verifying that your changes work.)

Related PRs

(If this PR adds or changes functionality, please take some time to update or suggest changes to the docs at https://developers.diem.com, and link to your PR here.)

If targeting a release branch, please fill the below out as well

  • Justification and breaking nature (who does it affect? validators, full nodes, tooling, operators, AOS, etc.)
  • Comprehensive test results that demonstrate the fix working and not breaking existing workflows.
  • Why we must have it for V1 launch.
  • What workarounds and alternative we have if we do not push the PR.

@bors-libra bors-libra added this to In Review in bors Aug 31, 2021
rustielin
rustielin previously approved these changes Sep 1, 2021
Copy link
Contributor

@rustielin rustielin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much nicer! LGTM

Have you tried passing in these new args in fgi? One side effect of having this many subcommands is that it becomes a bit more complicated how we pass args from fgi to forge. I don't anticipate this causing too much issue though, since we'll likely add more functionality to the forge command itself, and keep the k8s-swarm subcommand simple (not needing any further args).

@zihaoccc
Copy link
Contributor Author

zihaoccc commented Sep 1, 2021

much nicer! LGTM

Have you tried passing in these new args in fgi? One side effect of having this many subcommands is that it becomes a bit more complicated how we pass args from fgi to forge. I don't anticipate this causing too much issue though, since we'll likely add more functionality to the forge command itself, and keep the k8s-swarm subcommand simple (not needing any further args).

yea, I did mention above in test plan for those routines thourgh fgi instead of cargo, such as:
./scripts/fgi/run --local-swarm
./script/fgi/run --pr xxx
./scripts/fgi/run -T dev_czh_pull_9061 --suite land_blocking_compat

I did make corresponding change in fgi:
ret = subprocess.call(["cargo", "run", "-p", "forge-cli", "--", "test", "local-swarm"]) and
f"timeout {timeout_secs} forge {' '.join(forge_args)} test k8s-swarm --cluster-name {cluster_name} --image-tag {tag}".strip()

@zihaoccc
Copy link
Contributor Author

zihaoccc commented Sep 1, 2021

/land

@bors-libra bors-libra moved this from In Review to Queued in bors Sep 1, 2021
@bors-libra bors-libra moved this from Queued to Testing in bors Sep 1, 2021
@rustielin
Copy link
Contributor

Do you anticipate it needing to have something like this though? I'm not sure how much more we're intending to add to the swarm functionality

f"timeout {timeout_secs} forge {' '.join(forge_args)} test k8s-swarm  {' '.join(k8s_swarm_args)} --cluster-name {cluster_name} --image-tag {tag}".strip()

@github-actions
Copy link

github-actions bot commented Sep 1, 2021

Cluster Test Result

Test runner setup time spent 261 secs
Compatibility test results for land_c90be61f ==> land_3e984edd (PR)
1. All instances running land_c90be61f, generating some traffic on network
2. First full node land_c90be61f ==> land_3e984edd, to validate new full node to old validator node traffic
3. First Validator node land_c90be61f ==> land_3e984edd, to validate storage compatibility
4. First batch validators (14) land_c90be61f ==> land_3e984edd, to test consensus and traffic between old full nodes and new validator node
5. First batch full nodes (14) land_c90be61f ==> land_3e984edd
6. Second batch validators (15) land_c90be61f ==> land_3e984edd, to upgrade rest of the validators
7. Second batch of full nodes (15) land_c90be61f ==> land_3e984edd, to finish the network upgrade, time spent 713 secs
all up : 1245 TPS, 3637 ms latency, 4100 ms p99 latency, no expired txns, time spent 249 secs
Logs: http://kibana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-09-01T19:23:32Z',to:'2021-09-01T19:46:40Z'))
Dashboard: http://grafana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/d/performance/performance?from=1630524212000&to=1630525600000
Validator 1 logs: http://kibana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-09-01T19:23:32Z',to:'2021-09-01T19:46:40Z'))&_a=(columns:!(log),query:(language:kuery,query:'kubernetes.pod_name:"val-1"'),sort:!(!('@timestamp',desc)))

Repro cmd:

./scripts/cti --tag land_c90be61f --cluster-test-tag land_3e984edd -E BATCH_SIZE=15 -E UPDATE_TO_TAG=land_3e984edd --report report.json --suite land_blocking_compat

🎉 Land-blocking cluster test passed! 👌

@bors-libra bors-libra removed this from Testing in bors Sep 1, 2021
@bors-libra bors-libra closed this in 3e984ed Sep 1, 2021
@bors-libra bors-libra merged commit 3e984ed into diem:main Sep 1, 2021
@bors-libra bors-libra temporarily deployed to Sccache September 1, 2021 19:47 Inactive
@bors-libra bors-libra temporarily deployed to Sccache September 1, 2021 19:47 Inactive
@bors-libra bors-libra temporarily deployed to Docker September 1, 2021 19:47 Inactive
ankitkacn pushed a commit to ankitkacn/work that referenced this pull request Dec 29, 2022
bors-diem pushed a commit that referenced this pull request Dec 29, 2022
bors-diem pushed a commit that referenced this pull request Dec 29, 2022
ankitkacn pushed a commit to ankitkacn/work that referenced this pull request Jan 2, 2023
bors-diem pushed a commit that referenced this pull request Jan 2, 2023
bors-diem pushed a commit that referenced this pull request Jan 2, 2023
bors-diem pushed a commit that referenced this pull request Jan 2, 2023
ankitkacn pushed a commit to ankitkacn/work that referenced this pull request Jan 3, 2023
bors-diem pushed a commit that referenced this pull request Jan 3, 2023
bors-diem pushed a commit that referenced this pull request Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants