Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

farmerbot #1371

Closed
5 of 7 tasks
Tracked by #1404
xmonader opened this issue Dec 15, 2022 · 6 comments
Closed
5 of 7 tasks
Tracked by #1404

farmerbot #1371

xmonader opened this issue Dec 15, 2022 · 6 comments
Assignees
Milestone

Comments

@xmonader
Copy link
Contributor

xmonader commented Dec 15, 2022

Farmerbot

Tasklist

Description

The farmbot is an opt-in feature that can be run seperatly from the chain and zos.

On tfchain, power state and power target of the node will be added in order to let the farmerbot control the power state of it's nodes. The farmerbot can power down, power on nodes as it sees fit. The farmerbot should expose a method on it's actor to allow a user or tool to query the most optimal node in it's farm. It should try and match the workload with the current online / offline nodes and power on a node if it needs to accomodate a workload.

Proposed strategy for the farmerbot

There is a flow chart bellow.

Shutting down nodes

  • The farmer will have to configure his nodes in a markdown file (telling the farmerbot how to connect to them, etc)
  • The farmerbot should contact ZOS nodes through RMB (ask for used resources etc)
  • Based on the info it will shut down nodes if needed. It will change the powertarget of the node on tfchain which will emit an event that the node will catch and handle upon.
  • Maybe keep in memory that the node is shutting down

Waking up nodes periodically

  • The farmerbot will periodically have to wake nodes who are down to send uptime reports. This time window should be configurable (ex: not between 12am - 8 am) (some farmers requested this on the forum)

TSClient jobs

  • When the TSClient wants to deploy on a node it will create a job configuration and send it to the farmerbot
  • The TSClient will send the job to the farmerbot via RMB

Handeling jobs (aka bringing nodes up)

  • The jobs will tell the farmerbot that a client wants to deploy a job on a node
  • It will bring up the node if it is offline: it will change the powertarget of the node on tfchain which will emit an event that the other nodes in the farm will catch. They will then send the WOL packet to that node.

See more details

@xmonader xmonader changed the title farmerbot specs farmerbot Dec 15, 2022
@xmonader xmonader self-assigned this Dec 19, 2022
@xmonader xmonader added this to the 3.8 milestone Dec 19, 2022
@xmonader
Copy link
Contributor Author

@DylanVerstraete please fill in the specs to review before working on the issue

@xmonader
Copy link
Contributor Author

@DylanVerstraete there's that module https://github.com/freeflowuniverse/crystallib/tree/development/twinclient/examples allowing invoking typescript functionality to execute what V doesn't support natively over http, rmb, ws

@brandonpille
Copy link
Contributor

brandonpille commented Dec 20, 2022

I have a couple of questions:

  • Can we define the capacity management rules in this issue?
  • What does almost full mean? Is it a percentage of used resources? Does that change on the type of resource?
  • How do we know when a deployment is canceled (aka Resources are no longer used)?

@brandonpille
Copy link
Contributor

@xmonader

@xmonader
Copy link
Contributor Author

basically the goal is moving the logic that we did on the chain to be available on the farmerbot in terms of capacity planning and power management

a very important note is that, the farmerbot is opt-in so I believe we will need a field extra on the Farm object for the farmerbot url/ip (that needs to be on a public IP or we can use a twinID to support using @muhamadazmy 's relay instead)

capacity planning rules

from the previous story #1303 (comment)
I can should be able to send the contracts to the farmerbot to investigate and execute the power mgmt options on the farm

  • almost full should be user defined, e.g as a farmer i'd be wanting to turn on more nodes if the current nodes online reached 80% of used resources
  • also i should be able to define at least X nodes to be online e.g deploying kubernetes cluster should be deployed on different nodes so the farmerbot should be able to plan that to turn on nodes, some of the nodes can be already up to avoid cold starts
  • whatever also business logic we added in the tfchain during the other capacity planning story can be part of this

How do we know when a deployment is canceled (aka Resources are no longer used)?

farmerbot can check on the status of the contracts relevant to a specific farm i believe, even the farmerbot can ping the nodes statistics module to get the number of active deployments on a node and if it is extra on the farmer defined initial state it can be turned off

@DylanVerstraete
Copy link
Contributor

a very important note is that, the farmerbot is opt-in so I believe we will need a field extra on the Farm object for the farmerbot url/ip (that needs to be on a public IP or we can use a twinID to support using @muhamadazmy 's relay instead)

This has changed, the farmer bot will publish it's endpoint on a NATS server so the grid tools can lookup the farmerbot location on there and connect with it.

We could also support a RMB relay address once threefoldtech/tfchain#569 is done.

@xmonader xmonader assigned xmonader and unassigned despiegk Feb 26, 2023
@xmonader xmonader modified the milestones: 3.8, 3.9 Feb 26, 2023
@ramezsaeed ramezsaeed mentioned this issue Mar 2, 2023
8 tasks
@A-Harby A-Harby closed this as completed Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

5 participants