Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of the fact that nodes have individual disks #1830

Closed
scottyeager opened this issue Nov 11, 2022 · 11 comments
Closed

Better handling of the fact that nodes have individual disks #1830

scottyeager opened this issue Nov 11, 2022 · 11 comments
Assignees
Labels
type_feature New feature or request
Projects
Milestone

Comments

@scottyeager
Copy link

Currently, 3Nodes don't make any information available about the size of the individual disks they contain. Storage appears instead as a single quantity of SSD and, when applicable, HDD that are available for reservations, out of a single total quantity. This can create the appearance that a node can support a reservation of disk up to the available quantity, when in fact they can only support a reservation up to the largest available block on a given disk.

To improve the experience for deployers, we could:

  1. Make individual disk info available for query over RMB. Then interfaces can at least help the user to reserve within the limits of the disk
  2. Provide virtual disks that span multiple physical disks in the node, so reservations up to the total available capacity of the node work seamlessly. A consideration in this approach is that some users may wish to separate their reservation of multiple physical disks in order to use software RAID schemes, for example, so information about the size of individual disks would still be relevant

Likewise, farmers can benefit from greater visibility into the disks in their nodes. This can help when disks are not recognized, incorrectly recognized (SSD as HDD), or have failed. For farmers:

  1. Give information about how Zos sees the disks, including their ordering, such as /dev/sda == 1TB SSD, /dev/sdb == 2TB HDD (existing filesystem detected, disk not used)
  2. Show this information on the node console so its immediately visible when the node boots, and also make it available over RMB
@scottyeager scottyeager added the type_feature New feature or request label Nov 11, 2022
@xmonader xmonader added this to the 3.5.x milestone Nov 14, 2022
@muhamadazmy
Copy link
Member

Yes, i have been raising this issue (internally) for a while. It's indeed very bad that we show the capacity of the total amount of disks instead of individual disks for the exact same reasons as you mentioned.

But i would like to comment on the approach.

  • Individual disk capacity are reported and shown on the chain. instead of the aggregated capacity
  • No rmb should be involved into this, since all information will be available on the chain
  • There is no way (at the moment) we can create a virtual disk that span multiple physical disks. (without probably a huge impact) this was only possible if we used something like LVM or use btrfs over multiple disks. but we decided to not do this to make it easier to replace failed disks.

Doing this query over RMB would be easier of course since this doesn't require model changes on the chain. but the problem is now selecting a node that matches a workload require a lot of queries to multiple nodes which will slow things down a lot.

@muhamadazmy muhamadazmy self-assigned this Nov 14, 2022
@muhamadazmy muhamadazmy added this to To do in backlog via automation Nov 14, 2022
@muhamadazmy muhamadazmy modified the milestones: 3.5.x, 3.6.x Nov 14, 2022
@scottyeager
Copy link
Author

scottyeager commented Nov 14, 2022

This sounds good, and I agree that adding the data to TF Chain is the best overall solution for the deployment side.

For farmers, I still think it would be nice to have more info, such as about disks that were passed over due to existing data, available over rmb or at least printed to the console. It's pretty common for new farmers to have some disks not detected when they boot up their first nodes, and being able to know whether Zos sees the disks at all would be really helpful.

Not providing a built in solution to span disks is okay to me for now too. This can be accomplished after deployment with a single command for btrfs.

I think it could be nice to provide the option to specify which physical disk a virtual disk is reserved on. So users can achieve RAID 1, for example, in software across multiple physical disks without needing to reserve all available space on the first disk Zos is allocating from.

@Parkers145
Copy link

Have users bringing up this problem in chat again recently have we made any progress here?

@DylanVerstraete
Copy link
Contributor

I'm not sure why this cannot be queried client side, if a user is interested about a specific node he can query that node for the disk setup. Adding this data on chain does not seem like the right decision here, we actually want to minimize data stored on chain.

@Parkers145
Copy link

Parkers145 commented Jan 31, 2023

Does that mean every person that uses the grid would have to learn to use graphql?

@DylanVerstraete
Copy link
Contributor

If the data is not stored on chain it also will be not be stored in graphql. I meant that our deployment tools can easily fetch this information over RMB from the node itself. There is no reason we should this data on chain.

@Parkers145
Copy link

Oh I see, I'm tracking now. I had misunderstood your response.

@scottyeager
Copy link
Author

@DylanVerstraete, the biggest issue with relying on RMB for this functionality is that users are usually searching for a node that matches their desired deployment specifications. This means that the front ends like the playground will potentially need to make 100s or 1000s of RMB calls to filter through nodes that might match based on the initial info available from TF Chain.

For example, a user wants a VM with a 1TB disk. There are ~1500 nodes that have 1 TB free SRU. How many of these can actually support a 1TB disk reservation? We can only know by checking them one by one.

Grid Proxy could potentially be extended to query and cache this data from the nodes, so it can be used in an efficient way by the front end. I think though that this would be a fairly large extension of its functionality. Maybe @xmonader can weigh in from the perspective of the front ends and the proxy.

Of course we should reduce bloat on TF Chain wherever possible. If we can find an agreeable way to provide a nice user experience without putting this data on chain, that's certainly fine.

@LeeSmet
Copy link
Contributor

LeeSmet commented Feb 1, 2023

To my knowledge, the gridproxy already provides caching of data, so this should not be too hard. But even if this is not the case, we can't just dump it in tfchain. The chain is not meant to be used as a database. Imo, it is already doing way too much, and we need to see how to reduce traffic.

All in all, this should be exposed over RMB, and clients will have to either query individual nodes about their relevant disk layout, or some intermediate tool can aggregate this and clients can query that tool instead.

@muhamadazmy
Copy link
Member

@LeeSmet yeah, i had a similar chat with @brandonpille and we thought providing an rmb function to show capacity per disk is good enough for the farmer bot to do proper planning also to avoid storing this information on the chain

@muhamadazmy
Copy link
Member

This functionality is now available on devnet (to return disk capacity per disk over rmb) and should be used in the farmerbot for capacity planning. hence i will close the issue

backlog automation moved this from To do to Done Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type_feature New feature or request
Projects
Status: Done
Development

No branches or pull requests

6 participants