New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better handling of the fact that nodes have individual disks #1830
Comments
Yes, i have been raising this issue (internally) for a while. It's indeed very bad that we show the capacity of the total amount of disks instead of individual disks for the exact same reasons as you mentioned. But i would like to comment on the approach.
Doing this query over RMB would be easier of course since this doesn't require model changes on the chain. but the problem is now selecting a node that matches a workload require a lot of queries to multiple nodes which will slow things down a lot. |
This sounds good, and I agree that adding the data to TF Chain is the best overall solution for the deployment side. For farmers, I still think it would be nice to have more info, such as about disks that were passed over due to existing data, available over rmb or at least printed to the console. It's pretty common for new farmers to have some disks not detected when they boot up their first nodes, and being able to know whether Zos sees the disks at all would be really helpful. Not providing a built in solution to span disks is okay to me for now too. This can be accomplished after deployment with a single command for btrfs. I think it could be nice to provide the option to specify which physical disk a virtual disk is reserved on. So users can achieve RAID 1, for example, in software across multiple physical disks without needing to reserve all available space on the first disk Zos is allocating from. |
Have users bringing up this problem in chat again recently have we made any progress here? |
I'm not sure why this cannot be queried client side, if a user is interested about a specific node he can query that node for the disk setup. Adding this data on chain does not seem like the right decision here, we actually want to minimize data stored on chain. |
Does that mean every person that uses the grid would have to learn to use graphql? |
If the data is not stored on chain it also will be not be stored in graphql. I meant that our deployment tools can easily fetch this information over RMB from the node itself. There is no reason we should this data on chain. |
Oh I see, I'm tracking now. I had misunderstood your response. |
@DylanVerstraete, the biggest issue with relying on RMB for this functionality is that users are usually searching for a node that matches their desired deployment specifications. This means that the front ends like the playground will potentially need to make 100s or 1000s of RMB calls to filter through nodes that might match based on the initial info available from TF Chain. For example, a user wants a VM with a 1TB disk. There are ~1500 nodes that have 1 TB free SRU. How many of these can actually support a 1TB disk reservation? We can only know by checking them one by one. Grid Proxy could potentially be extended to query and cache this data from the nodes, so it can be used in an efficient way by the front end. I think though that this would be a fairly large extension of its functionality. Maybe @xmonader can weigh in from the perspective of the front ends and the proxy. Of course we should reduce bloat on TF Chain wherever possible. If we can find an agreeable way to provide a nice user experience without putting this data on chain, that's certainly fine. |
To my knowledge, the gridproxy already provides caching of data, so this should not be too hard. But even if this is not the case, we can't just dump it in tfchain. The chain is not meant to be used as a database. Imo, it is already doing way too much, and we need to see how to reduce traffic. All in all, this should be exposed over RMB, and clients will have to either query individual nodes about their relevant disk layout, or some intermediate tool can aggregate this and clients can query that tool instead. |
@LeeSmet yeah, i had a similar chat with @brandonpille and we thought providing an rmb function to show capacity per disk is good enough for the farmer bot to do proper planning also to avoid storing this information on the chain |
This functionality is now available on devnet (to return disk capacity per disk over rmb) and should be used in the farmerbot for capacity planning. hence i will close the issue |
Currently, 3Nodes don't make any information available about the size of the individual disks they contain. Storage appears instead as a single quantity of SSD and, when applicable, HDD that are available for reservations, out of a single total quantity. This can create the appearance that a node can support a reservation of disk up to the available quantity, when in fact they can only support a reservation up to the largest available block on a given disk.
To improve the experience for deployers, we could:
Likewise, farmers can benefit from greater visibility into the disks in their nodes. This can help when disks are not recognized, incorrectly recognized (SSD as HDD), or have failed. For farmers:
/dev/sda == 1TB SSD
,/dev/sdb == 2TB HDD (existing filesystem detected, disk not used)
The text was updated successfully, but these errors were encountered: