Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit reserved cache and memory size #1781

Closed
muhamadazmy opened this issue Aug 14, 2022 · 4 comments · Fixed by #1925
Closed

Revisit reserved cache and memory size #1781

muhamadazmy opened this issue Aug 14, 2022 · 4 comments · Fixed by #1925
Assignees
Labels
type_feature New feature or request
Milestone

Comments

@muhamadazmy
Copy link
Member

Currently zos reserve 100g of the ssd storage, we need to revise this value because it's too much.

Also revise the amount of reserved memory for the system

@muhamadazmy muhamadazmy self-assigned this Aug 14, 2022
@muhamadazmy
Copy link
Member Author

I had a quick conversation with Kristof about the reserved system resources. Currently we have always 100GB of ssd "reserved" for zos cache . Also a 10% with min value of 2GB of node memory is also reserved for the system.

He thinks that's too much (specially the storage) and we need to revise those values. For storage, the 100G is just a subvolume quota but it's taken into consideration while calculating how much free storage is available for workloads.

The problem is that the "reserved" storage amount is not reported by the node, instead it's right now known by the gridproxy and it always just assume this amount is used by the system.

@muhamadazmy
Copy link
Member Author

@xmonader I suggest the following to avoid minimum change to the grid

  • Right now, the node reports FULL ssd capacity as SRU. while it internally reserve 100G. Only the proxy knows about this value right now (terraform?) and makes sure to take it into account during capacity planning here. This makes it hard to make each node has a different value without changing the node object on the grid to also report "reserved for system" capacity.
  • Instead, what if the node reports only usable capacity (so total - reserved) for both storage and memory. This way each node can has different reserved value (and change it dynamically if needed), then grid proxy doesn't need to know about any node internal reservation.
  • This will also change minting of course which is gonna be a problem.
  • Another solution is that the node object then should have both "total capacity" (as right now) , and a new field "system reserved". So minting can use total, and capacity planning can use both (plus active contracts) to filter out nodes

@xmonader xmonader added this to the 3.4.x milestone Nov 14, 2022
@muhamadazmy
Copy link
Member Author

I think we need to implement this with #1830 because this will change what is being reported by the node as full capacity.
the idea is that we start with a small reservation (say 10G) then monitor usage of storage disk, and increase the size if needed

@muhamadazmy
Copy link
Member Author

Provisiond need to be aware that reserved cache size is dynamic and can change in runtime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type_feature New feature or request
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants