Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: flatgeobuf: document amount of RAM needed for packed Hilbert RTree building #8490

Merged
merged 2 commits into from
Oct 2, 2023

Conversation

rouault
Copy link
Member

@rouault rouault commented Sep 28, 2023

@rouault
Copy link
Member Author

rouault commented Sep 28, 2023

CC @bjornharrtell Can you confirm or infirm my theory about the RAM requirement ? sizeof(NodeItem) == 40 clearly , and we need 2 instances per feature: one in the list provided to the PackedRTree constructor, and another one for the PackedRTree._nodeItems[] ?

@bjornharrtell
Copy link
Contributor

Without diving deep I would say that seems correct yes. But I don't immediately understand/recall why I had to have this double representation.

@bjornharrtell
Copy link
Contributor

Looked into it a bit and think it should be possible to avoid the duplication, but I'm not sure when I will find the time and brain power to do it so I think this docs change is correct for now.

@bjornharrtell
Copy link
Contributor

@rouault actually "at least the number of features times 80 bytes" can be misleading, it'a bit more. The private _nodeItems also include nodes for all levels above the lowest.

@rouault
Copy link
Member Author

rouault commented Oct 1, 2023

The private _nodeItems also include nodes for all levels above the lowest.

any estimation of that amount? Perhaps some log2(number of features) * something involved ?

@bjornharrtell
Copy link
Contributor

It depends on node size but in GDAL it's not configurable so it's set to 16. The full tree size can be calculated like this:

uint64_t n = numItems;
uint64_t numNodes = n;
do
{
n = (n + nodeSizeMin - 1) / nodeSizeMin;
numNodes += n;
} while (n != 1);
return numNodes * sizeof(NodeItem);

Where in the case of GDAL nodeSizeMin is 16. Subtract by numItems * 40 and that should be the size of the part of the tree that is above the feature level. There is probably some nice mathemetic way to formulate this but I'm not clever enough right now. :)

@rouault
Copy link
Member Author

rouault commented Oct 1, 2023

There is probably some nice mathemetic way to formulate this

So, assuming numNodes would be a power of 16 to simplify things (and if it is not, that should be really close), that's numNodes / 16 + numNodes / 16**2 + numNodes / 16**3 + ... until infinity (which comes much sooner than infinity). So a geometric series of coefficient 1./16 and first term numNodes/16, which converges to numNodes/16 / (1. - 1./16) = numNodes / 15 ~= 0.06667 * numNodes
So the total size required would be (2 + 0.06667) * 40 * numItems ~= 83 * numItems

@rouault rouault merged commit df2db95 into OSGeo:master Oct 2, 2023
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants