You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Following #65 . The size of the hash table of visited nodes for NGQueue is set to 16394 ("two" hash table of size 8192). However, when searching in a large data-set and setting the "MaxCheck" to a relatively large number (i.e. 16394), the hash table would be full during the searching process. After that, every node that is not in the hash table (either visit for first time or not first time) would result in a -1 return value from _CheckAndSet function in WorkSpace.h. Since the CheckAndSet function only check if the return value from _CheckAndSet is equal to 0, all those nodes with -1 return value would be added to the NGQueue, causing duplicated search and inaccurate result (i.e. duplicate) not only during actual searching process, but also during the refine stage of the building process (which use searching result to update RNG).
To Reproduce
Steps to reproduce the behavior:
Build with a large data-set. (theoretically, a data-set larger than 16394 would be above to reproduce using extreme parameter, but a large data-set, i.e. 1M, would be easier to reproduce)
The behavior is already reproduced during refine stage of the building process.
Search with "MaxCheck" setting to a number larger than 16394.
The result of the search may contains duplicate result.
Expected behavior
Either:
Hash table is large enough to store all the visited node.
Or:
Proper handle of -1 return value from _CheckAndSet function or proper handle of the situation that hash table becomes full.
Desktop (please complete the following information):
OS: Windows 10
Version : Current master version
Additional context
Have tried to set the m_poolSize to a large number (i.e. 65535), does solve the problem ("MaxCheck" set to 16394), but the problem would still occur if using a much larger "MaxCheck". Since there is at least the memory space issue, and search efficiency, should not simply set the m_poolSize to a very large number.
Maybe set the m_poolSize according to "MaxCheckForRefineGraph" during build stage, and according to "MaxCheck" during search stage.
The text was updated successfully, but these errors were encountered:
Describe the bug
Following #65 . The size of the hash table of visited nodes for NGQueue is set to 16394 ("two" hash table of size 8192). However, when searching in a large data-set and setting the "MaxCheck" to a relatively large number (i.e. 16394), the hash table would be full during the searching process. After that, every node that is not in the hash table (either visit for first time or not first time) would result in a -1 return value from _CheckAndSet function in WorkSpace.h. Since the CheckAndSet function only check if the return value from _CheckAndSet is equal to 0, all those nodes with -1 return value would be added to the NGQueue, causing duplicated search and inaccurate result (i.e. duplicate) not only during actual searching process, but also during the refine stage of the building process (which use searching result to update RNG).
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Either:
Hash table is large enough to store all the visited node.
Or:
Proper handle of -1 return value from _CheckAndSet function or proper handle of the situation that hash table becomes full.
Desktop (please complete the following information):
Additional context
Have tried to set the m_poolSize to a large number (i.e. 65535), does solve the problem ("MaxCheck" set to 16394), but the problem would still occur if using a much larger "MaxCheck". Since there is at least the memory space issue, and search efficiency, should not simply set the m_poolSize to a very large number.
Maybe set the m_poolSize according to "MaxCheckForRefineGraph" during build stage, and according to "MaxCheck" during search stage.
The text was updated successfully, but these errors were encountered: