Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash table of visited nodes for NGQueue #79

Open
RioMichael opened this issue Jul 12, 2019 · 0 comments
Open

Hash table of visited nodes for NGQueue #79

RioMichael opened this issue Jul 12, 2019 · 0 comments

Comments

@RioMichael
Copy link

RioMichael commented Jul 12, 2019

Describe the bug
Following #65 . The size of the hash table of visited nodes for NGQueue is set to 16394 ("two" hash table of size 8192). However, when searching in a large data-set and setting the "MaxCheck" to a relatively large number (i.e. 16394), the hash table would be full during the searching process. After that, every node that is not in the hash table (either visit for first time or not first time) would result in a -1 return value from _CheckAndSet function in WorkSpace.h. Since the CheckAndSet function only check if the return value from _CheckAndSet is equal to 0, all those nodes with -1 return value would be added to the NGQueue, causing duplicated search and inaccurate result (i.e. duplicate) not only during actual searching process, but also during the refine stage of the building process (which use searching result to update RNG).

To Reproduce
Steps to reproduce the behavior:

  1. Build with a large data-set. (theoretically, a data-set larger than 16394 would be above to reproduce using extreme parameter, but a large data-set, i.e. 1M, would be easier to reproduce)
  2. The behavior is already reproduced during refine stage of the building process.
  3. Search with "MaxCheck" setting to a number larger than 16394.
  4. The result of the search may contains duplicate result.

Expected behavior
Either:
Hash table is large enough to store all the visited node.
Or:
Proper handle of -1 return value from _CheckAndSet function or proper handle of the situation that hash table becomes full.

Desktop (please complete the following information):

  • OS: Windows 10
  • Version : Current master version

Additional context
Have tried to set the m_poolSize to a large number (i.e. 65535), does solve the problem ("MaxCheck" set to 16394), but the problem would still occur if using a much larger "MaxCheck". Since there is at least the memory space issue, and search efficiency, should not simply set the m_poolSize to a very large number.
Maybe set the m_poolSize according to "MaxCheckForRefineGraph" during build stage, and according to "MaxCheck" during search stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant