New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong value being potentially read? #26

Closed
ayende opened this Issue Aug 31, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@ayende

ayende commented Aug 31, 2018

I might be missing something, but consider the case of hash collision during read.

https://github.com/Microsoft/FASTER/blob/master/cc/src/core/faster.h#L1132

vs.

foundPhysicalAddress = Constants.kInvalidAddress;

I'm not showing the handling of missing out on hash collision there? It looks to me like the C# version does the correct check, but the C++ version assumes that there will always be a value.

@jahunter-m

This comment has been minimized.

Show comment
Hide comment
@jahunter-m

jahunter-m Sep 4, 2018

Member

Let me think about this--in the C++ version, we exit the loop (and the function) iff either we (1) find a matching key (line 1137) or we end up with from_address < min_address. In the former case, we return the address of the matching record; in the latter case, we return that from_address (which is < min_address).

Member

jahunter-m commented Sep 4, 2018

Let me think about this--in the C++ version, we exit the loop (and the function) iff either we (1) find a matching key (line 1137) or we end up with from_address < min_address. In the former case, we return the address of the matching record; in the latter case, we return that from_address (which is < min_address).

@jahunter-m

This comment has been minimized.

Show comment
Hide comment
@jahunter-m

jahunter-m Sep 4, 2018

Member

In C#, if we don't find a match with foundLogicalAddress >= minOffset, then we return kInvalidAddress.

So, in C++: if we find a hash collision, then the test in line 1136 returns false, so we continue the loop, looking at the record's previous address.

If we do not find a match in any record with address >= min_offset (i.e., all records with address >= min_offset are hash collisions!), then we return the last address we saw, which will always be < min_offset.

Member

jahunter-m commented Sep 4, 2018

In C#, if we don't find a match with foundLogicalAddress >= minOffset, then we return kInvalidAddress.

So, in C++: if we find a hash collision, then the test in line 1136 returns false, so we continue the loop, looking at the record's previous address.

If we do not find a match in any record with address >= min_offset (i.e., all records with address >= min_offset are hash collisions!), then we return the last address we saw, which will always be < min_offset.

@jahunter-m

This comment has been minimized.

Show comment
Hide comment
@jahunter-m

jahunter-m Sep 4, 2018

Member

In the C++ version, all of function TraceBackForKeyMatch()'s callers pass "head_address" as the min_offset. This means that TraceBackForKeyMatch() will look only at the in-memory portion of the log (i.e., the log's tail). If we don't find a match in memory, then TraceBackForKeyMatch() returns the first address of a potentially-matching record on disk. Note that this address may be kInvalidAddress, if the last record we read in memory has no previous record stored on disk.

So I think the C++ code correctly handles hash collisions, although it may do this in a different way that the C# version does.

Thanks for the comment; feel free to reopen it if my reasoning, above, is flawed.

Member

jahunter-m commented Sep 4, 2018

In the C++ version, all of function TraceBackForKeyMatch()'s callers pass "head_address" as the min_offset. This means that TraceBackForKeyMatch() will look only at the in-memory portion of the log (i.e., the log's tail). If we don't find a match in memory, then TraceBackForKeyMatch() returns the first address of a potentially-matching record on disk. Note that this address may be kInvalidAddress, if the last record we read in memory has no previous record stored on disk.

So I think the C++ code correctly handles hash collisions, although it may do this in a different way that the C# version does.

Thanks for the comment; feel free to reopen it if my reasoning, above, is flawed.

@jahunter-m jahunter-m closed this Sep 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment