Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Fetch API to typed fields #67

Merged
merged 1 commit into from
Jan 11, 2019
Merged

Add Fetch API to typed fields #67

merged 1 commit into from
Jan 11, 2019

Conversation

xichen2020
Copy link
Owner

cc @notbdu

This PR adds Fetch to typed field interfaces, which allows us to fetch values from typed fields for a given doc ID set determined from a doc ID set iterator passed in.

@xichen2020 xichen2020 merged commit 138c65f into master Jan 11, 2019
@xichen2020 xichen2020 deleted the xichen-field-fetch-api branch January 11, 2019 18:08
}
it.currDocID = it.seekableDocIt.DocID()
} else {
for i := 0; i < distance; i++ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you see value wrapping the docIt inside a wrapper type that implements SeekableDocIDSetIterator? ie the wrapper implements SeekForward and uses this code here. Inside NewAtPositionDocIDSetIterator, you always wrap docIt inside a SeekableDocIDSetIterator if it doesn't already implement it. Should make this function more readable.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah possibly, although a bit wary of introducing yet another wrapper type that can be implemented in a few lines. I'd say if there's another place that needs this logic then yes, otherwise maybe not.

curr int32
}

func newbitmapBasedDocIDIterator(rit *roaring.Iterator) *bitmapBasedDocIDIterator {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/newbi/newBi

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@@ -15,6 +15,10 @@ type DocIDSet interface {
// Iter returns the document ID set iterator.
Iter() DocIDSetIterator

// Fetch returns the set of positions in the current doc ID set that
// are also in the doc ID set given by the iterator passed in.
Fetch(it DocIDSetIterator) DocIDPositionIterator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this more Intersect than Fetch? also, isn't this more appropriate given this is supposed to be a "set"

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my thinking was if at some point we want to treat the doc IDs that are in the set passed in but not in the current set differently then the name Fetch could potentially make more sense, but I can change it to Intersect as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

treat the doc IDs that are in the set passed in but not in the current set differently

Maybe we can call that case difference? I do like the idea of using traditional set function names.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is semantically Intersect though, change included in the next PR

}
it.currPosition++
it.currDocID = it.maskingIt.DocID()
if it.currDocID < it.numTotalDocs {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be totally misunderstanding something; I was under the assumption that doc ids are monotonically increasing, ie the DocIDSetIterator might start at 1000 and return 1000, 1001, 1002, etc. However here we're comparing against a full doc ID set representing [0, numTotalDocs) which can only occur naturally once given the doc ids are monotonically increasing.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm not sure I get what you meant by "can only occur naturally once"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given it.currDocID < it.numTotalDocs is an invariant, can we simplify this to:

         if it.done {
		return false
	}
	if !it.maskingIt.Next() {
		it.done = true
		return false
	}
	it.currPosition++
	it.currDocID = it.maskingIt.DocID()
        return true

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer to be a bit defensive here in case the masking iterator goes out of range.

if it.backingDone {
return false
}
if it.maskingDocID == invalidDocID {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this not be done upfront in the constructor so we don't have to check it every time?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sounds good, can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants