New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: extent based namespaces #3208
feat: extent based namespaces #3208
Conversation
…/jackgerrits/vowpal_wabbit into jagerrit/extent_based_namespaces
…/jackgerrits/vowpal_wabbit into jagerrit/extent_based_namespaces
vowpalwabbit/feature_group.cc
Outdated
@@ -140,13 +248,50 @@ bool features::sort(uint64_t parse_mask) | |||
apply_permutation_in_place(values, dest_index_vec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worried about allocations that can happen in steady state:
- sort operation (seems avoidable)
- flatten_namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort is very rarely used, it is only used with limit
and sort_features
. It is also done only once per example, so in practice this will not matter. Trying to extend what the existing sort implementation was doing to include extents would be quite impractical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed the apply function was rolled into one to avoid reallocating the done vec
@@ -10,6 +10,7 @@ table Namespace { | |||
name:string; | |||
hash:uint8; | |||
features:[Feature]; | |||
full_hash:uint64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a /// comment for this element would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. I'd rather call this hash but that is already taken by the other element which is really the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I agree with your preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I switch them? I am worried that will break things. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to leave as is with a comment
@@ -10,6 +10,7 @@ table Namespace { | |||
name:string; | |||
hash:uint8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a /// comment for this element would be useful.
Yep. Looks good. Thanks. |
…/jackgerrits/vowpal_wabbit into jagerrit/extent_based_namespaces
No description provided.