Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PUBDEV-6938 GroupBy - support for grouping by String columns #4594
PUBDEV-6938 GroupBy - support for grouping by String columns #4594
Changes from 1 commit
3a99030
1ece7b8
e19dfb6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compareTo already does equals, I think doing both here is not efficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that it should not return
0
if we hit values that are are equal - based on the previous ordering implementation, as explained in the comment below:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, thats true, than maybe you could do
int res = g1._gsStr[i].compareTo(g2._gsStr[i]); if (res != 0) return res;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this looks cleaner and more efficient. I changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of code in this PR handles switching between the 2 arrays (_gs nad _gsStr). This is due to the fact that we can have an arbitrary order of string/numerical columns on the input. However, in the implementation itself, we can re-order the columns to make sure we will have the exact order we want, eg. numerical first then string ones. Would that be something that would help with making the code less complex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing that comes to my mind that could solve this problem. Switching between
_gs
and_gsStr
is done in 2 places - filling the output frame and sorting it. In both cases it is expected to preserve order of the columns. There is a possibility (at least for the "filling" part) of replacinggbColsTypes
array with 2 arrays defining order for each of the columns in_gs
and_gsStr
. But I don't know whether it significantly reduces the complexity, if at all.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view, two arrays are ok in this case. One array with both types of values has the same complexity; there should also be
if
where we will have to decide what to do with string and numeric separately. Maybe some class can solve this, however, I think it is overengineering in this case.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is better to keep everything in BufferedStrings as much as possible (atStr method), this won't duplicate the memory needed to hold the string, BufferedString just references an existing byte array that holds the actual data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I will adjust it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to use
BufferedString
internally.