-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement more efficient uniq
aggregators
#51
Comments
uni
aggregatorsuniq
aggregators
I am not sure we can gain much by re-implementing a custom aggregator for |
There is a first version of a custom aggregator for |
I was thinking of adding auxiliary map outputs (the end line) so that we don't even have to traverse the whole file, because this would allow us to just copy the file without ever touching it. However, I think you might be right, this doesn't sound like a big performance win. Let's make it low priority. |
That's actually a great idea, I didn't think of that.
El jue, 10 de dic. de 2020 a la(s) 08:26, Konstantinos Kallas (
notifications@github.com) escribió:
… I am not sure we can gain much by re-implementing a custom aggregator for
uniq. This is because the partial inputs to the aggregator have already
unique, so the only comparison we can gain is from comparisons at *n*
input boundaries — which I don't see yielding serious improvements over
simply re-running uniq.
I was thinking of adding auxiliary map outputs (the end line) so that we
don't even have to traverse the whole file, because this would allow us to
just copy the file without ever touching it. However, I think you might be
right, this doesn't sound like a big performance win. Let's make it low
priority.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#51 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADL55ELN4NGNLPNUMH44ULSUDD63ANCNFSM4T7T4KGQ>
.
|
This can also be applied for tr -s ' ' '\n' or other similar commands that remove duplicates etc |
Closing because this issue will have to do with the annotation library rather than PaSh itself now |
At the moment the aggregator for
uniq
isuniq
and there is no implemented aggregator foruniq -c
.We could implement very efficient aggregators for both of them
The text was updated successfully, but these errors were encountered: