New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analytic functions #606
Analytic functions #606
Conversation
…to use faster algo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't checked this out and really looked at it yet. I expect when I do the structure will become very clear. Also assuming that the build will address the few formatting issues (like the extra blank line at the end of the WindowSpecificationTest)
I love the number of tests you're providing. I will leave this open for @benmccann to review since it's very important. I am most interested in what it "feels like" to work with it, which will take a little while to determine. This is going to be great when it's ready.
@@ -0,0 +1,15 @@ | |||
package tech.tablesaw.analytic; | |||
|
|||
public interface AggregateFunction<T, R> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know your comment said the javadoc is incomplete. I hope you'll to add some here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This PR needs a ton of javadoc. I can also make this interface package private.
|
||
@Override | ||
public void removeLeftMost() { | ||
throw new IllegalArgumentException("Should never call remove on fixed function"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is IllegalArgument the right exception here? I would have thought "UnsupportedOperation", as is used in java collections for things that shouldn't be called on particular implementations of interfaces.
otoh,
The use of UnsupportedOperation below for things that are not yet implemented is also a little troubling, because the JDK usage of UnsupportedOperation refers to things that will never be implemented, rather than things that aren't yet implemented.
I think it's ok as is, as long as we hold the release until it's fully baked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IllegalState might make sense as well.
} | ||
|
||
public String toSqlString() { | ||
StringBuilder sb = new StringBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is SqlString just for documentation? If so, maybe DocString? Unless we're using it for querying I'm not sure we should use a name that suggests it's really SQL that you could run against a real database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I wrote it as documentation. I do not see why users would need to run it against an actual database. Maybe I can rename this toSqlLikeString and add javadoc.
Thanks for taking a look. I will work on cleaning it up this week (javaoc formatting naming etc). Looking forward to getting your thoughts on how it feels to work with and how we can improve it! |
Added analytic function implementation as discussed in #582. This is experimental, incomplete and intended to spark discussion.
It is missing the following.
Testing
Instead of hand rolling tests for the main logic I used the bush dataset, added 50 missing values and wrote SQL queries for all the relevant windows. The tests check to make sure this implementation exactly matched what BigQuery produced.