Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analytic functions #606

Merged
merged 12 commits into from Aug 18, 2019
Merged

Analytic functions #606

merged 12 commits into from Aug 18, 2019

Conversation

ryancerf
Copy link
Collaborator

@ryancerf ryancerf commented Aug 17, 2019

Added analytic function implementation as discussed in #582. This is experimental, incomplete and intended to spark discussion.

It is missing the following.

  • Javadoc!!!
  • Implementations for many of the analytic aggregate functions.
  • Implementation of executeInPlace.
  • API and implementation for allowing users to add consumers.

Testing

Instead of hand rolling tests for the main logic I used the bush dataset, added 50 missing values and wrote SQL queries for all the relevant windows. The tests check to make sure this implementation exactly matched what BigQuery produced.

Copy link
Collaborator

@lwhite1 lwhite1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked this out and really looked at it yet. I expect when I do the structure will become very clear. Also assuming that the build will address the few formatting issues (like the extra blank line at the end of the WindowSpecificationTest)

I love the number of tests you're providing. I will leave this open for @benmccann to review since it's very important. I am most interested in what it "feels like" to work with it, which will take a little while to determine. This is going to be great when it's ready.

@@ -0,0 +1,15 @@
package tech.tablesaw.analytic;

public interface AggregateFunction<T, R> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know your comment said the javadoc is incomplete. I hope you'll to add some here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This PR needs a ton of javadoc. I can also make this interface package private.


@Override
public void removeLeftMost() {
throw new IllegalArgumentException("Should never call remove on fixed function");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is IllegalArgument the right exception here? I would have thought "UnsupportedOperation", as is used in java collections for things that shouldn't be called on particular implementations of interfaces.
otoh,
The use of UnsupportedOperation below for things that are not yet implemented is also a little troubling, because the JDK usage of UnsupportedOperation refers to things that will never be implemented, rather than things that aren't yet implemented.

I think it's ok as is, as long as we hold the release until it's fully baked.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IllegalState might make sense as well.

}

public String toSqlString() {
StringBuilder sb = new StringBuilder();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is SqlString just for documentation? If so, maybe DocString? Unless we're using it for querying I'm not sure we should use a name that suggests it's really SQL that you could run against a real database.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I wrote it as documentation. I do not see why users would need to run it against an actual database. Maybe I can rename this toSqlLikeString and add javadoc.

@ryancerf
Copy link
Collaborator Author

@lwhite1

Thanks for taking a look. I will work on cleaning it up this week (javaoc formatting naming etc). Looking forward to getting your thoughts on how it feels to work with and how we can improve it!

@lwhite1 lwhite1 merged commit aba18f1 into jtablesaw:master Aug 18, 2019
@ryancerf ryancerf deleted the analyticFunctions branch September 3, 2019 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants