Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

TAJO-919: Implement LAG and LEAD window functions#351

Closed
sirpkt wants to merge 5 commits intoapache:masterfrom
sirpkt:TAJO-919
Closed

TAJO-919: Implement LAG and LEAD window functions#351
sirpkt wants to merge 5 commits intoapache:masterfrom
sirpkt:TAJO-919

Conversation

@sirpkt
Copy link
Copy Markdown
Contributor

@sirpkt sirpkt commented Jan 14, 2015

This patch contains following changes

  • lead, lag are implemented. lag is implemented as window function as it only sees previous rows while lead is implemented as aggregation function since it should see next rows.
  • visitWindowFunction() in ExprAnnotator.java and TypeDeterminant.java is modified to support multiple arguments because current implementation only supports one argument for window function.

There are some arguable points

  • It assumes that offset value is non-negative.
  • Way to handle aggregation function in WindowAggExec.java is modified to call terminate for every Tuple, and it provides way to return different value for each terminate call, which is essential to implement lead function within current window supporting structure. However, it adds additional burden for other aggregation functions calculation because their terminate calls always induce the same calculation and return the same results.
  • lead implementation temporarily holds all the rows in the same partition in memory.

Any idea about above points and about other parts of the patch, please.

I checked 'mvn clean install'

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be 'nth next row' instead of 'nth previous ...'.

@hyunsik
Copy link
Copy Markdown
Member

hyunsik commented Jan 15, 2015

I'll review it soon. This feature is very desired in many applications. It would be great if we add this feature to 0.10.

@hyunsik
Copy link
Copy Markdown
Member

hyunsik commented Jan 15, 2015

According to https://wiki.postgresql.org/wiki/SQL2008_windowing_queries, the offset is a positive integer (offset > 0). So, your assumption would be right. If offset or default value is omitted, default offset will be 1 and default value will be null.

@hyunsik
Copy link
Copy Markdown
Member

hyunsik commented Jan 15, 2015

lead implementation temporarily holds all the rows in the same partition in memory.

Do you mean a LinkedList in LeadContext? It does not seem to keep all rows in the same partition.

@sirpkt
Copy link
Copy Markdown
Contributor Author

sirpkt commented Jan 15, 2015

Thank you for the review, @hyunsik.
I'll fixed the wrong description.

Yes, I mean that LinkedList.
It does not keep the copy of all rows but it does keep the reference of all rows and it prevents from freeing those resources during lead operation I think (is it right?).

@sirpkt
Copy link
Copy Markdown
Contributor Author

sirpkt commented Jan 16, 2015

I just fixed the wrong description of lead functions.

@hyunsik
Copy link
Copy Markdown
Member

hyunsik commented Jan 21, 2015

Thank you @sirpkt for your description and your fixes. I'm also sorry for late response. I took flight, and I just get back into harness.

I just wondered that you mentioned 'all rows' because in my view LinkedList seems to keep all column values.

@sirpkt
Copy link
Copy Markdown
Contributor Author

sirpkt commented Jan 21, 2015

Sorry for making you confused.
I mean all the values of the given column.

@hyunsik
Copy link
Copy Markdown
Member

hyunsik commented Jan 22, 2015

Thank you for correcting that. I'll finish the review as soon as possible.

@hyunsik
Copy link
Copy Markdown
Member

hyunsik commented Feb 5, 2015

+1 ship it. The patch looks good to me.

@hyunsik
Copy link
Copy Markdown
Member

hyunsik commented Feb 5, 2015

@sirpkt Could you add this patch to both master and 0.10.0 branch?

@asfgit asfgit closed this in 901700b Feb 5, 2015
asfgit pushed a commit that referenced this pull request Feb 5, 2015
@sirpkt
Copy link
Copy Markdown
Contributor Author

sirpkt commented Feb 5, 2015

Thank you for the review, @hyunsik
I just committed the patch to both master and 0.10.0 branch.

@dongjoon-hyun
Copy link
Copy Markdown
Member

Hi, @hyunsik, @sirpkt.
This issue was closed in GitHub, but not in https://issues.apache.org/jira/browse/TAJO-919.
Could you close that too, please?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants