Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing documentation for window functions #5520

Closed
candlerb opened this issue Oct 31, 2019 · 6 comments · Fixed by #7741
Closed

Missing documentation for window functions #5520

candlerb opened this issue Oct 31, 2019 · 6 comments · Fixed by #7741
Assignees
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@candlerb
Copy link
Contributor

Describe the bug
There is no documentation for window functions, and no python examples for window functions.

To Reproduce
Search for "window" in the docs (site2).

It's mentioned in passing:

A Java function listens for the sanitized-sentences topic, counts the number of times each word appears within a specified time window, and publishes the results to a results topic

(my emphasis)

In some CLI flags:

./docs/functions-cli.md:sliding-interval-count | The number of messages after which the window slides. |  |
./docs/functions-cli.md:sliding-interval-duration-ms | The time duration after which the window slides. |  |
./docs/functions-cli.md:window-length-count | The number of messages per window. |  |
./docs/functions-cli.md:window-length-duration-ms | The time duration of the window in milliseconds. | |

And in a reference to release notes:

  • Add Windowfunction interface to functions api #3324

Expected behavior
N/A

Screenshots
N/A

Desktop (please complete the following information):
N/A

Additional context
This functionality is really interesting to me, and I'd like to know (a) what it can do, and (b) if I can use it from python.

I'm particularly interested if it's possible to do e.g. a ten-minute rolling window, where I can add 1 to an internal counter for an event which enters the window, and remove 1 from a count when that event leaves the window, thus maintaining a running count.

It could be done without external support, but would require in-RAM buffering for all events over than 10 minute window - but buffering is what pulsar is for :-)

Externally I could maintain two readers: one at the current location and one at T-10 mins. But that wouldn't work as a pulsar function AFAICS.

@candlerb candlerb added the type/bug The PR fixed a bug or issue reported a bug label Oct 31, 2019
@candlerb
Copy link
Contributor Author

I got part of the answer from pulsar:

$ apache-pulsar-2.4.1/bin/pulsar-admin functions update --name womble --window-length-count 5 --sliding-interval-count 3
There is currently no support windowing in python

Reason: There is currently no support windowing in python

But I'd still like to see documentation for what windowing is supposed to do :-)

@Jennifer88huang-zz
Copy link
Contributor

@candlerb Thank you very much for raising the issue. I'll look into the issue, and provide related docs soon.

@wolfstudy
Copy link
Member

wolfstudy commented Nov 7, 2019

Sorry, I missed this message.

But I'd still like to see documentation for what windowing is supposed to do :-)

@candlerb Currently, does not support window function in Python Function.

private static void doPythonChecks(FunctionConfig functionConfig) {
        if (functionConfig.getProcessingGuarantees() == FunctionConfig.ProcessingGuarantees.EFFECTIVELY_ONCE) {
            throw new RuntimeException("Effectively-once processing guarantees not yet supported in Python");
        }

        if (functionConfig.getWindowConfig() != null) {
            throw new IllegalArgumentException("There is currently no support windowing in python");
        }

        if (functionConfig.getMaxMessageRetries() != null && functionConfig.getMaxMessageRetries() >= 0) {
            throw new IllegalArgumentException("Message retries not yet supported in python");
        }
    }

@Jennifer88huang-zz
Copy link
Contributor

@candlerb Currently, window function is supported in Java, not in Python.

@candlerb
Copy link
Contributor Author

candlerb commented Nov 7, 2019

Yep, I got that. But I don't see any documentation for how it would work for a Java Pulsar Function either.

@Jennifer88huang-zz
Copy link
Contributor

@srkukarni could you help add the related documentation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants