Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Functions] reorganize the context hierarchy for functions #10631

Merged
merged 6 commits into from
Jun 23, 2021

Conversation

nlu90
Copy link
Member

@nlu90 nlu90 commented May 18, 2021

Motivation

Currently the context relationship for function, source and sink is not well defined. This prevents some common features to be added once for all and creates some confusion, code duplication in the current repo. As demonstrated in the following graph, this PR changes the hierarchy from left to right. By introducing a common base context, it help solving some issues we are seeing. The base context provides common access to pulsar cluster, state, metrics, and meta-data to make sure all components can reuse it.

context hierarchy

Modifications

  • Remove ConnectorContext interface.
  • Introduce a BaseContext interface.
  • Update existing Context, SourceContext, SinkContext interface to extend the new common interface.

Verifying this change

  • Make sure that the change passes the CI checks.

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API:

@nlu90 nlu90 marked this pull request as ready for review June 1, 2021 23:59
@nlu90 nlu90 changed the title reorganize the context hierarchy for functions [Functions] reorganize the context hierarchy for functions Jun 2, 2021
@merlimat merlimat requested a review from jerrypeng June 4, 2021 20:54
@@ -84,7 +84,7 @@
void flush() throws PulsarClientException;

/**
* Flush all the messages buffered in the client and wait until all messages have been successfully persisted.
* Flush all the messages buffered in the client Asynchronously.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. Asynchronously -> asynchronously

*/
Record<?> getCurrentRecord();

@InterfaceStability.Unstable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are changing the InterfaceStability level? Is this change no backwards compatible?

Copy link
Contributor

@jerrypeng jerrypeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am ok with doing some refactoring in the interfaces we have for Source, Sink, and Function context, this PR also exposes interfaces that are originally only for FunctionContext to SourceContext and SinkContext. We should consider carefully before introducing these new interfaces in Source and Sink Context. I don't want to expose functionality that is not needed.

If this is just a refactor, please keep the current interfaces exposed for FunctionContext, SourceContext, and SinkContext.

@nlu90
Copy link
Member Author

nlu90 commented Jun 7, 2021

While I am ok with doing some refactoring in the interfaces we have for Source, Sink, and Function context, this PR also exposes interfaces that are originally only for FunctionContext to SourceContext and SinkContext. We should consider carefully before introducing these new interfaces in Source and Sink Context. I don't want to expose functionality that is not needed.

If this is just a refactor, please keep the current interfaces exposed for FunctionContext, SourceContext, and SinkContext.

@jerrypeng Thanks for the review. Despite the previous shared interfaces, the followings new interfaces are exposed to connectors:

default <S extends StateStore> S getStateStore(String name) {
        throw new UnsupportedOperationException("Component cannot get state store");
    }

default PulsarAdmin getPulsarAdmin() {
        throw new UnsupportedOperationException("Component cannot access pulsar admin");
    }
    
default PulsarAdmin getPulsarAdmin(String clusterName) {
        throw new UnsupportedOperationException("Component cannot access pulsar admin");
    }

default <O> CompletableFuture<Void> publish(String topicName, O object, String schemaOrSerdeClassName) {
        throw new UnsupportedOperationException("Component cannot publish message");
    }

default <O> CompletableFuture<Void> publish(String topicName, O object) {
        throw new UnsupportedOperationException("Component cannot publish message");
    }

default <O> TypedMessageBuilder<O> newOutputMessage(String clusterName, String topicName, Schema<O> schema)
            throws PulsarClientException {
        throw new UnsupportedOperationException("Component can not output message to pulsar cluster");
    }
    
default <O> TypedMessageBuilder<O> newOutputMessage(String topicName, Schema<O> schema)
            throws PulsarClientException {
        throw new UnsupportedOperationException("Component can not output message to pulsar cluster");
    }

default <O> ConsumerBuilder<O> newConsumerBuilder(Schema<O> schema) throws PulsarClientException {
        throw new UnsupportedOperationException("Component can not create consumer builder");
    }

Based on my understanding, these are pulsar cluster interaction methods that connectors may need. And actually, I'm planning to add one more interface getPulsarClient as a following PR to fix #8668 more concretely.

But I might be missing some of the context here, so please let me know which of the newly exposed functionality is not needed and we can discuss and update accordingly.

@jerrypeng
Copy link
Contributor

@nlu90

Based on my understanding, these are pulsar cluster interaction methods that connectors may need. And actually, I'm planning to add one more interface getPulsarClient as a following PR to fix #8668 more concretely.

  1. I think the goal and scope of this PR is refactor some of the code for various context interfaces so there is less duplicated code. Let's make the changes that are within that scope. Let's not try to do too many things in one PR.

  2. If we want to expose more of the interfaces of sources and sinks. Let's have a clear use case in mind before exposing additional interfaces. I am not in favor of just bulk adding additional interfaces for sources and sinks. I don't want to maintain functionality in sources and sinks that is not useful. For example, for a sink, is there a concrete use case in which a sink needs to publish a message to another pulsar topic?

@nlu90
Copy link
Member Author

nlu90 commented Jun 10, 2021

@nlu90

Based on my understanding, these are pulsar cluster interaction methods that connectors may need. And actually, I'm planning to add one more interface getPulsarClient as a following PR to fix #8668 more concretely.

  1. I think the goal and scope of this PR is refactor some of the code for various context interfaces so there is less duplicated code. Let's make the changes that are within that scope. Let's not try to do too many things in one PR.
  2. If we want to expose more of the interfaces of sources and sinks. Let's have a clear use case in mind before exposing additional interfaces. I am not in favor of just bulk adding additional interfaces for sources and sinks. I don't want to maintain functionality in sources and sinks that is not useful. For example, for a sink, is there a concrete use case in which a sink needs to publish a message to another pulsar topic?

Sounds good to me.

I'll limit the scope of this PR for only refactoring apis. Keep those listed apis only in the FunctionContext. And send additional PRs if apis are needed.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share @jerrypeng concerns.
Apart from that I appreciate this work.

The fact that we are not changing the tests is a good demonstration that this is a compatible change from the API point of view

@sijie
Copy link
Member

sijie commented Jun 18, 2021

@nlu90 Can you rebase the PR to the latest master?

@nlu90
Copy link
Member Author

nlu90 commented Jun 21, 2021

@sijie I rebased the master.

@nlu90
Copy link
Member Author

nlu90 commented Jun 22, 2021

@jerrypeng @eolivelli

Could you take another look at the PR? I've updated it based on the comments and passed all the CI check

@sijie sijie added this to the 2.9.0 milestone Jun 22, 2021
@sijie sijie merged commit 999329a into apache:master Jun 23, 2021
kaushik-develop pushed a commit to kaushik-develop/pulsar that referenced this pull request Jun 24, 2021
)

### Motivation

Currently the context relationship for function, source and sink is not well defined. This prevents some common features to be added once for all and creates some confusion, code duplication in the current repo. As demonstrated in the following graph, this PR changes the hierarchy from left to right. By introducing a common base context, it help solving some issues we are seeing. The base context provides common access to pulsar cluster, state, metrics, and meta-data to make sure all components can reuse it.

![context hierarchy](https://user-images.githubusercontent.com/16407807/118730483-8ebf5200-b7ec-11eb-9220-d41261f148bb.png)



### Modifications

- Remove `ConnectorContext` interface.
- Introduce a `BaseContext` interface. 
- Update existing `Context`, `SourceContext`, `SinkContext` interface to extend the new common interface.
@codelipenghui codelipenghui deleted the neng/func-context branch September 21, 2021 12:54
Technoboy- pushed a commit to Technoboy-/pulsar that referenced this pull request Sep 21, 2021
)

### Motivation

Currently the context relationship for function, source and sink is not well defined. This prevents some common features to be added once for all and creates some confusion, code duplication in the current repo. As demonstrated in the following graph, this PR changes the hierarchy from left to right. By introducing a common base context, it help solving some issues we are seeing. The base context provides common access to pulsar cluster, state, metrics, and meta-data to make sure all components can reuse it.

![context hierarchy](https://user-images.githubusercontent.com/16407807/118730483-8ebf5200-b7ec-11eb-9220-d41261f148bb.png)



### Modifications

- Remove `ConnectorContext` interface.
- Introduce a `BaseContext` interface. 
- Update existing `Context`, `SourceContext`, `SinkContext` interface to extend the new common interface.
codelipenghui pushed a commit that referenced this pull request Sep 24, 2021
…12117)

### Motivation

Currently the context relationship for function, source and sink is not well defined. This prevents some common features to be added once for all and creates some confusion, code duplication in the current repo. As demonstrated in the following graph, this PR changes the hierarchy from left to right. By introducing a common base context, it help solving some issues we are seeing. The base context provides common access to pulsar cluster, state, metrics, and meta-data to make sure all components can reuse it.

![context hierarchy](https://user-images.githubusercontent.com/16407807/118730483-8ebf5200-b7ec-11eb-9220-d41261f148bb.png)



### Modifications

- Remove `ConnectorContext` interface.
- Introduce a `BaseContext` interface. 
- Update existing `Context`, `SourceContext`, `SinkContext` interface to extend the new common interface.

Co-authored-by: Neng Lu <nlu@streamnative.io>
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Sep 24, 2021
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
)

### Motivation

Currently the context relationship for function, source and sink is not well defined. This prevents some common features to be added once for all and creates some confusion, code duplication in the current repo. As demonstrated in the following graph, this PR changes the hierarchy from left to right. By introducing a common base context, it help solving some issues we are seeing. The base context provides common access to pulsar cluster, state, metrics, and meta-data to make sure all components can reuse it.

![context hierarchy](https://user-images.githubusercontent.com/16407807/118730483-8ebf5200-b7ec-11eb-9220-d41261f148bb.png)



### Modifications

- Remove `ConnectorContext` interface.
- Introduce a `BaseContext` interface. 
- Update existing `Context`, `SourceContext`, `SinkContext` interface to extend the new common interface.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants