Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Separate Kafka operators from Messaging toolkit #111

Closed
cancilla opened this issue Apr 3, 2017 · 12 comments
Closed

Proposal: Separate Kafka operators from Messaging toolkit #111

cancilla opened this issue Apr 3, 2017 · 12 comments

Comments

@cancilla
Copy link

cancilla commented Apr 3, 2017

Proposal

I would like to propose that we split the Kafka operators from the Messaging toolkit and move them into a toolkit, in a new repository. As the Bluemix Streaming Analytics service continues to grow in popularity, the need to send and receive data to and from the Streaming Analytics service to other services/systems is becoming more important. At the moment, the best way to do this is via the MessageHub service, which is backed by Kafka. The current approach of maintaining the Kafka operators in the Messaging toolkit is problematic for the following reasons:

  • In order to use the Kafka operator, the entire Messaging toolkit, including dependencies for message brokers other than Kafka, end up being included in the SAB. This results in unnecessary libraries being loaded and a larger-than-required SAB file.
  • Kafka is being developed at a rapid pace. Updating the Kafka operators in order to stay current with the latest Kafka version means having to reversion and retest the entire Messaging toolkit, including operators intended for other message brokers. This is time consuming and can potentially lead to problems.
  • As mentioned previously, Kafka is becoming more important as the Streaming Analytics service gets used. With Kafka being included in the Messaging toolkit, it may be difficult for new users to find the Kafka operators. Having them in a separate repository/toolkit will make our Kafka support more obvious.

Naming

I would like to propose the following names:

  • Repository: streamsx.kafka
  • Toolkit: com.ibm.streamsx.kafka

Initial Contribution

The initial toolkit will be created by copying the Kafka operators from the Messaging toolkit with minimal code changes. This will allow existing applications to quickly migrate to the Kafka toolkit without having to perform any major rewrites. This will also serve as a baseline for future development work on the operators.

Migration Process

I propose the following process be taken to migrate the Kafka operators to a new toolkit

  1. Copy current Kafka operators into a new toolkit
  2. Move Kafka-related issues from the Messaging repo to the Kafka repo
  3. Create a release of the Kafka toolkit (baseline release)
  4. Upon releasing the first version of the Kafka toolkit:
    a. Mark the operators in the Messaging toolkit as deprecated and point users to the Kafka toolkit.
    b. Update Messaging documentation to reflect that the Kafka operators in the Messaging toolkit are deprecated and that the Kafka toolkit should be used instead.
@mikespicer
Copy link
Collaborator

+1

@chanskw
Copy link
Collaborator

chanskw commented Apr 3, 2017

+1
I suggest that the toolkit and its namespace be named as com.ibm.streamsx.messaging.kafka to ease migration effort for existing customers.

@cancilla
Copy link
Author

cancilla commented Apr 3, 2017

If I set the namespace for the operators to com.ibm.streamsx.messaging.kafka, will there be a conflict if an application is also using the Messaging toolkit, since that toolkit will have Kafka operators in the same namespace?

@chanskw
Copy link
Collaborator

chanskw commented Apr 3, 2017

I am wondering if removing Kafka operators from the messaging toolkit is a better option, rather than leaving them in the toolkit and marking them as deprecated.

By removing them, customer will get a compile error saying that the operators no longer exist. They will then be forced to migrate to depend on the new toolkit. Because the namespaces don't change, it's a matter of updating the application dependency and build script.

By leaving them behind, would it be more confusing as we now have two sets of operators and customers do not get compile errors for using the old operators.

@cancilla
Copy link
Author

cancilla commented Apr 3, 2017

I am not against removing them from the toolkit. However, I am worried if a user is using the messaging toolkit that is packaged with the product. Regardless of what we do with the toolkit on Github, the product will still have Kafka operators in the same namespace, which may introduce conflicts.

@chanskw
Copy link
Collaborator

chanskw commented Apr 3, 2017

Another way is to update the existing Kafka operators to actually produce a compile warning when being used and point to the operators. Does that make it more easily understood?

@cancilla
Copy link
Author

cancilla commented Apr 3, 2017

I should have been more explicit when I said "mark as deprecated". What I meant by this was that we would throw a warning at compile-time indicating that the operators have been deprecated. This does 2 things:

  1. Clearly indicates that the operators are deprecated and should not be used. We can even reference the new toolkit in this warning.
  2. Allows applications that are not ready/able to introduce a new toolkit to continue to function. I think this is important because adding a new toolkit to an existing application stack may require updates to build scripts, approvals, additional testing, etc. By allowing the Messaging toolkit to continue to function, it gives developers time to plan their migration to the new toolkit.

I agree that having 2 versions of the same operators can be confusing, but I think we are stuck between a rock and a hard place. My suggestion is to start with the least destructive approach. If we see a lot of cases where users are getting confused, we can reevaluate then.

@chanskw
Copy link
Collaborator

chanskw commented Apr 4, 2017

Thanks for the explanation.. I agree!

@schubon
Copy link
Member

schubon commented Apr 4, 2017

There was already an issue for this under streamsx.messaging:
IBMStreams/streamsx.messaging#246.

To drive discussion for the product I had to create an RTC for this and added the participants of above as subscribers. It is not as articulated as the discussion and proposal in this issue here though.
Please, can you check for your approval requests and comment in the RTC work item as we'd like to have this split supported in the next product release, too.

Thanks.

@ddebrunner
Copy link
Member

  1. Copy current Kafka operators into a new toolkit
  2. Create a release of the Kafka toolkit (baseline release)

I'm not sure we would want to rush into a release, since these would be "new" operators we have a chance to "fix" any issues with the operator parameters etc. Are there any outstanding changes folks know of?

@chanskw
Copy link
Collaborator

chanskw commented Apr 4, 2017

@ddebrunner I think you have a great point here.

@chanskw
Copy link
Collaborator

chanskw commented Apr 4, 2017

streamsx.kafka created. Please continue discussion there or in the messaging toolkit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants