Skip to content

Avoid creating output topic on tenant namespace if output-topic not provided#2261

Merged
rdhabalia merged 2 commits into
apache:masterfrom
rdhabalia:sink_topic_empty
Aug 7, 2018
Merged

Avoid creating output topic on tenant namespace if output-topic not provided#2261
rdhabalia merged 2 commits into
apache:masterfrom
rdhabalia:sink_topic_empty

Conversation

@rdhabalia
Copy link
Copy Markdown
Contributor

Motivation

Sometime user wants to use pulsar-function only for processing source-messages and do not want to redirect output to any topic. Right now, if user doesn't provided output topic on function-cli then cli derives output-topic and forces output messages to be redirected to that topic. It causes new topic creation under a tenant's namespace without tenant's permission and PulsarSink unnecessary publishes messages to that newly created topic. Also if the namespace is global then messages of the output topic will be replicated to other clusters as well.
So, function should not write messages to output topic if output topic is not provided.

Modifications

  • function-cli doesn't compute output topic name if it's not provided
  • function worker doesn't create output topic if it's not present

@rdhabalia rdhabalia added this to the 2.1.1-incubating milestone Jul 28, 2018
@rdhabalia rdhabalia self-assigned this Jul 28, 2018
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather have this functionality gated by some kind of flag with default being output topic created that the users can turn off for optimization purposes

Copy link
Copy Markdown
Contributor Author

@rdhabalia rdhabalia Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if user doesn't provide output topic name then it means user is not intended to publish output in the external topic. I think here, cli creates output topic without user's knowledge and adding additional flag may create more confusion for user . So, as a user, I feel it's better if cli doesn't compute output topic if it's not provided.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we made this flag optional, the intention was that the user need to specify as few arguments as necessary and the system takes care of most of the arguments. Essentially design for user simplicity. That's why I felt if one needs an optimization, that might be a flag.
I'm curious what @sijie @jerrypeng feel about this.

Copy link
Copy Markdown
Contributor

@jerrypeng jerrypeng Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should implicitly create a topic for the output. I think it makes more sense for the the user to explicitly specify it. The user might even be more confused if he or she can submit a function without a output topic. He or she might not even know what the default output topic name will be.

If we really want to create a default output topic automatically for the function, lets at least check that the function is set to return a non-void type. There is no point to create automatically creating an output topic if the function is set to return void

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should implicitly create a topic for the output.

can we merge this PR then. as I have mentioned into motivation: replication-namespace and security are the main concerns so, function should not create output topic implicitly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdhabalia I think the agreement here is not to create an output topic implicitly. so why not fail submission if output is not specified and the return type is not void? to me, fail fast is better than than using a DISABLED sink. if people want to disable sink, there should be another flag controlling that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if people want to disable sink, there should be another flag controlling that.

I think when user submits function, user doesn't care about source or sink and as a user one
doesn't want to redirect output to any unknown topic. seems like adding extra flag makes output topic mandatory. not sure if that's a good idea.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if people want to disable sink, there should be another flag controlling that.

@sijie , made the change.

@rdhabalia
Copy link
Copy Markdown
Contributor Author

retest this please

Copy link
Copy Markdown
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me @rdhabalia . one comment around the command naming.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you rename it to --skip-output? we are attempting to standardize the naming convention in the command line - "lower cases with '-' as separators".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.. function cli was not consistent with other pulsar-cli. will rename it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed it.

…rovided

fix test

add flag to skip output topic

rename skip-output cmd
@rdhabalia rdhabalia merged commit 14765b2 into apache:master Aug 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants