SAMZA-2465: Task inputs information lost when enabled `RegExTopicGenerator` or specified 'task.inputs' explicitly in non-legacy application #1286

alnzng · 2020-02-21T00:28:04Z

Symptom

If the user’s non-legacy application enabled RegExTopicGenerator or specified task.inputs explicitly and specified the input streams in its application descriptor, the expectation from the user side should be that the application can consume messages from specified input streams and Kafka topics that matched specified regex patterns.

However, in current logic seems the input information from the application descriptor will be overrided by the information from RegExTopicGenerator or task.inputs in the config file, which means the user’s application can only consume from matched Kafka topics or the inputs specified in task.inputs.

Cause

The generated task inputs from the application descriptor are overrided by JobNodeConfigurationGenerator.mergeConfig function.

Changes

Merge generated inputs and original inputs before doing JobNodeConfigurationGenerator.mergeConfig function call.

Tests

All unit tests and integration tests are passed

API Changes

None

Upgrade Instructions

None

Usage Instructions

Noe

When the user's non-legacy application enables 'RegExtopicGenerator' or specifies 'task.inputs' in the config file, the inputs information are specified in 'ApplicationDescriptor' will lost. Signed-off-by: Alan Zhang <shuai.xyz@gmail.com>

mynameborat · 2020-06-11T04:13:52Z

@bkonold can you take a look at this PR?

bkonold · 2020-06-11T07:30:51Z

It is my understanding that we override generated configs to allow for job deployment to be reconfigured without building a new binary of the job. This is useful, for example, when managing issues in production and a job needs to be quickly reconfigured; the job's configuration can be modified and the job redeployed with the same binary vs needing to touch the job's app descriptor and building a new version of the binary.

For that reason it seems contradictory to merge values for the same key between original and generated config...

A bit more generally begs the question of how we treat precedence between original config, rewritten config, and generated configs (from app descriptor). @kw2542 Since you are working on the deployment flow, can you comment on what the touch points are in the system currently for rewriting configs? Do we have a clear picture of what this precedence is now?

rmatharu-zz · 2020-06-11T17:21:38Z

".... input information from the application descriptor will be overrided by the information from RegExTopicGenerator or task.inputs...."
Isnt this desired behavior ?
If a user provides conflicting values for inputs in the two places, we resolve the conflict in favor of the one in the config.
I'm not sure why this is problematic.

mynameborat · 2020-06-11T17:30:52Z

".... input information from the application descriptor will be overrided by the information from RegExTopicGenerator or task.inputs...."
Isnt this desired behavior ?
If a user provides conflicting values for inputs in the two places, we resolve the conflict in favor of the one in the config.
I'm not sure why this is problematic.

+1; This PR negates - #1065

alnzng · 2020-06-11T17:55:18Z

Clarify the issue here:

If the user’s non-legacy application enabled RegExTopicGenerator and specified the input streams in its application descriptor, the expectation from the user side should be the application that can consume messages from specified input streams and Kafka topics that matched specified regex pattern. However, in current logic seems the input information from the application descriptor will be overdried by the information from RegExTopicGenerator, which means the user’s application can only consume from matched Kafka topics.

Fix task inputs information lost issue

42df909

When the user's non-legacy application enables 'RegExtopicGenerator' or specifies 'task.inputs' in the config file, the inputs information are specified in 'ApplicationDescriptor' will lost. Signed-off-by: Alan Zhang <shuai.xyz@gmail.com>

alnzng requested a review from rmatharu-zz February 21, 2020 04:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAMZA-2465: Task inputs information lost when enabled `RegExTopicGenerator` or specified 'task.inputs' explicitly in non-legacy application #1286

SAMZA-2465: Task inputs information lost when enabled `RegExTopicGenerator` or specified 'task.inputs' explicitly in non-legacy application #1286

alnzng commented Feb 21, 2020 •

edited

Loading

mynameborat commented Jun 11, 2020

bkonold commented Jun 11, 2020

rmatharu-zz commented Jun 11, 2020 •

edited

Loading

mynameborat commented Jun 11, 2020

alnzng commented Jun 11, 2020

SAMZA-2465: Task inputs information lost when enabled RegExTopicGenerator or specified 'task.inputs' explicitly in non-legacy application #1286

Are you sure you want to change the base?

SAMZA-2465: Task inputs information lost when enabled RegExTopicGenerator or specified 'task.inputs' explicitly in non-legacy application #1286

Conversation

alnzng commented Feb 21, 2020 • edited Loading

Symptom

Cause

Changes

Tests

API Changes

Upgrade Instructions

Usage Instructions

mynameborat commented Jun 11, 2020

bkonold commented Jun 11, 2020

rmatharu-zz commented Jun 11, 2020 • edited Loading

mynameborat commented Jun 11, 2020

alnzng commented Jun 11, 2020

SAMZA-2465: Task inputs information lost when enabled `RegExTopicGenerator` or specified 'task.inputs' explicitly in non-legacy application #1286

SAMZA-2465: Task inputs information lost when enabled `RegExTopicGenerator` or specified 'task.inputs' explicitly in non-legacy application #1286

alnzng commented Feb 21, 2020 •

edited

Loading

rmatharu-zz commented Jun 11, 2020 •

edited

Loading