Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make supervisor API similar to submit task API #8810

Merged
merged 22 commits into from Nov 20, 2019

Conversation

@surekhasaharan
Copy link
Contributor

surekhasaharan commented Nov 2, 2019

Fixes #8662

Description

Make submit task and supervisor API's similar in structure. supervisor API accepts both forms:

{
  "type": "kafka",
  "spec": {
    "dataSchema": { ... },
    "tuningConfig": { ... },
    "ioConfig": { ... }
  }
}

and

{
  "type": "kafka",
   "dataSchema": { ... },
   "tuningConfig": { ... },
   "ioConfig": { ... }
}

The supervisor spec would allow both formats, but the serialization is done to the old format to allow for backwards compatibility.

This PR has:

  • been self-reviewed.
  • added unit tests or modified existing tests to cover new code paths.
  • been tested in a test Druid cluster.
@surekhasaharan surekhasaharan changed the title accept spec or dataSchema, tuningConfig, ioConfig while submitting Make submit task API similar to supervisor API Nov 2, 2019
…to submit-task-api
@lgtm-com

This comment has been minimized.

Copy link

lgtm-com bot commented Nov 6, 2019

This pull request introduces 3 alerts when merging c34c5cf into 3b602da - view on LGTM.com

new alerts:

  • 3 for Dereferenced variable may be null
@lgtm-com

This comment has been minimized.

Copy link

lgtm-com bot commented Nov 6, 2019

This pull request introduces 1 alert when merging 8d0c0a6 into 6f7fbeb - view on LGTM.com

new alerts:

  • 1 for Dereferenced variable may be null
@jihoonson

This comment has been minimized.

Copy link
Contributor

jihoonson commented Nov 6, 2019

Please fix LGTM alert.

@surekhasaharan

This comment has been minimized.

Copy link
Contributor Author

surekhasaharan commented Nov 8, 2019

Please fix LGTM alert.

Yeah, i looked at that, it thinks dataSchema can be null here because of test cases, where null is passed for dataSchema, though those tests pass a valid ingestionSchema, and it's already checked in here. Not sure how to make lgtm ignore that or fix otherwise.

@vogievetsky

This comment has been minimized.

Copy link
Contributor

vogievetsky commented Nov 8, 2019

I am a strong 👍 on the design (I am biased since I filed the original issue).
This will allow us to take a nasty hack out of the web console.

Copy link
Contributor

jihoonson left a comment

Yeah, i looked at that, it thinks dataSchema can be null here because of test cases, where null is passed for dataSchema, though those tests pass a valid ingestionSchema, and it's already checked in here. Not sure how to make lgtm ignore that or fix otherwise.

You can fix it by making a non-null IndexIngestionSpec first and then using it in the constructor.

Also, this PR changes only IndexTask. Are you going to change ParallelIndexSupervisorTask in a follow-up pr?

@@ -134,8 +134,7 @@ which has been configured to read the `quickstart/tutorial/wikiticker-2015-09-12

```json
{
"type" : "index",
"spec" : {
"type" : "index",

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

Wrong indentation. Should be 2 spaces.

{
return makeGroupId(ingestionSchema.ioConfig.appendToExisting, ingestionSchema.dataSchema.getDataSource());
final boolean isValid = (ingestionSchema != null) ^ (dataSchema != null

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

This check should be in the constructor rather than here since it's a method to make a groupId.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 13, 2019

Author Contributor

makeGroupId is called from the constructor inside this, so can't add another check before calling this

&& ioConfig != null
&& tuningConfig != null);
if (!isValid) {
throw new ISE("invalid spec input");

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

Please make the error message more user friendly. It should say what's missing or what's duplicate.

}
if (ingestionSchema == null) {
assert (ioConfig != null);
assert (dataSchema != null);

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

Why these asserts? Seems duplicate.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 13, 2019

Author Contributor

these were added because of lgtm warnings

@@ -174,6 +191,9 @@ public IndexTask(
@JsonProperty("id") final String id,
@JsonProperty("resource") final TaskResource taskResource,
@JsonProperty("spec") final IndexIngestionSpec ingestionSchema,

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

Should be @Deprecated.

@@ -197,6 +197,9 @@ private void runIndexTask() throws Exception
false
),
null,
null,

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

Please add another constructor to IndexTask which is compatible to the old constructor so that you can minimize the change of this PR. The constructor should be @VisibleForTesting and @Deprecated.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 13, 2019

Author Contributor

added the original constructor back


final String json = jsonMapper.writeValueAsString(task);

Thread.sleep(100); // Just want to run the clock a bit to make sure the task id doesn't change

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

I don't understand this comment. What does it do?

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 13, 2019

Author Contributor

copy-paste of another test case :). Tried removing the sleep, test case still passes.

@Test(expected = ISE.class)
public void testIndexTaskInvalidSpecSerde()
{
new IndexTask(

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 8, 2019

Contributor

This is not a serde test. The test should deserialize from a json string.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 13, 2019

Author Contributor

moved this to IndexTaskTest

@surekhasaharan

This comment has been minimized.

Copy link
Contributor Author

surekhasaharan commented Nov 13, 2019

Yeah, i looked at that, it thinks dataSchema can be null here because of test cases, where null is passed for dataSchema, though those tests pass a valid ingestionSchema, and it's already checked in here. Not sure how to make lgtm ignore that or fix otherwise.

You can fix it by making a non-null IndexIngestionSpec first and then using it in the constructor.

Also, this PR changes only IndexTask. Are you going to change ParallelIndexSupervisorTask in a follow-up pr?

Yes, want to change ParallelIndexSupervisorTask as well in this PR, working on it.

@surekhasaharan surekhasaharan added the WIP label Nov 18, 2019
@surekhasaharan surekhasaharan removed the WIP label Nov 18, 2019
@surekhasaharan surekhasaharan changed the title Make submit task API similar to supervisor API Make supervisor API similar to submit task API Nov 18, 2019
@surekhasaharan

This comment has been minimized.

Copy link
Contributor Author

surekhasaharan commented Nov 18, 2019

Thanks @jihoonson for review and suggestion, I have updated this PR to allow supervisor schema to take in spec (the old format is still valid) similar to index, index_parallel and hadoop submit task APIs

@jihoonson

This comment has been minimized.

Copy link
Contributor

jihoonson commented Nov 19, 2019

@surekhasaharan thanks for updating the PR. Would you please merge master so that LGTM can be passed?

…to submit-task-api
…to submit-task-api
@@ -42,8 +42,23 @@
{
private static final String TASK_TYPE = "kafka";

private static KafkaSupervisorTuningConfig getTuningConfig(
KafkaSupervisorIngestionSpec ingestionSchema,
KafkaSupervisorTuningConfig tuningConfig

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

Please annotate both @Nullable.

@JsonCreator
public KafkaSupervisorSpec(
@JsonProperty("spec") KafkaSupervisorIngestionSpec ingestionSchema,

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

Please annotate ingestionSchema, dataSchema, tuningConfig, and ioConfig with @Nullable.

@JsonProperty
public DataSchema getDataSchema()
{
return super.getDataSchema();

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

Seems unnecessary.

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

And getDataSchema() should be deprecated.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 19, 2019

Author Contributor

removed from subclass and deprecated in base class

{
return super.getDataSchema();
}

@Override
@JsonProperty
public KafkaSupervisorTuningConfig getTuningConfig()

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

This and getIoConfig() should be deprecated now.

@JsonProperty
public KafkaSupervisorIngestionSpec getSpec()
{
return (KafkaSupervisorIngestionSpec) super.getSpec();

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

@Nullable

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 20, 2019

Contributor

Hmm sorry now it's not nullable.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 20, 2019

Author Contributor

oh, yeah removed

@@ -50,6 +50,7 @@
protected final SeekableStreamIndexTaskClientFactory indexTaskClientFactory;
protected final ObjectMapper mapper;
protected final RowIngestionMetersFactory rowIngestionMetersFactory;
private final SeekableStreamSupervisorIngestionSpec ingestionSchema;
private final DataSchema dataSchema;

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

This class now has duplicate information between ingestionSchema and dataSchema, ioConfig, and tuningConfig. I suggest to remove dataSchema, ioConfig, and tuningConfig and keep only ingestionSchema.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 19, 2019

Author Contributor

These will still be required in subclasses KafkaSupervisorSpec and KinesisSupervisorSpec for backwards compatibility, so there will be some duplication.

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 19, 2019

Contributor

It doesn't have to be duplicate. You can always create ingestionSchema and get things from it.

This comment has been minimized.

Copy link
@surekhasaharan

surekhasaharan Nov 20, 2019

Author Contributor

ok right, removed the extra configs from this class.

@JsonProperty
public KafkaSupervisorIngestionSpec getSpec()
{
return (KafkaSupervisorIngestionSpec) super.getSpec();

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 20, 2019

Contributor

Hmm sorry now it's not nullable.

@JsonProperty
public KinesisSupervisorIOConfig getIoConfig()
{
return (KinesisSupervisorIOConfig) super.getIoConfig();
}

@Override
@Nullable

This comment has been minimized.

Copy link
@jihoonson

jihoonson Nov 20, 2019

Contributor

Same here. It's not nullable now.

@jihoonson

This comment has been minimized.

Copy link
Contributor

jihoonson commented Nov 20, 2019

Thanks. I left a couple of trivial comments. LGTM after removing wrong nullable annotations.

Copy link
Contributor

jihoonson left a comment

+1 after CI. Thanks!

@jihoonson jihoonson merged commit d628beb into apache:master Nov 20, 2019
5 checks passed
5 checks passed
LGTM analysis: Python No code changes detected
Details
Inspections: pull requests (Druid) TeamCity build finished
Details
LGTM analysis: Java No new or fixed alerts
Details
LGTM analysis: JavaScript No new or fixed alerts
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@surekhasaharan surekhasaharan deleted the surekhasaharan:submit-task-api branch Nov 20, 2019
zhenxiao added a commit to zhenxiao/incubator-druid that referenced this pull request Nov 25, 2019
* accept spec or dataSchema, tuningConfig, ioConfig while submitting task json

* fix test

* update docs

* lgtm warning

* Add original constructor back to IndexTask to minimize changes

* fix indentation in docs

* Allow spec to be specified in supervisor schema

* undo IndexTask spec changes

* update docs

* Add Nullable and deprecated annotations

* remove deprecated configs from SeekableStreamSupervisorSpec

* remove nullable annotation
VladimirIordanov added a commit to VladimirIordanov/incubator-druid that referenced this pull request Nov 25, 2019
* accept spec or dataSchema, tuningConfig, ioConfig while submitting task json

* fix test

* update docs

* lgtm warning

* Add original constructor back to IndexTask to minimize changes

* fix indentation in docs

* Allow spec to be specified in supervisor schema

* undo IndexTask spec changes

* update docs

* Add Nullable and deprecated annotations

* remove deprecated configs from SeekableStreamSupervisorSpec

* remove nullable annotation
jon-wei added a commit to jon-wei/druid that referenced this pull request Nov 26, 2019
* accept spec or dataSchema, tuningConfig, ioConfig while submitting task json

* fix test

* update docs

* lgtm warning

* Add original constructor back to IndexTask to minimize changes

* fix indentation in docs

* Allow spec to be specified in supervisor schema

* undo IndexTask spec changes

* update docs

* Add Nullable and deprecated annotations

* remove deprecated configs from SeekableStreamSupervisorSpec

* remove nullable annotation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.