Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sys.supervisors table to system tables #8547

Merged
merged 15 commits into from Oct 18, 2019

Conversation

surekhasaharan
Copy link

Fixes #7007
Proposal issue #8463

Description

This PR adds a sys.supervisors table to the pool of system tables. This would allow to query the supervisors via DruidSQL.

SupervisorResource changes

  • Added new queryParam fullStatus to GET /druid/indexer/v1/supervisor
  • Return a List<SupervisorStatus> instead of Map<String,Object> from specGetAll
  • Added class SupervisorStatus

SystemSchema changes

  • Added SupervisorTable class to SystemSchema to contain supervisor specific code

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.
  • been tested in a test Druid cluster.

|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`|
|`detailedState`|String|supervisor specific state. (See documentation of specific supervisor for details)|
|`healthy`|Boolean|true or false indicator of overall supervisor health|
|`specString`|String|a json string of supervisor spec|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, some of these docs changes will conflict with #8548 (e.g., "json" should be "JSON").

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks for heads-up, changed to uppercase here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Travis shows this spell check report:
image

To fix:

return suspended;
}

@JsonPOJOBuilder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Didn't know you could could have Jackson use a builder.

private final boolean healthy;
private final SupervisorSpec spec;
/**
* This is a stringified version of spec object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention the format? Is it JSON?

try {
request = indexingServiceClient.makeRequest(
HttpMethod.GET,
StringUtils.format("/druid/indexer/v1/supervisor?fullStatus"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit of using StringUtils.format() if there are no args to be formatted?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, may be not, but i saw some places using this, and because of habit by now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove the StringUtils.format if there aren't any format args.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will remove from all the places in this class

.stream()
.map(x -> {
Optional<SupervisorStateManager.State> theState =
manager.getSupervisorState(x);
ImmutableMap.Builder<String, Object> theBuilder = ImmutableMap.builder();
theBuilder.put("id", x);
SupervisorStatus.Builder theBuilder = new SupervisorStatus.Builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change to have it use a builder!

Optional<SupervisorSpec> theSpec = manager.getSupervisorSpec(x);
if (theSpec.isPresent()) {
try {
theBuilder.withSpecString(objectMapper.writeValueAsString(manager.getSupervisorSpec(x).get()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be nice to have to builder accept SupervisorSpec and serialize/deserialize it to a string internally so that the formatting logic is all in one place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tried that, but the objectMapper is injected here, which is used to serialize the SupervisorSpec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work if you add another constructor for SupervisorStatus.Builder that injects the ObjectMapper?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, it seems odd to add ObjectMapper to Builder, and the builder already accepts a spec, in addition to specString. I could generate specString from spec in SupervisorStatus#getSpecString if I have the right ObjectMapper, but then I think the specString would appear in response to druid/indexer/v1/supervisor?full as well as druid/indexer/v1/supervisor?fullStatus, since it's not null anymore for the former call and it's not required/desired in former call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, since the SupervisorStatus DTO already has a SupervisorSpec field that gets serialized when the response is serialized, why do you need to have another field that's an explicitly serialized version (which is populated before the response is serialized)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i added already serialized specString to get the json payload in SystemSchema#SupervisorsTable via the JsonParserIterator, otherwise the deserialization errors out since KafkaSupervisorSpec's attributes are not present in broker, they are only available in overlord.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ccaominh added comment and javadoc about the need for explicitly serialized spec, let me know if it helps clarify things

Copy link
Contributor

@ccaominh ccaominh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

|Field|Type|Description|
|---|---|---|
|`id`|String|supervisor unique identifier|
|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a space between "Available states:" and "`UNHEALTHY"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should link to a page that defines what these states mean. I don't think one exists (I checked briefly) so for now I'd say link to the Kafka docs and say that users can look there for details.

|---|---|---|
|`id`|String|supervisor unique identifier|
|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`|
|`detailedState`|String|supervisor specific state. (See documentation of specific supervisor for details)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only two right now, let's help the users out by linking to them directly. i.e. "See documentation of the specific supervisor for details, e.g. [Kafka] or [Kinesis]."


|Column|Type|Notes|
|------|-----|-----|
|supervisor_id|STRING|supervisor task identifier|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notes should be in sentence case: first letter capitalized. (Like the other tables on this page.)

@@ -771,6 +771,27 @@ For example, to retrieve tasks information filtered by status, use the query
SELECT * FROM sys.tasks WHERE status='FAILED';
```

#### SUPERVISORS table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the comments on the api-reference page apply here as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fixed here as well

|`detailedState`|String|supervisor specific state. (See documentation of specific supervisor for details)|
|`healthy`|Boolean|true or false indicator of overall supervisor health|
|`specString`|String|a JSON string of supervisor spec|
|`type`|String|type of supervisor task, e.g., `kafka` or `kinesis`|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just "type of supervisor" (supervisors aren't tasks)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed task

@@ -42,6 +42,7 @@

public class KinesisSupervisorSpec extends SeekableStreamSupervisorSpec
{
private static final String TASK_TYPE = "kinesis";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUPERVISOR_TYPE

* This API is only used for informational purposes in
* org.apache.druid.sql.calcite.schema.SystemSchema.SupervisorsTable
*
* @return supervisor task type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not a task.)

import java.util.Objects;

/**
* This class contains the attributes of a supervisor task which are returned by the API's in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not a task)

try {
request = indexingServiceClient.makeRequest(
HttpMethod.GET,
StringUtils.format("/druid/indexer/v1/supervisor?fullStatus"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove the StringUtils.format if there aren't any format args.

private final SupervisorSpec spec;
/**
* This is a JSON representation of {@code spec}
* The explicit serialization is present here so that users of {@code SupervisorStatus} which cannot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

Won't the spec still be serialized into spec, and won't callers still try to deserialize the spec and get some kind of error?

(If it does work -- how does it work? I am perplexed.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it seems to work in my local cluster, the exception with using SupervisorSpec instead of string was

java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Can not construct instance of org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpec, problem: Guice configuration errors:
1) No implementation for org.apache.druid.indexing.overlord.TaskStorage was bound.
  while locating org.apache.druid.indexing.overlord.TaskStorage

the way it works is, i create SupervisorStatus with json string here, so SupervisorStatus#getSpecString returns the json payload, and SupervisorStatus#getSpec would return null in this case.

|`healthy`|Boolean|true or false indicator of overall supervisor health|
|`suspended`|Boolean|true or false indicator of whether the supervisor is in suspended state|

* `/druid/indexer/v1/supervisor?fullStatus`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the fullStatus API be unified with /druid/indexer/v1/supervisor?full?

It seems a bit confusing to have two "full" parameters, and the information returned by fullStatus appears to contain everything that full would return.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/druid/indexer/v1/supervisor?fullStatus returns everything "full" returns, in addition adds:

  • type
  • source
  • specString
    these are added to fill the columns in sys.supervisor table without deserializing the spec object(which cannot be done in SystemSchema because of dependency issues). Also to avoid changing the behavior of existing API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an explanation to the docs describing why the fullStatus param exists and when you would use one vs. the other

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since fullStatus is used internally for sys tables, i'm now thinking should we document it, as I think for users, the full query param might suffice. Not sure what is generally our stance on these kind of internal options. I think, the only reason, one would use fullStatus , is if they care to get type or source without digging into the spec object. I can either remove the fullStatus or add that explanation. Thoughts ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's ok to skip documentation for options meant for internal use. Maybe call it ?system rather than ?full to make it clearer in the code what the purpose is.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?system sounds good to me and remove the option from docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's also better to skip the documentation. If it's intended for internal use and folks start using it because it's documented, it may be harder to change the behavior/API later if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, if it's only meant for internal use, then renaming fullStatus to system and leaving out the docs for that sounds good to me.

Copy link
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a minor comment and comment about docs, LGTM otherwise

Function<SupervisorStatus, Iterable<ResourceAction>> raGenerator = supervisor -> Collections.singletonList(
AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(supervisor.getSource()));

final Iterable<SupervisorStatus> authorizedTasks = AuthorizationUtils.filterAuthorizedResources(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

authorizedTasks -> authorizedSupervisors

Copy link
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after CI

@jon-wei jon-wei merged commit 98f59dd into apache:master Oct 18, 2019
@surekhasaharan surekhasaharan deleted the sys-supervisor-table branch October 18, 2019 22:32
@jon-wei jon-wei added this to the 0.17.0 milestone Dec 17, 2019
@jon-wei jon-wei mentioned this pull request Dec 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wish list: A DruidSQL table for supervisors
4 participants