Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separate ingestion and query thread pool #1929

Merged
merged 1 commit into from
Nov 17, 2015
Merged

separate ingestion and query thread pool #1929

merged 1 commit into from
Nov 17, 2015

Conversation

pjain1
Copy link
Member

@pjain1 pjain1 commented Nov 6, 2015

When the ingestion and query rate for a realtime index task using the event receiver firehose is quite high, then having different thread pools for handling ingestion and querying will be quite useful. The separation will be useful in case ingestion and querying takes a while to respond for example deserializing sketch objects during ingestion or responding to a slow query. Thus, the ingestion cannot not block queries and vice-versa as they can be tuned independently.

@@ -43,7 +43,7 @@

@Inject
public ServiceAnnouncingChatHandlerProvider(
@Self DruidNode node,
@RemoteChatHandler DruidNode node,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awesome, the power of annotations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks to @cheddar for showing how to do this

@pjain1
Copy link
Member Author

pjain1 commented Nov 10, 2015

@xvrl @drcrallen took care of comments. Using consecutive ports for jetty servers.

@@ -233,6 +245,12 @@ public TaskStatus call()
command.add(String.format("-Ddruid.host=%s", childHost));
command.add(String.format("-Ddruid.port=%d", childPort));

if(config.isSeparateIngestionEndPoint()) {
command.add(String.format("-Ddruid.indexer.task.chathandler.service=%s", "druid/chathandler"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this consumed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three properties are used to create DruidNode object annotated with RemoteChatHandler at ServiceAnnouncingChatHandlerProvider and getServer method of ChatHandlerServerModule. However, while announcement of the event receiver firehose the service name provided during registration is used and the 'serviceName' property which is specified here is ignored. Thus, it is not consumed anywhere, it is just there so that object creation does not fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's so can you make it something that won't register as a valid ZK path? also please add a comment as to why it is set, and where the property actually gets taken from.

I'd hate for some change to happen later to the chat handler and have this passively create new ZK nodes without intending to, all because we forgot it was here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure...makes sense

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I don't think I can change it somehow so that it does not register as valid ZKPath as if the peon were to somehow use this serviceName, it will just take the String as it is and try to announce it. What can be done here as you suggested to change the name to something like placeholderServiceName and add a comment form where the actual serviceName will be taken ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks. Should have thought about it, anyways done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait...it may not help as serviceName is not a path because all the "/" are replaced to ":" and the resultant String is used to announce the service..let me double check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked putting invalid path as serviceName will not help, makeCanonicalServiceName method in CuratorServiceUtils class replaces all "" with ":"
However I have put another character \u0001 (invalid for zk) in the String and it will fail if this name is used. Also I have added some UTs for ServiceAnnouncer to test the behavior. Please review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran integration tests with this change and actually even the deserialization of DruidNode failed when the peon process starts with exception - com.fasterxml.jackson.core.JsonParseException: Illegal unquoted character ((CTRL-CHAR, code 1)): has to be escaped using backslash to be included in string value
So I think it is just better to have only comment there explaining this name would be ignored and not do anything smart.

@drcrallen
Copy link
Contributor

Cool, 👍 after travis

@drcrallen drcrallen closed this Nov 13, 2015
@drcrallen drcrallen reopened this Nov 13, 2015
@pjain1
Copy link
Member Author

pjain1 commented Nov 13, 2015

Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.062 sec <<< FAILURE! - in io.druid.curator.announcement.AnnouncerTest
testSessionKilled(io.druid.curator.announcement.AnnouncerTest)  Time elapsed: 2.793 sec  <<< FAILURE!
java.lang.AssertionError: expected null, but was:<10,10,1447436227166,1447436227166,0,0,0,94859180405555201,25,0,10>
    at org.junit.Assert.fail(Assert.java:88)
    at org.junit.Assert.failNotNull(Assert.java:664)
    at org.junit.Assert.assertNull(Assert.java:646)
    at org.junit.Assert.assertNull(Assert.java:656)
    at io.druid.curator.announcement.AnnouncerTest.testSessionKilled(AnnouncerTest.java:173)

closing/reopening again for travis

@pjain1 pjain1 closed this Nov 13, 2015
@pjain1 pjain1 reopened this Nov 13, 2015
@pjain1
Copy link
Member Author

pjain1 commented Nov 17, 2015

@drcrallen travis is happy

@pjain1
Copy link
Member Author

pjain1 commented Nov 17, 2015

@xvrl any more comments ?

@@ -251,6 +251,7 @@ Middle managers pass their configurations down to their child peons. The middle
|`druid.indexer.runner.javaOpts`|-X Java options to run the peon in its own JVM.|""|
|`druid.indexer.runner.maxZnodeBytes`|The maximum size Znode in bytes that can be created in Zookeeper.|524288|
|`druid.indexer.runner.startPort`|The port that peons begin running on.|8100|
|`druid.indexer.runner.separateIngestionEndPoint`|Use separate server and consequently separate jetty thread pool for ingesting events|false|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

endpoint is one word

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@pjain1
Copy link
Member Author

pjain1 commented Nov 17, 2015

Transient failure -

io.druid.indexing.overlord.TaskLifecycleTest
testRealtimeIndexTask[taskStorageType=HeapMemoryTaskStorage](io.druid.indexing.overlord.TaskLifecycleTest)  Time elapsed: 2.379 sec  <<< FAILURE!
java.lang.AssertionError: null
    at org.junit.Assert.fail(Assert.java:86)
    at org.junit.Assert.assertTrue(Assert.java:41)
    at org.junit.Assert.assertTrue(Assert.java:52)
    at io.druid.indexing.overlord.TaskLifecycleTest.testRealtimeIndexTask(TaskLifecycleTest.java:833)

I have refactored this test in #1679 and I think it should solve this transient error

@pjain1 pjain1 closed this Nov 17, 2015
@pjain1 pjain1 reopened this Nov 17, 2015
@@ -65,6 +65,14 @@
"hadoop"
);

@JsonProperty
@NotNull
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NotNull is probably not helping as boolean can never be null anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@himanshug
Copy link
Contributor

👍

drcrallen added a commit that referenced this pull request Nov 17, 2015
separate ingestion and query thread pool
@drcrallen drcrallen merged commit dbe201a into apache:master Nov 17, 2015
@pjain1 pjain1 mentioned this pull request Dec 1, 2015
@gianm gianm added this to the 0.8.3 milestone Dec 1, 2015
@gianm gianm mentioned this pull request Dec 4, 2015
@pjain1 pjain1 deleted the jetty_threads branch May 6, 2017 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants