Completely rework the Druid getting started process #2216

fjy · 2016-01-06T23:13:49Z

rejigger the distribution packaging to make more sense
new quickstart tutorial
new load batch data tutorial
new load streaming data tutorial
new load kafka tutorial
new clustering tutorial
new streaming ingestion overview page
new stream push page
new stream pull page
lots of new information about using Druid and Tranquility
added a new query optimization page
added a new caching info page

This PR depends on some CSS changes to Druid docs which are coming in a separate PR. The updated pages will not render correctly without those changes.

This will rework the Druid getting started process to be very similar to Imply's recommended getting started process, which was mostly written by @gianm . The packaging of Druid will also be similar to what Imply is doing.

navis · 2016-01-07T01:42:36Z

I love this one. 👍

himanshug · 2016-01-07T04:52:54Z

docs/content/tutorials/quickstart.md

+
+You will need:
+
+  * Java 7 or better


s/better/higher

gianm · 2016-01-07T15:54:18Z

@fjy could you gzip examples/quickstart/wikiticker-2015-09-12-sampled.json? There's not much reason to have it there as a text file

fjy · 2016-01-07T19:38:48Z

@himanshug @navis @gianm @pjain1 added clustering docs. More changes to come.

rasahner · 2016-01-08T01:42:07Z

docs/content/tutorials/cluster.md

+
+## Tune Druid Brokers
+
+Druid Brokers also benefit greatly from being tuned to the hardware it


"they run on"?

fjy · 2016-02-03T00:28:18Z

@himanshug addressed comments

rasahner · 2016-02-03T18:52:30Z

docs/content/tutorials/quickstart.md

+
+```bash
+curl http://www.gtlib.gatech.edu/pub/apache/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz -o $zookeeper-3.4.6.tar.gz
+tar xzf $zookeeper-3.4.6.tar.gz


I don't see why there's a $ in front of zookeeper-3.4.6.tar.gz on this line and the one before.

that should be removed

rasahner · 2016-02-03T19:36:39Z

docs/content/tutorials/quickstart.md

+java `cat conf-quickstart/druid/coordinator/jvm.config | xargs` -cp conf-quickstart/druid/_common:conf-quickstart/druid/coordinator:lib/* io.druid.cli.Main server coordinator
+java `cat conf-quickstart/druid/overlord/jvm.config | xargs` -cp conf-quickstart/druid/_common:conf-quickstart/druid/overlord:lib/* io.druid.cli.Main server overlord
+java `cat conf-quickstart/druid/middleManager/jvm.config | xargs` -cp conf-quickstart/druid/_common:conf-quickstart/druid/middleManager:lib/* io.druid.cli.Main server middleManager
+```


I guess most people trying this will know to put these each in background or run each in a different window or whatever, but it's tempting to cut/paste this whole thing to execute...

fjy · 2016-02-03T22:39:37Z

@rasahner addressed comments

pjain1 · 2016-02-03T22:58:01Z

👍

rasahner · 2016-02-03T23:01:16Z

docs/content/tutorials/ingestion.md

+We recommend this kind of architecture if you need real-time analytics but *also* need 100% fidelity
+for historical data. All streaming ingestion methods currently supported by Druid do introduce the
+possibility of dropped or duplicated messages in certain failure scenarios, and batch re-ingestion
+eliminates this potential source of error for historical data. This also gives you the option to


The first part of the "also" isn't really an "also" - necessary re-ingestion because of possible errors is exactly what has been being discussed. I'd replace both sentences with something like
"Hybrid streaming also gives you the option to re-ingest your data if you needed to revise it for any reason."

rasahner · 2016-02-03T23:17:20Z

+1 when author thinks it is ready.

rasahner · 2016-02-03T23:40:23Z

docs/content/tutorials/ingestion.md

+- [Streams-based tutorial](tutorial-streams.html) showing you how to push data over HTTP.
+- [Kafka-based tutorial](tutorial-kafka.html) showing you how to load data from Kafka.
+
+## Hybrid batch/streaming


Sorry if my comments were confusing. Here's my recommended text for this whole section. I think it's not necessary to say anything right here about queries not caring how the data was ingested - it potentially adds more confusion than it takes away.

You can combine batch and streaming methods in a hybrid batch/streaming architecture. In a hybrid architecture, you use a streaming method to do initial ingestion, and then periodically re-ingest older data in batch mode (typically every few hours, or nightly). When Druid re-ingests data for a time range, the new data automatically replaces the data from the earlier ingestion.

All streaming ingestion methods currently supported by Druid do introduce the possibility of dropped or duplicated messages in certain failure scenarios, and batch re-ingestion eliminates this potential source of error for historical data.

Batch re-ingestion also gives you the option to re-ingest your data if you needed to revise it for any reason.

rasahner · 2016-02-03T23:41:37Z

I have no other comments.

Completely rework the Druid getting started process

himanshug reviewed Jan 7, 2016
View reviewed changes

docs/content/tutorials/quickstart.md

You will need:

* Java 7 or better

Copy link

Contributor

himanshug Jan 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/better/higher

rasahner reviewed Jan 8, 2016
View reviewed changes

fjy added the Discuss label Jan 19, 2016

fjy force-pushed the new-tutorials branch from 275c4d7 to c81ee94 Compare January 19, 2016 21:15

fjy force-pushed the new-tutorials branch from abac278 to 9963166 Compare February 3, 2016 00:28

rasahner reviewed Feb 3, 2016
View reviewed changes

fjy force-pushed the new-tutorials branch from 023f19f to 379f86e Compare February 3, 2016 19:30

rasahner reviewed Feb 3, 2016
View reviewed changes

fjy force-pushed the new-tutorials branch from cfc2420 to 11e0fee Compare February 3, 2016 22:45

rasahner reviewed Feb 3, 2016
View reviewed changes

fjy force-pushed the new-tutorials branch from 4b77a8b to bcda499 Compare February 3, 2016 23:04

fjy force-pushed the new-tutorials branch from c8a5ab6 to 5ec13f2 Compare February 3, 2016 23:27

rasahner reviewed Feb 3, 2016
View reviewed changes

fjy force-pushed the new-tutorials branch 2 times, most recently from f82e1c7 to 067bfda Compare February 4, 2016 01:49

new quickstart

1aa363c

fjy force-pushed the new-tutorials branch from 067bfda to 1aa363c Compare February 4, 2016 17:37

fjy added a commit that referenced this pull request Feb 4, 2016

Merge pull request #2216 from druid-io/new-tutorials

7abad74

Completely rework the Druid getting started process

fjy merged commit 7abad74 into master Feb 4, 2016

fjy deleted the new-tutorials branch February 4, 2016 18:43

fjy mentioned this pull request Feb 5, 2016

druid-0.9.0 release notes #2404

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Completely rework the Druid getting started process #2216

Completely rework the Druid getting started process #2216

fjy commented Jan 6, 2016

navis commented Jan 7, 2016

himanshug Jan 7, 2016

gianm commented Jan 7, 2016

fjy commented Jan 7, 2016

rasahner Jan 8, 2016

fjy commented Feb 3, 2016

rasahner Feb 3, 2016

fjy Feb 3, 2016

rasahner Feb 3, 2016

fjy commented Feb 3, 2016

pjain1 commented Feb 3, 2016

rasahner Feb 3, 2016

rasahner commented Feb 3, 2016

rasahner Feb 3, 2016

fjy Feb 4, 2016

rasahner commented Feb 3, 2016


		## Tune Druid Brokers

		Druid Brokers also benefit greatly from being tuned to the hardware it

Completely rework the Druid getting started process #2216

Completely rework the Druid getting started process #2216

Conversation

fjy commented Jan 6, 2016

navis commented Jan 7, 2016

Choose a reason for hiding this comment

gianm commented Jan 7, 2016

fjy commented Jan 7, 2016

Choose a reason for hiding this comment

fjy commented Feb 3, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fjy commented Feb 3, 2016

pjain1 commented Feb 3, 2016

Choose a reason for hiding this comment

rasahner commented Feb 3, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rasahner commented Feb 3, 2016