log sizes of created smoosh files #3817

Merged
merged 2 commits into from Jan 5, 2017

Projects

None yet

3 participants

@himanshug
Member

just adding logs to primarily understand impact of changing maxRowsInMemory setting during batch ingestion on size of intermediate segments created.

@himanshug himanshug added this to the 0.10.0 milestone Jan 4, 2017
@@ -695,9 +695,12 @@ public void doRun()
for (File file : toMerge) {
indexes.add(HadoopDruidIndexerConfig.INDEX_IO.loadIndex(file));
}
+
+ log.info("starting merge of intermediate persisted segments...");
@drcrallen
drcrallen Jan 4, 2017 Member

can this be debug level?

@himanshug
himanshug Jan 4, 2017 Member

this is only going to get printed once during batch ingestion and most of the time users are not running indexing with debug level. its not going to pollute the logs really.

@drcrallen
drcrallen Jan 4, 2017 Member

Ok, can the formatting be made more consistent with other logging messages. like simply

Starting merge of intermediate persisted segments

@drcrallen
drcrallen Jan 4, 2017 Member

Same request with other logging statements

mergedBase = mergeQueryableIndex(
indexes, aggregators, new File(baseFlushFile, "merged"), progressIndicator
);
+ log.info("finished merge of intermediate persisted segments...");
@drcrallen
drcrallen Jan 4, 2017 Member

same question, can this be debug level?

@himanshug
himanshug Jan 4, 2017 Member

same reason as above

+
+ FileOutputStream outStream = new FileOutputStream(outFile);
+ this.channel = outStream.getChannel();
+ closer.register(outStream);
@drcrallen
drcrallen Jan 4, 2017 Member

can you move the closer to the same place the file output stream is created?

final FileOutputStream outStream = closer.register(new FileOutputStream(outFile));

should work

@@ -494,6 +498,7 @@ public boolean isOpen()
public void close() throws IOException
{
closer.close();
+ FileSmoosher.LOG.info("Created smoosh file [%s] of size [%s] bytes.", outFile.getAbsolutePath(), outFile.length());
@drcrallen
drcrallen Jan 4, 2017 Member

Same question, can this be debug level?

@himanshug
himanshug Jan 4, 2017 Member

same reason.... mostly we look at logs afterwards where most jobs would've run without debug level logs.

@drcrallen drcrallen was assigned by himanshug Jan 4, 2017
@@ -695,9 +695,12 @@ public void doRun()
for (File file : toMerge) {
indexes.add(HadoopDruidIndexerConfig.INDEX_IO.loadIndex(file));
}
+
+ log.info("starting merge of intermediate persisted segments...");
@drcrallen
drcrallen Jan 4, 2017 Member

Ok, can the formatting be made more consistent with other logging messages. like simply

Starting merge of intermediate persisted segments

@drcrallen
drcrallen Jan 4, 2017 Member

Same request with other logging statements

@drcrallen
Member

Minor logging format request, 👍 after that

@drcrallen drcrallen added the Operations label Jan 4, 2017
@himanshug
Member

@drcrallen updated formatting.

mergedBase = mergeQueryableIndex(
indexes, aggregators, new File(baseFlushFile, "merged"), progressIndicator
);
+ log.info("finished merge of intermediate persisted segments.");
@fjy
fjy Jan 4, 2017 Member

can u log the time taken?

@fjy fjy merged commit 7ced0e8 into druid-io:master Jan 5, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment