Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAMZA-2561: Add job features to MetricsHeader #1402

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PawasChhokra
Copy link
Contributor

Issues: Currently, the MetricsHeader object emitted by the SamzaContainer does not exclude basic job level information.
Changes: Added a few features of the job to be emitted by the MetricsHeader
API Changes: With this change, the MetricsHeader class will emit other properties of the job like number of containers used, number of cores used, etc.
Upgrade instructions: None
Usage instructions: None
Tests: Modified existing tests to work with this change.

Copy link
Contributor

@Sanil15 Sanil15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, Minor comments


package org.apache.samza.metrics;

public enum ApiType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SamzaAPI not a better name?

public static String getApiType(Config config) {
ApplicationConfig appConfig = new ApplicationConfig(config);
String appClass = appConfig.getAppClass();
if (appClass == null || appClass.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever true? for legacy task class applications?

import org.apache.samza.config.TaskConfig;
import org.apache.samza.metrics.ApiType;
import org.apache.samza.metrics.DeploymentType;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;


public class Util {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to your change but JobInfoUtil is a better name for this Util class rather than just Util

@@ -59,6 +59,14 @@
private int numPersistentStores = 2;
private int containerNumCores = 2;
private boolean autosizingEnabled = false;
private String deploymentType = "test deployment type";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding valid values for deployType and API type, to unit test the util functions you just added

@@ -252,7 +264,17 @@ private void validateMetricsHeader(OutgoingMessageEnvelope outgoingMessageEnvelo
Assert.assertEquals(metricsSnapshot.getHeader().getSamzaVersion(), samzaVersion);
Assert.assertEquals(metricsSnapshot.getHeader().getHost(), hostname);
Assert.assertEquals(metricsSnapshot.getHeader().getSource(), DiagnosticsManager.class.getName());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: revert whitespace change

Comment on lines +1 to +24
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.samza.metrics;

public enum DeploymentType {
YARN, STANDALONE
}
Copy link
Contributor

@rmatharu-zz rmatharu-zz Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single line enums like these can likely be added to the metrics-header class, since that seems like the only place theyre used no?
Same for API type above.

Comment on lines +166 to +170
public static int getContainerCount(Config config) {
JobConfig jobConfig = new JobConfig(config);
return jobConfig.getContainerCount();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why even have it as a util?
Could the caller not cast the "config" object into JobConfig, TaskConfig and ClusterManagerConfig once and invoke the respective function, rather than do a cast in each separate util.

Same for getContainerMemoryMb, getNumCores, getThreadPoolSize, getSspGrouperFactory, getHostAffinityEnabled, getContainerRetryCount, getContainerRetryWindowMs, getMaxConcurrency.
Then none of these "util" methods need to be there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines +211 to +214
public static int getMaxJvmHeapMb() {
Long maxJvmHeapMb = Runtime.getRuntime().maxMemory() / (1024 * 1024);
return maxJvmHeapMb.intValue();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can DiagnosticsUtil.buildDiagnosticsManager also invoke this util method, instead of calling Runtime.getRuntime().maxMemory directly ?

Duration terminationDuration, boolean autosizingEnabled) {
Duration terminationDuration,
boolean autosizingEnabled,
String deploymentType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should deploymentType and api type be enums, since we defined an enum for them above?

metricsSnapshot.getHeader().getContainerThreadPoolSize(), metricsSnapshot.getHeader().getHostAffinity(),
metricsSnapshot.getHeader().getSspGrouper(), metricsSnapshot.getHeader().getMaxContainerRetryCount(),
metricsSnapshot.getHeader().getContainerRetryWindowMs(), metricsSnapshot.getHeader().getTaskMaxConcurrency(),
metricsSnapshot.getHeader().getMaxJvmHeapMb());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a new separate constructor for DiagnosticsStreamMessage that takes a metricsHeader as input?

Comment on lines +127 to +140
serde,
blacklist,
Util.getDeploymentType(config),
Util.getApiType(config),
Util.getContainerCount(config),
Util.getContainerMemoryMb(config),
Util.getNumCores(config),
Util.getThreadPoolSize(config),
Util.getHostAffinityEnabled(config),
Util.getSspGrouperFactory(config),
Util.getContainerRetryCount(config),
Util.getContainerRetryWindowMs(config),
Util.getMaxConcurrency(config),
Util.getMaxJvmHeapMb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please maintain the existing ordering of passing in the parameters, then the serde and then the blacklist.
Similar for MetricsHeader above, where time fields are passed in last.

@@ -59,6 +59,14 @@
private int numPersistentStores = 2;
private int containerNumCores = 2;
private boolean autosizingEnabled = false;
private String deploymentType = "test deployment type";
private String apiType = "test api type";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasnt this an enum ?

Copy link
Contributor

@rmatharu-zz rmatharu-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took an initial pass, requires some simplification and cleanup.

@rmatharu-zz
Copy link
Contributor

Since a majority of the parameters are derived from config, in the interest of easy future extensibility, it'd be better to simply emit the entire config object once, from DiagnosticsManager at container-startup.
Similar for dependencies.

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;


public class Util {
private static final Logger LOG = LoggerFactory.getLogger(Util.class);
private static final String YARN_JOB_FACTORY_CLASS = "org.apache.samza.job.yarn.YarnJobFactory";
private static final String BEAM_RUNNER_CLASS = "org.apache.beam.runners.samza.SamzaRunner";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually needed given
#1185 ?

Copy link
Contributor

@rmatharu-zz rmatharu-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment about #1185

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants