Skip to content
This repository has been archived by the owner on Jun 7, 2021. It is now read-only.

APEXCORE-754 Add plugin dependency jar-files to application package #555

Merged
merged 1 commit into from Jul 13, 2017

Conversation

sgolovko
Copy link
Contributor

Included plugin jar-files into the application package and added names of the plugin jar-files to the application classpath.

@vrozov @PramodSSImmaneni Could you please review the PR?

List<String> list = new LinkedList<>();

try {
URLConnection conn = clazz.getResource(clazz.getSimpleName() + ".class").openConnection();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use JarHelper.getJar().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I cannot use the implementation of the method JarHelper.getJar(). The method getJar() creates jar file if the method parameter Java class does not belong to any jar-file. And it does not satisfy to the PR requirements (please talk to @PramodSSImmaneni about the PR requirements).

Logically the implementation of the method getJar() is equal to the following 2 lines of Java code:

URLConnection conn = clazz.getResource(clazz.getSimpleName() + ".class").openConnection();
return conn instanceof JarURLConnection ? ((JarURLConnection)conn).getJarFileURL().getFile() : null;

The method getJar() returns a name of jar-file as a string. But the implementation of the method getApexDependentJarsByClass() should continue to work with URLConnection object to retrieve the manifest properties.

I think the code will look very ugly, if after getting of the jar-file name we need to recreate the URLConnection object from string again. It does not make sense.

Also in order to use the method JarHelper.getJar() the code should create an instance of the class JarHelper. But logically the method getJar() should be static , because it does not depend on any environment. (The implementation of the method contains caching of jar-name that is not used by the implementation of the method getApexDependentJarsByClass()).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no requirement not to create a jar file if getJar() is called for a class that is located on the local file system and does not belong to an existing jar file. I don't see why such requirement will exist and why for certain classes jar should be created on the fly and some classes require pre-existing jars. The behavior must be consistent across all classes that Apex loads.

Loading of the manifest file should be added to the getJar(). Ability to specify extra dependencies using "apex-dependencies" (or Class-Path) should not be limited to a specific jars or classes. It needs to be a generic feature available to all classes that Apex loads.

There is no need to make getJar() static. There is only one instance of JarHelper created.

String jarPath = ((JarURLConnection)conn).getJarFileURL().getFile();
list.add(jarPath);
logger.debug("Jar-file {} that contains the class {} was added", jarPath, clazz.getCanonicalName());
String value = ((JarURLConnection)conn).getMainAttributes().getValue("apex-dependencies");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be conn.getManifest().getMainAttributes()?
Also, instead of apex-dependencies can we keep the attribute name as Class-Path which is usually the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of the method JarURLConnection.getMainAttributes() returns the manifest attributes.

Unfortunately we cannot use the manifest attribute Class-Path, because the purpose of the new property to deploy the corresponded jar-files. But the attribute Class-Path points to the used jars. For instance, you can look at the content of the attribute Class-Path in apex jar-files to see how many jars are included by Maven.


if (!StringUtils.isEmpty(value)) {
String folderPath = new File(jarPath).getParent();
for (String jarFile : value.split(",")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the classpath entries would be space separated if we go with the default..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not going to reuse the attribute Class-Path.

if (!StringUtils.isEmpty(value)) {
String folderPath = new File(jarPath).getParent();
for (String jarFile : value.split(",")) {
String file = folderPath + File.separator + jarFile;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This logic seems a little hard-bound. So basically its expected that user should put the dependent jars in the same location as the plugin jar. This may not be true.

2 This adds a question related to what should be the path in "apex-dependencies"?? Relative? absolute? just jar name?

  1. Can you please check if there is enough control in maven plugin to have "apex-dependencies" comma seperated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the agreement, the dependent jar-file should be located in the folder or subfolders that contains the parent jar-file. And the path of the dependent jar-files must be relative.

Maven allows a user to add any attributes that should be included into the jar manifest. And the attribute values can have commas inside the values.

@@ -338,7 +344,7 @@ public void startApplication() throws YarnException, IOException
throw new IllegalStateException(applicationType + " is not a valid application type.");
}

LinkedHashSet<String> localJarFiles = findJars(dag, defaultClasses);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? If this is not related to this PR, please undo it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comments above.

@@ -179,7 +179,7 @@ public void stop()
yarnClient.stop();
}

public static LinkedHashSet<String> findJars(LogicalPlan dag, Class<?>[] defaultClasses)
public LinkedHashSet<String> findJars(Class<?>[] defaultClasses)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? If this is not related to this PR, please undo it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically the method findJars() is an internal method of the class StramClient. The PR implementation has to use an extra private property of the class StramClient: Configuration conf. So I think it is much better to change the method to dynamic and exclude dag and configuration from the method signature.

@@ -604,6 +604,8 @@ protected void serviceInit(Configuration conf) throws Exception
}

public static final String PLUGINS_CONF_KEY = "apex.plugin.stram.plugins";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tushargosavi @PramodSSImmaneni @chinmaykolhatkar Please rename to "apex.engine.plugins". My concern is using plugin twice and referring to stram.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, is it necessary to have a separate entry for setup and engine plugins? Can't platform distinguish them based on interfaces the class implements.

@sgolovko
Copy link
Contributor Author

Based on our agreement I implemented the new method getJars() that takes two extra boolean parameters to provide the explicit behavior: makeJarFromFolder and addJarDependencies. The signature and behavior of the old method getJar() was not changed.

@vrozov @PramodSSImmaneni Please review the PR.


if (jarPath == null) {
try {
URLConnection conn = clazz.getResource(clazz.getSimpleName() + ".class").openConnection();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? Why does the class need to reload itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

*/
public List<String> getJars(Class<?> clazz, boolean makeJarFromFolder, boolean addJarDependencies)
{
CodeSource codeSource = clazz.getProtectionDomain().getCodeSource();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the original code unless there is a reason to change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


LOG.info("Local jar file dependencies: " + localJarFiles);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use sl4j logging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't update this line. But anyway it is fixed.

* @param addJarDependencies True if the method should include dependent jar files
* @return List of names of the jar-files
*/
public List<String> getJars(String classPath, boolean makeJarFromFolder, boolean addJarDependencies)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding a class by its name is not the functionality provided by getJars(). It should be outside of getJars().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

String pluginClassesPaths = conf.get(StreamingAppMasterService.PLUGINS_CONF_KEY);
if (!StringUtils.isEmpty(pluginClassesPaths)) {
for (String pluginClassPath : StringUtils.splitByWholeSeparator(pluginClassesPaths, StreamingAppMasterService.PLUGINS_CONF_SEP)) {
localJarFiles.addAll(jarHelper.getJars(pluginClassPath, true, true));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must not be any difference in behavior between plugin classes and any other classes. In all cases it should use getJars(class, true, true) and please make it the default behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

*/
private void getDependentJarsFromManifest(JarURLConnection conn, String jarPath, List<String> list) throws IOException
{
String value = ((JarURLConnection)conn).getMainAttributes().getValue("apex-dependencies");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define "apex-dependencies" as public static final String.
cast to (JarURLConnection) is not necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* @param list List of target jar-files
* @throws IOException
*/
private void getDependentJarsFromManifest(JarURLConnection conn, String jarPath, List<String> list) throws IOException
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider using JarFile instead of JarURLConnection
I would expect the getDependentJarsFromManifest to return a list of the dependent jars, instead of accepting a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@sgolovko
Copy link
Contributor Author

@vrozov The updates are in the PR. Please review.

* @param addJarDependencies True if the method should include dependent jar files
* @return List of names of the jar-files
*/
public List<String> getJars(Class<?> jarClass, boolean makeJarFromFolder, boolean addJarDependencies)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be a Set? Duplicates should not be allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this code the list can contain unique values only: the name of the parent jar files and names of dependent jar files. But the Java implementations of List is simpler than the implementations of Set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is uniqueness guaranteed? You can't rely on apex-dependencies to be unique. There is no need to implement Set or List, the complexity of implementation is hidden.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the developers/maven will put duplicates in the apex dependent property. Even if it happens it is not big deal. For instance the implementation of the method find() will remove duplicates. The major number of duplicates will come from the apex dependent properties for several parent jars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a public API and returning a List is wrong. It should be a Set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

*/
public List<String> getJars(Class<?> jarClass, boolean makeJarFromFolder, boolean addJarDependencies)
{
List<String> list = new LinkedList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but better to allocate List/Set after calling getJar() to avoid extra object creation in case of an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

list.add(jar);
}

if (addJarDependencies) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be inside if (jar != null) check, otherwise subject to NullPointerException.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check (jar == null) is inside the implementation of the method getJars().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case getJar() returns null (for a system library), it will not be added to the list due to check that jar != null, but if addJarDependencies is true, the code will call new JarFile(null).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{
String value = jarFile.getManifest().getMainAttributes().getValue(APEX_DEPENDENCIES);
if (!StringUtils.isEmpty(value)) {
String folderPath = new File(jarFile.getName()).getParent();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a Path not a String?

Path folderPath = Paths.get(jarFile.getName()).getParent()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this Path. The value of the parent folder is used as a string for concatenation of strings only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path abstracts concatenation, getting absolute path, creating children. All operations done are Path operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@vrozov
Copy link
Member

vrozov commented Jul 13, 2017

@sgolovko Are the latest changes in?

set.add(jar);
}

if (addJarDependencies && (jar != null)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this block inside if on line 136.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

set.add(file.getPath());
logger.debug("Jar-file {} was added as a dependent jar", file.getPath());
} else {
logger.warn("Jar-file {} does not exist", file.getPath());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be good to provide details regarding original Jar that has a missing dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Included plugin jar-files into the application package and added names of the plugin jar-files to the application classpath.
@vrozov vrozov merged commit 5994a0b into apache:master Jul 13, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants