APEXCORE-754 Add plugin dependency jar-files to application package #555
Conversation
List<String> list = new LinkedList<>(); | ||
|
||
try { | ||
URLConnection conn = clazz.getResource(clazz.getSimpleName() + ".class").openConnection(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use JarHelper.getJar().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I cannot use the implementation of the method JarHelper.getJar(). The method getJar() creates jar file if the method parameter Java class does not belong to any jar-file. And it does not satisfy to the PR requirements (please talk to @PramodSSImmaneni about the PR requirements).
Logically the implementation of the method getJar() is equal to the following 2 lines of Java code:
URLConnection conn = clazz.getResource(clazz.getSimpleName() + ".class").openConnection();
return conn instanceof JarURLConnection ? ((JarURLConnection)conn).getJarFileURL().getFile() : null;
The method getJar() returns a name of jar-file as a string. But the implementation of the method getApexDependentJarsByClass() should continue to work with URLConnection object to retrieve the manifest properties.
I think the code will look very ugly, if after getting of the jar-file name we need to recreate the URLConnection object from string again. It does not make sense.
Also in order to use the method JarHelper.getJar() the code should create an instance of the class JarHelper. But logically the method getJar() should be static , because it does not depend on any environment. (The implementation of the method contains caching of jar-name that is not used by the implementation of the method getApexDependentJarsByClass()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no requirement not to create a jar file if getJar() is called for a class that is located on the local file system and does not belong to an existing jar file. I don't see why such requirement will exist and why for certain classes jar should be created on the fly and some classes require pre-existing jars. The behavior must be consistent across all classes that Apex loads.
Loading of the manifest file should be added to the getJar(). Ability to specify extra dependencies using "apex-dependencies" (or Class-Path) should not be limited to a specific jars or classes. It needs to be a generic feature available to all classes that Apex loads.
There is no need to make getJar() static. There is only one instance of JarHelper created.
String jarPath = ((JarURLConnection)conn).getJarFileURL().getFile(); | ||
list.add(jarPath); | ||
logger.debug("Jar-file {} that contains the class {} was added", jarPath, clazz.getCanonicalName()); | ||
String value = ((JarURLConnection)conn).getMainAttributes().getValue("apex-dependencies"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be conn.getManifest().getMainAttributes()
?
Also, instead of apex-dependencies
can we keep the attribute name as Class-Path
which is usually the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of the method JarURLConnection.getMainAttributes() returns the manifest attributes.
Unfortunately we cannot use the manifest attribute Class-Path, because the purpose of the new property to deploy the corresponded jar-files. But the attribute Class-Path points to the used jars. For instance, you can look at the content of the attribute Class-Path in apex jar-files to see how many jars are included by Maven.
|
||
if (!StringUtils.isEmpty(value)) { | ||
String folderPath = new File(jarPath).getParent(); | ||
for (String jarFile : value.split(",")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the classpath entries would be space separated if we go with the default..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not going to reuse the attribute Class-Path.
if (!StringUtils.isEmpty(value)) { | ||
String folderPath = new File(jarPath).getParent(); | ||
for (String jarFile : value.split(",")) { | ||
String file = folderPath + File.separator + jarFile; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This logic seems a little hard-bound. So basically its expected that user should put the dependent jars in the same location as the plugin jar. This may not be true.
2 This adds a question related to what should be the path in "apex-dependencies"?? Relative? absolute? just jar name?
- Can you please check if there is enough control in maven plugin to have "apex-dependencies" comma seperated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the agreement, the dependent jar-file should be located in the folder or subfolders that contains the parent jar-file. And the path of the dependent jar-files must be relative.
Maven allows a user to add any attributes that should be included into the jar manifest. And the attribute values can have commas inside the values.
@@ -338,7 +344,7 @@ public void startApplication() throws YarnException, IOException | |||
throw new IllegalStateException(applicationType + " is not a valid application type."); | |||
} | |||
|
|||
LinkedHashSet<String> localJarFiles = findJars(dag, defaultClasses); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? If this is not related to this PR, please undo it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comments above.
@@ -179,7 +179,7 @@ public void stop() | |||
yarnClient.stop(); | |||
} | |||
|
|||
public static LinkedHashSet<String> findJars(LogicalPlan dag, Class<?>[] defaultClasses) | |||
public LinkedHashSet<String> findJars(Class<?>[] defaultClasses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? If this is not related to this PR, please undo it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically the method findJars() is an internal method of the class StramClient. The PR implementation has to use an extra private property of the class StramClient: Configuration conf. So I think it is much better to change the method to dynamic and exclude dag and configuration from the method signature.
@@ -604,6 +604,8 @@ protected void serviceInit(Configuration conf) throws Exception | |||
} | |||
|
|||
public static final String PLUGINS_CONF_KEY = "apex.plugin.stram.plugins"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tushargosavi @PramodSSImmaneni @chinmaykolhatkar Please rename to "apex.engine.plugins". My concern is using plugin twice and referring to stram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, is it necessary to have a separate entry for setup and engine plugins? Can't platform distinguish them based on interfaces the class implements.
Based on our agreement I implemented the new method getJars() that takes two extra boolean parameters to provide the explicit behavior: makeJarFromFolder and addJarDependencies. The signature and behavior of the old method getJar() was not changed. @vrozov @PramodSSImmaneni Please review the PR. |
|
||
if (jarPath == null) { | ||
try { | ||
URLConnection conn = clazz.getResource(clazz.getSimpleName() + ".class").openConnection(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? Why does the class need to reload itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
*/ | ||
public List<String> getJars(Class<?> clazz, boolean makeJarFromFolder, boolean addJarDependencies) | ||
{ | ||
CodeSource codeSource = clazz.getProtectionDomain().getCodeSource(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the original code unless there is a reason to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
LOG.info("Local jar file dependencies: " + localJarFiles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use sl4j logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't update this line. But anyway it is fixed.
* @param addJarDependencies True if the method should include dependent jar files | ||
* @return List of names of the jar-files | ||
*/ | ||
public List<String> getJars(String classPath, boolean makeJarFromFolder, boolean addJarDependencies) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finding a class by its name is not the functionality provided by getJars(). It should be outside of getJars().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
String pluginClassesPaths = conf.get(StreamingAppMasterService.PLUGINS_CONF_KEY); | ||
if (!StringUtils.isEmpty(pluginClassesPaths)) { | ||
for (String pluginClassPath : StringUtils.splitByWholeSeparator(pluginClassesPaths, StreamingAppMasterService.PLUGINS_CONF_SEP)) { | ||
localJarFiles.addAll(jarHelper.getJars(pluginClassPath, true, true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There must not be any difference in behavior between plugin classes and any other classes. In all cases it should use getJars(class, true, true) and please make it the default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
*/ | ||
private void getDependentJarsFromManifest(JarURLConnection conn, String jarPath, List<String> list) throws IOException | ||
{ | ||
String value = ((JarURLConnection)conn).getMainAttributes().getValue("apex-dependencies"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define "apex-dependencies" as public static final String.
cast to (JarURLConnection) is not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* @param list List of target jar-files | ||
* @throws IOException | ||
*/ | ||
private void getDependentJarsFromManifest(JarURLConnection conn, String jarPath, List<String> list) throws IOException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider using JarFile instead of JarURLConnection
I would expect the getDependentJarsFromManifest to return a list of the dependent jars, instead of accepting a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@vrozov The updates are in the PR. Please review. |
* @param addJarDependencies True if the method should include dependent jar files | ||
* @return List of names of the jar-files | ||
*/ | ||
public List<String> getJars(Class<?> jarClass, boolean makeJarFromFolder, boolean addJarDependencies) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be a Set? Duplicates should not be allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this code the list can contain unique values only: the name of the parent jar files and names of dependent jar files. But the Java implementations of List is simpler than the implementations of Set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is uniqueness guaranteed? You can't rely on apex-dependencies to be unique. There is no need to implement Set or List, the complexity of implementation is hidden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the developers/maven will put duplicates in the apex dependent property. Even if it happens it is not big deal. For instance the implementation of the method find() will remove duplicates. The major number of duplicates will come from the apex dependent properties for several parent jars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a public API and returning a List is wrong. It should be a Set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
*/ | ||
public List<String> getJars(Class<?> jarClass, boolean makeJarFromFolder, boolean addJarDependencies) | ||
{ | ||
List<String> list = new LinkedList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but better to allocate List/Set after calling getJar() to avoid extra object creation in case of an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
list.add(jar); | ||
} | ||
|
||
if (addJarDependencies) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be inside if (jar != null) check, otherwise subject to NullPointerException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check (jar == null) is inside the implementation of the method getJars().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case getJar() returns null (for a system library), it will not be added to the list due to check that jar != null, but if addJarDependencies is true, the code will call new JarFile(null)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
{ | ||
String value = jarFile.getManifest().getMainAttributes().getValue(APEX_DEPENDENCIES); | ||
if (!StringUtils.isEmpty(value)) { | ||
String folderPath = new File(jarFile.getName()).getParent(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a Path not a String?
Path folderPath = Paths.get(jarFile.getName()).getParent()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should this Path. The value of the parent folder is used as a string for concatenation of strings only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Path abstracts concatenation, getting absolute path, creating children. All operations done are Path operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@sgolovko Are the latest changes in? |
set.add(jar); | ||
} | ||
|
||
if (addJarDependencies && (jar != null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this block inside if on line 136.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
set.add(file.getPath()); | ||
logger.debug("Jar-file {} was added as a dependent jar", file.getPath()); | ||
} else { | ||
logger.warn("Jar-file {} does not exist", file.getPath()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be good to provide details regarding original Jar that has a missing dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Included plugin jar-files into the application package and added names of the plugin jar-files to the application classpath.
Included plugin jar-files into the application package and added names of the plugin jar-files to the application classpath.
@vrozov @PramodSSImmaneni Could you please review the PR?