-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Detect multiple versions of the RAPIDS jar on the classpath at the same time #7309
Comments
This just happened again to another customer. I filed NVIDIA/spark-rapids-tools#266 in tools to try and warn people about it, but at the same time if we lose a day working with a customer because of a simple mistake like this it really would be nice to just have the plugin crash to let you know something is wrong ahead of time. (possibly with a config to bypass it) |
Scope: count number of Spark RAPIDS jars on the classpath; if number of Spark RAPIDS jars > 1, then fail with error message. |
here is an approach explored in the Scala shell that could be implemented somewhere around ShimLoader. For example I have these rapids jars in my ivy cache
~/dist/spark-3.3.0-bin-hadoop3/bin/spark-shell --jars $(find ~/.ivy2/jars/ -name '*rapids-4-spark_2.12*.jar' | xargs printf "%s,") import scala.collection.JavaConverters._
val cl = classOf[com.nvidia.spark.rapids.SparkShimServiceProvider].getClassLoader
val rapidsJarURLs = cl.getResources("rapids4spark-version-info.properties").asScala.toList
scala> rapidsJarURLs.map { url => scala.io.Source.fromInputStream(url.openStream()).mkString("") } yields
|
Is your feature request related to a problem? Please describe.
This is not going to work when the older jar is the first thing on the classpath, but at some point we can hopefully get to the point where the jars that don't have a check are not used so then we can reliably detect this.
It would really be nice if we had a way to detect that there were multiple versions of the plugin jar on the classpath. By default this should result in an error when we try to load, because the user might not have the version that they expect. We can have a config to allow it to keep running in this situation.
Describe the solution you'd like
The class loader can return an enumeration of multiple URLs for resources when there are duplicates on the classpath. We could try to look for
cudf-java-version-info.properties
,rapids4spark-version-info,.properties
, andspark-rapids-jni-version-info.properties
. If we find multiple of any of these on the classpath at a minimum we need to warn, and by default we probably should throw an exception.We could even go as far as to read these files and see what is different. to include in the error message.
The text was updated successfully, but these errors were encountered: