Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYSTEMML-1118]Updated to use JCuda 0.8.0 #291

Closed
wants to merge 1 commit into from

Conversation

nakul02
Copy link
Member

@nakul02 nakul02 commented Nov 18, 2016

This PR updates systemml to use cuda 8 and cudnn 5.1.
Use has been significantly simplified compared to the steps listed here.

The user need only invoke maven to make the package and the cuda native shared libraries will be available wrapped in jars in the classpath without the need to set any environment variables or extra java options. (tested on a Centos 7 x86_64 system).

mavenized-jcuda and jcuda packages have been excluded in the shade plugin to stop them from being included in the package. This decision can be revisited in the future.

Note - This is a WIP. We still await the availability of a repo from where we can fetch mavenized-jcuda with cuda 8.

@nakul02
Copy link
Member Author

nakul02 commented Nov 18, 2016

Ping @niketanpansare

@dusenberrymw
Copy link
Contributor

+1 on the update to the latest cuda + cudnn.

@niketanpansare
Copy link
Contributor

@nakul02 Can we please close this PR and use #307 instead ?

@bertholdreinwald
Copy link
Contributor

but let's keep native BLAS and GPU backend separate.

@niketanpansare
Copy link
Contributor

niketanpansare commented Dec 23, 2016

Can we please move this discussion to #307 ? The PR addresses:

  1. the deployment related issue for BLAS as well as JCuda. It also allows users to selectively turn-off BLAS as well as JCuda and always defaults to Java library. Also, it handles heterogeneity of the cluster and is one of the required step to make progress for distributed GPU backend.
  2. avoids the need for -gpu flag and provides a common interface irrespective of the programmatic APIs such as MLContext, DSL, mllearn, etc

@niketanpansare
Copy link
Contributor

@nakul02 I take back my earlier comment. For now, let's use the approach in your PR.

LGTM after resolving conflicting files 👍

@nakul02
Copy link
Member Author

nakul02 commented Mar 2, 2017

This PR is now updated to use jcuda8 artifacts available on maven central.
@niketanpansare - could you please try this on your windows machine?
@asurve - for now, I am including the jcuda* artifacts in the uber/package jar. If this is not ok, I can change this.

@nakul02 nakul02 changed the title [SYSTEMML-1118][WIP] Updated to use JCuda 0.8.0 [SYSTEMML-1118]Updated to use JCuda 0.8.0 Mar 2, 2017
@niketanpansare
Copy link
Contributor

LGTM 👍

@deroneriksson
Copy link
Member

@nakul02 Thank you for helping getting jcuda deployed to maven central! This will be of great assistance to many people out there.

@asurve
Copy link
Member

asurve commented Mar 2, 2017

I would not recommend to have non-SystemML related code in Uber jar file. We do have AntLR and wink files in Uber jar, but I am not sure if thats right approach. Unless you convince other way, I would prefer to have jcuda files outside Uber jar.

<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcuda</artifactId>
<version>0.7.5b</version>
<scope>provided</scope>
<version>${jcuda.version}</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nakul02 can you please mark all jcuda related artifacts as "provided" dependency ? The PR increases the size of SystemML.jar from 4.7 MB to 6.4 MB.

@nakul02
Copy link
Member Author

nakul02 commented Mar 2, 2017

@asurve- I suggested adding the jcuda jars to the Uber jar because those jars may not be available in the environment they are run in (Spark)
The obvious disadvantages are related to release process (Per platform, testing on those platforms, licenses).

@nakul02
Copy link
Member Author

nakul02 commented Mar 2, 2017

If the scope is set to provided, it becomes the user's responsibility to download the jcuda jars (there are 16 jars). This makes using a deployment a pain for the user.

On the other hand, (as @niketanpansare and I spoke offline about), the jar becomes bigger and for a feature that is not used every single time, maybe making the jar bigger is not a good idea.

@lresende, @deroneriksson, @bertholdreinwald - do you have any comments/thoughts?
If not, I'll do what @niketanpansare suggested.

@lresende
Copy link
Member

lresende commented Mar 2, 2017

Assuming jcuda is compatible with Apache License, I see two possible solutions :

  • in the binary distro, have something like lib-cuda and have the necessary libs for platforms there
  • the other approach, might be just help the users with a simple script that might download the necessary dependencies and put on the right place for the user.

I would prefer not to have this in the "fat-jar" particularly because of size and platform dependency. But then we might need to add things to classpath in order to run systemml with GPU support.

@nakul02
Copy link
Member Author

nakul02 commented Mar 4, 2017

@asurve, @lresende - excluded jcuda from jar and uber jar. The user will be responsible for providing them.

@asurve
Copy link
Member

asurve commented Mar 4, 2017

Thanks @nakul02 LGTM

@nakul02
Copy link
Member Author

nakul02 commented Mar 4, 2017

Thanks @asurve.

Docs will be available in https://github.com/apache/incubator-systemml/blob/master/docs/devdocs/gpu-backend.md.

Will merge.

@asfgit asfgit closed this in 3757995 Mar 4, 2017
asfgit pushed a commit that referenced this pull request Mar 6, 2017
j143-zz pushed a commit to j143-zz/systemml that referenced this pull request Nov 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants