QUDA with MILC #313

Drink7 · 2015-07-10T05:03:49Z

I have some problem about the quda with MILC.
I'm using the application in the MILC called ks_imp_rhmc,and the Makefile setting is below.
//compiler
CC = mpicc
//Linker flag
LD=mpicxx
//QUDA option
WANTQUDA = true
WANT_GF_GPU = true

Other QUDA options are commented.
But I still cannot run on the GPU,an executable is generated but it seems that it doesn't run on the GPU.
I don't know some warning message mean,I see this two warning message in the log.

WARNING: Failed to determine NUMA affinity for device 0 (possibly not applicable)
WARNING: Cache file not found. All kernels will be re-tuned (if tuning is enabled).

Can someone help me ? Thanks.
(My quda version is 0.7.1)

AlexVaq · 2015-07-10T05:20:26Z

Well, these two warning messages look QUDA generated. What makes you think you're not running in the GPU? Did you try to log in the node you were running and call nvidia-smi to see the GPU workload?

Aldo take into accout that, if there is not cache file, the first time you run it might take a while...

El 10/7/2015, a las 7:03, Drink7 notifications@github.com escribió:

I have some problem about the quda with MILC.
I'm using the application called ks_imp_rhmc,and the Makefile setting is below.
//compiler
CC = mpicc
//Linker flag
LD=mpicxx
//QUDA option
WANTQUDA = true
WANT_GF_GPU = true

Other QUDA options are commented.
But I still cannot run on the GPU,an executable is generated but it seems that it doesn't run on the GPU.
I don't know some warning message mean,I see this two warning message in the log.

WARNING: Failed to determine NUMA affinity for device 0 (possibly not applicable)
WARNING: Cache file not found. All kernels will be re-tuned (if tuning is enabled).

Can someone help me ? Thanks.
(My quda version is 0.7.1)

—
Reply to this email directly or view it on GitHub.

Drink7 · 2015-07-10T08:08:25Z

The GPU card I use is Tesla K40,and I called nvidia-smi to see the workload and got this picture.

Because of the 0% GPU Util,I think the executable doesn't correctly run in the GPU.

AlexVaq · 2015-07-10T08:14:22Z

Which part of QUDA are you trying to use exactly? Did you enable VERBOSE output?

Drink7 · 2015-07-10T08:47:19Z

The application ks_imp_rhmc seems that trying to use HISQ fermion force and gauge tools,and I just use the configure file in the quda's directory named configure.milc.titan to build up QUDA.
In the configure file it doesn't enable VERBOSE,I'll add it to configure file and build it again.

AlexVaq · 2015-07-10T08:51:34Z

It might give you some info, but I don’t guarantee anything. To tell you the truth, I’m not familiar at all with MILC code. However, I contributed to gauge tools... What are you using exactly of gauge tools?

Drink7 · 2015-07-10T09:07:00Z

In the MILC code,the readme file in the application ks_imp_rhmc said that measurements include plaquette...,so I add the flag to the configure file.But I'm not sure whether it will work with the executable ks_imp_rhmc generate.

mathiaswagner · 2015-07-10T11:02:09Z

Which version of MILC and QUDA do you use?

Can you share your MILC Makefile and QUDA make.inc ?

mathiaswagner · 2015-07-10T11:04:27Z

The NUMA affinity message is from QUDA. It is a known issue that might affect performance on some systems.

Drink7 · 2015-07-10T11:35:53Z

I use the latest version,version 7.7.11 of MILC and version 0.7.1 of QUDA.
Here is my MILC Makefile
http://codepad.org/2EaY2rJp
QUDA make.inc
http://codepad.org/9cG6fxmQ

mathiaswagner · 2015-07-10T13:17:34Z

Thanks. I will try to have a look later.

Did you try to run on of the test input files in the ks_imp_rhmc/test directory? Which binary exactly did you use in the ks_imp_rhmc directory? su3_rhmc_hisq ?

mathiaswagner · 2015-07-10T13:47:04Z

From my first look in your Makefile:

You only offload the gauge force to the GPU. Everything else is kept on the CPU. So that should explain why your GPU is idle most of the time. I assume the code is running on the GPU, it is just only the gauge force.

Things you can check to verify it is running on the GPU:

at the end QUDA should print some timing information:

computeGaugeForceQuda Total time = 6.55167 secs
download     = 3.397669 secs (  51.9%), with       12 calls at 2.831391e+05 us per call
upload     = 1.160581 secs (  17.7%), with        6 calls at 1.934302e+05 us per call
init     = 0.042459 secs ( 0.648%), with       12 calls at 3.538250e+03 us per call
compute     = 1.926511 secs (  29.4%), with        6 calls at 3.210852e+05 us per call
free     = 0.020827 secs ( 0.318%), with        6 calls at 3.471167e+03 us per call
constant     = 0.003388 secs (0.0517%), with       12 calls at 2.823333e+02 us per call
total accounted       = 6.551435 secs (   100%)
total missing         = 0.000236 secs (0.0036%)

for the calls of the gauge force you should see lines similar to

QUDA_MILC_INTERFACE: qudaGaugeForce (called)
QUDA_MILC_INTERFACE: qudaGaugeForce (return)
GFTIME:   time = 1.507170e+00 (Symanzik1_QUDA) mflops = 1.064487e+05

You might want to try to put the inversions also on the GPU by using
WANT_FN_CG_GPU = true in the MILC Makefile.

If you still have troubles feel free to share your output file. To reduce its length you can change line 216 in the MILC Makefile to

CGPU += -DSET_QUDA_VERBOSE # -DSET_QUDA_SUMMARIZE

Drink7 · 2015-07-10T14:48:02Z

Yes,I've tried to run the test input file in the ks_imp_rhmc/test directory before,and I used the executable su3_rhmd_hisq with double precision.And then I called nvidia-smi and it showed the GPU information above.

So should I put the inversions for all the QUDA Options or just change this option?
WANT_FN_CG_GPU = true

mathiaswagner · 2015-07-10T14:50:52Z

Well, your nvidia-smi output shows that the GPU is used. But with only the gauge force on the GPU the utilization is probably pretty low. That is what you see.
How long does the execution take and what does QUDA print for computeGaugeForceQuda at the end of the run?

If you want to put the inversion on the GPU the WANT_FN_CG_GPU = true is sufficient but you may also set everything to true. Just give it a try.

detar · 2015-07-10T15:13:17Z

For the ks_imp_rhmc applications, you will need the full suite of HISQ
evolution modules.

Perhaps the following example Makefile for ks_imp_rhmc would help

http://www.physics.utah.edu/~detar/milc/Makefile-Drink7

This is for a somewhat later version of the MILC code than 7.7.11, but
the QUDA macros should still be OK.

On 7/10/2015 8:48 AM, Drink7 wrote:

Yes,I've tried to run the test input file in the ks_imp_rhmc/test
directory before,and I used the executable su3_rhmd_hisq with double
precision.And then I called nvidia-smi and it showed the GPU
information above.

So should I put the inversions for all the QUDA Options or just change
this option?
WANT_FN_CG_GPU = true

—
Reply to this email directly or view it on GitHub
#313 (comment).

Carleton DeTar
Department of Physics and Astronomy
University of Utah

Drink7 · 2015-07-11T07:35:10Z

OK. I'll try to set those option to true and see how they change the performance about GPU.
And I'll try to relink QUDA with MILC later.
I forgot to put the execution result into a log file,so I'll run the test input again and check the information you talked about.

Thank you very much for your help!

stevengottlieb · 2015-07-11T11:15:40Z

Are you part of the student cluster competition? If so, you should use the MILC tar all prepared for the competition.

Sent from my iPad

On Jul 11, 2015, at 3:35 AM, Drink7 <notifications@github.com mailto:notifications@github.com> wrote:

OK. I'll try to set those option to true and see how they change the performance about GPU.
And I'll try to relink QUDA with MILC later.
I forgot to put the execution result into a log file,so I'll run the test input again and check the information you talked about.

Thank you very much for your help.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/313#issuecomment-120592822.

Drink7 · 2015-07-11T14:35:42Z

Yes,I meet some problem when building QUDA with MILC and trying to ask for help.
You mean all the application in MILC 7.7.11(like ks_imp_dyn,pure_gauge and others) or just ks_imp_rhmc in the MILC?

mathiaswagner · 2015-07-11T15:42:01Z

If this is part of the student cluster competition I would prefer to take the further support away from the QUDA bug tracker. I think QUDA performs as expected.

@stevengottlieb , @detar : Do you provide the support the student cluster competition?

detar · 2015-07-12T20:57:23Z

Could you please introduce yourself?

Are you part of the student cluster competition?

On 7/11/2015 8:35 AM, Drink7 wrote:

Yes,I meet some problem when building QUDA with MILC and trying to ask
for help.
You mean all the application in MILC 7.7.11(like ks_imp_dyn,pure_gauge
and others) or just ks_imp_rhmc in the MILC?

—
Reply to this email directly or view it on GitHub
#313 (comment).

stevengottlieb · 2015-07-12T21:27:32Z

Please use the google group set up for the student cluster competition,
not the github developers list. I agree with Mathias Wagner that this
discussion belongs elsewhere.

Read the instructions on the competition webpage for MILC that were
recently updated. There is a specific tarball for the competition that
has a restricted set of code. There are also more test cases.

I will no longer respond to github posts on this issue and will
encourage others to do the same.

On Sat, 2015-07-11 at 14:35 +0000, Drink7 wrote:

Yes,I meet some problem when building QUDA with MILC and trying to ask
for help.
You mean all the application in MILC 7.7.11(like ks_imp_dyn,pure_gauge
and others) or just ks_imp_rhmc in the MILC?

—
Reply to this email directly or view it on GitHub.

Drink7 · 2015-07-13T06:58:31Z

I should use the google group to ask for help,not the github here.
I'm sorry and I'll close this issue later.

mathiaswagner · 2015-07-13T10:44:49Z

Thanks for moving that to the right place.

@stevengottlieb , @detar : If anything comes up during the cluster competition that is QUDA related please feed it back here. Also If you want some of QUDA developers to sometimes have a look into the issues popping up in the student competition let us know.

Drink7 closed this as completed Jul 13, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QUDA with MILC #313

QUDA with MILC #313

Drink7 commented Jul 10, 2015

AlexVaq commented Jul 10, 2015

Drink7 commented Jul 10, 2015

AlexVaq commented Jul 10, 2015

Drink7 commented Jul 10, 2015

AlexVaq commented Jul 10, 2015

Drink7 commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

Drink7 commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

Drink7 commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

detar commented Jul 10, 2015

Drink7 commented Jul 11, 2015

stevengottlieb commented Jul 11, 2015

Drink7 commented Jul 11, 2015

mathiaswagner commented Jul 11, 2015

detar commented Jul 12, 2015

stevengottlieb commented Jul 12, 2015

Drink7 commented Jul 13, 2015

mathiaswagner commented Jul 13, 2015

QUDA with MILC #313

QUDA with MILC #313

Comments

Drink7 commented Jul 10, 2015

AlexVaq commented Jul 10, 2015

Drink7 commented Jul 10, 2015

AlexVaq commented Jul 10, 2015

Drink7 commented Jul 10, 2015

AlexVaq commented Jul 10, 2015

Drink7 commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

Drink7 commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

Drink7 commented Jul 10, 2015

mathiaswagner commented Jul 10, 2015

detar commented Jul 10, 2015

Drink7 commented Jul 11, 2015

stevengottlieb commented Jul 11, 2015

Drink7 commented Jul 11, 2015

mathiaswagner commented Jul 11, 2015

detar commented Jul 12, 2015

stevengottlieb commented Jul 12, 2015

Drink7 commented Jul 13, 2015

mathiaswagner commented Jul 13, 2015