Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrootd/cmd/xrd-cp: sub-par performances #399

Open
sbinet opened this issue Nov 19, 2018 · 4 comments
Open

xrootd/cmd/xrd-cp: sub-par performances #399

sbinet opened this issue Nov 19, 2018 · 4 comments

Comments

@sbinet
Copy link
Member

sbinet commented Nov 19, 2018

trying to copy the following file:

$>  xrd-ls -l root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleElectron.root
-r--r--r--	 1841440760	 Oct 16 16:39	 /eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleElectron.root

results in:

$> time xrd-cp root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleElectron.root go.root

real	15m49.907s
user	0m23.006s
sys	0m32.620s

while, with the C++ version, I got:

$>  time xrdcp root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleElectron.root cxx.root

[1.715GB/1.715GB][100%][==================================================][43.9MB/s]   

real	0m40.105s
user	0m0.743s
sys	0m8.754s

presumably b/c of 2 factors:

  • we haven't implemented kXR_readv
  • we consequently don't write out the file in non-overlapping buckets
@sbinet
Copy link
Member Author

sbinet commented Nov 19, 2018

@EgorMatirov want to give this a try?

@EgorMatirov
Copy link
Contributor

@EgorMatirov want to give this a try?

First thing, that I have noticed: C++ version writes file by buckets of 16 MBytes while Go version uses buckets of 16 KBytes (due to https://golang.org/src/io/io.go#L391) which results in much bigger overhead.

Passing a buffer of 16MBytes results in:
Go version: 816s.
C++ version: 570s.

Next optimization would be reading from the server and writing to the disk simultaneously.

Something like:
goroutine reads a bucket from the server and puts it to the buffered channel.
another goroutine reads a bucket from the channel and writes it to the disk.

I'll give it a try.

we haven't implemented kXR_readv

To be honest, I don't see how that can speed up copying here. As far I can tell, the only difference is that it supports reading from several files, but that's unrelated here since we are copying one file. (But it looks like a good idea to check against copying several files later).

@sbinet
Copy link
Member Author

sbinet commented Nov 19, 2018

we haven't implemented kXR_readv

To be honest, I don't see how that can speed up copying here. As far I can tell, the only difference is that it supports reading from several files, but that's unrelated here since we are copying one file. (But it looks like a good idea to check against copying several files later).

this was just a base-less statement :)

First thing, that I have noticed: C++ version writes file by buckets of 16 MBytes while Go version uses buckets of 16 KBytes (due to https://golang.org/src/io/io.go#L391) which results in much bigger overhead.

nice find.

another possible avenue is to use bufio.Writer (but probably just using io.CopyBuffer may be logically equivalent)

sbinet added a commit to sbinet-hep/hep that referenced this issue Nov 20, 2018
This CL uses a 16MiB buffer for copying (usually large) files.
We could also make the size available from the command line or try to
automatically decide what would be the best size using some heuristics.
This size turns out to be quite reasonnable.

Updates go-hep#399.
sbinet added a commit to sbinet-hep/hep that referenced this issue Nov 21, 2018
This CL uses a 16MiB buffer for copying (usually large) files.
We could also make the size available from the command line or try to
automatically decide what would be the best size using some heuristics.
This size turns out to be quite reasonnable.

  $> benchstat ./ref.txt new.txt
  name            old time/op    new time/op     delta
  XrdCp_Small-8     52.3ms ± 2%     54.4ms ± 5%      +4.09%  (p=0.000 n=30+29)
  XrdCp_Medium-8     17.2s ±21%      2.5s ±156%     -85.41%  (p=0.000 n=25+26)

  name            old alloc/op   new alloc/op    delta
  XrdCp_Small-8     59.0kB ± 0%  16803.5kB ± 0%  +28371.67%  (p=0.000 n=28+25)
  XrdCp_Medium-8    86.1MB ± 0%    225.8MB ± 0%    +162.37%  (p=0.000 n=29+25)

  name            old allocs/op  new allocs/op   delta
  XrdCp_Small-8        241 ± 0%        242 ± 1%      +0.27%  (p=0.001 n=29+29)
  XrdCp_Medium-8     14.9k ± 0%       0.5k ± 2%     -96.66%  (p=0.000 n=29+28)

Updates go-hep#399.
@sbinet
Copy link
Member Author

sbinet commented Dec 3, 2018

ok, with #401 in, we have now (with a ~170 MiB file):

C++:
real	0m1.663s
user	0m0.031s
sys	0m0.679s

Go:
real	0m1.879s
user	0m0.220s
sys	0m0.603s

testing on the root-eos public instance I get better results for Go than for C++ but I suspect there's some throttling somewhere...

C++: (1.7 GiB)
real	9m40.706s
user	0m2.759s
sys	0m15.548s

Go: (1.7 GiB)
real	2m47.093s
user	0m8.137s
sys	0m17.981s

also some memory infos, from top:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
28840 binet     20   0 1433684 141000   7192 S  21.0   0.2   0:20.66 xrd-cp
28399 binet     20   0  558244 143304  11016 S   5.3   0.2   0:08.79 xrdcp

let's leave this one open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants