Preload chunked data sources #242

jpivarski · 2019-03-04T08:45:15Z

This addresses issue #240, though it only implements step 2 in a generic way that applies to both XRootD and HTTP. It has only been tested on HTTP, and there might be a more specific function to call to do background-loading in XRootD.

The test_tree.Test.test_hist_in_tree function was particularly slow using HTTP: 25 seconds. Preloading reduces this time to 5 seconds.

…e is not thread-safe; instead use pyxrootd's built-in asynchronous interface (UNTESTED)

…her you will preload asynchronously or not

zhenbinwu · 2019-03-04T17:27:11Z

Testing this branch, but run into issue as below

  File "test.py", line 73, in <module>
    tt = next(it)
  File "uproot240/uproot/tree.py", line 116, in iterate
    for start, stop, arrays in tree.iterate(branches=newbranches, entrysteps=entrysteps, outputtype=outputtype, namedecode=namedecode, reportentries=True, entrystart=0, entrystop=tree.numentries, flatten=flatten, flatname=flatname, awkwardlib=awkward, cache=cache, basketcache=basketcache, keycache=keycache, executor=executor, blocking=blocking):
  File "uproot240/uproot/tree.py", line 606, in iterate
    future = branch._step_array(interpretation, basket_itemoffset, basket_entryoffset, start, stop, awkward, basketcache, keycache, executor, explicit_basketcache)
  File "uproot240/uproot/tree.py", line 1417, in _step_array
    _delayedraise(fill(j))
  File "uproot240/uproot/tree.py", line 75, in _delayedraise
    exec("raise cls, err, trc")
  File "uproot240/uproot/tree.py", line 1385, in fill
    source = self._basket(i, interpretation, local_entrystart, local_entrystop, awkward, basketcache, keycache)
  File "uproot240/uproot/tree.py", line 1090, in _basket
    basketdata = key.basketdata()
  File "uproot240/uproot/tree.py", line 1501, in basketdata
    return self.cursor.copied().bytes(datasource, self._fObjlen)
  File "uproot240/uproot/source/cursor.py", line 78, in bytes
    return source.data(start, stop)
  File "uproot240/uproot/source/chunked.py", line 120, in data
    if self._futures is not None:
AttributeError: 'XRootDSource' object has no attribute '_futures'

jpivarski · 2019-03-04T17:35:48Z

Thanks! I just committed a fix (and I've been trying to get a real XRootD test in the suite, but as you can see from the history, it's hard).

zhenbinwu · 2019-03-04T18:02:33Z

It doesn't seem much improvement for this PR. Below is the average time running over two files via xrootd.

Version 3.4.9 'Average loading time', 27.485920828580856
Version 3.4.10 'Average loading time', 26.598708748817444

jpivarski · 2019-03-04T20:03:15Z

Sorry for the delay: I was working with someone and couldn't come back to this until now.

Your tests and the continuous integration tests are failing the same way—just another small error that would be fixed in a minute if I could test locally. :)

I've submitted another fix. The tests are now working well enough that in 10 minutes or so I'll know if I've got another shallow bug to fix. I'll let you know when that happens.

Meanwhile, you're either not seeing the improvement because it's not working yet or because your suspicion is right and the time is really being spent in decompressing LZMA. Actually, there's another way to test that hypothesis: look at top or htop in another window as your script runs. I/O bottlenecks don't use much CPU; decompressing LZMA uses a lot of CPU.

zhenbinwu · 2019-03-04T21:28:42Z

No problem, I am also just out of a meeting.

Here we go. This is from v3.4.9. Memory seems stable most of time so in view of CPU.

zhenbinwu · 2019-03-04T21:42:19Z

From this branch, it seems my node is getting busy. Can not see much change.
For the average 27 second of loading, it seems both xrootd cache and decompress taking half and half of the CPU time. I will try the ThreadPoolExecutor later on.

jpivarski · 2019-03-04T21:54:40Z

The high CPU values suggest that more time is spent decompressing than waiting for the network, but I'm going to see this update through because it will stop wasting network time, even if that's not your problem.

But also note that the fix has not run successfully once. There's no fair performance comparison when one version hasn't really been run.

This is a recipe for parallel processing, to spread the decompression load.

jpivarski · 2019-03-05T07:44:25Z

@zhenbinwu It should be ready now; my XRootD test is finally passing. (But I think I'm going to remove that test from production: installing prerequisites multiplies testing time by 10×.)

Could you check your case to be sure that it works for you? It might not speed anything up because the real bottleneck is in LZMA decompression, but I want to be sure that it doesn't break your workflow.

…time. :(

chrisburr · 2019-03-05T11:18:20Z

.travis.yml

@@ -38,7 +38,7 @@ install:
  - conda config --add channels conda-forge;
  - conda config --set always_yes yes --set changeps1 no
  - if [ "${PYVER}" = "2.7" ] || [ "${PYVER}" = "3.6" ] || [ "${PYVER}" = "3.7" ]; then
-      conda create -q -n testenv python=${PYVER} root xrootd;
+      conda create -q -n testenv python=${PYVER} root;   # xrootd (to enable XRootD test in issue240)


This won't change anything as xrootd is a dependency of root so it will be installed regardless. I haven't yet made time to create a root-minimal package.

But I think I'm going to remove that test from production: installing prerequisites multiplies testing time by 10×.)

We could move to using containers which have the prerequires for each test in the matrix already installed?

Switching to containers could help a lot—something to consider. This process of debugging the XRootD update through Travis was a slow but necessary step, but it turns 2 minute tests into 20 minute tests, where all of the extra time is spent in the conda install.

If xrootd is a dependency of root, then removing it shouldn't affect whether the test runs, and I'll see test_issue240 in the logs. However, that doesn't explain why I'm seeing such a long install time.

It seems Python 2 is 50% slower than Python 3.6 and 3.7 (the others don't install ROOT). On the conda side it's being worked but it's still annoying.

I have a few ideas. I'll play around with it on my fork over the next day or two and come up with something that is faster.

Oh, dropping xrootd doesn't do anything to testing time, so it really is being pulled in by root. And the long times might be caused by the root installation. Apparently, I haven't been paying attention to how long these have been taking.

I know about the timing difference because it's not required for everything in the test matrix, not because I remember it being shorter before this PR.

Also, test_issue240 is still running, so I guess I'm not giving up the test. The installation time to enable ROOT-writing tests gives me the XRootD test for free.

Preparing containers could improve the long times that bothered me in this test (because I couldn't test locally), but then we'd be responsible for maintaining those containers, right? I'm not sure that's a good trade-off. I've also been hearing that the Azure tests have been running much faster. For the time being, this is fine, but I guess we can think about it.

zhenbinwu · 2019-03-06T00:42:27Z

Indeed, the decompression of NanoAOD is the bottleneck. I got similar result with this branch:
('Average loading time', 26.41544744372368)

jpivarski added 5 commits March 4, 2019 09:41

fixes #240, step 2 only

2a9cc59

XRootDSource must not use an executor because the pyxrootd.client.Fil…

8efdf06

…e is not thread-safe; instead use pyxrootd's built-in asynchronous interface (UNTESTED)

somehow forgot to pass number of threads to max_workers)

439f91b

for XRootD, the 'threads' parameter is a boolean that determines whet…

e24607c

…her you will preload asynchronously or not

bump version number

ccd86b2

jpivarski mentioned this pull request Mar 4, 2019

Are any test files accessible by (public) XRootD? scikit-hep/scikit-hep-testdata#21

Closed

jpivarski added 3 commits March 4, 2019 16:58

add test that depends on XRootD

7e74bc6

explicitly put install_requires into tests_require

5e1d270

explicitly install cachetools

a6c5c81

fix reported bug and put all tests_require in pip install

0d17fc3

self.cache[chunkindex] should raise KeyError, not return None

0e8e998

an hour per typo!

c6fb340

the interface was rather different than expected

af7b57d

jpivarski added 3 commits March 5, 2019 06:32

oy, this is a slow way to debug

3f9d2c3

ye gods!

a7cd35b

all arguments to File.read must be integers

02200f9

Now that it finally works, remove the test because it takes too much …

17a9e10

…time. :(

chrisburr reviewed Mar 5, 2019

View reviewed changes

I think I broke Travis; try again

a1d6e3a

jpivarski merged commit 4166601 into master Mar 6, 2019

jpivarski deleted the issue-240 branch March 6, 2019 06:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preload chunked data sources #242

Preload chunked data sources #242

jpivarski commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019

jpivarski commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019

jpivarski commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019 •

edited

jpivarski commented Mar 4, 2019

jpivarski commented Mar 5, 2019

chrisburr Mar 5, 2019

jpivarski Mar 5, 2019

chrisburr Mar 5, 2019

jpivarski Mar 5, 2019

jpivarski Mar 5, 2019

zhenbinwu commented Mar 6, 2019

Preload chunked data sources #242

Preload chunked data sources #242

Conversation

jpivarski commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019

jpivarski commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019

jpivarski commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019

zhenbinwu commented Mar 4, 2019 • edited

jpivarski commented Mar 4, 2019

jpivarski commented Mar 5, 2019

chrisburr Mar 5, 2019

Choose a reason for hiding this comment

jpivarski Mar 5, 2019

Choose a reason for hiding this comment

chrisburr Mar 5, 2019

Choose a reason for hiding this comment

jpivarski Mar 5, 2019

Choose a reason for hiding this comment

jpivarski Mar 5, 2019

Choose a reason for hiding this comment

zhenbinwu commented Mar 6, 2019

zhenbinwu commented Mar 4, 2019 •

edited