Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

Preload chunked data sources #242

Merged
merged 17 commits into from
Mar 6, 2019
Merged

Preload chunked data sources #242

merged 17 commits into from
Mar 6, 2019

Conversation

jpivarski
Copy link
Member

This addresses issue #240, though it only implements step 2 in a generic way that applies to both XRootD and HTTP. It has only been tested on HTTP, and there might be a more specific function to call to do background-loading in XRootD.

The test_tree.Test.test_hist_in_tree function was particularly slow using HTTP: 25 seconds. Preloading reduces this time to 5 seconds.

@zhenbinwu
Copy link

Testing this branch, but run into issue as below

  File "test.py", line 73, in <module>
    tt = next(it)
  File "uproot240/uproot/tree.py", line 116, in iterate
    for start, stop, arrays in tree.iterate(branches=newbranches, entrysteps=entrysteps, outputtype=outputtype, namedecode=namedecode, reportentries=True, entrystart=0, entrystop=tree.numentries, flatten=flatten, flatname=flatname, awkwardlib=awkward, cache=cache, basketcache=basketcache, keycache=keycache, executor=executor, blocking=blocking):
  File "uproot240/uproot/tree.py", line 606, in iterate
    future = branch._step_array(interpretation, basket_itemoffset, basket_entryoffset, start, stop, awkward, basketcache, keycache, executor, explicit_basketcache)
  File "uproot240/uproot/tree.py", line 1417, in _step_array
    _delayedraise(fill(j))
  File "uproot240/uproot/tree.py", line 75, in _delayedraise
    exec("raise cls, err, trc")
  File "uproot240/uproot/tree.py", line 1385, in fill
    source = self._basket(i, interpretation, local_entrystart, local_entrystop, awkward, basketcache, keycache)
  File "uproot240/uproot/tree.py", line 1090, in _basket
    basketdata = key.basketdata()
  File "uproot240/uproot/tree.py", line 1501, in basketdata
    return self.cursor.copied().bytes(datasource, self._fObjlen)
  File "uproot240/uproot/source/cursor.py", line 78, in bytes
    return source.data(start, stop)
  File "uproot240/uproot/source/chunked.py", line 120, in data
    if self._futures is not None:
AttributeError: 'XRootDSource' object has no attribute '_futures'

@jpivarski
Copy link
Member Author

Thanks! I just committed a fix (and I've been trying to get a real XRootD test in the suite, but as you can see from the history, it's hard).

@zhenbinwu
Copy link

It doesn't seem much improvement for this PR. Below is the average time running over two files via xrootd.

Version 3.4.9 'Average loading time', 27.485920828580856
Version 3.4.10 'Average loading time', 26.598708748817444

@jpivarski
Copy link
Member Author

Sorry for the delay: I was working with someone and couldn't come back to this until now.

Your tests and the continuous integration tests are failing the same way—just another small error that would be fixed in a minute if I could test locally. :)

I've submitted another fix. The tests are now working well enough that in 10 minutes or so I'll know if I've got another shallow bug to fix. I'll let you know when that happens.

Meanwhile, you're either not seeing the improvement because it's not working yet or because your suspicion is right and the time is really being spent in decompressing LZMA. Actually, there's another way to test that hypothesis: look at top or htop in another window as your script runs. I/O bottlenecks don't use much CPU; decompressing LZMA uses a lot of CPU.

@zhenbinwu
Copy link

No problem, I am also just out of a meeting.

Here we go. This is from v3.4.9. Memory seems stable most of time so in view of CPU.

screen shot 2019-03-04 at 3 27 03 pm

@zhenbinwu
Copy link

zhenbinwu commented Mar 4, 2019

screen shot 2019-03-04 at 3 37 42 pm

From this branch, it seems my node is getting busy. Can not see much change.
For the average 27 second of loading, it seems both xrootd cache and decompress taking half and half of the CPU time. I will try the ThreadPoolExecutor later on.

@jpivarski
Copy link
Member Author

The high CPU values suggest that more time is spent decompressing than waiting for the network, but I'm going to see this update through because it will stop wasting network time, even if that's not your problem.

But also note that the fix has not run successfully once. There's no fair performance comparison when one version hasn't really been run.

This is a recipe for parallel processing, to spread the decompression load.

@jpivarski
Copy link
Member Author

@zhenbinwu It should be ready now; my XRootD test is finally passing. (But I think I'm going to remove that test from production: installing prerequisites multiplies testing time by 10×.)

Could you check your case to be sure that it works for you? It might not speed anything up because the real bottleneck is in LZMA decompression, but I want to be sure that it doesn't break your workflow.

.travis.yml Outdated
@@ -38,7 +38,7 @@ install:
- conda config --add channels conda-forge;
- conda config --set always_yes yes --set changeps1 no
- if [ "${PYVER}" = "2.7" ] || [ "${PYVER}" = "3.6" ] || [ "${PYVER}" = "3.7" ]; then
conda create -q -n testenv python=${PYVER} root xrootd;
conda create -q -n testenv python=${PYVER} root; # xrootd (to enable XRootD test in issue240)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't change anything as xrootd is a dependency of root so it will be installed regardless. I haven't yet made time to create a root-minimal package.

But I think I'm going to remove that test from production: installing prerequisites multiplies testing time by 10×.)

We could move to using containers which have the prerequires for each test in the matrix already installed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to containers could help a lot—something to consider. This process of debugging the XRootD update through Travis was a slow but necessary step, but it turns 2 minute tests into 20 minute tests, where all of the extra time is spent in the conda install.

If xrootd is a dependency of root, then removing it shouldn't affect whether the test runs, and I'll see test_issue240 in the logs. However, that doesn't explain why I'm seeing such a long install time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems Python 2 is 50% slower than Python 3.6 and 3.7 (the others don't install ROOT). On the conda side it's being worked but it's still annoying.

I have a few ideas. I'll play around with it on my fork over the next day or two and come up with something that is faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, dropping xrootd doesn't do anything to testing time, so it really is being pulled in by root. And the long times might be caused by the root installation. Apparently, I haven't been paying attention to how long these have been taking.

I know about the timing difference because it's not required for everything in the test matrix, not because I remember it being shorter before this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, test_issue240 is still running, so I guess I'm not giving up the test. The installation time to enable ROOT-writing tests gives me the XRootD test for free.

Preparing containers could improve the long times that bothered me in this test (because I couldn't test locally), but then we'd be responsible for maintaining those containers, right? I'm not sure that's a good trade-off. I've also been hearing that the Azure tests have been running much faster. For the time being, this is fine, but I guess we can think about it.

@zhenbinwu
Copy link

Indeed, the decompression of NanoAOD is the bottleneck. I got similar result with this branch:
('Average loading time', 26.41544744372368)

@jpivarski jpivarski merged commit 4166601 into master Mar 6, 2019
@jpivarski jpivarski deleted the issue-240 branch March 6, 2019 06:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants