IndexError when read_root used with chunksize returns an empty iterator 

Dear all,

I have been facing the following issue. Suppose you have a bunch of large root files, so you want to use chunksize. But sometimes, you want to apply a tight cut, such that for some files, you end up with no entries. In this case,

```
for df in read_root(myfile, key=myTree, where=tight_selection, chunksize=100000):
     # Do something
```

raises an `IndexError: index 0 is out of bounds for axis 0 with size 0` because the iterator returned by read_root has length zero.

I'm not sure what is the best to change. I guess this is the part of read_root that has to be changed:

```python
    if chunksize:
        tchain = ROOT.TChain(key)
        for path in paths:
            tchain.Add(path)
        n_entries = tchain.GetEntries()
        # XXX could explicitly clean up the opened TFiles with TChain::Reset

        def genchunks():
            current_index = 0
            for chunk in range(int(ceil(float(n_entries) / chunksize))):
                arr = root2array(paths, key, all_vars, start=chunk * chunksize, stop=(chunk+1) * chunksize, selection=where, *args, **kwargs)
                if flatten:
                    arr = do_flatten(arr, flatten)
                yield convert_to_dataframe(arr, start_index=current_index)
                current_index += len(arr)
return genchunks()
```

I guess if n_entries == 0, one should do something special, but I'm not sure what's the best to do. Maybe return None ? In that case the user can do:

```
df _ list = read_root(myfile, key=myTree, where=tight_selection, chunksize=100000)

if ( df_list != None ):
    for df in :
         # Do something
```

?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IndexError when read_root used with chunksize returns an empty iterator #63

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

IndexError when read_root used with chunksize returns an empty iterator #63

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions