# Implementing `DataFrame.iloc` in Dask
- [Pandas](#pandas)
- [Dask (before)](#before)
- [Dask (after)](#after)
- [Next steps](#next-steps) ([#6661](https://github.com/dask/dask/pull/6661) 👀)

## Pandas <a id="pandas"></a>
`iloc` (**i**ndex **loc**ation) allows slicing Pandas DataFrames by integer indices:

In [1]:
import pandas as pd
ch = pd.Series(list('abcdefghijklmnopqrstuvwxyz'))
ords = ch.apply(ord)
df = pd.DataFrame({'ch':ch,'ord':ords}).set_index('ch')
df.iloc[5:15]

Unnamed: 0_level_0,ord
ch,Unnamed: 1_level_1
f,102
g,103
h,104
i,105
j,106
k,107
l,108
m,109
n,110
o,111


## Dask (before) <a id="before"></a>
Install a recent Dask release:

In [2]:
from sys import executable as python
!{python} -m pip uninstall -q -y dask && {python} -m pip install -q dask[dataframe]==2021.5.0

Make a simple Dask DataFrame (from the Pandas DataFrame above):

In [3]:
import dask.dataframe as dd
ddf = dd.from_pandas(df, chunksize=10)
ddf

Unnamed: 0_level_0,ord
npartitions=3,Unnamed: 1_level_1
a,int64
k,...
u,...
z,...


### `iloc` ⟹ `NotImplementedError`
`iloc` currently raises `NotImplementedError` on Dask DataFrames:

In [4]:
ddf.iloc[5:15]

NotImplementedError: 'DataFrame.iloc' only supports selecting columns. It must be used like 'df.iloc[:, column_indexer]'.

Implementing this requires knowing the sizes of each DataFrame partition, which isn't currently tracked.

## Dask (after) <a id="after"></a>
Install [dask#6661](https://github.com/dask/dask/pull/6661) (and enable `%autoreload`):

In [5]:
%load_ext autoreload
%autoreload 2

url = 'git+https://github.com/dask/dask.git@refs/pull/6661/head#egg=dask'
from sys import executable as python
!{python} -m pip uninstall -q -y dask && {python} -m pip install -q {url}



Recreate the same Dask DataFrame:

In [6]:
ddf = dd.from_pandas(df, chunksize=10)
ddf

[autoreload of dask.delayed failed: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/IPython/extensions/autoreload.py", line 245, in check
    superreload(m, reload, self.old_objects)
  File "/usr/local/lib/python3.8/dist-packages/IPython/extensions/autoreload.py", line 410, in superreload
    update_generic(old_obj, new_obj)
  File "/usr/local/lib/python3.8/dist-packages/IPython/extensions/autoreload.py", line 347, in update_generic
    update(a, b)
  File "/usr/local/lib/python3.8/dist-packages/IPython/extensions/autoreload.py", line 317, in update_class
    update_instances(old, new)
  File "/usr/local/lib/python3.8/dist-packages/IPython/extensions/autoreload.py", line 280, in update_instances
    ref.__class__ = new
  File "/usr/local/lib/python3.8/dist-packages/dask/delayed.py", line 548, in __setattr__
    object.__setattr__(self, attr, val)
TypeError: __class__ assignment: 'DelayedLeaf' object layout differs from 'DelayedLeaf'
]
[autoreload of da

Unnamed: 0_level_0,ord
npartitions=3,Unnamed: 1_level_1
a,int64
k,...
u,...
z,...


(`autoreload` errors don't matter in this case)

### 🎉🎉 `iloc` 🎉🎉 <a id="iloc"></a>

In [7]:
ddf.iloc[5:15]

Unnamed: 0_level_0,ord
npartitions=2,Unnamed: 1_level_1
a,int64
k,...
u,...


In [8]:
ddf.iloc[5:15].compute()

Unnamed: 0_level_0,ord
ch,Unnamed: 1_level_1
f,102
g,103
h,104
i,105
j,106
k,107
l,108
m,109
n,110
o,111


## Next Steps <a id="next-steps"></a>
See [dask#6661](https://github.com/dask/dask/pull/6661) for too much information about the state of this work.