# Track a folder of files

```{warning}

Currently only supports data folders that are located in the configured storage.
```

In [None]:
!lamin load mydata

In [None]:
import lamindb as ln

ln.nb.header()

## Track a folder and its containing files

In [None]:
ln.dev.datasets.generate_cell_ranger_files(
    "sample_001", ln.setup.settings.instance.storage.root
)

In [None]:
!ls -l './mydata/sample_001/'

Let's pass a directory path to `ln.DFolder`, which creates a DFolder record:

In [None]:
dfolder = ln.DFolder(folder="./mydata/sample_001/")

dfolder

Meanwhile creates dobject records correspond to each of the file inside the DFolder:

In [None]:
dfolder.dobjects

In [None]:
ln.add(dfolder)

## What happens under the hood?

### In the SQL database

1. A `DFolder` entry
2. 15 `DObject` entries correspond to 15 files inside the directory
2. A `Notebook` entry
3. A `Run` entry

All three entries are linked so that you can find the file using any of the metadata fields.

In [None]:
import lamindb.schema as lns

ln.select(ln.DFolder, name=dfolder.name).one()

In [None]:
ln.select(ln.DFolder).join(ln.DObject.dfolders).where(
    ln.DFolder.name == "sample_001"
).df()

In [None]:
ln.select(ln.schema.Notebook, id=ln.nb.notebook.id).one()

In [None]:
ln.select(ln.schema.Run, id=ln.nb.run.id).one()

## View the directory tree

In [None]:
dfolder.tree()

## Find and retrieve files in dfolder

### Retrieve dobjects from a dfolder

In [None]:
with ln.Session() as ss:
    dfolder = ss.select(ln.DFolder, name="sample_001").first()
    dobjects = dfolder.dobjects

In [None]:
dobjects[:2]

### Retrieve files via it's relative path to the directory

In [None]:
dfolder.get(relpath="raw_feature_bc_matrix/features.tsv.gz")

In [None]:
dfolder.get(relpath=["analysis/analysis.csv", "raw_feature_bc_matrix/features.tsv.gz"])

In [None]:
dfolder.get(relpath="raw_feature_bc_matrix")

In [None]:
dfolder.get(relpath="raw_feature_bc_matrix", suffix=".mtx.gz")

Query a specific file from a dfolder using `ln.select`:

In [None]:
ln.select(ln.DObject, name="metrics_summary").join(ln.DObject.dfolders).where(
    ln.DFolder.name == "sample_001"
).df()