# Track a folder of files

```{warning}

Currently only supports data folders that are located in the configured storage.
```

In [None]:
!lamin load mydata

In [None]:
import lamindb as ln

ln.track()

## Track a folder and its containing files

In [None]:
ln.dev.datasets.generate_cell_ranger_files(
    "sample_001", ln.setup.settings.instance.storage.root
)

In [None]:
!ls -l './mydata/sample_001/'

Let's pass a directory path to `ln.Folder`, which creates a Folder record:

In [None]:
folder = ln.Folder(folder="./mydata/sample_001/")

folder

Meanwhile creates file records correspond to each of the file inside the Folder:

In [None]:
folder.files

In [None]:
ln.add(folder)

## What happens under the hood?

### In the SQL database

1. A `Folder` entry
2. 15 `File` entries correspond to 15 files inside the directory
2. A `Notebook` entry
3. A `Run` entry

All three entries are linked so that you can find the file using any of the metadata fields.

In [None]:
ln.select(ln.Folder, name=folder.name).one()

In [None]:
ln.select(ln.Folder).join(ln.File.folders).where(ln.Folder.name == "sample_001").df()

In [None]:
ln.select(ln.schema.Notebook, id=ln.context.transform.id).one()

In [None]:
ln.select(ln.schema.Run, id=ln.context.run.id).one()

## View the directory tree

In [None]:
folder.tree()

## Find and retrieve files in folder

### Retrieve files from a folder

In [None]:
with ln.Session() as ss:
    folder = ss.select(ln.Folder, name="sample_001").first()
    files = folder.files

In [None]:
files[:2]

### Retrieve files via it's relative path to the directory

In [None]:
folder.get(relpath="raw_feature_bc_matrix/features.tsv.gz")

In [None]:
folder.get(relpath=["analysis/analysis.csv", "raw_feature_bc_matrix/features.tsv.gz"])

In [None]:
folder.get(relpath="raw_feature_bc_matrix")

In [None]:
folder.get(relpath="raw_feature_bc_matrix", suffix=".mtx.gz")

Query a specific file from a folder using `ln.select`:

In [None]:
ln.select(ln.File, name="metrics_summary").join(ln.File.folders).where(
    ln.Folder.name == "sample_001"
).df()