# Track files

In [None]:
import lamindb as ln

In [None]:
ln.track()

```{note}

Within a Jupyter notebook, the call to `ln.context.track_notebook(); ln.Run(load_latest=True)` tracks the notebook run as a data source.

Learn more: {doc}`/guide/run`.
```

## Usage

A local file:

In [None]:
filepath = ln.dev.datasets.file_jpg_paradisi05().resolve().as_posix()

In [None]:
filepath

To start tracking this file, we creates a `file` record:

```{note}

We'll work with a single class for data objects in memory and on disk: {class}`~lamindb.File`. On disk, these are often (but not always, e.g., for `zarr`) files.
```

In [None]:
file = ln.File(filepath)

The `file` record captures metadata about the file and will be our way to query and load data.

In [None]:
file

We can also access linked metadata records, for instance, the record that stores metadata about this run.

In [None]:
file.source

As we're ingesting from a notebook, here, it defaults to the notebook run created upon calling `ln.track()`:

In [None]:
assert ln.context.run == file.source

Next, add metadata & data to database & storage, we can do so in a single transaction:

In [None]:
ln.add(file)

## What happens under the hood?

### In the SQL database

1. A `File` entry
2. A `Notebook` entry
3. A `Run` entry

All three entries are linked so that you can find the file using any of the metadata fields.

In [None]:
ln.select(ln.File, name=file.name).one()

In [None]:
ln.select(ln.schema.Notebook, id=ln.context.transform.id).one()

In [None]:
ln.select(ln.schema.Run, id=ln.context.run.id).one()

### In storage

```{note}

This is your configured storage location (in this instance `./mydata`), which you pass to `ln.setup.init(storage=...)` when initiating the instance.

If cloud storage location is configured, the file will be uploaded.
```

A jpg file with cryptic name:

In [None]:
!ls ./mydata

```{tip}

If you prefer semantic names, you can easily achieve it by tracking existing data rather than ingesting data into a storage location: {doc}`/guide/existing`.

Naming data objects in storage by the primary key ID of the `File` is typically preferred when facing potential clashes of names at large scale or working with in-memory views.
```

## Retrieve a file

Getting the data back works through `.load()` - here, we get back filepath with the cryptic filename.

In [None]:
file.load()

## Find a file

You can also query the file-associated File record by its metadata. One of the simplest ways is by name:

In [None]:
file = ln.select(ln.File, name="paradisi05_laminopathic_nuclei").one()

file

Learn more: {doc}`/guide/select`.