# Ingest data objects

Let's log in a test user:

In [None]:
!lndb login testuser1

and create a local test LaminDB instance

In [None]:
!lndb init --storage mytest

We're now ready to track & query data!

```{tip}

In Jupyter notebooks and lab, you can see the documentation for a python function by hitting SHIFT + TAB. Hit it twice to expand the view.
```

In [None]:
import lamindb as ln

ln.nb.header()

## Track local files - data objects on disk

Let's add an image file ([Paradisi05](https://bmcmolcellbiol.biomedcentral.com/articles/10.1186/1471-2121-6-27)):

<img width="150" alt="Laminopathic nuclei" src="https://upload.wikimedia.org/wikipedia/commons/2/28/Laminopathic_nuclei.jpg">

In [None]:
filepath = ln.dev.datasets.file_jpg_paradisi05()
filepath

First create a record for the file.

As we'll abstract over data objects in memory and on disk, we'll use the term _data object_:

In [None]:
dobject = ln.DObject(filepath)

The record captures metadata about the file and will be our way to query and load data.

In [None]:
dobject

We can also access linked records, for instance, the record that stores metadata about this run.

In [None]:
dobject.source

As we're ingesting from a notebook, here, it defaults to the notebook run:

In [None]:
ln.nb.run

In [None]:
assert ln.nb.run == dobject.source

If we want to add metadata & data to the database, we can do so in a single transaction:

In [None]:
ln.add(dobject)

## Track data objects in memory

In [None]:
import sklearn.datasets

Let's now ingest an in-memory `DataFrame` storing the iris dataset:

In [None]:
df = sklearn.datasets.load_iris(as_frame=True).frame

df.head()

In [None]:
df.shape

When ingesting in-memory objects, a `name` argument needs to be passed:

In [None]:
dobject = ln.DObject(df, name="iris");

In [None]:
ln.add(dobject)

## The story of `DObject`

```{seealso}

[What is DObject?](https://lamin.ai/docs/db/guide/advanced) 

```

We have come to love the pydata family of `DataFrame`, `AnnData`, `pytorch.DataLoader`, `zarr.Array`, `pyarrow.Table`, `xarray.Dataset`, and others (!) for accessing lower-level data objects.

But we couldn’t find an object for accessing how data objects are linked to context.
So, we made `DObject` to help with modeling and understanding data objects in relation to their context.

Context can be other data objects, data transformations, people & pipelines who performed transformations, and all aspects of data lineage.
Context can also be theories, hypotheses, and any entity of the domain in which data is generated and modeled.

Depending on how `DObject`s are linked to context, they give rise to features of data lakes, warehouses and knowledge graphs.

We’ve worked in biology for many years, so, we focused on linking `DObject` to biological concepts: entities, their types, records, transformations, and relations.
You'll learn about them further down the guide.

But first, let's see how to query!
