![imaging1/4](https://img.shields.io/badge/imaging1/4-lightgrey)
[![Jupyter Notebook](https://img.shields.io/badge/Source%20on%20GitHub-orange)](https://github.com/laminlabs/lamin-usecases/blob/main/docs/imaging.ipynb)

# sc-imaging

Here, you will learn how to structure, featurize, and make a large imaging collection queryable for large-scale machine learning:

1. Load and annotate a {class}`~lamindb.Collection` of microscopy images (![sc-imaging/4](https://img.shields.io/badge/imaging1/4-lightgrey))
2. Generate single-cell images ([![sc-imaging2/4](https://img.shields.io/badge/imaging2/4-lightgrey)](/sc-imaging4))
3. Featurize single-cell images ([![sc-imaging3/4](https://img.shields.io/badge/imaging3/6-lightgrey)](/sc-imaging4))
4. Train model to identify autophagy positive cells ([![sc-imaging4/4](https://img.shields.io/badge/imaging4/4-lightgrey)](/sc-imaging4))


```{toctree}
:maxdepth: 1
:hidden:

sc-imaging2
sc-imaging3
sc-imaging4
```

First, we load and annotate a collection of microscopy images in TIFF format that [was previously uploaded](https://lamin.ai/scportrait/examples/transform/asoq6WyPequ8?).

The images used here were acquired as part of a [study](https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.biorxiv.org/content/10.1101/2023.06.01.542416v1&ved=2ahUKEwj3m56hs52LAxWxRfEDHQZwKZ8QFnoECBIQAQ&usg=AOvVaw04HsGDIPcIPr1ldblXuh0Z) on autophagy, a cellular process during which cells recycle their components in autophagosomes. The study tracked genetic determinants of autophagy through fluorescence microscopy of human U2OS cells.

In [None]:
# pip install 'lamindb[jupyter,bionty]'
!lamin init --storage ./test-sc-imaging --modules bionty

In [None]:
import lamindb as ln
import bionty as bt
from tifffile import imread
import matplotlib.pyplot as plt

ln.track()

All image metadata is stored in an already ingested `.csv` file on the `scportrait/examples` instance.

In [None]:
metadata_files = (
    ln.Artifact.using("scportrait/examples")
    .get(key="input_data_imaging_usecase/metadata_files.csv")
    .load()
)

metadata_files.head(2)

In [None]:
metadata_files.apply(lambda col: col.unique())

All images are of the same cell line (U2OS), which have been imaged on an Opera Phenix microscope at 20X magnification.
To induce autophagy, the cells have either been treated with `Torin-1`, a small molecule that mimics starvation, for 14 hours, or left untreated as a control.

To visualize the process of autophagosome formation, the U2OS cells have been genetically engineered to express fluorescently tagged proteins.
`LC3B` is a  marker of autophagosomes, allowing us to visualize their formation in the mCherry channel.
`LckLip` is a membrane-targeted fluorescent protein, which helps outline the cellular boundaries of individual cells in the `Alexa488` channel.
Furthermore, the cells’ DNA was stained using `Hoechst`, which we can visualize in the `DAPI` channel to identify the nuclei of individual cells.

These three structures are visualized in three separate image channels:

| Channel | Imaged Structure   |
|---------|--------------------|
| 1       | DNA                |
| 2       | Autophagosomes     |
| 3       | Plasma Membrane    |

In addition to expressing fluorescently tagged proteins, some of the cells have had the `EI24` gene knocked out, leading to two different genotypes: `WT` (wild-type) cells and `EI24KO` (knockout) cells.
For each genotype, two different clonal cell lines were analyzed, with multiple fields of view (FOVs) captured per condition.

To enable queries on our images, we annotate them with the corresponding metadata.

In [None]:
autophagy_imaging_schema = ln.Schema(
    name="Autophagy imaging schema",
    features=[
        ln.Feature(name="genotype", dtype=ln.ULabel.name).save(),
        ln.Feature(name="stimulation", dtype=ln.ULabel.name).save(),
        ln.Feature(name="cell_line", dtype=bt.CellLine.name).save(),
        ln.Feature(name="cell_line_clone", dtype=ln.ULabel.name).save(),
        ln.Feature(name="channel", dtype=ln.ULabel.name).save(),
        ln.Feature(name="FOV", dtype=ln.ULabel.name).save(),
        ln.Feature(name="magnification", dtype=ln.ULabel.name).save(),
        ln.Feature(name="microscope", dtype=ln.ULabel.name).save(),
        ln.Feature(name="imaged structure", dtype=ln.ULabel.name).save(),
        ln.Feature(
            name="resolution", dtype=float, description="conversion factor for px to µm"
        ).save(),
    ],
    coerce_dtype=True,
).save()

In [None]:
curator = ln.curators.DataFrameCurator(metadata_files, autophagy_imaging_schema)
try:
    curator.validate()
except ln.core.exceptions.ValidationError as e:
    print(e)

Add and standardize missing terms:

In [None]:
curator.cat.standardize("cell_line")
curator.cat.add_new_from("all")
curator.validate()

Add all images to our lamindb instance to annotate all relevant metadata.

In [None]:
# Create study feature and associated label
ln.Feature(name="study", dtype=ln.ULabel).save()
ln.ULabel(name="autophagy imaging").save()

# loop through all Artifacts and add feature values
artifacts = []
for _, row in metadata_files.iterrows():
    artifact = (
        ln.Artifact.using("scportrait/examples")
        .filter(key__icontains=row["image_path"])
        .one()
    )
    artifact.save()
    artifact.cell_lines.add(bt.CellLine.filter(name=row.cell_line).one())

    artifact.features.add_values(
        {
            "genotype": row.genotype,
            "stimulation": row.stimulation,
            "cell_line_clone": row.cell_line_clone,
            "channel": row.channel,
            "imaged structure": row["imaged structure"],
            "study": "autophagy imaging",
            "FOV": row.FOV,
            "magnification": row.magnification,
            "microscope": row.microscope,
            "resolution": row.resolution,
        }
    )

    artifacts.append(artifact)

In [None]:
artifacts[0].describe()

In addition, we create a {class}`~lamindb.Collection` to hold all {class}`~lamindb.Artifact` that belong to this specific imaging study.

In [None]:
collection = ln.Collection(
    artifacts,
    key="Annotated autophagy imaging raw images",
    description="annotated microscopy images of cells stained for autophagy markers",
).save()

Let's look at some example images where we match images from the same clone, stimulation condition, and FOV to ensure correct channel alignment.

In [None]:
FOV_example_images = (
    metadata_files.sort_values(by=["cell_line_clone", "stimulation", "FOV"])
    .head(3)
    .reset_index(drop=True)
)

fig, axs = plt.subplots(1, 3, figsize=(15, 5))
for idx, row in FOV_example_images.iterrows():
    path = ln.Artifact.using("scportrait/examples").get(key=row["image_path"]).cache()
    image = imread(path)
    axs[idx].imshow(image)
    axs[idx].set_title(f"{row['imaged structure']}")
    axs[idx].axis("off")

FOV_example_images = (
    metadata_files.sort_values(by=["cell_line_clone", "stimulation", "FOV"])
    .tail(3)
    .reset_index(drop=True)
)

fig, axs = plt.subplots(1, 3, figsize=(15, 5))
for idx, row in FOV_example_images.iterrows():
    path = ln.Artifact.using("scportrait/examples").get(key=row["image_path"]).cache()
    image = imread(path)
    axs[idx].imshow(image)
    axs[idx].set_title(f"{row['imaged structure']}")
    axs[idx].axis("off")

In [None]:
ln.finish()