# Using LogReg for feature selection

For this quickstart, you will need to install lamindb in addition to modlyn. Also, init a lamindb instance to query a test dataset.

```
pip install modlyn lamindb
lamin init --storage test-modlyn
```

In [None]:
import modlyn as mn
import lamindb as ln

ln.track()

Let's use a [very small subset](https://lamin.ai/laminlabs/arrayloader-benchmarks/artifact/D21D2K8697CY8tHE0001) of the Tahoe100M dataset with 100 observations.

In [None]:
artifact = ln.Artifact.using("laminlabs/arrayloader-benchmarks").get("D21D2K8697CY8tHE0001")
adata = artifact.load()
adata

We'll predict cell lines via logistic regression. Most classes only have a single sample support.

In [None]:
adata.obs.value_counts("cell_line")

Let's fit the model.

In [None]:
logreg = mn.models.SimpleLogReg(
    adata=adata,
    label_column="cell_line",    
    learning_rate=1e-1,
)
logreg.fit(
    adata_train=adata,
    adata_val=adata[:20],  # this should _not_ be a subset in real-world use cases
    train_dataloader_kwargs={
        "batch_size": 8,
    },
    max_epochs=4,
)

In [None]:
logreg.plot_losses()

With almost no data present in this test dataset, performance in this quickstart is not great and precision is ill-defined in several cases.

In [None]:
logreg.plot_classification_report(adata)

In [None]:
# ln.finish()  # finish tracking in lamindb