In [None]:
#| hide
from Single_Cell_Fuzzy_Labels.core import *

# Single-Cell-Fuzzy-Labels

> This GitHub repo offers a method for label transfer in single-cell RNA-seq data using shared embeddings from foundation models. It leverages language model APIs (GPT-3.5 or 4) for harmonizing label sets, enhancing transfer interpretation and facilitating basic metric evaluation.

 The `Single-Cell-Fuzzy-Labels` library introduces an innovative label transfer methodology for single-cell RNA-seq data analysis. Utilizing a K-nearest neighbors (KNN) strategy, the library facilitates the transfer of labels from newly queried data to a well-annotated reference dataset. By harnessing the capabilities of pre-trained Single Cell foundation models in conjunction with advanced language models like GPT-3.5 or 4, `Single-Cell-Fuzzy-Labels` streamlines the harmonization of different label sets. This process not only improves the interpretability of label transfers but also ensures a more coherent integration of data. The library further provides essential metrics to assess the efficacy of the label transfer under this new label schema.



## Install

```sh
pip install Single_Cell_Fuzzy_Labels
```

## How to use

To utilize the `Single-Cell-Fuzzy-Labels` library in your single-cell RNA-seq data analysis, follow these steps:

1. Install the library using pip:
   ```sh
   pip install Single_Cell_Fuzzy_Labels
   ```

2. Import the library in your Python environment:
   ```python
   from Single_Cell_Fuzzy_Labels.core import *
   ```

Optional steps before label transfer:

3. Download pre-trained embeddings from cellxgene:
   ```python
   embeddings = download_embeddings(cellxgene_url)
   ```

   Or

   Embed your own dataset using a foundation model such as UCE or scGPT:
   ```python
   dataset_embeddings = embed_dataset(your_dataset, model='UCE')
   ```

4. Assess embedding quality using Single-cell Integration Benchmarking (scIB):
   ```python
   quality_metrics = assess_embedding_quality(dataset_embeddings)
   ```

5. Prepare your reference dataset with well-annotated labels.

6. Use the `transfer_labels` function to transfer labels from the reference dataset to your new query data:
   ```python
   transferred_labels = transfer_labels(query_data, reference_data)
   ```

7. Evaluate the label transfer quality using the provided metrics:
   ```python
   evaluate_transfer(transferred_labels)
   ```

For more detailed usage instructions and examples, refer to the documentation and the tutorials included in the GitHub repository.


2