Skip to content

naity/ReceptorGPT

Repository files navigation


Logo

ReceptorGPT

Discover TCR matches, antigen specificity, and structure with AI

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Contributing
  5. License
  6. Contact

About The Project

ReceptorGPT Screen Shot

ReceptorGPT is an AI-powered app that uses Transformer-based protein language models to help you identify T cell receptor (TCR) matches, discover antigen specificity, and predict TCR structure. ReceptorGPT leverages embeddings to represent TCR sequences in a compact and informative way, allowing it to efficiently query a database of TCRs with known antigen specificity.

  • 🔍 Efficiently identify TCR matches
  • 💡 Discover TCR antigen specificity
  • 🧩 Predict TCR structure

(back to top)

Built With

  • Python
  • Streamlit
  • Transformers
  • Chroma
  • ESM

(back to top)

Getting Started

To launch the ReceptorGPT web app, please follow the steps below:

  1. Clone the repo:
git clone https://github.com/naity/ReceptorGPT.git
  1. Run the Streamlit app:
streamlit run app.py

Prerequisites

The requirements.txt file lists the Python packages that need to be installed in order to run the app. Please use the command below for installation.

pip install -r requirements.txt

(back to top)

Usage

The ReceptorGPT app can be run as is. However, users can also update or customize the embedding database for the app using the following Jupyter Notebooks:

  1. antigen_specific_tcrs.ipynb: This notebook preprocesses TCR sequences from various public databases. Users can customize the TCRs they want to include in the database.

  2. build_index.ipynb: Run this notebook after executing antigen_specific_tcrs.ipynb to transform TCRs into vectors and build the embedding database for the ReceptorGPT app. You may customize the underlying LLM for creating embeddings.

  3. evaluation.ipynb: This notebook assesses the performance of the constructed TCR embedding database. TCRs used to build the database are queried against it, and the recall is 99.7%, indicating that the same TCR is returned 99.7% of the time.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

ytiancompbio ytiancompbio @yuan_tian ytiancompbio

(back to top)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published