Skip to content

lambda-science/NLMyo

Repository files navigation

Twitter Follow GitHub last commit GitHub

NLMyo🔧: a toolbox built to leverage the power of Large Language Models (LLMs) to exploit histology text reports.

NLMyo Banner

NLMyo Workflow

NLMyo🔧: is a toolbox built to leverage the power of Large Language Models (LLMs) to exploit histology text reports.
Available tools:

  • Anonymizer🕵️: a tool to automatically censor patient histology report PDF.
  • Extract Metadata📝: a tool to extract metadata from histology reports such as biopsy number, muscle, diagnosis...
  • Auto Classify 🪄: a tool to automatically predict a diagnosis of congenital myopathy subtype from an histology reports using AI (large language models). Currently can predict between: Nemaline Myopathy, Core Myopathy, Centro-nuclear Myopathy, Non Congenital Myopathy (NON-MC).
  • Report Search 🔎: a tool to search for a specific term in a set of histology reports. The tool will return the top 5 reports containing closest to your symptom query from our database of reports..

🚨 DISCLAIMER: For some tools you can select OpenAI API mode for better results. In OpenAI mode, all data inserted in this tools are sent to OpenAI servers. Please do not upload private or non-anonymized data. As per their terms of service OpenAI does not retain any data (for more time than legal requirements, click for source) and do not use them for trainning. However, we do not take any responsibility for any data leak.

This project is free and open-source under the AGPL license, feel free to fork and contribute to the development.

Warning: This tool is still in early phases and active development.

How to Use

You can use the demo version at https://lbgi.fr/NLMyo/ or see the #How To Install to have your own instance.
Once on the website, simply select the right tool in the sidebar on the left.
Here is a sample pdf that you can use with the tools PDF File

How to install

  • Create a .env file with your OpenAI API key such as OPENAI_API_KEY=sk-...
  • Install the venv with poetry install and activate with source .venv/bin/activate
  • Get the Vicuna LLM model cd models && wget https://huggingface.co/eachadea/ggml-vicuna-7b-1.1/resolve/main/ggml-vic7b-q4_1.bin
  • If you are from our lab and have SSH access you can pull the DVC Data (Raw Data + ChromaDB) with dvc pull
  • If you are not from our lab and want to create your own embedding. Create a folder data/processed/ containing all your *.txt file to embed. And run python ingest.py to create the ChromaDB (vector store)
  • Run the app using streamlit run Home.py

Contact

Creator and Maintainer: Corentin Meyer, 3rd year PhD Student in the CSTB Team, ICube — CNRS — Unistra corentin.meyer@etu.unistra.fr

Citing NLMyo🔧

[placeholder]

Partners

Partner Banner

NLMyo is born within the collaboration between the CSTB Team @ ICube led by Julie D. Thompson, the Morphological Unit of the Institute of Myology of Paris led by Teresinha Evangelista, the imagery platform MyoImage of Center of Research in Myology led by Bruno Cadot, the photonic microscopy platform of the IGMBC led by Bertrand Vernay and the Pathophysiology of neuromuscular diseases team @ IGBMC led by Jocelyn Laporte

About

NLMyo🔧: a toolbox built to leverage the power of Large Language Models (LLMs) to exploit histology text reports. Demo version at: https://lbgi.fr/NLMyo

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published