Skip to content

biocompute-objects/biocompute-object-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioCompute Object Assistant

Background

The BioCompute Object (BCO) project is a community-driven open standards framework for standardizing and sharing computations and analyses. With the exponential increase in both the quantity and complexity of biological data and the workflows used to analyze and transform the data, the need for standardization in documentation is imperative for experimental preservation, transparency, accuracy, and reproducability.

As with any documentation standard, the main hurdles to continued adoption are the overhead required to maintain the quality and accuracy of a BCO in parallel as the research evolves over time and retroactively documenting pre-existing research. With the recent improvements in large language models (LLMs), the feasibility and utility of an automated BCO creation assistant is an intriguing use case.

Approaches

For this proof of concept, two main approaches were explored: the OpenAI Assistants API and a Retrieval-Augmented Generation (RAG) pipeline using the Python LlamaIndex library. Each approach is described further in their respective directories.

Testing

For baseline testing, a hand-selected set of papers were used (found in the papers/ directory).

About

BioCompute Object Assistant

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages