# GAMER: Generative Analysis of Metadata Retrieval

This model uses a multi agent framework on Langraph to retrieve and summarize metadata information based on a user's natural language query. 

This workflow consists of 6 agents, or nodes, where a decision is made and there is new context provided to either the model or the user. Here are some decisions incorporated into the framework:
1. To best answer the query, does the entire database need to be queried, or the vector index?
- Input: `x (query)`
- Decides best data to query against
- Output: `entire_database, vector_embeddings`
2. If querying against the vector embeddings, does the index need to be filtered further with metdata tags, to improve optimization of retrieval?
- Input: `x (query)`
- Decides whether database can be further filtered by applying a MongoDB query
- Output: `MongoDB query, None`
3. Are the documents retrieved during retrieval relevant to the question?
- Input: `x (query)`
- Decides whether document should be kept or tossed during summarization
- Output: `yes, no`


![Graph workflow](C:\Users\sreya.kumar\Documents\GitHub\metadata-chatbot\graph_workflow.png)

## Calling the model

### Synchronous calling

In [1]:
!pip install metadata-chatbot



In [2]:
!pip show metadata_chatbot

Name: metadata-chatbot
Version: 0.0.44
Summary: Generated from aind-library-template
Home-page: 
Author: Allen Institute for Neural Dynamics
Author-email: 
License: MIT
Location: C:\Users\sreya.kumar\Documents\GitHub\metadata-chatbot\venv\Lib\site-packages
Requires: aind-data-access-api, boto3, fastapi, langchain, langchain-aws, langchain-community, langchain-core, langgraph, motor, nest-asyncio, pymongo, pytest, sshtunnel, uvicorn
Required-by: 


In [1]:
from metadata_chatbot.agents.GAMER import GAMER
query = "What are the injections used in asset SmartSPIM_692908_2023-11-08_16-48-13_stitched_2023-11-09_11-12-06"

model = GAMER()
result = model.invoke(query)

print(result)

  documents = retriever.get_relevant_documents(query = query, query_filter = filter)


ValidationError: 1 validation error for RetrievalGrader
relevant_context
  Field required [type=missing, input_value={'binary_score': 'yes', '...s used for this asset.'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/missing

### Asynchronous calling

In [6]:
from metadata_chatbot.agents.GAMER import GAMER
llm = GAMER()
query = "Which channels were imaged in asset SmartSPIM_692908_2023-11-08_16-48-13_stitched_2023-11-09_11-12-06? What is labelled in each channel?"

result = await llm.ainvoke(query)
print(result)

Based on the provided context, the following channels were imaged in asset SmartSPIM_692908_2023-11-08_16-48-13_stitched_2023-11-09_11-12-06:

1) 488 nm channel: Likely labels a green fluorescent protein like GFP or EGFP. This channel may show the nuclear-localized EGFP expression from the Ai224 genotype.

2) 561 nm channel: Likely labels a red fluorescent protein like tdTomato or RFP. This channel may show the neuronal labeling from the AAVrg-Syn-Flpo virus expressing a fluorescent protein under the Synapsin promoter.

3) 639 nm channel: Likely labels a far-red fluorescent protein like iRFP.

The specific labels are not explicitly stated in the provided context, but the wavelengths correspond to commonly used fluorescent proteins for labeling different structures or molecules in biological samples.
