Skip to content

Program to find the best RAG model for your Gen AI app. This uses Llama Index for retrieval and langchain for promtps and other tasks. Trulens eval is used for evaluating using RAG triads.

Notifications You must be signed in to change notification settings

akshatsingh1718/FindBest_RAG

Repository files navigation

to run eval

  • Open trulens_utils/main.py and change below paths for your usecase:

    EVAL_QUESTIONS = "example/questions.txt"
    DOCUMENTS = "./example/FoodnDrinksCatalogue.txt" 
  • Run main.py as a package: python3 -m trulens_utils.main

  • Dashboard should be started at http://192.168.1.8:8501.

Things to do

  • Make the prompt variables dynamic in agent.py
  • Make the conversation prompt utterance template generic to different prompts.
  • In chains.py, make the input_variables dynamic (from data.json) .
  • Conversation_history will be always present in input_variables chains.py.
  • Sometimes ai responds with a very long sentences which may disturb the user experience. Tell Ai to not respond wiht all the options avaialble for users as the list can be very long.
  • [] First question from ai make it static (like greeting) and then from the ans from ai will be using the true history
  • [] New tools like maths tools can be added for creating invoice or total.

Observation:

  • [] Maybe the llm is forgettig to ouput the agent action format because of that long menu.

How to talk

Think what conversation stage we are at.

- Class StageAnalyzerChain: This will be used to decide which stage we are at. It will output only a single number to denote stage of conversation took place.

Talk to the person

- Class ConversationChain: This will be used to talk to the person given the stage conversation id.

Langchain understanding

Add new kwargs with LCLE

- Use llm.bind(kwarg= Value) eg. llm.bind(stop=["\nobservation:"]). Link: https://python.langchain.com/docs/modules/agents/agent_types/chat_conversation_agent

Output parser

- if llm is not using any tool then return AgentFinish. We get text which is returned/generated by the llm. Then return AgentFinish(return_values={"output" : LLM_RESPONSE_TO_HUMAN}, log)

- If llm is using a tool then return.


- AgentFinish() => directly returns the text which llm thinks. (no need to call a tool).
- AgentAction() => next action or folloup action to try

Streaming

- While using an output parser we may not get the real streaming

Questions To ASK DEV:

- Why there is _streaming_generator instead of _astreaming_generator as it only runs using async ?
    Ans: Maybe _streaming_generator only sends generator items after getting the whole completetion but _astreaming_generator uses streaming=True and async function. refer: https://python.langchain.com/docs/modules/agents/how_to/streaming#stream-tokens


- In the streaming_generator why are we not using any tools ? Can we you tools with generator ?

About

Program to find the best RAG model for your Gen AI app. This uses Llama Index for retrieval and langchain for promtps and other tasks. Trulens eval is used for evaluating using RAG triads.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages