Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search based QA research #174

Closed
andreaskoepf opened this issue Dec 30, 2022 · 12 comments · Fixed by #213
Closed

Search based QA research #174

andreaskoepf opened this issue Dec 30, 2022 · 12 comments · Fixed by #213
Assignees

Comments

@andreaskoepf
Copy link
Collaborator

Take a closer look at sandbox-grounded-qa and analyze how they do the contextualization with Google search. Describe the requirements to use such a technique for queries against our LM. Write a short report as md file.

Video: https://youtu.be/DpOQpClVgCw

Your research will help up to plan the next iteration after the MVP.

@andreaskoepf
Copy link
Collaborator Author

@billray0259
Copy link
Collaborator

I assume we want to develop our own system that can be run locally and does not rely on calling any of the cohere APIs, right?

@andreaskoepf
Copy link
Collaborator Author

andreaskoepf commented Dec 30, 2022

@billray0259 yes, primarily we would like to focus on an open-source solution. (extension modules allowing arbitrary API calls are also possible, but not the preferred solution. Integration of search-engine results is definitely one of the core features that we want to implement in Open-Assistant)

grafik

Regarding the cohere approach a first step could be to analyze and describe which functionality the generate&embed functions offer/what is needed for this approach.

@billray0259
Copy link
Collaborator

Alright, I'll learn what I can about their system and generate a report.

@yk
Copy link
Collaborator

yk commented Dec 30, 2022

@billray0259 thank you very much for taking this on. once you have the report, could you make a PR with the report inside a markdown file? maybe somewhere in docs/research or so

@BitterKanegul
Copy link
Contributor

Hi @andreaskoepf , @billray0259 , @yk I would love to help you out in this effort! I guess once Bill evaluates the framework, additional datasources and research into good ways of fine-tuning for embeddings can be done!

@danielpatrickhug
Copy link
Collaborator

Hey, happy to help too. https://github.com/hwchase17/langchain is an open-source variation with a lot of activity. I can help write up a document for this as well. Don't want to issue snipe but I went through coheres repo and generated a report. Here it is if you want to use it, or I can make a pull request if you like it. let me know what you think :)

Cohere Grounded Question Answering

Grounded question answering is a system that aims to provide accurate and contextualized answers to factual questions by combining the strengths of language models and Google search.

The system uses language models, such as Cohere API and Serp API, to understand the context and form natural language questions. It then uses Google search to find relevant information on the web and provide an answer based on the retrieved information.

The motivation behind this approach is that language models are good at generating sensible answers to complex questions, but they do not have a mechanism for determining the truthfulness of their answers. On the other hand, Google search is effective at finding factual information on the web, but it is not as good at understanding contextual questions or providing answers in natural language.

By combining the ability of language models to understand context and generate natural language questions with the consensus-based truthfulness of Google search, grounded question answering aims to provide reliable and accurate answers to a wide range of factual questions.

There are some potential failure modes for the bot, such as when the user asks a question that implies something that is not true or when the bot is unable to find relevant information for the question.

Summary

This code repository contains various scripts that can be used to create a question answering chatbot. The chatbot is able to generate answers to questions based on a given context and information gathered from the web using the Google Search API.

qa/model.py contains functions for generating answers based on given context and information gathered from the web using the Cohere API.

qa/answer.py contains a function for generating an answer to a question based on a given context and the training data of a specified model. It also contains a function for generating an answer based on information gathered from the web using the search functions in search.py and the Cohere API.

qa/search.py contains functions for searching the web using the SerpAPI and extracting relevant information from the search results.

discord_bot.py contains a Discord bot that uses the functions in the previous modules to generate answers to questions and send them as replies to queries made through Discord.

qa/bot.py contains a class that encapsulates the functionality of the chatbot, including answering questions and gathering information from the web. It can be used as a standalone chatbot or integrated into other systems, such as the Discord bot in discord_bot.py.

discord_bot.py

The code for discord_bot.py is a script for a Discord bot that uses the GroundedQaBot class from the qa.bot module to answer questions.

The bot is capable of answering questions based on the context of the conversation and information from the web, and can be triggered by either direct messages to the bot or by users reacting with a specific emoji to a message.

The script takes in several command line arguments, including an API key for Cohere, an API key for the serpAPI, and an API key for the Discord bot. The script sets up an instance of the MyClient class, which is a subclass of discord.Client. The on_ready method initializes the bot and prints a message to the console when the bot logs in. The on_message method handles direct messages to the bot, and the on_reaction_add method handles reactions to messages with a specific emoji. Both of these methods trigger the answer method, which uses the GroundedQaBot to answer the question and send the response back to Discord. The answer method also sets the chat history for the GroundedQaBot based on the previous messages in the conversation. The script runs the Discord client by calling the run method with the Discord API key.

qa/bot.py

The GroundedQaBot class in bot.py is a conversational agent that provides answers to factual questions by combining the use of language models and Google search.

The GroundedQaBot class takes in two arguments upon initialization: a cohere_api_key and a serp_api_key. The cohere_api_key is used to instantiate a Client object from the cohere library, which is used to perform natural language processing tasks. The serp_api_key is used to make requests to the Serp API, which is used to retrieve information from Google search results.

The GroundedQaBot class has a chat_history attribute, which stores the history of the conversation with the bot. The answer method is used to generate an answer to a given question, based on the conversational history. The answer method makes use of the get_contextual_search_query function to generate a search query based on the conversation history and the answer_with_search function to retrieve information from Google search results and generate an answer based on the retrieved information. The answer method returns a tuple containing the answer text, a list of source URLs, and a list of source texts.

qa/model.py

The model.py file contains two functions that make use of the cohere library to perform natural language processing tasks.

The get_contextual_search_query function takes in a conversation history and a co object (which is an instance of the Client class from the cohere library) and returns a search query that takes into account the context of the conversation. It does this by first creating a prompt by reading a file called get_contextual_search_query.prompt and appending the given history of messages to it. It then uses the given cohere API client to generate text using the given model and the created prompt. This code is using the np.argmax function to find the index of the maximum value in the likelihood list, and then using that index to retrieve the corresponding text value. The generated text is then returned as the search query.

The get_sample_answer function returns a sample answer to a given question using the Cohere API and the specified model. It does this by first reading in a prompt from a file called get_sample_answer.prompt and appending the question and "Answer:" to it. It then uses the Cohere API's generate method with the specified model, the modified prompt, and various other parameters(max_tokens, temp, k, etc.) to generate a response. The response is then returned as the text of the first generation (prediction) of the response.

qa/answer.py

The answer.py file contains two functions that generate answers to questions: answer and answer_with_search.

The answer function takes in a question, a context, a co object (which is an instance of the Client class from the cohere library), a model, and a chat_history (which is optional). t uses the Cohere API to generate a response to the question based on the context. The response is generated using a prompt that includes the question, context, and chat history, if provided. The function then filters out any empty responses and returns the response with the highest likelihood score as the final answer.

The answer_with_search function takes in a question, a co object, a serp_api_token (which is used to make requests to the Serp API), and a number of optional arguments: chat_history, model, embedding_model, url, n_paragraphs, and verbosity. It retrieves a number of paragraphs relevant to the question using the get_results_paragraphs_multi_process function and the Serp API. It then generates a sample answer to the question using the get_sample_answer function and the co object. The sample answer and the retrieved paragraphs are used to find the most relevant paragraph using the embedding_search function. The most relevant paragraph is then used as the context and passed to the answer function along with the question, the co object, and the model, to generate the final answer. The final answer, a list of source URLs, and a list of source texts are returned as a tuple.

qa/search.py

The search.py module contains functions for performing searches using the SERP API and Google Search, and for extracting relevant text from the search results.

The function serp_api_search takes in a search term, a serp_api_token, and a url as input. It uses the serp_api_google_search function to make a request to the Google Search API with the provided search_term and serp_api_token and returns the response as a dictionary. It then iterates over the organic_results and top_stories sections of the response and extracts the url and text of each result, adding them to a list response_urls as a tuple. Finally, it returns the response_urls list.

The open_link function follows a link and returns its contents, and the paragraphs_from_html function extracts a list of paragraphs from an HTML page using BeautifulSoup.

The get_paragraphs_text_from_url function combines these functions to extract a list of paragraphs from the contents pointed to by a URL.

The embedding_search function takes in a list of paragraphs of text, a list of sources corresponding to each paragraph, a search_term, and a model to use for embedding. It uses the Cohere API's embed function to generate embeddings for the paragraphs and the search_term. It then calculates the cosine similarity between each paragraph embedding and the search term embedding using the cosine_similarity function. It returns a sorted list of tuples containing the paragraph, its source, and the cosine similarity between its embedding and the search term embedding.

The get_results_paragraphs_multi_process function performs a search using the SERP API and extracts relevant text from the search results in parallel using multiple processes.

@billray0259
Copy link
Collaborator

@danielpatrickhug Thank you for putting this together! I'm working on a report as well. I'll try to add supplementary information to your report; hopefully, the combination will be greater than the sum of its parts.

I plan on doing some research to determine what models they are currently using for tasks such as:

  • Generating a single question from the chat history
  • Generating a sample answer to the question
  • Calculating embeddings for search results and the sample answer
  • Synthesize the information from the top search results into a grounded answer.

I also want to share some ideas related to using OpenAssistant to accomplish these tasks.

It looks like we'll have an excellent understanding of the Cohere Grounded QA system!

I will be able to provide a pull request with my report by the end of the day today. I'll include my discussion questions/thoughts in the body of the request instead of the report document.

Thanks again @danielpatrickhug

@danielpatrickhug
Copy link
Collaborator

danielpatrickhug commented Dec 31, 2022

@billray0259 you're welcome! happy to help. all good ideas, looking forward to it! The default model for searching through paragraphs is "multilingual-22-12" from cohere. I'll see if I can get a similar document going for langchain!

@mrcabbage972
Copy link
Contributor

mrcabbage972 commented Jan 1, 2023

Just want to mention that there's another tool of the LangChain variety which is called Dust. I still didn't try either of them, so not sure what are the pros and cons. Dust does have a sample implementation of a web-search assistant here.

@danielpatrickhug
Copy link
Collaborator

@mrcabbage972 Oh cool, Thanks for sharing I'll check it out. Langchain is more of a general framework for building a LLM 'agent' or 'chain' while cohere's repo is for asking and aswering questions using their apis. Langchain on the other hand allows you to create general prompt templates, use embedding representations, query document/vector stores, and a framework for giving llmchains tools like:

  • python_repl: A Python shell that allows you to execute Python commands. Maintains state. Does not require an LLM to be initialized.
  • serpapi: A search engine that calls the Serp API and parses the results. Does not require an LLM to be initialized.
  • requests: A tool that uses the Python requests module to access specific content from a website. Does not require an LLM to be initialized.
  • terminal: A tool that executes commands in a terminal using the Python subprocess module. Does not require an LLM to be initialized.
  • pal-math: A language model based on a specific paper that is good at solving complex word math problems. Requires an LLM to be initialized.
  • pal-colored-objects: A language model based on a specific paper that is good at reasoning about position and color attributes of objects. Requires an LLM to be initialized.
  • llm-math: An instance of the LLMMath chain that is useful for answering math questions. Requires an LLM to be initialized.
  • open-meteo-api: A tool that provides natural language access to the OpenMeteo API (https://api.open-meteo.com/), specifically the /v1/forecast endpoint. Requires an LLM to be initialized.
  • news-api: A tool that provides natural language access to the News API (https://newsapi.org/), specifically the /v2/top-headlines endpoint. Requires an LLM to be initialized. Extra parameter: news_api_key (your API key to access this endpoint).
  • tmdb-api: A tool that provides natural language access to the TMDB API (https://api.themoviedb.org/3), specifically the /search/movie endpoint. Requires an LLM to be initialized. Extra parameter: tmdb_bearer_token (your Bearer Token to access this endpoint - note that this is different from the API key).

its also, works with Openai, huggingface, and cohere models.

@pruksmhc
Copy link
Contributor

What is the action items from this? One other difference between Cohere QA and LangChain is Cohere has more processing in the Serp API results than LangChain (by splitting up into paragraphs). This paragraph splitting may not be sufficient for certain types of questions or queries so that would be another workstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants