In [21]:
import cohere
from llama_index import SimpleDirectoryReader

from api_keys import cohere_api_key, open_ai_api_key
import os

## Load Documents

Documents comes from https://github.com/huggingface/transformers/blob/main/awesome-transformers.md

In [18]:
documents = SimpleDirectoryReader('./docs').load_data()
documents = [doc.text for doc in documents if doc.text != '']

documents[:5]

["\n\nAwesome projects built with Transformers\n\nThis page lists awesome projects built on top of Transformers. Transformers is more than a toolkit to use pretrained\nmodels: it's a community of projects built around it and the Hugging Face Hub. We want Transformers to enable\ndevelopers, researchers, students, professors, engineers, and anyone else to build their dream projects.\n\nIn this list, we showcase incredibly impactful and novel projects that have pushed the field forward. We celebrate\n100 of these projects as we reach the milestone of 100k stars as a community; but we're very open to pull requests\nadding other projects to the list. If you believe a project should be here and it's not, then please, open a PR \nto add it.\n\n",
 '\n\ngpt4all\n\ngpt4all is an ecosystem of open-source chatbots trained on massive collections of clean assistant data including code, stories and dialogue. It offers open-source, large language models such as LLaMA and GPT-J trained in an assistant

In [22]:
co = cohere.Client(cohere_api_key)

## Querying using co.rerank to find relevant documents!

In [23]:
query = "Where can I find open source chatbots?"

results = co.rerank(query=query, documents=documents, top_n=3, model='rerank-english-v2.0') 
for idx, r in enumerate(results):
  print(f"Document Rank: {idx + 1}, Document Index: {r.index}")
  print(f"Document: {r.document['text']}")
  print(f"Relevance Score: {r.relevance_score:.2f}")
  print("\n")

Document Rank: 1, Document Index: 1
Document: 

gpt4all

gpt4all is an ecosystem of open-source chatbots trained on massive collections of clean assistant data including code, stories and dialogue. It offers open-source, large language models such as LLaMA and GPT-J trained in an assistant-style.

Keywords: Open-source, LLaMa, GPT-J, instruction, assistant


Relevance Score: 0.98


Document Rank: 2, Document Index: 14
Document: 

DeepPavlov

DeepPavlov is an open-source conversational AI library. It is designed for the development of production ready chat-bots and complex conversational systems, as well as research in the area of NLP and, particularly, of dialog systems.

Keywords: Conversational, Chatbot, Dialog


Relevance Score: 0.96


Document Rank: 3, Document Index: 25
Document: 

argilla

Argilla is an open-source platform providing advanced NLP labeling, monitoring, and workspaces. It is compatible with many open source ecosystems such as Hugging Face, Stanza, FLAIR, and others

## Summarize the given markdown about transformer projects!

In [30]:
half_of_the_documents = documents[:len(documents)//2] # due to character limit

response = co.summarize( 
    text='\n'.join(half_of_the_documents),
    model='summarize-xlarge', 
    length='medium',
    extractiveness='medium'
)

summary = response.summary

In [31]:
summary

"This page lists awesome projects built on top of Transformers. Transformers is more than a toolkit to use pretrained models: it's a community of projects built around it and the Hugging Face Hub. We want Transformers to enable developers, researchers, students, professors, engineers, and anyone else to build their dream projects."