## Configuration

In [None]:
import weaviate
from weaviate.embedded import EmbeddedOptions

client = weaviate.Client(
    embedded_options=weaviate.embedded.EmbeddedOptions(),
    additional_headers={
        'X-Cohere-Api-Key': "sk-key"
    })

### Schema

In [2]:
# resetting the schema. CAUTION: THIS WILL DELETE YOUR DATA 
client.schema.delete_all()

schema = {
   "classes": [
       {
           "class": "BlogPost",
           "description": "Blog post from the Weaviate website.",
           "vectorizer": "text2vec-cohere",
           "vectorIndexConfig": {
            "distance": "dot"
           },
           "moduleConfig": {
               "reranker-cohere": { 
                    "model": "rerank-multilingual-v2.0"
                }
           },
           "properties": [
               {
                  "name": "Content",
                  "dataType": ["text"],
                  "description": "Content from the blog post",
               },
               {
                "name": "URL",
                "dataType": ["text"],
                "description": "Title of the blog post"
               }
            ]
        }
    ]
}

client.schema.delete_all()

client.schema.create(schema)

print("Schema was created.")

Schema was created.


{"action":"hnsw_vector_cache_prefill","count":1000,"index_id":"blogpost_38MqzEOUwMGM","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2023-08-23T16:42:09-04:00","took":133000}


## Upload Data

In [3]:
blogs = ['../ranking/data/ranking-models.mdx', '../ranking/data/ref2vec-centroid.mdx']

data = {}

# Loop through each file path and read the file
for blog in blogs:
    with open(blog, 'r') as file:
        data[blog] = file.read()

I'm manually chunking up the document into smaller chunks. This results in the chunks being a bit messy, but this can be improved by using an external tool like LlamaIndex, Haystack, LangChain, etc. 

In [4]:
with client.batch as batch:
    for source in data.keys():
        for i in range(0,len(data[source]), 500):
            properties = {
                "source": source,
                "content": data[source][i:i+500]
            }
            client.batch.add_data_object(
                properties,
                class_name="BlogPost"
            )

## Query

##### Query without reranking

In [50]:
query = """
{
  Get {
    BlogPost (
      nearText: {
        concepts: "Low hanging fruit to improve relevance"
      }
      limit: 10)
     {
      content
    }
  }
}
"""

client.query.raw(query)

{'data': {'Get': {'BlogPost': [{'content': 's to prompt it with:\n`please output a relevance score on a scale of 1 to 100.`\n\nI think the second strategy is a bit more interesting, in which we put as many documents as we can in the input and ask the LLM to rank them. The key to making this work is the emergence of LLMs to follow instructions, especially with formatting their output. By prompting this ranking with “please output the ranking as a dictionary of IDs with the key equal to the rank and the value equal to the document id”. Also in'},
    {'content': "from the User's references to other vectors. And as the set of references continues to evolve, the Ref2Vec vectors will continue to evolve also, ensuring that the User vector remains up-to-date with their latest interests.\n\nWhether your goal is to construct a Home Feed interface for users or to pair with search queries, Ref2Vec provides a strong foundation to enable Recommendation with fairly low overhead. For example, it can 

##### The first few results from the above query aren't exactly what we're looking for. Let's run the query again, but rerank the top 10 documents with the text in the content property. 

##### Query with Ranking

In [51]:
query = """
{
  Get {
    BlogPost (
      nearText: {
        concepts: "Low hanging fruit to improve relevance"
      },
      limit: 10)
     {
      content
      _additional {
        rerank(
            property: "content",
            query: "Low hanging fruit to improve relevance"
        ){
          score
        }
      }
    }
  }
}
"""

client.query.raw(query)

{'data': {'Get': {'BlogPost': [{'_additional': {'rerank': [{'score': 0.99982184}]},
     'content': '\ntitle: Ranking Models for Better Search\n\nWhether searching to present information to a human, or a large language model, quality matters. One of the low hanging fruit strategies to improve search quality are ranking models. Broadly speaking, ranking models describe taking the query and each candidate document, one-by-one, as input to predict relevance. This is different from vector and lexical search where representations are computed offline and indexed for speed. Back in August, we [published'},
    {'_additional': {'rerank': [{'score': 3.0959105e-05}]},
     'content': 's to prompt it with:\n`please output a relevance score on a scale of 1 to 100.`\n\nI think the second strategy is a bit more interesting, in which we put as many documents as we can in the input and ask the LLM to rank them. The key to making this work is the emergence of LLMs to follow instructions, especially wi