# RAG exploration

Generated content from the app is great but seems quite generic. I want to augment it with relevant features from the VS Code release notes where users can get more context.

## Load data

Since VS Code team uses a GitHub repo to manage the release notes, I used the GitHub API to fetch the release notes. Each release notes "document" can be long, so I knew I had to chunk the data, and chunk them in a way that preserves text segments with related context. Since the VS Code team manages the release notes in markdown format, I used a markdown parser ([LangChain's `markdown_header_metadata_splitter`](https://python.langchain.com/docs/how_to/markdown_header_metadata_splitter/)) to chunk each release notes so that release features can fit into the embeddings models that have much smaller token limits.

Let's load the data that we've saved.

**`Copilot: Load release_notes.json as docs_contents`**

In [None]:
import json

with open('release_notes.json', 'r') as file:
    release_notes = json.load(file)

release_notes[101]

{'content': 'See what is new in the Visual Studio Code February 2017 Release (1.10)  \n## Workbench  \n### Configurable Explorer key bindings  \nBy popular demand, you can now configure the key bindings for most of the commands in the File Explorer and OPEN EDITORS view.  \nThe following commands could already be assigned prior to version 1.10 in the File Explorer:  \n* `explorer.newFile` - Create a new file\n* `explorer.newFolder`-  Create a new folder  \nNew commands that work in both the File Explorer and OPEN EDITORS view  \n* `explorer.openToSide` - Open to the side\n* `copyFilePath` - Copy path of file/folder\n* `revealFileInOS` - Reveal file in OS  \nNew commands that only work in the File Explorer:  \n* `filesExplorer.copy` - Copy a file from the File Explorer\n* `filesExplorer.paste` - Paste a file that was copied from the File Explorer\n* `renameFile` - Rename a file/folder in the File Explorer\n* `moveFileToTrash` - Move a file/folder to trash from the File Explorer\n* `dele

## Hybrid search

We will do a hybrid of **full-text** and **vector similarity** search to retrieve the most relevant documents. This method will combine the precision of keyword matching with the contextual understanding of semantically similar content -- which would not only improve relevance and coverage of the document retrieval, it would also capture results that would be missed by either method alone.

But I'm not sure which tool to use for the full-text search. Let's ask Copilot _(chat)_.

**`Copilot: Recommend a few options for lightweight and fast full-text search engine`**

I like that MeiliSearch is open source and easy to deploy and use. This is perfect for the purposes of this demo. Let's use that.

You can manage your dev environment freely in Codespaces, so let's install MeiliSearch through the terminal and use the [self-hosted option](https://www.meilisearch.com/docs/learn/self_hosted/getting_started_with_self_hosted_meilisearch).

```bash
# Install Meilisearch
curl -L https://install.meilisearch.com | sh

# Launch Meilisearch
./meilisearch --master-key="aSampleMasterKey"
```

Now that we have MeiliSearch running, let's install the Python package (`pip install meilisearch`) and index the release notes.

In [6]:
import meilisearch

ms_client = meilisearch.Client('http://127.0.0.1:7700')

In [3]:
import re

# Pre-compile the regex pattern for better performance in repeated use
pattern = re.compile(r'v1_(9|8)\d+')

latest_release = []
for item in release_notes:
    # Skip items without 'url' or section headers
    if 'url' not in item or item['url'].endswith('#_'):
        continue

    # Check if the 'url' matches the pattern for latest release () and filter out 'content_embeddings'
    if pattern.search(item['url']):
        filtered_item = {k: v for k, v in item.items() if k != 'content_embeddings'}
        latest_release.append(filtered_item)

# Print the filtered items
print(latest_release[:2])

[{'content': 'Learn what is new in the Visual Studio Code June 2023 Release (1.80)  \n### Accessibility help improvements  \nA new command **Open Accessibility Help** (`kb(editor.action.accessibilityHelp)`) opens a help menu based on the current context. It currently applies to the editor, terminal, notebook, chat panel, and inline chat features.  \nDisable the accessibility help menu hint and open additional documentation, if any, from within the help menu.', 'url': 'https://code.visualstudio.com/updates/v1_80#_accessibility-help-improvements', 'id': 3375}, {'content': 'Learn what is new in the Visual Studio Code June 2023 Release (1.80)  \n### Accessibility help for notebooks  \nA new accessibility help menu was added for notebooks to provide information about the editor layout and navigating and interacting with the notebook.', 'url': 'https://code.visualstudio.com/updates/v1_80#_accessibility-help-for-notebooks', 'id': 3376}]


In [4]:
print(f"Original (`release_notes`): {len(release_notes)}")
print(f"Latest (`latest_release`): {len(latest_release)}")

Original (`release_notes`): 4138
Latest (`latest_release`): 576


Now let's load the data into MeiliSearch and conduct a test search.

In [7]:
ms_client.index('latest_release').add_documents(latest_release)

TaskInfo(task_uid=8, index_uid='latest_release', status='enqueued', type='documentAdditionOrUpdate', enqueued_at=datetime.datetime(2024, 10, 10, 0, 51, 3, 768598))

In [8]:
ms_client.index('latest_release').search('copilot notebooks', {'limit': 20})['hits']

[{'content': 'Learn what is new in the Visual Studio Code June 2023 Release (1.80)  \n## Contributions to extensions  \n### GitHub Copilot  \nWe have introduced preview-only slash commands in the Chat view to help you create projects and notebooks and search for text in your workspace.  \n>**Note**: To get access to the Chat view, inline chat, and slash commands (for example `/search`, `/createWorkspace`), you need to install the [GitHub Copilot Chat](https://marketplace.visualstudio.com/items?itemName=GitHub.copilot-chat) extension.  \n#### Create workspaces  \nYou can ask Copilot to create workspaces for popular project types with the `/createWorkspace` slash command. Copilot will first generate a directory structure for your request.  \n<video src="images/1_80/create-workspace-outline.mp4" autoplay loop controls muted title="Create workspace outline"></video>  \nYou can then use the **Create Workspace** button to create and open the project directory as a new workspace.  \n![Create 

## Retrieve documents

Retrieve documents first via full-text search using TF-IDF & meilisearch. Then generate embeddings for top 10 full-text search results and conduct a semantic search to come up with the top 3 most relevant results.

Since we're using natural language to query our results, let's use TF-IDF (Term Frequency - Inverse Document Frequency) to score words based on their relevance to a given document or query. This will help to find the most important words for the search.

**`Copilot: Create a function that applies TF-IDF to extract op keywords given a sentence`**

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer

def extract_top_keywords(query, documents, top_k=10):
    vectorizer = TfidfVectorizer(stop_words='english') # remove English words that don't carry significant meaning
    vectorizer.fit(documents) # fit the vectorizer on the documents

    tfidf_matrix = vectorizer.transform([query]) # transform the query to a TF-IDF matrix
    words = vectorizer.get_feature_names_out() # get the feature names (words)
    scores = tfidf_matrix.toarray().flatten() # get the scores for each word in the query
    
    # extract top keywords based on TF-IDF scores
    keyword_scores = dict(zip(words, scores))
    sorted_keywords = sorted(keyword_scores.items(), key=lambda x: x[1], reverse=True)
    
    # define words to exclude
    exclude_words = {'recent', 'new', 'feature', 'features', 'content', 'contents', 'release', 'releases', 'notes', 'note', 'updates', 'update'}
    
    # output top keywords
    top_keywords = [word for word, score in sorted_keywords if score > 0 and word not in exclude_words][:top_k]
    return ' '.join(top_keywords) # return keywords as a string

# Test: Print top keywords for a sample query
documents = [doc['content'] for doc in release_notes if 'content' in doc] # only search over content of the release notes
top_keywords = extract_top_keywords("What are recent features in Copilot for notebooks?", documents)
print(top_keywords)

copilot notebooks


Now let's write a function to search for the most relevant documents based on the query.

**`Copilot: Using meilisearch and the extract_top_keywords function, write a function to conduct a full text search over only the content of `#release_notes`.`**

In [None]:
def full_text_search(query, documents=release_notes, index_name='latest_release', top_k=50):
    documents = [doc['content'] for doc in documents if 'content' in doc] # only search over content of the release notes
    top_keywords = extract_top_keywords(query, documents)

    result = ms_client.index(index_name).search(top_keywords, {'limit': top_k})['hits']
    return result

Now let's conduct a vector similarity search using FAISS. We'll first copy over getting started code for using  embeddings models from [GitHub Marketplace](https://github.com/marketplace/models).

In [None]:
import os

from azure.ai.inference import EmbeddingsClient
from azure.core.credentials import AzureKeyCredential

endpoint = "https://models.inference.ai.azure.com"

embeddings_client = EmbeddingsClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(os.environ["AZURE_TOKEN"])
)

def generate_embeddings(text, model="text-embedding-3-small"):
    response = embeddings_client.embed(
        input=text,
        model=model
    )

    return response.data[0].embedding

<!-- TODO: modify -->
**`Copilot: Create a function to conduct vector similarity search using FAISS.`**

In [None]:
import faiss
import numpy as np

def faiss_search(query_embedding, doc_embeddings, top_k=5):
    # Convert document embeddings into a numpy array
    embeddings_matrix = np.array(doc_embeddings)
    
    # Build FAISS index
    dim = embeddings_matrix.shape[1]
    index = faiss.IndexFlatL2(dim)  # Using L2 (Euclidean) distance
    index.add(embeddings_matrix)

    # Perform the search with FAISS
    _, indices = index.search(np.array([query_embedding]), top_k)

    return indices.flatten()

In [66]:
def retrieve_and_embed_docs(query, documents=latest_release, embeddings_model="text-embedding-3-small", top_k=3):
    full_text_results = full_text_search(query, top_k=10) # full text search using TF-IDF & meilisearch

    # extract relevant document embeddings from meilisearch results
    relevant_texts = []
    doc_embeddings = []
    urls = []
    for hit in full_text_results:
        doc_id = hit['id']
        doc = next((item for item in documents if item.get('id') == doc_id), None)
        if 'content' in doc:
            relevant_texts.append(doc['content'])
            content_embeddings = generate_embeddings(doc['content'], model=embeddings_model)
            doc_embeddings.append(content_embeddings)
            urls.append(doc['url'])
    
    # vector search using FAISS
    query_embedding = generate_embeddings(query, model=embeddings_model)
    faiss_indices = faiss_search(query_embedding, doc_embeddings, top_k)
    
    # combine results
    combined_results = []
    for i in faiss_indices:
        combined_results.append({
            "content": relevant_texts[i],
            "url": urls[i]
        })

    return combined_results

In [68]:
# Test: Generate an answer for a sample question
q = "What are recent features for Copilot chat in notebooks?"

retrieved_docs = retrieve_and_embed_docs(q)
retrieved_docs

[{'content': 'Learn what is new in the Visual Studio Code September 2024 Release (1.94)  \n### Attach variables in notebook chat  \nWhen you use Copilot in a notebook, you can now attach variables from the Jupyter kernel in your requests. Adding variables gives you more precise control over the context for your chat request, so that you get more relevant responses from Copilot.  \nEither type `#`, followed by the variable name, or use the 📎 control (`kb(workbench.action.chat.attachContext)`) in Inline Chat to add a context variable.  \n<video src="images/1_94/notebook-kernel-variable.mp4" title="Attach a context variable by using `#` in a notebook chat request" autoplay loop controls muted></video>',
  'url': 'https://code.visualstudio.com/updates/v1_94#_attach-variables-in-notebook-chat'},
 {'content': 'Learn what is new in the Visual Studio Code March 2024 Release (1.88)  \n### GitHub Copilot  \n#### Inline Chat improvements  \nInline Chat now starts as a floating control, making it 

TODO: Let's test with different embeddings models to see if we get different results...

## Answer generation

In [106]:
# import tinyurl
def generate_llm_answer(question, context, completion_model="gpt-4o-mini"):
    # Combine the relevant documents into a single context
    context_text = " ".join([doc['content'] for doc in context if doc.get('content')])
    context_url = ", ".join([doc['url'] for doc in context if doc.get('url')])

    messages = [
        {"role": "system", "content": "You are a social assistant who writes creative content. You will politely decline any other requests from the user not related to creating content. Don't talk about a single VS Code release and don't talk about release dates at all. Instead, only talk about the relevant features. Don't include made up links. You format all your responses as Markdown unless otherwise specified. Avoid wrapping your entire response in a markdown code element."},
        {"role": "user", "content": f"Create a short tweet based on the following context: {context_text}. This won't actually be a tweet, so in your answer, always include the following URLs from the content sources: {context_url}. Question: {question}"}
    ]
    
    response = gpt_client.chat.completions.create(
        model=completion_model,
        messages=messages,
        temperature=0.3,
        max_tokens=1500, # Dynamically set max_tokens based on the combined length of the docs?
        top_p=1.0
    )

    answer = response.choices[0].message.content
    return answer

In [101]:
final_answer = generate_llm_answer(q, retrieved_docs)
print(final_answer)

🚀 Exciting updates in VS Code! Now you can attach variables in notebook chats with Copilot for more precise context. Just type `#` followed by the variable name or use the 📎 control! 🎉 Check it out: [Attach Variables](https://code.visualstudio.com/updates/v1_94#_attach-variables-in-notebook-chat) #VSCode #GitHubCopilot

For more on Copilot features, explore:  
- [March 2024 Release](https://code.visualstudio.com/updates/v1_88#_github-copilot)  
- [May 2024 Release](https://code.visualstudio.com/updates/v1_90#_github-copilot)


## Comparing between models

Use [GitHub Marketplace](https://github.com/marketplace/models) to find and experiment with AI models. Replace `embedidings_model` and `completion_model` names found in the marketplace:

```python
q = "What are recent features for Copilot chat in notebooks?"
retrieved_docs = retrieve_and_embed_docs(q, embeddings_model="text-embedding-3-small")
final_answer = generate_llm_answer(q, retrieved_docs, completion_model="gpt-4o-mini")
print(final_answer)
```

In [102]:
final_answer = generate_llm_answer(q, retrieved_docs, completion_model="Mistral-small")
print(final_answer)

🚀 New in VS Code 1.94! Attach variables from the Jupyter kernel in your Copilot chat requests for more relevant responses. Use `#` or the 📎 control. Learn more: https://code.visualstudio.com/updates/v1_94#_attach-variables-in-notebook-chat

And in VS Code 1.88, the kernel state is now automatically included as context in Inline Chat for notebooks. This lets Copilot use the current state of the notebook to provide more relevant completions. Learn more: https://code.visualstudio.com/updates/v1_88#_notebook-kernel-state-as-context


In [103]:
final_answer = generate_llm_answer(q, retrieved_docs, completion_model="meta-llama-3-8b-instruct")
print(final_answer)

Here's a tweet-sized summary of the recent features for Copilot chat in notebooks:

"New in VS Code! 🚀 Attach variables from Jupyter kernel in notebook chat with Copilot. Add context with `#` or 📎 control. Get more precise control over chat requests and relevant responses. Learn more: https://code.visualstudio.com/updates/v1_94#_attach-variables-in-notebook-chat"
