# General instructions for all labs

1. To turn in:
 - this python notebook, filled out (2 pts)
 - a *standalone* PDF report that contains all the plots, and the answers to all the discussion questions (2 pts)

2. Use of ChatGPT / CoPilot / etc:
   - Allowed, but you own everything that is generated
   - This means that any part of the solution can be asked in the quiz. It can be as detailed as "What was the batch size you used in training" or specific as "what exactly does masking do in this case?" Any discussion question is also game for a quiz question.
   - If I find AI usage to be excessive. I can individually drag any of you in for a 1-1 meeting, in which I grill you on your code. If it looks like irresponsible copy/pasting, without proper understanding, I reserve the right to drastically lower your grade, or even submit cases to GGAC for ethical review.
  
3. Use of peer collaboration:
   - In general not allowed. (Discussion / comparing answers is ok, but work on actual coding independently.)
   - Exceptions can be made if you all wrote your own training script, but 1. it takes forever to train or 2. you don't have great compute resources. Then you can share a trained model amongst yourself *and declare it on your pdf*. However, the code for training *still must be written by yourself*
     


# Lab 2: RAG over a Large Codebase

## SciPy

**SciPy** is a fundamental open-source library for scientific computing in Python. It builds on top of **NumPy** and provides efficient implementations of many core algorithms for mathematics, science, and engineering. SciPy is widely used in academia and industry for numerical analysis, optimization, signal processing, and more.

### Key Features

* **Linear Algebra & Optimization:** Robust solvers for systems of equations, eigenvalue problems, and constrained/unconstrained optimization.  
* **Integration & Differential Equations:** Tools for numerical integration, ODE solvers, and quadrature.  
* **Signal & Image Processing:** Filtering, Fourier transforms, and image manipulation utilities.  
* **Statistics & Probability:** Random variables, hypothesis testing, and statistical distributions.  
* **Sparse Matrices:** Efficient storage and computation with large, sparse systems.  

SciPy underlies much of the Python scientific ecosystem, serving as the backbone for applications in physics, biology, engineering, data science, and machine learning.

### Resources

* 📖 [Documentation](https://docs.scipy.org/doc/scipy/)  
* 💻 [GitHub Repository](https://github.com/scipy/scipy)

---

## Lab Goal

In this lab, you will build a **Retrieval-Augmented Generation (RAG) system** to answer natural language questions about a debase (\~10,000 scripts). By combining a L and a RAG over the del (Ldocumentation and LM) rieved SciPy should help a newbie coder the system will help you explore and understand complex source code, algorithms, and APIs.


#

## Part 1 – Setup and Preprocessing

In this part, you will complete the following. The notebook will walk you through it step by step.

1. **Documentation Preparation**

   * Obtain the provided project documentation (manuals, reference guides, tutorials, etc.).
   * Each section will be treated as text input for retrieval.
   * Ignore large binary files, images, and other non-text resources.

2. **Chunking**

   * Split the documentation into overlapping chunks:

     * Suggested: 300–500 tokens per chunk.
     * Overlap: 50–100 tokens to preserve context across boundaries.
   * Store metadata for each chunk, such as document name and section or page reference.

3. **Embedding Generation**

   * Use a sentence-transformer (e.g., `msmarco-distilbert-base-v3`) to generate embeddings for each chunk of documentation.
   * Save embeddings and metadata to disk for reuse.

4. **Vector Database**

   * Load embeddings into FAISS or Chroma.
   * Confirm you can perform a similarity search for a sampmake the flow of the lab smoother.


### Part 1 Step 1: Expanding Code Comments

Take a look at each of these helper scripts. To demonstrate you fully understand the code, you will expand the comments.   The comments should explain not only *what* each line of code does, but also *why* it is there and *how* it contributes to the larger goal of the pipeline.

You are welcome to use ChatGPT (or other tools) to help generate initial drafts of comments. However, you are ultimately responsible for understanding the details — you will be expected to **regurgitate this level of explanation on an exam without assistance**.



In [1]:
!pip install faiss-cpu #-gpu-cu12 # Or faiss-gpu-cu11 for CUDA 11.x

Collecting faiss-cpu
  Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB)
Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (31.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.4/31.4 MB[0m [31m20.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.12.0


In [2]:
import os
import json
import faiss
import numpy as np
from pathlib import Path
from typing import List, Dict

import torch
from transformers import AutoTokenizer, AutoModel

from sentence_transformers import SentenceTransformer
models = {}
#models['code'] = SentenceTransformer("sentence-transformers/msmarco-distilbert-base-v3")
models['code'] = SentenceTransformer("BAAI/bge-m3")

models['docs'] = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

# -------- 1. File Loader --------
def load_code_files(base_dir, exts={".cpp", ".h", ".c", ".f90", ".py"}):
    """
    What does it do? (1 sentence)
    This function takes base_dir folder and for every file ending with an extention in exts, it'll yeild a typle (file_path,file_content). It's a generator function.

    What are the inputs?
    The input can be any folder or file.

    What are the outputs?
    The output is a tuple (file_path,file_content) generated by the generator.

    """
    for root, _, files in os.walk(base_dir):
        for f in files:
            if os.path.splitext(f)[-1] in exts:
                try:
                    path = Path(root) / f
                    with open(path, "r", errors="ignore") as fh:
                        yield str(path), fh.read()
                except Exception as e:
                    print(f"Skipping {f}: {e}")


# -------- 2. Chunker --------
def chunk_text(text: str, chunk_size=1000, overlap=0):
    """
    What does it do? (1 sentence)
    This function takes text/string as input and slices them into chunks of size chunk_size with an overlap.

    What are the inputs?
    Text/string

    What are the outputs?
    List of chunks, where each chunk is of size chunk_size

    """
    words = text.split()
    chunks = []
    for i in range(0, len(words), chunk_size - overlap):
        piece = " ".join(words[i:i+chunk_size])
        if piece.strip():
            chunks.append({
                "text": piece,
                "start_word": i,
                "end_word": i+chunk_size
            })
    return chunks


# -------- 3. Embedding with CodeBERT --------
def embed_texts(model, texts):
    """
    What does it do? (1 sentence)
    This function produces embedding vectors for the corresponding text and model.

    What are the inputs?
    We provide text which we want to vectorize and we provide a pretrained model for converting this text to embedding vectors.

    What are the outputs?
    It'll return (len(text),embedding_vector_size) matrix

    """
    embs = model.encode(
        texts,
        batch_size=16,
        show_progress_bar=True,
        convert_to_numpy=True,
        normalize_embeddings=True  # good for cosine similarity
    )
    return embs

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/123 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/54.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/687 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/2.27G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.27G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/444 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/964 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/191 [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
texts = [
    "Artificial intelligence is transforming the world.",
    "Machine learning is a subset of AI.",
    "Natural language processing enables machines to understand human language."
]

# Print input dimensions
print("Input dimensions:")
print(" - Number of texts:", len(texts))

# Get embeddings
embeddings = embed_texts(models['docs'], texts)

# Print output dimensions
print("\nOutput dimensions:")
print(" - Embeddings shape:", embeddings.shape)  # (num_texts, embedding_dim)

Input dimensions:
 - Number of texts: 3


Batches:   0%|          | 0/1 [00:00<?, ?it/s]


Output dimensions:
 - Embeddings shape: (3, 384)




## FAISS

The next two code blocks introduce **FAISS** (Facebook AI Similarity Search), a library built for efficient similarity search across large collections of vectors. After we embed documentation chunks into high-dimensional vectors, FAISS gives us the data structures and algorithms needed to store them and quickly retrieve the closest matches to a query.

A key feature of FAISS is its support for different types of **indexes** for vector search. In this lab, we will start with the simplest option: **IndexFlatL2**.

* This index stores all vectors directly in memory.
* It computes exact Euclidean ($L^2$) distances for each query.
* It is best suited for datasets ranging from a few thousand up to a few hundred thousanke HNSW or IVF?


In [3]:
# -------- 4. Build Vector DB --------
def build_faiss_index(chunks_with_meta, dim):
    """
    What does it do? (1 sentence)
    This maps every embedding vector to an index. Basically this is a datastructure for storing all the embedding vectors optimally.

    What are the inputs?
    It requires the embedding vector with it's dimension.

    What are the outputs?
    It'll return a list of indices corresponding to the list of embedding vectors.

    """
    index = faiss.IndexFlatL2(dim)
    embeddings = np.vstack([c["embedding"] for c in chunks_with_meta]).astype("float32")
    index.add(embeddings)
    return index



# -------- Example Query --------
def query_index(query, topk=5, code_or_doc = 'code'):
    """
    What does it do? (1-2 sentences)
    For a given query, it'll return the topk nearest indices.

    What are the inputs?
    It'll require a query text, with topk, which will fetch topk nearest indices. We can fetch it from the code or docs folder.

    What are the outputs?
    It'll return the nearest index, given the query.

    """

    index = faiss.read_index("drive/MyDrive/%s/code_chunks.index" % code_or_doc)

    q_emb = embed_texts(models[code_or_doc],[query]).astype("float32")

    distances, indices = index.search(q_emb, topk)

    return indices[0]

In [None]:

# -------- MAIN PIPELINE --------
def process_codebase(base_dir="scipy-main", code_or_doc = 'code'):
    all_chunks = []
    if code_or_doc == 'docs':
        exts = {'.rst'}
        chunk_size = 1000
    else:
        exts = {'.py'}
        chunk_size = 250
    for fpath, text in load_code_files(base_dir, exts):
        for chunk in chunk_text(text,chunk_size=chunk_size):
            chunk["file"] = fpath
            all_chunks.append(chunk)

    # Embed with CodeBERT
    texts = [c["text"] for c in all_chunks]

    embs = embed_texts(models[code_or_doc], texts)
    print('done with embed')
    for c, e in zip(all_chunks, embs):
        c["embedding"] = e

    # Save metadata
    with open("drive/MyDrive/%s/chunks_metadata.jsonl" % code_or_doc, "w") as out:
        for c in all_chunks:
            meta = {k: v for k, v in c.items() if k != "embedding"}
            out.write(json.dumps(meta) + "\n")

    # Build index

    dim = len(all_chunks[0]["embedding"])
    index = build_faiss_index(all_chunks, dim)
    faiss.write_index(index, "drive/MyDrive/%s/code_chunks.index" % code_or_doc)
    print(f"Processed {len(all_chunks)} chunks from {base_dir}.")


# process_codebase("drive/MyDrive/scipy-main/doc/source",code_or_doc = 'docs')   # run once for docs
#process_codebase("drive/MyDrive/scipy-main/scipy",code_or_doc = 'code')   # run once for code



Batches:   0%|          | 0/568 [00:00<?, ?it/s]

done with embed
Processed 9080 chunks from drive/MyDrive/scipy-main/scipy.


In [4]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive



## Part 2: Testing

Now that you have built a vector-based retrieval tool on the **SciPy documentation**, the next step is to **qualitatively assess how well it works**. You will not compute numerical accuracy yet. Instead, you will explore retrieval behavior and record your judgments about when the system feels useful, misleading, or irrelevant.

You will use two supports for this process:

1. The **official SciPy documentation** (provided in the lab).
2. A **free ChatGPT interface** to act as an external “relevance judge.”

---

### 1. Design a Variety of Queries

Create at least **5 queries** covering different types of information you might want from SciPy. Use the templates below and adapt them to optimization, statistics, linear algebra, and sparse matrices:

* **Conceptual (high-level):**
  Ask about an algorithm, method, or mathematical concept.
  *Example:* `"trust-region methods in optimization"` or `"sparse matrix factorization"`.

* **Function/API-specific:**
  Target a known function, class, or module in SciPy.
  *Example:* `"scipy.optimize.minimize"` or `"scipy.sparse.linalg.cg"`.

* **Keyword-only:**
  Use just a technical term.
  *Example:* `"Poisson distribution"` or `"LU decomposition"`.

* **Natural language (descriptive):**
  Ask a full question in plain English.
  *Example:* `"How do I compute confidence intervals for a normal distribution in SciPy?"`

* **Edge cases:**
  Include queries that are vague, misspelled, or off-topic.
  *Example:* `"optimizashun"`, `"solver"`, `"deep learning"`.

---

### 2. Compare with the Manual

For each query:

1. Look up the topic in the SciPy documentation (using search or Ctrl+F).
2. Compare the retrieved snippets with the official docs.

   * If they match or are consistent, mark as **aligned**.
   * If they do not appear in the docs, mark as **potentially irrelevant**.

This provides a rough ground truth based on trusted documentation.

---

### 3. Use ChatGPT as a Relevance Judge

For an additional check, you will ask ChatGPT whether each retrieved snippet is relevant to the query. Use a **standardized prompt template** to keep your experiments consistent.

**Template:**rence (optional): [short excerpt from SciPy documentation]  

Question for ChatGPT:  
"Is this snippet relevant to the query? Answer Yes or No, and provide a one-sentence justification."  
```

**Example:**

```
Query: "sparse matrix factorization"  
Snippet: "The scipy.sparse.linalgecomposition of sparse matrices is available in scipy.sparse.linalg."  

Question for ChatGPT:  
"Is this snippet relevant to the query? Answer Yes or No, and provide a one-sentence justification."  
```

Make a table and include all the query/snippet/ChatGPT outputs.ses sparse LU factorization, which is one form of sparse matrix factorization.*
```

---

### 4. Summarize Findings

Reflect on your observations. Consider the following guiding questions:

**Query length:** How long should a query be to get useful results? Did shorter keyword queries work as well as longer, descriptive ones?

**Documentation coverage:** Were some parts of the SciPy documentation more complete than others? How did that affect retrieval quality?

**Exact vs. fuzzy matching:** Did you need to match words exactly (e.g., `"LU decomposition"` vs. `"factorization"`) to get good results? Give one example where exact matching was not required, and one where it failed.

**Variation across query types:** Which query styles (conceptual, API-specific, natural language, edge cases) gave the most helpful results? Which tended to misfire?

**Role of ChatGPT:** How reliable was ChatGPT as a relevance judge? Did it agree with other methods, such as manual check, ChatGPT check, notes)?


In [5]:
def load_metadata(code_or_doc='code'):
    """Load the chunk metadata (text, start/end) from the .jsonl file."""
    metas = []
    with open(f"drive/MyDrive/{code_or_doc}/chunks_metadata.jsonl", "r") as f:
        for line in f:
            metas.append(json.loads(line))
    return metas

In [6]:
queries_docs = ["What is the purpose of the scipy.signal module?", "How to use scipy.stats.norm.interval?", "sparse matrix multiplication", "How can I find the roots of a polynomial equation in SciPy?", "linear algebrra solver"]
metas=load_metadata('docs')
results = {}
for query in queries_docs:
    top_indices = query_index(query, topk=50, code_or_doc='docs')
    results[query] = [metas[i]["file"] for i in top_indices]

with open("drive/MyDrive/query_results_docs.json", "w") as f:
    json.dump(results, f, indent=4)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

In [7]:
queries_docs = ["How to call scipy.optimize.minimize with constraints?"
,"Example usage of scipy.sparse.linalg.spsolve"
,"What are the parameters for scipy.fft.fft2?"
,"How to handle LinAlgError in SciPy?"
,"Show me code for scipy.signal.convolve2d"]
metas=load_metadata('code')
results = {}
for query in queries_docs:
    top_indices = query_index(query, topk=50, code_or_doc='code')
    results[query] = [metas[i]["file"] for i in top_indices]

with open("drive/MyDrive/query_results_code.json", "w") as f:
    json.dump(results, f, indent=4)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]


### Questions for Discussion

* Why might we use **different models** for code and for documentation?
* Why might we choose **different chunk lengths** for code vs. docs?
* What are the potential **advantages or trade-offs** of alternative strategies?

*In your answers, consider the task requirements, the size of the corpus, and the amount of compute available (especially when using free models).*



# Part 3 – Quantitative Evaluation

We now want to **measure retrieval quality** (how well relevant code/doc chunks are retrieved) and **answer quality** (how grounded the model’s response is).

---

## Precision & Recall

* **Precision\@k**: fraction of the top *k* retrieved chunks that are relevant.

  $$
  \text{Precision@k} = \frac{\# \text{ relevant chunks in top k}}{k}
  $$

* **Recall\@k**: fraction of all relevant chunks that appear in the top *k*.

  $$
  \text{Recall@k} = \frac{\# \text{ relevant chunks in top k}}{\# \text{ total relevant chunks}}
  $$

---

## How to Evaluate

1. Pick **3–5 SciPy-focused queries** (function names, error cases, natural language).
2. For each query:

   * Record the **retrieved top-k chunks**.
   * Label which chunks are **relevant** (contain the needed function/definition/example).
   * Compute precision\@k and recall\@k for k = 1, 3, 5.
   * Collect the model’s answer.
   * Assign an **answer relevance score (0/1)**:

     * **1** = grounded in retrieved code/docs,
     * **0** = hallucinated/unrelated.

---

## Example Table with SciPy Queries

| Query                                             | P\@1 | P\@3 | P\@5 | R\@5 | Answer Relevant? (0/1) | Notes                                  |
| ------------------------------------------------- | ---- | ---- | ---- | ---- | ---------------------- | -------------------------------------- |
| sparse.linalg.cg                                | 1    | 1    | 0.8  | 0.8  | 1                      | Correctly retrieved conjugate grad fn  |
| fft convolution example                        | 1    | 0.67 | 0.8  | 0.8  | 1                      | Good examples in signal.fft docs       |
| optimize.minimize with bounds                  | 1    | 1    | 1    | 1    | 1                      | Direct hit in optimization tutorial    |
| stats.ttest_ind                                 | 0    | 0.33 | 0.4  | 0.4  | 0                      | Retrieved partial, model filled in     |
| *How do I solve a sparse linear system in SciPy?* | 1    | 0.67 | 0.8  | 0.8  | 1                      | Found references to `spsolve` and `cg` |


# Free or paid? It's up to you
Given that our assessment is not very long, a free way of assessing the output is to just use the ChatGPT free interface. But, if you are so inclined, you are of course welcome to pay for the OpenAI API (e.g. `gpt-4o-mini`). It isn't very expensive, and can help you practice setting up more automated testbeds.  Here is the process for doing that:


```bash
pip install openai
```

```python
from openai import OpenAI
client = OpenAI()

def ask_llm(user_query, retrieved_chunks):
    prompt = f"""
    You are an expert on the following codebase.
    Answer the question using only the provided code context.

    Code context:
    {retrieved_chunks}

    Question:
    {user_query}

    Answer:
    """
    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role":"user","content":prompt}]
    )
    return resp.choices[0].message.content
```



# Part 4 – Adding Code Retrieval and Evaluation

Now that you have embeddings for documentation, we will extend the system to also include **code embeddings**. This will allow the retriever to pull both documentation chunks *and* code chunks when answering a query.

---

## Step 1. Train & Add Code Embeddings

1. Use the same embedding workflow as before, but now applied to **code chunks** (functions, classes, or 50–80 line segments).
2. Store these in your vector index alongside your documentation embeddings.

---

## Step 2. Qualitative Assessment (Code Only)

As you did earlier for documentation:

1. Pick **3–5 queries** that are code-oriented (e.g., function names, error messages, “how do I call X?”).
2. Retrieve the top chunks from the **code index** only.
3. Evaluate qualitatively:

   * Do the retrieved chunks contain the right functions/classes?
   * Are signatures, docstrings, and examples useful?
   * Where does it fail (e.g., missing, too long, irrelevant)?

Keep brief notes for each query.


## Step 3.  Quantitative Evaluation with Fusion

Now we want to combine results from **doc retrieval** and **code retrieval** into a single ranked list. The process is:

1. **Retrieve a large candidate pool**

   * If your original top-k was, say, `k=5`, now retrieve about `10×k` (e.g., 50) from *each* index (docs and code).
   * This ensures you don’t miss relevant results that would otherwise be cut off.

2. **Compute fusion scores**

   * **Reciprocal Rank Fusion (RRF):**

     $$
     \text{RRF}(d) = \sum_{i \in \{\text{doc, code}\}} \frac{1}{k_0 + \text{rank}_i(d)}, \quad k_0 \in [60,100]
     $$

     This combines the rankings from both sources.
   * **(Optional)** Instead of RRF, you can use a **weighted similarity score**:

     $$
     \text{score}(d) = 0.6 \cdot \text{score}_{\text{code}}(d) + 0.4 \cdot \text{score}_{\text{doc}}(d)
     $$

     Use this if your queries are often function- or symbol-heavy.

3. **Re-rank and select final top-k**

   * After computing scores for all candidates, sort them.
   * Keep the best `k` (e.g., 5) as your fused retrieval output.



## Step 4. Build Your Evaluation Table

As before, compute **Precision\@k** and note whether the **answer is grounded**. This time evaluate both fused retrieval methods (RRF and/or weighted sum).

| Query                                      | P\@1 | P\@3 | P\@5 | Answer Relevant? (0/1) | Notes                                               |
| ------------------------------------------ | ---- | ---- | ---- | ---------------------- | --------------------------------------------------- |
| `sparse.linalg.cg`                         | 1    | 1    | 0.8  | 1                      | Found correct function signature                    |
| `fft convolution example`                  | 1    | 0.67 | 0.8  | 1                      | Docs + code both retrieved                          |
| *How do I solve a sparse system in SciPy?* | 1    | 0.67 | 0.8  | 1                      | Fused retrieval covered both `spsolve` and tutorial |




In [8]:
queries_docs = ["What are the different interpolation methods available in SciPy?"
,"How to use scipy.optimize.root to find the roots of a function?"
,"Explain how to perform a t-test in SciPy and interpret the results."
]
metas=load_metadata('code')
results = {}
for query in queries_docs:
    top_indices = query_index(query, topk=50, code_or_doc='code')
    results[query] = [metas[i]["file"] for i in top_indices]

with open("drive/MyDrive/query_results_code_4.json", "w") as f:
    json.dump(results, f, indent=4)

metas=load_metadata('docs')
results = {}
for query in queries_docs:
    top_indices = query_index(query, topk=50, code_or_doc='docs')
    results[query] = [metas[i]["file"] for i in top_indices]

with open("drive/MyDrive/query_results_docs_4.json", "w") as f:
    json.dump(results, f, indent=4)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]