<a href="https://colab.research.google.com/github/lamkaiyi/RAG_with_Llama/blob/main/LlamaIndexRAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install -q torch\
torchvision\
transformers\
langchain\
arxiv\
pymupdf\
chromadb\
wandb\
tiktoken\
sentence-transformers\
bitsandbytes\
accelerate\
ragas\
llama_index\
datasets\
text-generation\
pypdf

# Mine Relevant Documents

Given a topic, we will download relevant research papers from arXiv (https://info.arxiv.org/about/index.html)


In [None]:
from llama_index import download_loader

topic = "Support Vector Machines"
ArxivReader = download_loader("ArxivReader")

loader = ArxivReader()
documents = loader.load_data(search_query=topic)

Observe the data structure in which the documents are loaded.

In [None]:
documents[0]

Document(id_='bb67ec1b-f9c5-4cc4-9bf5-d0c3b21fecc4', embedding=None, metadata={'page_label': '1', 'Title of this paper': 'A novel improved fuzzy support vector machine based stock price trend forecast model', 'Authors': 'Shuheng Wang, Guohao Li, Yifan Bao', 'Date published': '01/02/2018', 'URL': 'http://arxiv.org/abs/1801.00681v1', 'file_name': '47c74d91799735819653df28d2e41ef1.pdf'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text=" \n A novel  improved  fuzzy  support  vector  machine  based  stock  price  \ntrend  forecast  model \n \nShuheng  Wang1, Guohao Li2,  and Yifan  Bao3  \n1 UCSD  Department  of Mathematics,  San Diego,  CA, USA; \n2 Marshall  School  of Business,  University  of Southern  California,  Los Angeles,  CA, USA; \n3 China  Economics  and Management  Academy,  Central  University  of Finance  and Economics,  \nChina.  \nKeywords:  NASDAQ  Stock  Market,  Standard  & Poor's  (S&P)  Stock  market,  support  vector  \nmachine,

----

# Instantiate relevant models

In order to utilise the HuggingFace Pipeline for RAG, we will require:

- Embedding model (**[all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2://)**)

   - Purpose: to convert inputs into a dense (low dimensional) vector space. This allows the model to capture contexts and semantic meanings

- LLM (**[Llama2 13B](https://huggingface.co/meta-llama/Llama-2-13b)**)

- Tokenizer

  - Purpose: convert raw text data into a numerical format suitable for training deep learning models.





  Reference: https://medium.com/@prudhviraju.srivatsavaya/embedding-layer-vs-tokenizer-a1e4ade764e3

In [None]:
from torch import cuda
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

embed_model_id = 'sentence-transformers/all-MiniLM-L6-v2'

device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'
print(device) #check that cuda is detected

embed_model = HuggingFaceEmbeddings(
    model_name=embed_model_id,
    model_kwargs={'device': device},
    encode_kwargs={'device': device, 'batch_size': 16}
)

cuda:0


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

train_script.py:   0%|          | 0.00/13.2k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [None]:
from google.colab import userdata
from torch import cuda, bfloat16
import transformers

model_id = 'meta-llama/Llama-2-13b-chat-hf'

device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# set quantization configuration to load large model with less GPU memory
# this requires the `bitsandbytes` library
bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=bfloat16
)

# begin initializing HF items, need auth token for these
hf_auth = userdata.get('huggingfaceAPIkey')

model_config = transformers.AutoConfig.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)

model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    config=model_config,
    quantization_config=bnb_config,
    device_map='auto',
    use_auth_token=hf_auth,
    #cache_dir='' #if running local
 )


model.eval()
print(f"Model loaded on {device}")




config.json:   0%|          | 0.00/587 [00:00<?, ?B/s]



model.safetensors.index.json:   0%|          | 0.00/33.4k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/9.95G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/9.90G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/6.18G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]



generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Model loaded on cuda:0


In [None]:
tokenizer = transformers.AutoTokenizer.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)
generate_text = transformers.pipeline(
    model=model, tokenizer=tokenizer,
    return_full_text=True,
    task='text-generation',

    temperature=0.01,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
    max_new_tokens=1000,  # max number of tokens to generate in the output
    repetition_penalty=1.1  # without this output begins repeating
)




tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [None]:
from llama_index import VectorStoreIndex
from llama_index import ServiceContext
from langchain.llms import HuggingFacePipeline


llm = HuggingFacePipeline(pipeline=generate_text)

service_context = ServiceContext.from_defaults(
    llm=llm, embed_model=embed_model
)
index = VectorStoreIndex.from_documents(documents,
                                        service_context=service_context)

In [None]:
query_engine = index.as_query_engine()

In [None]:
response = query_engine.query(
    f"What are some common applications of {topic}? \
    Answer based on the provided documents, and provide the relevant extract\
    from the documents to support the answer."
)
print(str(response))

 Some common applications of Support Vector Machines (SVMs) include text classification, image classification, bioinformatics, and medical diagnosis. SVMs have been used in various fields such as finance, marketing, and healthcare. For example, SVMs have been used to predict stock prices, classify customer responses, and identify genes related to diseases.

Here are some relevant extracts from the provided documents to support the answer:

From the first document: "Support Vector Machines (SVMs) have been successfully applied to a wide range of applications, including text classification, image classification, bioinformatics, and medical diagnosis."

From the second document: "In this paper, we study the support vector machine and introduced the notion of generalized support vector machine for classification of data. We show that the problem of generalized support vector machine is equivalent to the problem of generalized variational inequality and establish various results for the exi

Observe the Response class

In [None]:
response

Identify which text chunks the RAG pipeline is retrieving from the context documents to answer the question.

Observe that they do not seem to be relevant to the question.

In [None]:
for c in response.source_nodes:
  print(c.text)
  print()

The following is a summary of the paper: Learning properties of Support Vector Machines

Summary: We study the typical learning properties of the recently proposed Support
Vectors Machines. The generalization error on linearly separable tasks, the
capacity, the typical number of Support Vectors, the margin, and the robustness
or noise tolerance of a class of Support Vector Machines are determined in the
framework of Statistical Mechanics. The robustness is shown to be closely
related to the generalization properties of these machines.

The following is a summary of the paper: Linear Classification of data with Support Vector Machines and Generalized Support Vector Machines

Summary: In this paper, we study the support vector machine and introduced the notion
of generalized support vector machine for classification of data. We show that
the problem of generalized support vector machine is equivalent to the problem
of generalized variational inequality and establish various results for t

---

#Evaluating the quality of answers from the RAG pipeline

LlamaIndex has some built-in metrics for evaluating RAG pipelines.


Here, we test the `FaithfulnessEvaluator` from LlamaIndex. The results of the evaluation are displayed in a `pandas` DataFrame.

In [None]:
# attach to the same event-loop
import nest_asyncio

nest_asyncio.apply()

In [None]:
# configuring logger to INFO level
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [None]:
from llama_index import (
    TreeIndex,
    VectorStoreIndex,
    SimpleDirectoryReader,
    ServiceContext,
    Response,
)
from llama_index.evaluation import FaithfulnessEvaluator
import pandas as pd

pd.set_option("display.max_colwidth", 0)

In [None]:
# in practice, a different LLM should be used as evaluator
# However, due to Colab resource constraints, we use the same LLM for convenience

service_context_llama2 = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)

evaluator = FaithfulnessEvaluator(service_context=service_context_llama2)



In [None]:
# define jupyter display function
def display_eval_df(response, eval_result: str) -> None:
    if response.source_nodes == []:
        print("no response!")
        return
    eval_df = pd.DataFrame(
        {
            "Response": str(response),
            "Source": response.source_nodes[0].node.text[:1000] + "...",
            "Evaluation Result": "Pass" if eval_result.passing else "Fail",
        },
        index=[0],
    )
    eval_df = eval_df.style.set_properties(
        **{
            "inline-size": "600px",
            "overflow-wrap": "break-word",
        },
        subset=["Response", "Source"]
    )
    display(eval_df)

In [None]:
topic = "Support Vector Machines"
test_query = f"What are some common applications of {topic}?"
response_vector = query_engine.query(test_query)
eval_result = evaluator.evaluate_response(response=response_vector)

  warn_deprecated(


In [None]:
display_eval_df(response_vector, eval_result)

Unnamed: 0,Response,Source,Evaluation Result
0,"Support Vector Machines (SVMs) have several common applications in areas such as text classification, image classification, bioinformatics, and finance. SVMs can be used for binary or multi-class classification problems, and they are particularly effective when dealing with high-dimensional data or noisy data. Some specific applications include: 1. Sentiment Analysis: SVMs can be used to classify text documents as positive, negative, or neutral based on their sentiment. 2. Image Recognition: SVMs can be used to recognize objects in images by analyzing features such as color, texture, and shape. 3. Bioinformatics: SVMs can be used to classify proteins into different families based on their structural features. 4. Fraud Detection: SVMs can be used to detect fraudulent transactions in financial data by analyzing patterns and anomalies. 5. Medical Diagnosis: SVMs can be used to diagnose diseases based on medical images or patient data. 6. Recommendation Systems: SVMs can be used to recommend products or services based on user preferences and past behavior. 7. Quality Control: SVMs can be used to classify products into different quality levels based on their features. 8. Customer Segmentation: SVMs can be used to segment customers based on their demographic and behavioral data. 9. Risk Prediction: SVMs can be used to predict the risk of certain events such as credit card fraud or disease outbreaks based on historical data. 10. Traffic Signal Control: SVMs can be used to optimize traffic signal control systems to reduce congestion and improve traffic flow.","The following is a summary of the paper: Learning properties of Support Vector Machines Summary: We study the typical learning properties of the recently proposed Support Vectors Machines. The generalization error on linearly separable tasks, the capacity, the typical number of Support Vectors, the margin, and the robustness or noise tolerance of a class of Support Vector Machines are determined in the framework of Statistical Mechanics. The robustness is shown to be closely related to the generalization properties of these machines....",Fail


In [None]:
# test on multiple questions about SVMs, manually generated

questions = [
    "Provide an example of a situation where applying the kernel trick is advantageous, and discuss the computational benefits.",

    "Discuss the challenges posed by imbalanced datasets in the context of support vector machines.",

    "Explain strategies to mitigate the impact of class imbalance when training support vector machines.",

    "What are the differences between support vector machines and other popular classification algorithms such as logistic regression or decision trees?",

    "In what scenarios are support vector machines likely to outperform or underperform compared to alternative methods?"
]

In [None]:
responses = []
results = []
sources = []

for q in questions:
  response_vector = query_engine.query(q)
  eval_result = evaluator.evaluate_response(response=response_vector)
  responses.append(str(response_vector))
  sources.append(response_vector.source_nodes[0].node.text[:1000] + "...")
  results.append("Pass" if eval_result.passing else "Fail")



In [None]:
import pandas as pd

df = pd.DataFrame(
    {
        "Response": responses, "Source": sources, "Result": results
    }
)

df.style.set_properties(
        **{
            "inline-size": "600px",
            "overflow-wrap": "break-word",
        },
        subset=["Response", "Source"]
    )
display(df)

Unnamed: 0,Response,Source,Result
0,"\nIn situations where the feature space is high dimensional and the number of training examples is limited, applying the kernel trick can be advantageous. This is because the kernel trick allows us to map the data from the input space to a higher dimensional feature space, where the training examples can be more evenly distributed. This can lead to better generalization performance and improved robustness to noise and outliers. Additionally, the kernel trick can provide computational benefits by allowing us to avoid computing the dot product between the data points in the input space, which can be computationally expensive when working with large datasets. Instead, we can compute the dot product in the feature space, which can be more efficient. For example, in text classification tasks, the kernel trick can be used to map the text documents to a higher dimensional space where the similarity between documents can be captured more accurately.","More\ngeneral SVMs, that use a Kernel K(J,Φ(ξ)) instead of\nthe inner product in Eq.(2), have been proposed [1], but\nwe restrict to the inner product in the following.\nThe SV-margin is\nκmax(J∗) = max\nJinf\nµγµ, (3)\nwhereJ∗, the MSP weight vector in feature-space,\nis a linear combination of the SV [1,6], J∗=∑\nµ∈SVxµτµΦ(ξµ). Thexµare positive parameters to\nbe determined by the learning algorithm, which has to\ndetermine also the number of SV. Generally, this num-\nber is small compared with the feature-space dimension,\nafactthatallowstoincreasethelatterconsiderablywith-\nout increasing dramatically the number of parameters to\nbedetermined. TheSVMininput-space( k= 0)orlinear\nSVMisthe usualMSP,whosepropertieshaveextensively\nbeen studied (see [14] and references therein).\nWe obtain the generic properties of the SVM through\nthe by now standard replica approach [15]. Results are\nobtained in the thermodynamic limit, in which the input\nspace dimension and the number of training patterns go\nto inﬁnity...",Fail
1,"\n\nChallenges posed by imbalanced datasets in the context of support vector machines include:\n\n1. Biased models: SVMs trained on imbalanced datasets tend to bias towards the majority class, leading to poor performance on minority classes.\n2. Overfitting: SVMs may overfit to the majority class, resulting in poor generalization performance on new data.\n3. Underfitting: SVMs may fail to capture the underlying patterns in the minority class, leading to poor performance overall.\n4. Evaluation metrics: Traditional evaluation metrics such as accuracy may not accurately reflect the performance of SVMs on imbalanced datasets, leading to misleading conclusions.\n5. Hyperparameter tuning: Imbalanced datasets require careful tuning of hyperparameters such as regularization parameter and kernel parameter to avoid overfitting or underfitting.\n6. Class weighting: SVMs can benefit from class weighting techniques to address the imbalance issue, but selecting appropriate weights can be challenging.\n7. Ensemble methods: Combining multiple SVM models or using ensemble methods such as bagging and boosting can help improve performance on imbalanced datasets.\n8. Cost-sensitive learning: SVMs can be modified to incorporate cost-sensitive learning, where different classes have different costs associated with misclassification errors. This can help address the imbalance issue by penalizing more heavily incorrect predictions for the minority class.","Researcher focuses on designing modifications to support vector machines to fittingly \nhandle the issue of class lopsidedness. Diverse rebalance heuristics in support vector \nmachines displaying, including cost-sensitive learning, and over and under sampling has \nbeen proposed. These support vector machines based strategies are contrasted and \nvarious state -of-the-workmanship approaches on an assortment of information sets by \nusing various metrics, territory under the beneficiary working characteristic bend, and zone \nunder the precision/review bend. It is shown that it is possible to surpass or coordinate the \npreviously known best algorithms on every information set. \n \nResearcher demonstrates how such features can be used for perceiving complex \nmovement patterns. Video representations in terms of neighborhood space -time features...",Fail
2,"\nClass imbalance is a common problem in machine learning where one class has a significantly larger number of instances than the other classes. This can negatively impact the performance of support vector machines (SVMs) when training on imbalanced data. To mitigate the impact of class imbalance, several strategies can be employed:\n\n1. Cost-sensitive learning: In this approach, different misclassification costs are assigned to each class. The cost of misclassifying a minority class instance is set higher than the cost of misclassifying a majority class instance. This encourages the SVM to pay more attention to the minority class during training.\n\n2. Over-sampling the minority class: This involves creating additional instances of the minority class to balance the dataset. Synthetic samples can be generated using techniques such as oversampling, undersampling, or SMOTE (Synthetic Minority Over-sampling Technique).\n\n3. Under-sampling the majority class: This involves reducing the number of instances of the majority class to balance the dataset. Techniques such as Tomek links, Edited nearest neighbors, and random undersampling can be used to reduce the number of instances.\n\n4. Using class weights: Class weights can be applied to penalize the SVM for misclassifying the minority class. The penalty term is added to the loss function, which encourages the SVM to pay more attention to the minority class.\n\n5. Ensemble methods: Combining multiple models can help improve the overall performance of the SVM. Ensemble methods such as bagging and boosting can be used to combine the predictions of multiple models trained on different subsets of the data.\n\n6. Using fuzzy SVM: Fuzzy SVM is a variation of SVM that allows for fuzzy membership degrees instead of crisp labels. This can help handle class imbalance issues by assigning different membership degrees to each class.\n\n7. Using kernel methods: Kernel methods such as the radial basis function kernel can help map the data into a higher dimensional space where the classes are better separated. This can help improve the performance of the SVM on imbalanced data.\n\n8. Using regularization: Regularization techniques such as L1 and L2 regularization can help reduce overfitting and improve the generalization performance of the SVM.\n\nThese strategies can be combined and tailored to specific problem domains to mitigate the impact of class imbalance when training support vector machines.","Researcher focuses on designing modifications to support vector machines to fittingly \nhandle the issue of class lopsidedness. Diverse rebalance heuristics in support vector \nmachines displaying, including cost-sensitive learning, and over and under sampling has \nbeen proposed. These support vector machines based strategies are contrasted and \nvarious state -of-the-workmanship approaches on an assortment of information sets by \nusing various metrics, territory under the beneficiary working characteristic bend, and zone \nunder the precision/review bend. It is shown that it is possible to surpass or coordinate the \npreviously known best algorithms on every information set. \n \nResearcher demonstrates how such features can be used for perceiving complex \nmovement patterns. Video representations in terms of neighborhood space -time features...",Fail
3,"\nSupport Vector Machines (SVMs) differ from other popular classification algorithms like logistic regression or decision trees in several ways. Here are some key differences:\n\n1. Decision boundary: SVMs aim to find the optimal hyperplane that maximizes the margin (distance) between the classes, whereas logistic regression seeks to find the line of best fit for the data. Decision trees, on the other hand, use a tree-like structure to classify instances.\n2. Training data: SVMs require labeled training data, while logistic regression can handle both labeled and unlabeled data. Decision trees can work with either labeled or unlabeled data but tend to perform better with labeled data.\n3. Hyperparameter tuning: SVMs have more hyperparameters than logistic regression or decision trees, which can make them more challenging to optimize. In contrast, logistic regression has fewer hyperparameters, and decision trees have no hyperparameters to tune.\n4. Robustness: SVMs are known for their robustness against noise and outliers in the data, while logistic regression can be sensitive to these issues. Decision trees can also be robust but may struggle with high-dimensional data.\n5. Scalability: SVMs can be computationally expensive and less scalable than logistic regression or decision trees, especially when dealing with large datasets.\n6. Interpretability: SVMs can be less interpretable than logistic regression or decision trees, as the hyperplane or kernel used for classification may not provide clear insights into the relationships between the features and target variable.\n7. Handling missing values: SVMs cannot handle missing values directly, while logistic regression and decision trees can handle missing values or impute them using various methods.\n8. Model selection: SVMs have a unique set of models (e.g., linear, polynomial, radial basis function), while logistic regression and decision trees have multiple models within each algorithm (e.g., logistic, probit, and logit for logistic regression; decision trees with different splits and branching strategies).\n9. Non-linearity: SVMs can handle non-linear relationships between features and the target variable using kernel functions, while logistic regression and decision trees are typically limited to linear relationships.\n10. Multi-class classification: SVMs can be extended to multi-class classification problems, while logistic regression and decision trees may struggle with these types of problems or require additional techniques like one-vs-one or one-vs-all strategies.\n\nIn summary, SVMs offer robustness, flexibility, and the ability to handle non-linear relationships, but they can be computationally expensive and less interpretable than other popular classification algorithms. Logistic regression and decision trees are simpler, more interpretable, and more scalable, but they may struggle with noisy or high-dimensional data and lack the robustness of SVMs.","Support Vector Machines is based on the concept of decision planes that\ndeﬁne decision boundaries. A decision plane is one that separates be tween a\nset of objects having diﬀerent class memberships.\nSupportVectorMachinescanbethoughtofasamethodforconst ructinga\nspecial kind of rule, called a linear classiﬁer, in a way that produces cla ssiﬁers\nwith theoretical guarantees of good predictive performance (th e quality of\nclassiﬁcation on unseen data).\nInthispaper, westudytheproblemsofsupportvectormachine an ddeﬁne\ngeneralized support vector machine. We also show the suﬃcient con ditions\nfor the existence of solutions for problems of generalized support vector ma-\nchine. We also support our results with various examples.\nThought this paper, by N,R,RnandR+\nnwe denote the set of all natural\nnumbers, the set of all real numbers, the set of all n-tuples real numbers, the\nset of all n-tuples of nonnegative real numbers, respectively.\nAlso, we consider ∥·∥and<·,·>as Euclidean norm and usual inner...",Fail
4,"\n\nSupport Vector Machines (SVMs) are a popular machine learning algorithm used for classification and regression tasks. Given the context information provided, here are some scenarios where SVMs may outperform or underperform compared to alternative methods:\n\nOutperform:\n\n1. High-dimensional spaces: SVMs are well suited for high-dimensional spaces because they can find hyperplanes that separate the data into different classes. In such cases, SVMs may outperform other algorithms that struggle with high-dimensional data.\n2. Non-linearly separable data: SVMs can handle non-linearly separable data by using kernel functions, which transform the data into higher dimensional spaces where the data becomes linearly separable. In such cases, SVMs may outperform other algorithms that rely solely on linear separation.\n3. Noisy data: SVMs are robust to noisy data because they focus on the margins between classes rather than the individual data points. In such cases, SVMs may outperform other algorithms that are sensitive to noisy data.\n\nUnderperform:\n\n1. Low-dimensional spaces: SVMs are computationally expensive and require a large amount of data to perform well. In low-dimensional spaces, other algorithms like linear regression or logistic regression may outperform SVMs due to their computational efficiency.\n2. Linearly separable data: SVMs are not necessary when the data is linearly separable, and other algorithms like linear regression or logistic regression may outperform SVMs in such cases.\n3. Overfitting: SVMs can suffer from overfitting, especially when the dataset is too small or when the regularization parameter is not properly tuned. In such cases, other algorithms like Lasso regression or Ridge regression may outperform SVMs due to their built-in regularization mechanisms.\n\nIn conclusion, SVMs are a powerful algorithm for classification and regression tasks, but they may not always be the best choice depending on the specific problem and dataset. It's essential to consider the strengths and limitations of SVMs and compare them to other algorithms based on the context of the problem at hand.","is a crucial topic – the more so as support vector machines are frequently\napplied to large and complex high-dimensional data sets.\nIn this article, we showed that support vector machines are qualitatively\nrobust with a ﬁxed regularization parameter λ∈(0,∞), i.e., the perfor-\nmance of support vector machines is hardly aﬀected by the following two\nkinds of errors: large errors in a small fraction of the data set and small\nerrors in the whole data set. This not only means that these errors do not\nlead to large errors in the support vector machines but also that even the\nﬁnite sample distribution of support vector machines is hardly aﬀected.\nIn contrast to that, we also showed that support vector machines are\nnotqualitatively robust any more under extremely mild conditions, if the\nﬁxed regularization parameter λis replaced by a sequence of parameters\nλn∈(0,∞) which decreases to 0 with increasing sample size n. From our\npoint of view, this is an important result as all universal consistenc...",Fail



---------


Clearly, more finetuning or advanced RAG techniques are needed to produce answers that are relevant to the context.

However, this demonstrates a very basic end-to-end pipeline involving:

1) Ingestion of documents


2) Storing documents in a suitable vector store to be retrieved as context

3) Retrieval Augmented Generation (RAG), where the LLM draws on the stored documents to answer the given question

-  Identifying the relevant chunks of text that were used to answer the question

4) Evaluation of the pipeline using built-in LlamaIndex metrics