In [4]:
from langchain.vectorstores import Chroma
from langchain.embeddings.ollama import OllamaEmbeddings
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter


In [5]:

pdf_loader = PyPDFLoader("svm.pdf")
documents_raw = pdf_loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200)
documents = text_splitter.split_documents(documents=documents_raw)


In [6]:
embedding_model = OllamaEmbeddings(model="nomic-embed-text:latest")
db = Chroma()
embeddings_vectorstore = db.from_documents(documents,embedding_model)
retriever = embeddings_vectorstore.as_retriever()


  db = Chroma()


In [10]:
from langchain_community.llms import ollama
from langchain.prompts import ChatPromptTemplate

In [16]:
llm = ollama.Ollama(model="mistral:latest")
prompt = ChatPromptTemplate.from_template(
    "You are an assistant for explaining things in a clear and concise manner,\
        Use the context to give a detailed answer based on the user query, \
            <context> {context} </context> \
            <query> {input} </query>"
)

In [17]:
from langchain.chains.retrieval import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

combine_docs_chain = create_stuff_documents_chain(llm=llm,prompt=prompt)
retrieval_chain = create_retrieval_chain(retriever=retriever,combine_docs_chain=combine_docs_chain)



In [18]:
retrieval_chain.invoke({"input":"What is svm explain it briefly in about 200 words"})

{'input': 'What is svm explain it briefly in about 200 words',
 'context': [Document(metadata={'page': 2, 'source': 'svm.pdf'}, page_content='Support Vector Machines formulation \nSupport Vector m achines realize th e ideas o utlined above. To  see wh y, we n eed specify two  things: the \nhypothesis sp aces u sed by SVM, an d the loss fun ctions used. The folklore v iew of SVM is t hat th ey find  an \n"optimal" hyperpl ane as t he solution to the learni ng problem. The si mplest formulation of SVM  is the linear one, \nwhere the hyperpla ne lies on the space of the input dat a x. In this case the hypot hesis space is a subset of all \nhyperplanes of  the form:  \nf(x) = w⋅x +b. \nIn their m ost general form ulation, SVM fi nd a hyperplane i n a space diffe rent from  that of the input dat a x. It is a  \nhype rplane in a feature space  induced by a kernel K (the kernel define s a dot product in that space (Wahba , 1990)). \nThrough the kernel K the hy pothesis space  is defined as a 

Support Vector Machines (SVM) are a type of supervised machine learning algorithm used for classification and regression analysis. The primary goal of SVM is to find an optimal hyperplane that separates the data points of different classes in a way that maximizes the margin, which is the distance between the hyperplane and the nearest data points from each class (support vectors).\n\nIn simpler terms, SVM attempts to find the best line or plane that can accurately separate the given dataset into distinct groups. The hyperplane identified by SVM is not necessarily linear; it may be a curve in high-dimensional spaces. To handle nonlinear data, SVM uses a kernel trick to map the input data to a higher dimension where the data can be linearly separable.\n\nThe SVM algorithm involves solving a quadratic optimization problem, which finds the coefficients of the hyperplane (w and b) that maximize the margin while minimizing an error function. The error function measures the misclassification rate of the support vectors. By optimizing this function, SVM can find the hyperplane that provides the best balance between generalization performance (avoiding overfitting) and fitting the training data accurately.\n\nSVM has been successfully applied to various applications, such as time series prediction, face recognition, medical diagnosis, and more. Its theoretical foundations and experimental success encourage further research into its characteristics and potential uses. The report mentions that there are variations of standard SVM proposed in several papers, which can potentially improve the performance of SVM for specific tasks.