<img src = "https://github.com/VeryFatBoy/notebooks/blob/main/common/images/img_github_singlestore-jupyter_featured_2.png?raw=true">

<div id="singlestore-header" style="display: flex; background-color: rgba(235, 249, 245, 0.25); padding: 5px;">
    <div id="icon-image" style="width: 90px; height: 90px;">
        <img width="100%" height="100%" src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/browser.png" />
    </div>
    <div id="text" style="padding: 5px; margin-left: 10px;">
        <div id="badge" style="display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%">SingleStore Notebooks</div>
        <h1 style="font-weight: 500; margin: 8px 0 0 4px;">How to Build Local LLM RAG Apps with Ollama, DeepSeek-R1 and SingleStore</h1>
    </div>
</div>

<div class="alert alert-block alert-warning">
    <b class="fa fa-solid fa-exclamation-circle"></b>
    <div>
        <p><b>Action Required</b></p>
        <p>Create an environment variable to point to your SingleStore instance, as follows:</p>
        <p>export SINGLESTOREDB_URL="&lt;username&gt;:&lt;password&gt;@&lt;host&gt;:&lt;port&gt;/&lt;database&gt;"</p>
    </div>
</div>

In [1]:
!pip cache purge --quiet

[0m

In [2]:
!pip install langchain --quiet
!pip install langchain-community --quiet
!pip install langchain-ollama --quiet
!pip install ollama --quiet
!pip install pdf2image --quiet
!pip install pdfminer.six --quiet
!pip install singlestoredb --quiet
!pip install unstructured==0.10.14 --quiet

In [3]:
import nltk
import ollama
import re

from langchain.document_loaders import OnlinePDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import SingleStoreDB
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_ollama import OllamaEmbeddings

In [4]:
embedding_model = "all-minilm"

ollama.pull(embedding_model)

ProgressResponse(status='success', completed=None, total=None, digest=None)

In [5]:
llm = "deepseek-r1:1.5b"

ollama.pull(llm)

ProgressResponse(status='success', completed=None, total=None, digest=None)

In [6]:
nltk.download("punkt_tab")
nltk.download("averaged_perceptron_tagger_eng")

[nltk_data] Downloading package punkt_tab to
[nltk_data]     /Users/akmalchaudhri/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /Users/akmalchaudhri/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger_eng.zip.


True

In [7]:
loader = OnlinePDFLoader("https://www.investni.com/sites/default/files/2021-02/NI-fintech-document.pdf")

data = loader.load()

[nltk_data] Downloading package punkt to
[nltk_data]     /Users/akmalchaudhri/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/akmalchaudhri/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


In [8]:
print (f"You have {len(data)} document(s) in your data")
print (f"There are {len(data[0].page_content)} characters in your document")

You have 1 document(s) in your data
There are 39674 characters in your document


In [9]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 2000,
    chunk_overlap = 20
)
texts = text_splitter.split_documents(data)

print (f"You have {len(texts)} pages")

You have 23 pages


In [10]:
embeddings = OllamaEmbeddings(
    model = embedding_model,
)

dimensions = len(embeddings.embed_query(texts[0].page_content))

In [11]:
docsearch = SingleStoreDB.from_documents(
    texts,
    embeddings,
    table_name = "fintech_docs",
    distance_strategy = DistanceStrategy.DOT_PRODUCT,
    use_vector_index = True,
    vector_size = dimensions
)

In [12]:
prompt = "What are the best investment opportunities in Blockchain?"
docs = docsearch.similarity_search(prompt)
data = docs[0].page_content
print(data)

Within our well respected financial and related professional services cluster, global leaders including Deloitte and PwC are currently working on the application of blockchain solutions within insurance, digital banking and cross-border payments.

PwC

Vox Financial Partners

The PwC global blockchain impact centre in Belfast comprises a team of fintech professionals with deep expertise and a proven record of delivery of insurance, banking, e-commerce and bitcoin products and services. The Belfast team is exploring the application of this disruptive technology to digital currencies, digital assets, identity and smart contracts. The specialist team has already delivered a significant proof of concept project for the Bank of England, to investigate the capability of distributed ledger technology.

www.pwc.co.uk

Founded in 2016, the Belfast based Fintech consultancy Vox Financial Partners works with top-tier banks and broker- dealer clients in the US and Europe. Vox offers high quality r

In [13]:
output = ollama.generate(
    model = llm,
    prompt = f"Using this data: {data}. Respond to this prompt: {prompt}."
)

content = output["response"]
remove_think_tags = True

if remove_think_tags:
    content = re.sub(r"<think>.*?</think>", "", content, flags = re.DOTALL)

print(content)



**Best Investment Opportunities in Blockchain Technology**

1. **PwC and Deloitte: Insurance and Banking with Blockchain**
   - **Focus:** Utilizes blockchain for secure transactions, cross-border payments, and insurance solutions.
   - **Opportunities:** Explores innovative applications beyond traditional methods, such as digital currencies and smart contracts.

2. ** Vox Financial Partners: Identity and Smart Contracts**
   - **Focus:** Delivers structured contract drafting tools (Opal) on permissioned blockchain, aiming to enhance identity verification and secure payments.
   - **Opportunities:** Offers potential for innovative projects in identity management, leveraging blockchain's scalability benefits.

3. **Rakuten Blockchain Lab: Opal Software Application**
   - **Focus:** Implements DLT solutions for efficient contract management, which could be expanded or acquired for further development.
   - **Opportunities:** Provides scalable and secure project opportunities due to DLT