Load the PDF and Split into Pages

In [58]:
from langchain.document_loaders import PyPDFLoader

pdf_path = "Chaos-Based_Image_Encryption_Review_Application_an.pdf"

llm_loader = PyPDFLoader(pdf_path)
pages = llm_loader.load_and_split()

print(f"pages len: {len(pages)}")

print(pages[0].page_content[:500])
print(pages[0].metadata) 

pages len: 61
Citation: Zhang, B.; Liu, L.
Chaos-Based Image Encryption:
Review, Application, and Challenges.
Mathematics 2023, 11, 2585. https://
doi.org/10.3390/math11112585
Academic Editor: Jonathan
Blackledge
Received: 8 May 2023
Revised: 1 June 2023
Accepted: 4 June 2023
Published: 5 June 2023
Copyright: © 2023 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
c
{'producer': 'pdfTeX-1.40.21', 'creator': 'LaTeX with hyperref', 'creationdate': '2023-06-06T11:27:08+08:00', 'author': 'Bowen Zhang and Lingfeng Liu', 'keywords': 'chaos; image encryption; chaotic system; chaos-based image encryption; chaotic map; cryptography', 'moddate': '2023-06-06T05:31:40+02:00', 'subject': 'Chaos has been one of the most effective cryptographic sources since it was first used in image-encryption algorithms. This paper closely examines the development proces

Split Pages into Text Chunks

In [17]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50,
    separators=["\n\n", "\n", ". ", " ", ""]
)

text_chunks = text_splitter.create_documents(
    [page.page_content for page in pages],
    metadatas=[{"source": pdf_path, "page": i} for i in range(len(pages))]
)

Create Embeddings Using OpenAI

In [None]:
from langchain.embeddings import OpenAIEmbeddings, SentenceTransformerEmbeddings
from dotenv import dotenv_values
env_values = dotenv_values("./app.env")
embedding_llm = OpenAIEmbeddings(
    openai_api_key=env_values['OPENAI_API_KEY'],
    model="text-embedding-ada-002"  
)


In [19]:
text_chunks[5]

Document(metadata={'source': 'Chaos-Based_Image_Encryption_Review_Application_an.pdf', 'page': 0}, page_content='such as sensitivity to initial conditions, topological transitivity , and pseudo-randomness, are conducive\nto cross-referencing with other disciplines and improving image-encryption methods. Additionally , this')

In [20]:
docs_text = [chunk.page_content for chunk in text_chunks]
docs_embeddings = embedding_llm.embed_documents(docs_text)

In [21]:
# query
query_text = "What is chaos in chaotic systems?"
query_embedding = embedding_llm.embed_query(query_text)

FAISS Vector Database

In [None]:
from langchain.vectorstores import FAISS
vector_db = FAISS.from_documents(text_chunks, embedding_llm)


In [40]:
query= "What is chaos in chaotic systems?"
similar_docs = vector_db.similarity_search(query, k=3)

In [57]:
similar_docs

[Document(id='45bdd00c-0c23-45c5-b0b5-ad2409087abe', metadata={'source': 'Chaos-Based_Image_Encryption_Review_Application_an.pdf', 'page': 3}, page_content='Continuous chaotic systems: Continuous chaotic systems are dynamic systems that\nexhibit complex and unpredictable behavior over time. These systems are usually described\nby a set of ordinary or partial differential equations that govern the evolution of state'),
 Document(id='ba78f03e-95a3-492c-abe1-ffdab03be88e', metadata={'source': 'Chaos-Based_Image_Encryption_Review_Application_an.pdf', 'page': 4}, page_content='Continuous chaotic systems have the advantage of providing richer dynamical behavior\nand greater flexibility , but have the disadvantage of requiring higher computational power\nand more complex mathematical models, as well as the need for discretization to suit practi-\ncal applications.'),
 Document(id='edc85299-2863-4095-a867-9f8f9107939c', metadata={'source': 'Chaos-Based_Image_Encryption_Review_Application_an.pd

Load QA Chain and Prompt Templates

In [None]:
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import OpenAI
from langchain import PromptTemplate

from langchain.llms import OpenAI
from langchain_community.chat_models import ChatOpenAI


Initialize LLM model

In [None]:
env_values = dotenv_values("./app.env")
openai_api_key = env_values["OPENAI_API_KEY"]

import os
os.environ["OPENAI_API_KEY"] = openai_api_key 

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.5
)

Define QA Templates (Stuff Chain)

In [49]:
qna_template = "\n".join([
    "Answer the next questionu using the provided context.",
    "### Context:",
    "{context}",
    "",
    "### Question:",
    "{question}",
    "",
    "### Answer:",
])

qna_prompt = PromptTemplate(
    template=qna_template,
    input_variables=['context', 'question'],
    verbose=True
)

stuff_chain = load_qa_chain(llm, chain_type="stuff", prompt=qna_prompt)

Run the Stuff-Based QA Chain

In [50]:
answer = stuff_chain({
    "input_documents": similar_docs,
    "question": query
}, return_only_outputs=True)

In [51]:
print(answer['output_text'])

Chaos in chaotic systems refers to complex and unpredictable behavior that arises from the dynamic evolution of these systems over time. This behavior is characterized by sensitivity to initial conditions, where small changes can lead to vastly different outcomes. In the context of continuous chaotic systems, this chaotic behavior is governed by a set of ordinary or partial differential equations, which describe how the system evolves. While chaotic systems can exhibit rich dynamics and flexibility, they also present challenges, such as requiring significant computational power and complex mathematical modeling.


Define Initial and Refine Prompt Templates

In [34]:
initial_qna_template = "\n".join([
    "Answer the following question using the provided text only.",
    "If answer is not available. Say 'No answer for this context'",
    "### Context:",
    "{context_str}",
    "",
    "### Question:",
    "{question}",
    "### Answer:",
])

initial_qna_prompt = PromptTemplate(
    template=initial_qna_template,
    input_variables=['context_str', 'question']
)

In [35]:
refine_qna_template = "\n".join([
    "Refine the existing answer, if required, with the following context.",
    "If answer is not available. Say 'No answer for this context'",
    "### Context",
    "{context_str}",
    "",
    "### Existing Answer:",
    "{existing_answer}",
    "",
    "### Question:",
    "{question}",
    "",
    "### Refined Answer:",
])

refine_qna_prompt = PromptTemplate(
    template=refine_qna_template,
    input_variables=['context_str', 'existing_answer', 'question']
)

In [36]:
refine_chain = load_qa_chain(
    llm,
    chain_type="refine",
    question_prompt=initial_qna_prompt,
    refine_prompt=refine_qna_prompt,
    return_intermediate_steps=True,
)

In [37]:
question = "What is chaos in chaotic systems?"

similar_docs = vector_db.similarity_search(question, k=5)

print( len(similar_docs) )

5


Run the Refine-Based QA Chain

In [38]:
final_refined_answer = refine_chain({
                                        "input_documents": similar_docs,
                                        "question": question
                                    }, return_only_outputs=True)

final_refined_answer

{'intermediate_steps': ['No answer for this context.',
  'No answer for this context.',
  'No answer for this context.',
  'Chaos in chaotic systems refers to a complex and unpredictable behavior that arises in certain dynamical systems. These systems are highly sensitive to initial conditions, meaning that small changes in the starting state can lead to vastly different outcomes over time. This unpredictability is often characterized by a lack of long-term predictability, despite the underlying deterministic nature of the system. In various fields, such as finance and biology, chaos theory helps to model and understand complex phenomena, revealing patterns and behaviors that might not be immediately apparent.',
  'Chaos in chaotic systems refers to a complex and unpredictable behavior that arises in certain dynamical systems, characterized by high sensitivity to initial conditions. This means that small changes in the starting state can lead to vastly different outcomes over time. Des

In [39]:
print(final_refined_answer["output_text"])


Chaos in chaotic systems refers to a complex and unpredictable behavior that arises in certain dynamical systems, characterized by high sensitivity to initial conditions. This means that small changes in the starting state can lead to vastly different outcomes over time. Despite the underlying deterministic nature of these systems, they exhibit a lack of long-term predictability. In various fields, including neuroscience, machine learning, and cryptography, chaos theory is utilized to model and understand complex phenomena. For instance, chaotic systems can help simulate neuronal behavior and contribute to the development of new algorithms for artificial intelligence, as well as enhance secure communication systems in cryptography.


Create RAG Bot Functions

In [None]:
def rag_bot(question: str) -> dict:
    similar_docs = vector_db.similarity_search(question, k=5)

    result = refine_chain(
        {
            "input_documents": similar_docs,
            "question": question
        },
        return_only_outputs=True
    )

    return {
        "answer": result["output_text"],
        "documents": similar_docs
    }


def rag_answer(question: str) -> str:
    return rag_bot(question)["answer"]


Configure DSPy with LLM

In [None]:
import dspy
import os

os.environ["OPENAI_API_KEY"] = env_values["OPENAI_API_KEY"]

lm = dspy.LM("openai/gpt-4o-mini")  
dspy.configure(lm=lm)


Define Training Data (Questions + Answers)

In [None]:
raw_data = [
    {
        "question": "What is chaos in chaotic systems?",
        "response": "Chaos is a pseudo-random and unpredictable motion in a deterministic dynamical system caused by sensitivity to initial conditions and parameters."
    },
    {
        "question": "What are the main characteristics of chaotic systems?",
        "response": "Chaotic systems include sensitivity to initial conditions, nonlinearity, aperiodicity, fractal structure, and local instability."
    },
    {
        "question": "What is the role of chaotic systems in cryptography?",
        "response": "They are used for designing secure encryption due to pseudo-randomness and sensitivity to initial conditions."
    },
]

data = [dspy.Example(**ex).with_inputs("question") for ex in raw_data]
testset = data
len(testset)


3

Evaluate RAG Model Using SemanticF1

In [None]:
from dspy.evaluate import SemanticF1

metric = SemanticF1(decompositional=True)

scores = []

for example in testset:
    pred_text = rag_answer(example.question)

    pred = dspy.Prediction(response=pred_text)

    score = metric(example, pred)
    scores.append(score)

    print("Q:", example.question)
    print("GOLD:", example.response)
    print("PRED:", pred.response)
    print("Semantic F1:", round(score, 2))
    print("-" * 80)

avg_score = sum(scores) / len(scores)
print("Average Semantic F1:", round(avg_score, 2))


Q: What is chaos in chaotic systems?
GOLD: Chaos is a pseudo-random and unpredictable motion in a deterministic dynamical system caused by sensitivity to initial conditions and parameters.
PRED: Chaos in chaotic systems refers to a complex and unpredictable behavior that arises in certain dynamical systems characterized by high sensitivity to initial conditions. This means that even minor variations in the starting state can lead to vastly different outcomes over time, a phenomenon often illustrated by the "butterfly effect." Despite being deterministic, the intricate, non-linear interactions within chaotic systems render long-term predictions impossible. This characteristic makes chaotic systems particularly relevant in fields such as neuroscience, where they model the behavior of neurons, and in machine learning and artificial intelligence, where they contribute to the development of new algorithms. Additionally, chaotic systems are employed in cryptography to create secure communica