In [3]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("Practical.pdf")
data = loader.load()  

In [4]:
data

[Document(metadata={'source': 'AnomalyGPT_Practicum.pdf', 'page': 0}, page_content='Sandia Laboratories Project-AnomalyGPT\nTeam SAM\nSeungHoon Yoo\nGeorgia Institute of Technology\nsyoo97@gatech.edu@gatech.edu\nAmit Trikha\nGeorgia Institute of Technology\namittrikha@gatech.edu\nMukta Bisht\nGeorgia Institute of Technology\nmbisht6@gatech.edu\nAbstract\nFor this study, we utilized the AnomalyGPT model [1]\nto specialize in detecting anomalies in our custom Art-\nwork (paintings) dataset. Large vision-language models\n(LVLM) excel in recognizing common objects due to their\nextensive training data, they often struggle with domain-\nspecific knowledge and fine-grained details within objects.\nThis limitation impedes their effectiveness in domain spe-\ncific tasks such as Artwork (or painting) Anomaly De-\ntection. So,we investigated adapting the AnomalyGPT\nmodel to our custom dataset to tackle the domain spe-\ncific (i.e artwork/painting) problem. Our approach uti-\nlizing the AnomalyG

In [5]:
len(data)

14

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split data
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
docs = text_splitter.split_documents(data)


print("Total number of documents: ",len(docs))

Total number of documents:  76


In [7]:
docs[0]

Document(metadata={'source': 'AnomalyGPT_Practicum.pdf', 'page': 0}, page_content='Sandia Laboratories Project-AnomalyGPT\nTeam SAM\nSeungHoon Yoo\nGeorgia Institute of Technology\nsyoo97@gatech.edu@gatech.edu\nAmit Trikha\nGeorgia Institute of Technology\namittrikha@gatech.edu\nMukta Bisht\nGeorgia Institute of Technology\nmbisht6@gatech.edu\nAbstract\nFor this study, we utilized the AnomalyGPT model [1]\nto specialize in detecting anomalies in our custom Art-\nwork (paintings) dataset. Large vision-language models\n(LVLM) excel in recognizing common objects due to their\nextensive training data, they often struggle with domain-\nspecific knowledge and fine-grained details within objects.\nThis limitation impedes their effectiveness in domain spe-\ncific tasks such as Artwork (or painting) Anomaly De-\ntection. So,we investigated adapting the AnomalyGPT\nmodel to our custom dataset to tackle the domain spe-\ncific (i.e artwork/painting) problem. Our approach uti-\nlizing the AnomalyGP

# Get an API key: 

 https://ai.google.dev/gemini-api/docs/api-key to generate a Google AI API key. Paste in .env file

 Embedding models: https://python.langchain.com/v0.1/docs/integrations/text_embedding/

In [8]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings

from dotenv import load_dotenv
load_dotenv() 



embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector = embeddings.embed_query("hello, world!")
vector[:5]
#vector

  from .autonotebook import tqdm as notebook_tqdm


[0.05168594419956207,
 -0.030764883384108543,
 -0.03062233328819275,
 -0.02802734263241291,
 0.01813093200325966]

In [9]:
vectorstore = Chroma.from_documents(documents=docs, embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

In [10]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 10})

retrieved_docs = retriever.invoke("What is new in Development of Multiple Combined Regression Methods for Rainfall Measurement paper?")


In [11]:
len(retrieved_docs)

10

In [12]:
retrieved_docs

[Document(id='f5d777dd-deb0-4200-9d07-840db3155380', metadata={'page': 3, 'source': 'AnomalyGPT_Practicum.pdf'}, page_content='putational complexity and large datasets required for these\nmodels,while the standard CLIP model could be run lo-\ncally without such requirements. ¡FOR EACH MODEL\nADD FEW SENTENCES WHICH W AS DONE SPECIFI-\n4'),
 Document(id='73b7d3a3-4470-4785-a053-9f2e02a2d4b9', metadata={'page': 10, 'source': 'AnomalyGPT_Practicum.pdf'}, page_content='images. The team explored various options and eventually\naddressed the issue by combining multiple medical datasets\nfrom different sources.\n8. Comparison between models\nThe table 5 offers a detailed comparison of accu-\nracy scores for different models tested on a dataset,\n11'),
 Document(id='eabd1bb3-b6fa-4bb0-b970-0997d7f5e52b', metadata={'page': 11, 'source': 'AnomalyGPT_Practicum.pdf'}, page_content='plexity of the models and the challenges in training them\non datasets. Capturing intricate details proves to be part

In [13]:
print(retrieved_docs[5].page_content)

AnomalyGPT and AnomalyCLIP work on huge num-
ber of parameters and in our case it was 7B.Not only
it will be tricky to manually update the parameters but
also time consuming. The models used for these were
pretrained on these parameters and weights generated
were huge files.
• Model Overfitting: The tendency of the CLIP model
to overfit on the training dataset was a notable draw-
back. This overfitting occurs when the model learns
the training data too well,including its noise and out-
liers,at the expense of generalizing to new,unseen data.
Identifying and mitigating overfitting through tech-
niques such as cross-validation and regularization is
crucial for developing robust models. AnomalyGPT
and AnomalyCLIP are not prone to overfitting since
they work on huge number of parameters but can take
a lot of time to fine tune them before reaching the de-
sired behaviour.
• Diverse Test Data : The performance of our models
can vary significantly with more diverse and complex


In [14]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro",temperature=0.3, max_tokens=500)

In [15]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [16]:
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

In [18]:
response = rag_chain.invoke({"input": "What is in this paper?"})
print(response["answer"])

This paper discusses training AnomalyGPT, a model for anomaly detection, on an "Artwork" dataset of images of the Mona Lisa and Girl with a Pearl Earring.  The training process involved various image transformations and hyperparameter adjustments (learning rate of 1e-3, batch sizes of 8 and 16). The paper also briefly mentions testing with CLIP and includes comparisons with other models like MiniGPT4 and PandaGPT.
