REF: https://www.youtube.com/watch?v=0zgYu_9WF7A

In [54]:
from langchain_community.document_loaders import RSSFeedLoader

urls = [
    'https://decoded.avast.io/feed/',
    # "https://news.ycombinator.com/rss"
]

loader = RSSFeedLoader(urls=urls)
docs = loader.load()
print(len(docs))

10


In [55]:
docs

[Document(metadata={'title': 'Gen Q3/2024 Threat Report', 'link': 'https://decoded.avast.io/threatresearch/gen-q3-2024-threat-report/?utm_source=rss&utm_medium=rss&utm_campaign=gen-q3-2024-threat-report', 'authors': ['Threat Research Team'], 'language': 'en', 'description': 'Ransomware doubled in risk, 614% explosion in scam-yourself attacks, mobile threats surged', 'publish_date': datetime.datetime(2024, 11, 19, 13, 30, tzinfo=tzutc()), 'feed': 'https://decoded.avast.io/feed/'}, page_content='The third quarter threat report is here—and it’s packed with answers. Our Threat Labs team had uncovered some heavy stories behind the stats, exposing the relentless tactics shaping today’s threat landscape.\n\nHere’s what you need to know:\n\n614% explosion in Scam-Yourself Attacks: Over 2 million users were protected from FakeCaptcha scams, where fake tutorials, phony fixes, and malicious CAPTCHA prompts trick users into compromising their own systems.\n\nOver 2 million users were protected fro

In [56]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1200,
    chunk_overlap=100,
    add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

In [57]:
all_splits

[Document(metadata={'title': 'Gen Q3/2024 Threat Report', 'link': 'https://decoded.avast.io/threatresearch/gen-q3-2024-threat-report/?utm_source=rss&utm_medium=rss&utm_campaign=gen-q3-2024-threat-report', 'authors': ['Threat Research Team'], 'language': 'en', 'description': 'Ransomware doubled in risk, 614% explosion in scam-yourself attacks, mobile threats surged', 'publish_date': datetime.datetime(2024, 11, 19, 13, 30, tzinfo=tzutc()), 'feed': 'https://decoded.avast.io/feed/', 'start_index': 0}, page_content='The third quarter threat report is here—and it’s packed with answers. Our Threat Labs team had uncovered some heavy stories behind the stats, exposing the relentless tactics shaping today’s threat landscape.\n\nHere’s what you need to know:\n\n614% explosion in Scam-Yourself Attacks: Over 2 million users were protected from FakeCaptcha scams, where fake tutorials, phony fixes, and malicious CAPTCHA prompts trick users into compromising their own systems.\n\nOver 2 million users 

In [58]:
from langchain_ollama import OllamaEmbeddings

local_embeddings = OllamaEmbeddings(model="all-minilm")

In [59]:
from langchain_chroma import Chroma
from langchain_community.vectorstores.utils import filter_complex_metadata


vectorstore = Chroma.from_documents(documents=filter_complex_metadata(all_splits), embedding=local_embeddings)

In [80]:
question = "Summarize Avast Q1/2024 Threat Report"
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})
retrieved_docs = retriever.invoke(question)

In [81]:
retrieved_docs

[Document(metadata={'description': 'Nearly 90% of Threats Blocked are Social Engineering, Revealing a Huge Surge of Scams, and Discovery of the Lazarus APT Campaign', 'feed': 'https://decoded.avast.io/feed/', 'language': 'en', 'link': 'https://decoded.avast.io/threatresearch/avast-q1-2024-threat-report/?utm_source=rss&utm_medium=rss&utm_campaign=avast-q1-2024-threat-report', 'start_index': 40378, 'title': 'Avast Q1/2024 Threat Report'}, page_content='Last quarter, we reported that scams, together with phishing and malvertising, accounted for more than 75% of all threats blocked by Avast throughout the year. This quarter we have blocked over 80% for the same type of threats. This indicates a rather interesting – and very scam-ridden – start to the year.\n\nScams Everywhere, Including Video\n\nA scam is a type of threat that aims to trick users into giving an attacker their personal information or money. We track diverse types of scams which are listed below.\n\nIn our Q4/2023 report, we

In [82]:
context = ' '.join([doc.page_content for doc in retrieved_docs])
context

'Last quarter, we reported that scams, together with phishing and malvertising, accounted for more than 75% of all threats blocked by Avast throughout the year. This quarter we have blocked over 80% for the same type of threats. This indicates a rather interesting – and very scam-ridden – start to the year.\n\nScams Everywhere, Including Video\n\nA scam is a type of threat that aims to trick users into giving an attacker their personal information or money. We track diverse types of scams which are listed below.\n\nIn our Q4/2023 report, we pointed out that scam activity is increasing significantly. At that time, we saw that one of the main reasons was the high rate of malvertising campaigns. This trend has continued in Q1/2024, with the activity level from the previous peak.\n\nDaily risk ratio of scam in Q4/2023 and Q1/2024 Last quarter, we reported that scams, together with phishing and malvertising, accounted for more than 75% of all threats blocked by Avast throughout the year. Th

In [83]:
from langchain_ollama.llms import OllamaLLM

llm = OllamaLLM(model="llama3.2:1b")
response = llm.invoke(f"""
    Answer the question according to the context:
        Question: {question}
        Context: {context}
""")

In [84]:
print(response)

According to the Avast Q1/2024 Threat Report, the company has seen a significant increase in scam activity, including video scams, compared to last quarter's 75% of all threats blocked. This includes over 80% blocks for the same type of threats.

Key points from the report include:

* A new campaign targeting specific individuals through fabricated job offers
* A full attack chain from infection vector to deploying a rootkit called "FudModule 2.0" with zero-day Admin -> Kernel exploit
* The presence of a previously undocumented Kaolin RAT that:
 * Changes the last write timestamp of selected files and loads received DLL binaries from a C&C server
 * Loads FudModule along with a 0-day exploit

The report suggests that the sophistication of the attack, demonstrated by its impact on security products and persistence in certain regions (such as Asia), indicates a "scam-ridden" start to the year.
