## Environment Setup

This notebook demonstrates LangChain concepts using **local models via Ollama**. Before continuing, make sure the required models and Python packages are installed.

---

### 1. Install and Configure Ollama

If you haven't already, install [Ollama](https://ollama.com) and ensure it's running locally.

Then pull the required models:

```bash
ollama pull gemma3
ollama pull nomic-embed-text
```

> These models will be used for generation (`gemma3`) and embeddings (`nomic-embed-text`).

---

### 2. Install Required Python Packages

In [1]:
'''
!pip install --upgrade pip --quiet
!pip install -U \
  langchain langchain-ollama langchain-community langchain-core \
  langchain-experimental chromadb pandas pymupdf ipython \
  duckduckgo-search --quiet
'''

'\n!pip install --upgrade pip --quiet\n!pip install -U   langchain langchain-ollama langchain-community langchain-core   langchain-experimental chromadb pandas pymupdf ipython   duckduckgo-search --quiet\n'

### 3. Start Ollama (if not already running)

Make sure Ollama is running in the background. You can check by running:

```bash
ollama list
```

This should list both `gemma3:latest` and `nomic-embed-text:latest` models.

## Local vs. Cloud Models

### Ollama (Local Execution)

Ollama allows you to run open-source models (e.g., `gemma`, `llama`, `nomic-embed-text`) locally. This provides control, privacy, and fast iteration without API latency or cost.

### Amazon Bedrock (Cloud Execution)

[Amazon Bedrock](https://aws.amazon.com/bedrock/) provides access to multiple foundation models through a single API. It is fully managed, scalable, and integrates into AWS’s security and governance ecosystem.

## Basic Concepts of Generative AI

This section introduces core concepts for working with Large Language Models (LLMs), including how to prompt them effectively and build applications using frameworks like **LangChain**—whether deploying models locally via **Ollama** or through cloud services like **AWS Bedrock**.

---

### What Are LLMs?

**Large Language Models (LLMs)** are machine learning models trained on massive text corpora. They predict the next token in a sequence based on context, making them powerful pattern-recognition tools for generating human-like text. However, they do not understand meaning or possess reasoning capabilities—their output is probabilistic, not factual by default.

---

### Prompting and Prompt Engineering

**Prompts** are structured inputs that tell an LLM what to do. A well-crafted prompt can significantly influence the quality, accuracy, and usefulness of the model's response.

Prompts generally consist of three key components:

---

#### Instruction

The **instruction** specifies the task or behavior expected from the model. It can be a simple question, a directive, or even a persona setup.

**Examples:**
- “Generate a list of known malicious domains associated with the Qakbot malware family.”
- “Write a poem about the Qakbot malware family.”
- “Produce a sonnet about my love of Franklin’s BBQ.”

---

#### Context

**Context** includes relevant information provided to the model to help it complete the instruction. This is particularly important for specialized or domain-specific use cases.

**Examples of context:**
- Extracted text from a PDF or website
- Lists of known facts or structured datasets
- Internal business knowledge

Context is critical when the model lacks the necessary background knowledge. It also enables **in-context learning** and is foundational to **RAG (Retrieval-Augmented Generation)** systems.

**Reference:**  
- [AWS Blog: Context Window Overflow – Breaking the Barrier](https://aws.amazon.com/blogs/security/context-window-overflow-breaking-the-barrier/)

---

#### Output Format

Defining the **output format** ensures the model returns a response in a structure that's compatible with downstream processing or presentation.

**Examples:**
- JSON/JSONL for programmatic consumption
- Markdown for documentation or emails
- Lists, CSV, or tabular formats for parsing

Providing formatting examples in your prompt often leads to more consistent and usable output.

---

#### Best Practices for Prompt Engineering

- Be explicit and unambiguous in your instructions.
- Focus on what you want, not just what to avoid.
- Supply sufficient context for the model to reason accurately.
- Specify format, tone, language, or response length as needed.
- Iterate and refine—prompt design is an interactive process.
- Be cost-conscious—longer prompts and responses mean higher token usage.
- Leverage in-context examples to teach the model new behavior.

**Prompting strategies:**
- **Zero-shot**: No examples provided.
- **One-shot**: One example included.
- **Few-shot**: Multiple examples included.

---

### System Prompts

Some platforms (e.g., OpenAI, Anthropic) support **system prompts**—instructions that shape the model’s persona, tone, or behavioral constraints.

**Example system message:**  
> “You are a cybersecurity analyst generating threat intelligence summaries.”

System prompts are not typically hidden; providers often document their usage:

- [Anthropic: System Prompt Guidelines](https://docs.anthropic.com/en/release-notes/system-prompts)

---

### Fine-Tuning vs. In-Context Learning

- **Fine-tuning** modifies the model’s internal weights using domain-specific data. It’s powerful but resource-intensive and harder to maintain.
- **In-context learning** gives examples inline in the prompt. It’s flexible, efficient, and sufficient for many real-world use cases.

---

**Learn more:** [Prompt Engineering Guide](https://www.promptingguide.ai/)

## Core LangChain Concepts

[LangChain](https://python.langchain.com/) is a modular framework for building generative AI applications powered by Large Language Models (LLMs). It simplifies the development of multi-step pipelines, retrieval-augmented workflows, and agent-based reasoning systems.

---

### LangChain Capabilities

LangChain provides key building blocks for LLM applications:

- **Prompt templates** for reusability and structure
- **Memory** for multi-turn interactions
- **Tools** to access external data sources
- **Chains** to compose tasks in sequence or branching logic
- **Agents** for dynamic tool use and reasoning
- **Document loaders, chunkers, and retrievers** for knowledge ingestion

---

### Prompt Templates

**Prompt templates** allow you to define reusable prompts with variables. This is essential for consistent prompt formatting in production use cases.

```python
from langchain.prompts import PromptTemplate

template = "Generate a list of domains associated with {malware_family}"
prompt = PromptTemplate.from_template(template)
prompt.format(malware_family="Qakbot")
```

---

### Memory

LLMs are inherently **stateless**, meaning they do not retain context between interactions. LangChain provides memory modules such as `ConversationBufferMemory` that persist prior messages and automatically re-inject them into prompts.

```python
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
```

When used in a chain:

```python
chain = LLMChain(llm=llm, prompt=prompt_template, memory=memory)
```

Memory is critical for applications like chatbots, copilots, or multi-turn reasoning agents.

---

### Tools

**Tools** allow LLMs to interact with external systems—APIs, web search engines, internal databases, or third-party services. LangChain supports tool integration via descriptors that let agents reason about what tool to use and when.

**Examples of tools:**
- Google/Bing search
- DomainTools, VirusTotal, Wikipedia
- Custom internal APIs (e.g., threat intelligence, ticketing systems)

More: [LangChain Tools Documentation](https://python.langchain.com/docs/modules/tools/)

---

### Chains

**Chains** are workflows composed of modular steps. Each chain links one or more components (e.g., prompt → LLM → post-processor).

Common chain types include:

- `LLMChain`: single prompt-response (Note: LLMChain is deprecated in favor of LCEL - LangChain Expression Language)
- `SimpleSequentialChain`: sequential chaining of steps
- `MultiInputChain`: accepts multiple inputs
- `RouterChain`: dynamically selects downstream chains based on input

Example with memory:

```python
chain = LLMChain(llm=llm, prompt=prompt_template, memory=ConversationBufferMemory())
```

Chains provide structure and allow you to build reliable pipelines with conditional logic and state.

---

### Agents

**Agents** enable dynamic, reasoning-driven workflows. Unlike static chains, agents interpret user input, determine which tools or sub-chains to invoke, gather additional data, and synthesize final responses.

They excel at handling ambiguous prompts, complex logic, or open-ended tasks.

More: [LangChain Agents Documentation](https://python.langchain.com/docs/modules/agents/)

## RAG and Embeddings

### Vector Stores

Vector databases (e.g., Chroma, FAISS, Pinecone) store **text embeddings**, which allow similarity searches against known documents or facts.

### RAG (Retrieval-Augmented Generation)

**RAG** improves response quality by retrieving relevant content from an external knowledge source before prompting the model. This enables high-accuracy answers even from models that weren’t explicitly trained on the content.

### Chunking

Long documents are split into smaller **chunks** to improve retrieval accuracy and avoid exceeding token limits. LangChain includes text splitters like `RecursiveCharacterTextSplitter` that optimize chunk size and overlap.

### Embeddings

Embeddings are numeric vectors that represent semantic meaning. They are used for similarity search in vector stores.

- **Local**: e.g., `nomic-embed-text` via Ollama
- **Cloud**: e.g., Titan via AWS Bedrock, HuggingFace SentenceTransformers

---

### LLM Limitations

- **Hallucination**: Models may fabricate plausible but incorrect information.
- **Context limitations**: Only a fixed number of tokens can be processed per prompt.
- **Data leakage risk**: LLMs trained on sensitive data may inadvertently reveal it.
- **Stateless**: Models forget previous interactions unless provided memory mechanisms.

---

### Understanding Tokenization

LLMs process input/output as **tokens**, not words. Token count affects:

- Cost (in usage-based billing models)
- Whether your full prompt fits in the model’s context window
- Output truncation risk

Useful tools:
- [Claude Tokenizer](https://claude-tokenizer.vercel.app/)
- [TikToken Tokenizer](https://tiktikenizer.vercel.app/)

## Calling Ollama via LangChain - Zero Shot
https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html

In [None]:
from IPython.display import Markdown
from langchain_ollama import ChatOllama 

# Note: Ensure Ollama is running locally and the gemma3 model is installed
# Run: ollama pull gemma3
# If Ollama is not running, you'll get a connection error

llm = ChatOllama(
    model="gemma3",
    temperature=0.8,
    # base_url="http://remote-ip:port", # Ollama default port: 11434
    # timeout=5,
    num_predict=256,  # Use num_predict instead of max_tokens for Ollama
    # other params ...
)

messages = [
    ("system", "You are a helpful assistant."),
    ("human", "why is the sky blue?"),
]

Markdown(llm.invoke(messages).content)

That’s a fantastic question! The blue color of the sky is a really cool phenomenon caused by something called **Rayleigh scattering**. Here’s a breakdown of how it works:

1. **Sunlight is Made of All Colors:** Sunlight might look white to us, but it’s actually composed of all the colors of the rainbow – red, orange, yellow, green, blue, indigo, and violet.

2. **Entering the Atmosphere:** When sunlight enters the Earth's atmosphere, it bumps into tiny air molecules (mostly nitrogen and oxygen).

3. **Scattering of Light:** This bumping causes the light to scatter in different directions.  This scattering is much *more* effective with shorter wavelengths of light – blue and violet light have shorter wavelengths than red and orange.

4. **Rayleigh Scattering in Action:** Rayleigh scattering describes how light is scattered by particles smaller than its wavelength. Because blue light has a shorter wavelength, it’s scattered *much* more strongly than other colors. It’s like throwing a small ball (blue light) and a large ball (red light) at a bumpy wall - the small ball will bounce off in many more directions.

5. **Why Blue, Not Violet?** Violet light

## Using Prompt Templates and LangChain Expression Syntax

This example introduces the use of a `ChatPromptTemplate` combined with an Ollama-backed LLM using LangChain's expression syntax. This pattern allows for reusable, parameterized prompts and demonstrates how to integrate structured prompting into your pipeline. The `|` operator chains the prompt and model together, creating a modular and declarative workflow.

We’ll also simulate a multi-role setup (e.g., system + human) within the prompt string.


In [3]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM

template = """Question: {question}

Answer: Let's think step by step."""

prompt = ChatPromptTemplate.from_template(template)

model = OllamaLLM(model="gemma3")

chain = prompt | model

Markdown(chain.invoke({"question": "System: You are cybersecurity expert, and AI assisant to AWS.\n\nHuman: What is the LummaC2 malware?"}))

Okay, let's break down LummaC2. It's a particularly nasty piece of malware that’s been making waves in the cybersecurity world, primarily targeting organizations in the energy sector. Here’s a step-by-step breakdown of what you need to know:

**1. What is LummaC2?**

* **Initial Infection Vector:** LummaC2 typically begins with spear-phishing emails. These emails are highly targeted and often appear to be legitimate communications, designed to trick users into clicking malicious links or opening infected attachments.
* **Trojan Horse:**  The initial email contains a Trojan horse - a malicious file disguised as something innocuous.  Once executed, the Trojan installs a backdoor onto the victim’s system.
* **C2 (Command and Control) Infrastructure:**  This is the key element. LummaC2 isn't just about stealing data; it’s about persistent, ongoing control of compromised systems.  It establishes a communication channel – a Command and Control (C2) server – allowing the attackers to remotely control the infected machines.

**2. Key Characteristics & Functionality**

* **Sophisticated C2 Communication:** LummaC2 uses a highly sophisticated C2 infrastructure. It doesn’t just rely on simple HTTP requests. It employs a layered approach:
    * **DNS Tunneling:**  This is a critical feature. It uses DNS queries to establish a covert communication channel, bypassing traditional firewall restrictions. The malware queries a malicious domain, transmitting commands and receiving instructions.
    * **TLS/SSL Encryption:**  The communication between the infected machine and the C2 server is encrypted using TLS/SSL, making it very difficult to intercept and analyze.
    * **Dynamic C2 Domains:** The attackers use a pool of domain names and dynamically rotate them to evade detection by security vendors.
* **Data Exfiltration:**  Once control is established, LummaC2 is capable of exfiltrating data – stealing sensitive information.
* **Lateral Movement:** It's not content to just control one machine. It actively scans the network for other vulnerable systems and attempts to move laterally, expanding its reach.
* **Persistence:** LummaC2 is designed to remain persistent on infected systems, ensuring continued control.



**3. Targeting & Sector Focus**

* **Energy Sector:** Initially, LummaC2 was primarily associated with attacks on organizations within the energy industry (specifically, oil and gas). This is what initially drew significant attention.
* **Industrial Control Systems (ICS):** A major concern is that it can target and compromise ICS, which controls critical infrastructure like power plants, refineries, and pipelines.

**4. Detection and Mitigation**

* **DNS Monitoring:** Because of its use of DNS tunneling, monitoring DNS traffic is crucial.  Look for unusual DNS queries.
* **Network Traffic Analysis:** Analyzing network traffic for suspicious patterns – especially encrypted traffic – can uncover C2 communication.
* **Endpoint Detection and Response (EDR):** EDR solutions are important for detecting and responding to threats on individual endpoints.
* **Threat Intelligence:** Stay up-to-date on the latest threat intelligence about LummaC2 and its associated C2 infrastructure.

---

**Resources for Further Research:**

* **CrowdStrike Report:** [https://www.crowdstrike.com/blog/lumma-c2-malware-targeting-energy-sector/](https://www.crowdstrike.com/blog/lumma-c2-malware-targeting-energy-sector/)
* **SecurityWeek Article:** [https://www.securityweek.com/lumma-c2-malware-targeting-energy-sector-a-deep-dive/](https://www.securityweek.com/lumma-c2-malware-targeting-energy-sector)


Do you want me to delve deeper into a specific aspect of LummaC2, such as:

*   Its technical details (e.g., specific C2 protocols)?
*   Specific detection techniques?
*   Its impact on the energy sector?

## Memory

It's nice to be able to ask a question and get an answer, but that seems pretty transactional and impersonal. I'd like the LLM I'm talking to to show me they are listening to me and paying attention to what I am saying. I am, after all, human and I need to be loved or at least feel like it.

But LLMs are cold and heartless (technically, they are stateless but we are talking about my feelings here). This means that they don’t ‘remember’ interactions from prompt to prompt. You can fine tune them to persist data but without updating the weights through expensive training but, they don’t ‘remember’ anything more after the model is set at a checkpoint. This means if I just asked what the Qakbot malware was and then follow up with a question like "What industry was primarily targeted by this malware?", the model will not be able to answer within the context of Qakbot...since show beats tell, let's see how that works firsthand.

## Now lets ask the followup question...

In [4]:
Markdown(llm.invoke("System: You are cybersecurity expert, and AI assisant.\n\nHuman: What industry was primarily targeted by this malware?").content)

Please provide me with the details about the malware you're referring to! I need information like:

*   **The name of the malware:** (e.g., WannaCry, Emotet, etc.)
*   **A description of its capabilities:** (What did it do? What kind of damage did it cause?)
*   **Any known attack patterns or techniques:** (How did it spread? What vulnerabilities did it exploit?)

Once you give me this information, I can analyze it and tell you which industry was primarily targeted. 

**I can’t answer your question without knowing what malware you’re asking about.**

## LangChain Memory Components: Enabling Stateful Conversations
The memory component of LangChain allows the LLM to become 'stateful'. This quality is quite useful when developing applications driven by LLMs. For instance, a conversational system or a chatbot is required to recall past interactions to maintain a conversation. Without memory, the system would not be able to handle follow-up messages or recollect key pieces of information mentioned earlier in the conversation. 

In this section, we will explore the memory modules provided by LangChain. LangChain offers several types of memory modules depending on the task and the properties of the LLM. We will incorporate memory into chains and examine changes in performance due to it.

In [5]:
from langchain_community.llms import Ollama
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
import pandas as pd
import warnings

warnings.filterwarnings("ignore")

# Initialize Ollama with Gemma model
llm = Ollama(model="gemma3")

In [6]:
# Set up conversation memory
cbm_memory = ConversationBufferMemory()

# Adding memory to a conversational chain
chain_with_buffer_memory = ConversationChain(llm=llm, memory=cbm_memory)

# Helper function to prompt and print memory state
def prompt_and_print_memory(prompts, chain_with_memory):
    # Store the responses
    responses = {"input": [], "history": [], "response": []}

    # Repeatedly prompting the chain and observing the memory
    for prompt in prompts:
        response = chain_with_memory.invoke({"input": prompt})
        responses["input"].append(prompt)
        responses["history"].append(cbm_memory.load_memory_variables({}))
        responses["response"].append(response["response"])

    # Display responses in a dataframe
    df = pd.DataFrame.from_dict(responses)
    with pd.option_context("display.max_colwidth", None):
        display(df)

# Sequence of prompts for demonstration
cbm_prompts = [
    "What is the LummaC2 malware?",
    "What industry was primarily targeted by this malware?",
]

# Invoke helper function
prompt_and_print_memory(cbm_prompts, chain_with_buffer_memory)


Unnamed: 0,input,history,response
0,What is the LummaC2 malware?,"{'history': 'Human: What is the LummaC2 malware? AI: Okay, let’s talk about LummaC2! It’s a really interesting piece of malware, and it’s been a significant focus for cybersecurity researchers, particularly those at Mandiant. Essentially, LummaC2 is a sophisticated, modular malware campaign primarily targeting the industrial sector, specifically focusing on manufacturing companies. Here’s what I know, broken down into key aspects: * **Origin & Attribution:** Initially, it was attributed to the APT28 group, which is a known Russian state-sponsored hacking group associated with Fancy Bear and Cozy Bear. However, recent research has suggested a more complex situation – it’s possible that APT28 *utilized* LummaC2, or that it’s a variant developed by a group with connections to them, but perhaps not a direct operational extension. This is still being investigated actively. * **Modular Architecture:** This is a key characteristic. LummaC2 isn’t just a single piece of malware; it’s comprised of multiple modules, each performing a specific function. These modules include:  * **Command and Control (C2):** This is the central hub, used to communicate with the infected systems and issue commands. It uses a layered C2 infrastructure, making it difficult to detect and disrupt.  * **Data Exfiltration:** Modules designed to steal sensitive data, including intellectual property, trade secrets, and operational data. They employ various techniques like DNS tunneling and encrypted channels.  * **Credential Harvesting:** Modules designed to steal usernames and passwords from systems, expanding the attackers’ access.  * **Lateral Movement:** Modules designed to move through the network, gaining access to more systems. This often involves exploiting vulnerabilities or using stolen credentials.  * **Custom Modules:** A particularly concerning aspect is the discovery of ""custom modules"" – specifically designed tools tailored to target specific industries and systems. These modules aren’t readily available on the dark web, suggesting they were developed specifically for this campaign. One example was a device control module for programmable logic controllers (PLCs). * **Targets:** As I mentioned before, the initial focus was on the manufacturing sector, with particular interest in companies involved in aerospace, defense, and automotive industries. They targeted systems managing production, quality control, and supply chain operations. * **Techniques:** They employed several techniques, including:  * **Supply Chain Attacks:** Targeting vendors and suppliers of industrial control systems.  * **Phishing:** Using targeted phishing emails to lure victims into installing malware.  * **Exploiting Vulnerabilities:** Leveraging known vulnerabilities in industrial control systems and other software. * **Current Status:** While initially a significant threat, the activity surrounding LummaC2 has decreased. However, researchers continue to monitor the threat landscape for any resurgence or related activity. Do you want me to delve deeper into a specific aspect of LummaC2, like the PLCs, the C2 infrastructure, or perhaps the techniques they used? I can also access more recent reports and analyses if you'd like. Would you like me to show you a timeline of key events related to LummaC2?'}","Okay, let’s talk about LummaC2! It’s a really interesting piece of malware, and it’s been a significant focus for cybersecurity researchers, particularly those at Mandiant. Essentially, LummaC2 is a sophisticated, modular malware campaign primarily targeting the industrial sector, specifically focusing on manufacturing companies.\n\nHere’s what I know, broken down into key aspects:\n\n* **Origin & Attribution:** Initially, it was attributed to the APT28 group, which is a known Russian state-sponsored hacking group associated with Fancy Bear and Cozy Bear. However, recent research has suggested a more complex situation – it’s possible that APT28 *utilized* LummaC2, or that it’s a variant developed by a group with connections to them, but perhaps not a direct operational extension. This is still being investigated actively.\n\n* **Modular Architecture:** This is a key characteristic. LummaC2 isn’t just a single piece of malware; it’s comprised of multiple modules, each performing a specific function. These modules include:\n\n * **Command and Control (C2):** This is the central hub, used to communicate with the infected systems and issue commands. It uses a layered C2 infrastructure, making it difficult to detect and disrupt.\n * **Data Exfiltration:** Modules designed to steal sensitive data, including intellectual property, trade secrets, and operational data. They employ various techniques like DNS tunneling and encrypted channels.\n * **Credential Harvesting:** Modules designed to steal usernames and passwords from systems, expanding the attackers’ access.\n * **Lateral Movement:** Modules designed to move through the network, gaining access to more systems. This often involves exploiting vulnerabilities or using stolen credentials.\n * **Custom Modules:** A particularly concerning aspect is the discovery of ""custom modules"" – specifically designed tools tailored to target specific industries and systems. These modules aren’t readily available on the dark web, suggesting they were developed specifically for this campaign. One example was a device control module for programmable logic controllers (PLCs).\n\n* **Targets:** As I mentioned before, the initial focus was on the manufacturing sector, with particular interest in companies involved in aerospace, defense, and automotive industries. They targeted systems managing production, quality control, and supply chain operations.\n\n* **Techniques:** They employed several techniques, including:\n\n * **Supply Chain Attacks:** Targeting vendors and suppliers of industrial control systems.\n * **Phishing:** Using targeted phishing emails to lure victims into installing malware.\n * **Exploiting Vulnerabilities:** Leveraging known vulnerabilities in industrial control systems and other software.\n\n* **Current Status:** While initially a significant threat, the activity surrounding LummaC2 has decreased. However, researchers continue to monitor the threat landscape for any resurgence or related activity.\n\nDo you want me to delve deeper into a specific aspect of LummaC2, like the PLCs, the C2 infrastructure, or perhaps the techniques they used? I can also access more recent reports and analyses if you'd like. Would you like me to show you a timeline of key events related to LummaC2?"
1,What industry was primarily targeted by this malware?,"{'history': 'Human: What is the LummaC2 malware? AI: Okay, let’s talk about LummaC2! It’s a really interesting piece of malware, and it’s been a significant focus for cybersecurity researchers, particularly those at Mandiant. Essentially, LummaC2 is a sophisticated, modular malware campaign primarily targeting the industrial sector, specifically focusing on manufacturing companies. Here’s what I know, broken down into key aspects: * **Origin & Attribution:** Initially, it was attributed to the APT28 group, which is a known Russian state-sponsored hacking group associated with Fancy Bear and Cozy Bear. However, recent research has suggested a more complex situation – it’s possible that APT28 *utilized* LummaC2, or that it’s a variant developed by a group with connections to them, but perhaps not a direct operational extension. This is still being investigated actively. * **Modular Architecture:** This is a key characteristic. LummaC2 isn’t just a single piece of malware; it’s comprised of multiple modules, each performing a specific function. These modules include:  * **Command and Control (C2):** This is the central hub, used to communicate with the infected systems and issue commands. It uses a layered C2 infrastructure, making it difficult to detect and disrupt.  * **Data Exfiltration:** Modules designed to steal sensitive data, including intellectual property, trade secrets, and operational data. They employ various techniques like DNS tunneling and encrypted channels.  * **Credential Harvesting:** Modules designed to steal usernames and passwords from systems, expanding the attackers’ access.  * **Lateral Movement:** Modules designed to move through the network, gaining access to more systems. This often involves exploiting vulnerabilities or using stolen credentials.  * **Custom Modules:** A particularly concerning aspect is the discovery of ""custom modules"" – specifically designed tools tailored to target specific industries and systems. These modules aren’t readily available on the dark web, suggesting they were developed specifically for this campaign. One example was a device control module for programmable logic controllers (PLCs). * **Targets:** As I mentioned before, the initial focus was on the manufacturing sector, with particular interest in companies involved in aerospace, defense, and automotive industries. They targeted systems managing production, quality control, and supply chain operations. * **Techniques:** They employed several techniques, including:  * **Supply Chain Attacks:** Targeting vendors and suppliers of industrial control systems.  * **Phishing:** Using targeted phishing emails to lure victims into installing malware.  * **Exploiting Vulnerabilities:** Leveraging known vulnerabilities in industrial control systems and other software. * **Current Status:** While initially a significant threat, the activity surrounding LummaC2 has decreased. However, researchers continue to monitor the threat landscape for any resurgence or related activity. Do you want me to delve deeper into a specific aspect of LummaC2, like the PLCs, the C2 infrastructure, or perhaps the techniques they used? I can also access more recent reports and analyses if you'd like. Would you like me to show you a timeline of key events related to LummaC2? Human: What industry was primarily targeted by this malware? AI: Okay, let’s get specific about the industry focus! The primary industry targeted by LummaC2 was, and still largely is, the manufacturing sector. However, it wasn’t a blanket targeting – it was heavily concentrated within specific sub-sectors. Specifically, the initial and most significant targets were companies involved in: * **Aerospace:** There were several reported incidents targeting aerospace firms, focusing on systems related to aircraft design, manufacturing, and maintenance. * **Defense:** Similar to aerospace, defense contractors were a key area of interest, particularly those involved in weapons systems and related technologies. * **Automotive:** Manufacturers and suppliers within the automotive industry were also targeted, with an emphasis on systems controlling production lines and managing supply chains. While there were instances of targeting companies in other sectors – like electronics – the manufacturing industries, particularly aerospace and defense, were the most heavily impacted and remain the core focus of the LummaC2 campaign. I can pull up some specific examples of companies that were identified as targets, if you’d like. Would you like me to do that, or perhaps explore the specific types of systems within these industries that were most vulnerable to LummaC2 attacks?'}","Okay, let’s get specific about the industry focus! The primary industry targeted by LummaC2 was, and still largely is, the manufacturing sector. However, it wasn’t a blanket targeting – it was heavily concentrated within specific sub-sectors.\n\nSpecifically, the initial and most significant targets were companies involved in:\n\n* **Aerospace:** There were several reported incidents targeting aerospace firms, focusing on systems related to aircraft design, manufacturing, and maintenance.\n* **Defense:** Similar to aerospace, defense contractors were a key area of interest, particularly those involved in weapons systems and related technologies.\n* **Automotive:** Manufacturers and suppliers within the automotive industry were also targeted, with an emphasis on systems controlling production lines and managing supply chains.\n\nWhile there were instances of targeting companies in other sectors – like electronics – the manufacturing industries, particularly aerospace and defense, were the most heavily impacted and remain the core focus of the LummaC2 campaign.\n\nI can pull up some specific examples of companies that were identified as targets, if you’d like. Would you like me to do that, or perhaps explore the specific types of systems within these industries that were most vulnerable to LummaC2 attacks?"


## Tools
### We'll use a tool supplied with LangChain and a custom tool we create

[DuckDuckGo](https://duckduckgo.com/) is an internet privacy company most popularly known for their private search engine. The company emphasizes privacy and anonimity as one of the key principles behind all their products.

Let's create a DuckDuckGo tool that is capable of retrieving results from a web search.

In [7]:
from langchain.tools import Tool
from langchain_community.tools import DuckDuckGoSearchRun

# Define the API wrapper for DuckDuckGo search
duckduckgo_search = DuckDuckGoSearchRun()

# Define the DuckDuckGo tool using a description and the function to retrieve results from DuckDuckGo search
duckduckgo_tool = Tool(
    name="DuckDuckGoSearch",
    func=duckduckgo_search.run,
    description="useful for when you need to answer questions about current weather and other current events",
)

# Test the DuckDuckGo tool
Markdown(duckduckgo_tool("What is a honeybee?"))

The best-known honey bee species is the western honey bee (Apis mellifera), which was domesticated and farmed (i.e. beekeeping) for honey production and crop pollination. Sep 1, 2025 · A honeybee is any of a small group of social bees that make honey . All honeybees live together in nests or hives. There are two honeybee sexes, male and female, and two female castes. The best-known honey bee species is the western honey bee (Apis mellifera), which was domesticated and farmed (i.e. beekeeping) for honey production and crop pollination. Simply, a Honey Bee is a small vegetarian insect which lives in a highly structured colony with thousands of its sisters (and a few brothers along with one Queen), all working toward the goal of storing enough food (honey) for the winter when flowers are not present. Honey bees store honey in their honeycombs, and it is collected from wild bee colonies, or from hives of domesticated bees, a practice known as beekeeping or apiculture. Honeybees are important pollinators for flowers, fruits, and vegetables . They live on stored honey and pollen all winter and cluster into a ball to conserve warmth. All honeybees are social and... View all Honey bees, also known as “honeybees,” are a group of insect species in the genus Apis . These insects are eusocial, which means they form large, complex societies. They are best known for building hives to store honey, and it is common to farm them for this reason. Aug 2, 2023 · Honey bees, scientifically known as Apis mellifera, are social insects that belong to the family Apidae. These creatures have a highly evolved and complex societal structure, living in well-organized colonies with distinct roles for each member. Honeybees are important pollinators for flowers, fruits, and vegetables . They live on stored honey and pollen all winter and cluster into a ball to conserve warmth. All honeybees are social and... Talking about honeybees is what we do! There is so much to say and learn (we couldn’t fit it on this page if we tried) but here’s a taste of what ... Additionally, honeybees in Ethiopia have made a slow recovery following a two year war. ... when jolted with static from flies, aphids, honeybees ...

## Custom Tools


You can define custom tools using the tool decorator.

It is critical when writing custom tools that we use well written doc strings, that we use python's type decoration syntax and that we use the @tool decorator supplied by LangChain (loaded above). The docstring and the type labels are used by the agent to reason about what a tool is useful for and how to interface with it. If your agnet isn't picking the correct tools or is calling them incorrectly, these are common culprits.

Keep in mind that a custom tool can do anything you can define in code. They can even call APIs of much larger systems. Effectively, these are boundless.

Here is a simple custom tool that returns the current date.

In [8]:
from langchain.tools import tool
from datetime import date

@tool
def curr_date(text: str) -> str:
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date math should occur \
    outside this function."""
    return str(date.today())

In [9]:

# Define the date tool
date_tool = Tool(
    name="DateTool", func=curr_date, description="Useful to retrieve the current date"
)

# Test date tool
print(date_tool(""))

2025-09-07


## Agents

Finally...we get to the good stuff...Agents are really cool.  Generally speaking, they have a model and a set of tools at their disposal.  They can take a prompt, break it down into the steps necessary to produce a meaningful response and, using the descriptions of the tools at their disposal, decide which ones are likely to be able to accomplish each tasks in the best way.  It then uses it's conclusions to execute the needed tools, pull in the right context and respond to your prompt.

They are simple to build and powerful to use.

Let's see if we can solve the dot counting problem now...

In [10]:
from langchain_experimental.tools import PythonREPLTool
from langchain.agents import AgentType
from langchain.agents import initialize_agent
from IPython.display import Markdown

llm = ChatOllama(
    model="gemma3",
    temperature=0.1,
    num_predict=256,  # Use num_predict instead of max_tokens for Ollama
)

# Define the Python REPL tool
python_repl = PythonREPLTool()

# Initialize the agent using the local model and the Python REPL tool
python_agent = initialize_agent(
    [python_repl],
    llm,  # Using our previously initialized Ollama Gemma model here
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True,
)

# Test the Python Agent with a simple input
result = python_agent.invoke(
    {"input": "How many dots are in the string ......................................?"}
)

# Display the result using Markdown
Markdown(result['output'])



[1m> Entering new AgentExecutor chain...[0m


Python REPL can execute arbitrary code. Use with caution.


[32;1m[1;3mThought: The string contains a series of dots. I need to count the number of dots.
Action: Python_REPL
Action Input: `string = "....................................."; print(len(string))`[0m
Observation: [36;1m[1;3m37
[0m
Thought:[32;1m[1;3mThought: I now know the final answer
Final Answer: 37[0m

[1m> Finished chain.[0m


37

## Pulling it all together and let the model reason about something to get the answer we need.

### Agent with Automatic Tool Selection
The real power of agents is when they are allowed to reason about which tools can be used to solve different tasks or sub-tasks to respond to a prompt. This is what makes them seem intelligent...but remember they are not truly intelligent! They're following sophisticated pattern recognition.

The agent reasoning flow works like this:

User Input → The user submits a query or request
LLM Reasoning → The LLM analyzes the request and determines approach
Tool Selection → Based on tool descriptions, the LLM selects appropriate tool(s)
Tool Execution → The selected tools are executed with parameters from the LLM
Result Synthesis → The LLM combines tool outputs into a coherent response
Hopefully, the model decides correctly and you get great results. Sometimes you have to prompt the model to use the right tools.

For example, if a model is trying to solve a math problem without using your math-specific tool, you can just add "You are bad at math so always use the wolfram tool to solve math problems" in the prompt. Simple guidance like this is usually enough to get the model working the way you intend.

This reasoning ability to select appropriate tools creates a flexible system that can handle diverse queries by combining specialized tools with general language understanding. The agent can break down complex problems, identify which tools would be most helpful for each component, and then integrate the results into a coherent response.

In [11]:
from langchain.agents import initialize_agent

# Define a list of available tools for the agent
tools = [
    duckduckgo_tool,
    date_tool,
]

# Initialize the agent with access to all the tooks in the list
agent_executor = initialize_agent(
    tools, 
    llm, 
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
    verbose=True, 
    handle_parsing_errors=True,
)

In [12]:
agent_executor.invoke(
    {
        "input": """
        System: You are not good at determining dates and times.  When you are asked about weather reports, you should first make sure you know what date is being asked about.
        If you are asked about weather and a day was not specified, you should use the date_tool to get today's date and then use the duckduckgo_tool to find out the weather for the date.
        When asking about weather, you should always include the date in your search for weather so you get the weather report for the correct date.
        
        Human: I am in Washington DC.  Will I need an umbrella today?
        """
    }
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out the weather in Washington DC today to determine if I need an umbrella. Since the date was not specified, I should use the DateTool to get today's date and then use the DuckDuckGoSearch tool to find the weather for that date.
Action: DateTool
Action Input: ''[0m
Observation: [33;1m[1;3m2025-09-07[0m
Thought:[32;1m[1;3mOkay, I now have today's date as 2025-09-07. I will use the DuckDuckGoSearch tool to find the weather in Washington DC for this date.
Action: DuckDuckGoSearch
Action Input: 'weather in Washington DC on 2025-09-07'[0m
Observation: [36;1m[1;3mWashington , District of Columbia weather forecast for Sunday, September 7 , 2025 . Get the latest on temperature, precipitation, wind speed, and UV. Plan your day with accurate weather updates. Washington , D . C . Weather Forecast for September 2025 is based on long term prognosis and previous years' statistical data. 4 days ago · It features all 

{'input': "\n        System: You are not good at determining dates and times.  When you are asked about weather reports, you should first make sure you know what date is being asked about.\n        If you are asked about weather and a day was not specified, you should use the date_tool to get today's date and then use the duckduckgo_tool to find out the weather for the date.\n        When asking about weather, you should always include the date in your search for weather so you get the weather report for the correct date.\n\n        Human: I am in Washington DC.  Will I need an umbrella today?\n        ",
 'output': "The weather forecast for Washington DC on 2025-09-07 includes temperature, precipitation, wind speed, and UV information. It is based on long-term prognosis and previous years' statistical data."}

# RAG Demonstration with ChromaDB

This cell demonstrates building a Retrieval-Augmented Generation (RAG) system using ChromaDB as the vector store. The code loads PDF documents, splits them into manageable chunks, creates vector embeddings using AWS Bedrock or HuggingFace models, and stores them in a persistent ChromaDB database. It showcases two key RAG functionalities: (1) direct similarity search to retrieve relevant document chunks based on semantic meaning, and (2) question answering that leverages the retrieved context to generate accurate responses using an LLM. This implementation includes fallback options, proper error handling, and performance testing to ensure reliable operation even if the primary embedding service encounters issues. The system persists the vector database to disk, allowing for reuse without regenerating embeddings in future sessions.

In [13]:
import os
import pandas as pd
from IPython.display import Markdown, display

# LangChain imports
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain_community.llms import Ollama
from chromadb.config import Settings

In [14]:
# Initialize Ollama LLM (Gemma model)
# Note: Ensure both gemma3 and nomic-embed-text models are pulled:
# ollama pull gemma3
# ollama pull nomic-embed-text
llm = Ollama(model="gemma3")

# Define embeddings model using Ollama with a compatible embedding model
embedding_model = OllamaEmbeddings(model="nomic-embed-text")

# Create directory for PDFs
pdf_dir = "./data/pdfs"
os.makedirs(pdf_dir, exist_ok=True)
print(f"PDF directory is at: {os.path.abspath(pdf_dir)}")

# Text splitter for chunking documents
recur_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    separators=["\n\n", "\n", ".", "!", "?", ",", " ", ""],
    length_function=len
)

# Load and process PDFs using PyMuPDFLoader for each file in the directory
from langchain_core.documents import Document

data = []
for filename in os.listdir(pdf_dir):
    if filename.endswith(".pdf"):
        loader = PyMuPDFLoader(file_path=os.path.join(pdf_dir, filename))
        data.extend(loader.load())

# Ensure there's data, create a demo if not
if len(data) == 0:
    data = [
        Document(page_content="Sample text about LummaC2.", metadata={"source": "sample.pdf", "page": 1}),
    ]

# Split documents into chunks
data_splits = recur_splitter.split_documents(data)

# Create ChromaDB vector store (in-memory only using default embedded mode, avoids server/tenant setup)
vectordb = Chroma.from_documents(
    documents=data_splits,
    embedding=embedding_model,
    collection_name="rag-demo",
    persist_directory="./chroma_llm_training"
)

# Define RAG QA prompt template
qa_template = """
You are a helpful assistant answering questions based on the provided context.
Use the following pieces of context to answer the user's question. If unsure, state you don't know.

Context:
{context}

Question: {question}

Answer:
"""
qa_prompt_template = PromptTemplate.from_template(qa_template)

# Define RetrievalQA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vectordb.as_retriever(search_kwargs={"k": 5}),
    return_source_documents=True,
    chain_type="stuff",
    chain_type_kwargs={"prompt": qa_prompt_template},
)

# Test questions
questions = [
    "What is LummaC2?",
    "What is a ClickFix?",
    "Why is the sky blue?"
]

# Run questions through the QA chain
for question in questions:
    print(f"\nQuestion: {question}")
    response = qa_chain({"query": question})
    display(Markdown(response["result"]))

    print("\nSources:")
    seen_sources = set()
    for doc in response["source_documents"][:2]:
        source_info = f"{os.path.basename(doc.metadata['source'])}, Page {doc.metadata['page']}"
        if source_info not in seen_sources:
            print(source_info)
            seen_sources.add(source_info)

PDF directory is at: /Users/schwartz/src/genai-essentials/data/pdfs

Question: What is LummaC2?


Based on the provided context, LummaC2 is a malware. It is identified by the following:

*   An executable file named LummaC2.exe (SHA256 hashes: 2F31D00FEEFE181F2D8B69033B382462FF19C35367753E6906ED80F815A7924F, 4D74F8E12FF69318BE5EB383B4E56178817E84E83D3607213160276A7328AB5D)
*   DLL binaries: iphlpapi.dll and winhttp.dll

The context indicates that it is being tracked by FBI and CISA.


Sources:
aa25-141b-threat-actors-deploy-lummac2-malware-to-exfiltrate-sensitive-data-from-organizations.pdf, Page 8

Question: What is a ClickFix?


Based on the provided context, a ClickFix is a social engineering tactic used by malicious websites. These websites masquerade as legitimate services like Google Chrome, Facebook, PDFSimpli, and reCAPTCHA to trick users into revealing information or installing malware.


Sources:
clickfix-attacks-sector-alert-tlpclear.pdf, Page 1

Question: Why is the sky blue?


I do not know. The provided context discusses BlueDelta’s activities, including credential harvesting and the use of backdoors like MASEPIE. It does not contain information about why the sky is blue.


Sources:
CTA-RU-2024-0530.pdf, Page 18
