# MISP & PyMISP - Threat Intelligence Analysis Workshop

## What is MISP?

MISP is an open-source threat intelligence platform designed for sharing, storing, and correlating Indicators of Compromise (IoCs) and threat intelligence data. Originally developed by CIRCL, MISP has become a cornerstone of collaborative cybersecurity efforts worldwide.

## Core MISP Concepts and Terminology

### Events and Attributes
- **Events**: Container objects that represent security incidents, campaigns, or threat intelligence reports
- **Attributes**: Individual pieces of threat intelligence (IP addresses, domains, file hashes, URLs, etc.)
- **Objects**: Complex structures that group related attributes together (e.g., a file object containing hash, filename, and size)
- **Relationships**: Connections between events, attributes, and objects that show how threats relate to each other

### MISP Galaxies and Taxonomies
- **Taxonomies**: Classification systems used to tag and categorize threat intelligence
- **Tags**: Labels applied to events and attributes using taxonomies and galaxy information
- **Galaxies**: Knowledge bases containing structured threat intelligence (threat actors, attack patterns, tools, etc.)
- **Clusters**: Specific entries within galaxies (e.g., "APT29" within the threat-actor galaxy)

### MITRE ATT&CK Integration
- **Techniques**: Specific attack methods catalogued in the MITRE ATT&CK framework
- **Tactics**: High-level categories of attack behavior (e.g., Persistence, Defense Evasion)
- **Procedures**: Specific implementations of techniques by threat actors
- **Sub-techniques**: More granular variations of main techniques

## PyMISP Library

PyMISP is the official Python library for interacting with MISP instances:
- **API Wrapper**: Provides Python methods for all MISP API endpoints
- **Object-Oriented**: Represents MISP data as Python objects for easy manipulation
- **Search Capabilities**: Powerful querying with filters, boolean logic, and temporal ranges
- **Data Export**: Multiple formats for integration with other security tools

## Key Features of MISP

### Threat Intelligence Sharing
- **Community Collaboration**: Share IOCs and threat data across organizations
- **Automated Feeds**: Consume and distribute threat intelligence through standardized formats
- **Access Control**: Granular permissions for data sharing and visibility

### Data Correlation
- **Automatic Correlation**: MISP automatically identifies relationships between similar attributes
- **Manual Relationships**: Analysts can create explicit connections between threats
- **Contextual Enrichment**: Add metadata and context to raw threat indicators

### Integration Capabilities
- **API Access**: RESTful API for programmatic access and automation
- **Export Formats**: STIX, CSV, JSON, XML, and custom formats
- **SIEM Integration**: Direct feeds to security information and event management systems

## Documentation

- API Specs: https://www.misp-project.org/openapi/
- Cheatsheet: https://www.misp-project.org/misp-training/cheatsheet.pdf
- MISP REST API: https://github.com/MISP/misp-training/blob/main/a.7-rest-API/Training%20-%20Using%20the%20API%20in%20MISP.ipynb
    - query: https://github.com/MISP/misp-training/blob/main/a.7-rest-API/query-misp.ipynb
- PyMISP: https://github.com/MISP/PyMISP/blob/main/docs/tutorial/FullOverview.ipynb

## Exercises
### PyMISP

### PyMISP Setup and Authentication

Before we can query MISP for threat intelligence, we need to establish a connection to a MISP instance. This process involves:

**Authentication Requirements:**
- **API Key (AuthKey)**: A unique identifier that grants access to the MISP instance
- **Base URL**: The web address of the MISP server
- **SSL Configuration**: Security settings for encrypted communication

**Training Instance Details:**
For this workshop, we'll use the community training instance at `https://training.misp-community.org`. This is a public MISP instance designed for learning and testing purposes.

**Security Considerations:**
- API keys provide access equivalent to your user account permissions
- Never share API keys or commit them to version control
- The training instance uses self-signed certificates, requiring SSL verification bypass
- In production environments, proper SSL certificates should be used

**Connection Process:**
The code below will authenticate with the MISP instance and verify the connection by retrieving version information.

In [1]:
import pymisp
import urllib3
import getpass

MISP_BASEURL = "https://training.misp-community.org"
MISP_API_KEY = getpass.getpass("Enter your MISP AuthKey:")

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

misp = pymisp.PyMISP(
    MISP_BASEURL,
    MISP_API_KEY,
    ssl=False # Disable SSL verification
)
print(f"Connected to MISP {misp.root_url} running version: {misp.version['version']}")

Enter your MISP AuthKey: ········


Connected to MISP https://training.misp-community.org running version: 2.5.17.2


### Exercise 1.0: Retrieve User Account Information

**Objective**: Understand your MISP account details and permission levels.

**Key Concepts:**
- **User ID**: Unique numerical identifier for your account in the MISP database
- **Email**: Your account's contact email and login identifier
- **Role**: Determines your permissions and access levels within MISP

**MISP Role Types:**
- **Read Only**: Can view shared threat intelligence but cannot create or modify data
- **User**: Standard access with ability to create events and add attributes
- **Publisher**: Can publish events for community sharing
- **Org Admin**: Administrative rights within your organization
- **Site Admin**: Full administrative access to the MISP instance

**Why User Information Matters:**
Understanding your account details helps you know what operations you can perform and what data you can access. Different roles have different capabilities for creating, modifying, and sharing threat intelligence.

**API Endpoint**: `/users/view/me`

The code below retrieves comprehensive user information including account details, role permissions, and user settings.

In [2]:
user, role, settings = misp.get_user(pythonify=True, expanded=True)
print(f"User ID: {user.id}, Email: {user.email}, Role: {role.name}")

User ID: 635, Email: luciano.righetti@circl.lu, Role: admin


### Exercise 1.1: Retrieve a Specific MISP Event by UUID

**Objective**: Learn how to query individual MISP events using their unique identifiers.

**Key Concepts:**
- **UUID**: A 36-character string that uniquely identifies each MISP event
- **Event Information**: Human-readable description of the security incident or threat
- **Event Structure**: Events contain attributes, objects, tags, and relationships

**Understanding MISP Events:**
Events are the primary containers for threat intelligence in MISP. Each event represents:
- A security incident (e.g., malware infection, data breach)
- A threat campaign (e.g., APT group activity)
- A collection of related indicators
- Contextual information about threats

**Event Components:**
- **Metadata**: Creation date, last modification, threat level
- **Attributes**: Individual IOCs (IPs, domains, hashes, etc.)
- **Objects**: Structured data groupings
- **Tags**: Classification and categorization
- **Relationships**: Connections to other events

**API Endpoint**: `/events/view/[uuid]`

The code below retrieves a specific event and displays its basic information including the event description and UUID.

In [3]:
event = misp.get_event("83a7add9-76d7-47ef-9f4b-ebd07fbe880d", pythonify=True)
print(event.info, event.uuid)

Kobalos - Linux threat to high performance computing infrastructure 83a7add9-76d7-47ef-9f4b-ebd07fbe880d


### Exercise 1.2: Search Events by Threat Actor Attribution

**Objective**: Demonstrate how to search for events attributed to specific threat actors using MISP galaxies.

**Key Concepts:**
- **Threat Actors**: Individuals or groups that conduct cyber attacks
- **MISP Galaxies**: Structured knowledge bases containing threat intelligence
- **Galaxy Clusters**: Specific entries within galaxies (e.g., individual threat actors)
- **Attribution**: The process of linking attacks to specific threat actors

**Understanding MISP Galaxies:**
Galaxies are knowledge databases that provide structured information about:
- **Threat Actors**: APT groups, cybercriminal organizations, nation-state actors
- **Attack Patterns**: MITRE ATT&CK techniques and tactics
- **Tools**: Malware families, attack tools, and software
- **Sectors**: Industry verticals and organizational types

**Threat Actor Intelligence:**
- **Group Names**: Both public names and aliases used by threat actors
- **Attribution Confidence**: Levels of certainty in linking attacks to actors
- **Historical Activity**: Timeline of observed campaigns and attacks
- **Targeting Patterns**: Preferred victims and attack methods

**About "Deadeye Jackal":**
This is an example threat actor name that might appear in MISP threat intelligence. Real threat actor names include groups like APT29, Lazarus Group, or FIN7.

**Tag Format**: `misp-galaxy:threat-actor="Deadeye Jackal"`

**API Endpoints**: 
- `/events/index` - Basic event listing with filters
- `/events/restSearch` - Advanced search capabilities

The code below searches for all events tagged with the specified threat actor and returns the matching events.

In [4]:
events = misp.search("events", tag="misp-galaxy:threat-actor=\"Deadeye Jackal\"", pythonify=True)
print(events)

[<MISPEvent(info=Investigation Syrian Electronic Army Activities - Domain(s) Take over via Melbourne IT registrar), <MISPEvent(info=Another Foo bar)]


### Exercise 1.3: Extract URL Attributes for Protective Tools

**Objective**: Learn how to query specific attribute types and understand the distinction between intelligence context and detection-ready indicators.

**Key Concepts:**
- **Attributes**: Individual pieces of threat intelligence within MISP events
- **Attribute Types**: Standardized categories like URL, IP, domain, hash, filename
- **to_ids Flag**: Boolean indicator showing if an attribute is suitable for detection systems

**Understanding the to_ids Flag:**
This critical flag determines how attributes should be used:
- **to_ids = 1 (True)**: Attribute is validated and suitable for automated detection/blocking
- **to_ids = 0 (False)**: Attribute is for intelligence context only, not for blocking

**Why the Distinction Matters:**
- **False Positive Prevention**: Avoid blocking legitimate infrastructure
- **Quality Control**: Ensure only validated indicators reach security tools
- **Context Preservation**: Maintain reference information without triggering alerts
- **Operational Safety**: Prevent disruption of legitimate business operations

**URL Attributes in Threat Intelligence:**
URLs can represent various types of threats:
- **Malicious URLs**: Command and control servers, malware distribution sites
- **Phishing URLs**: Fake websites designed to steal credentials
- **Exploit URLs**: Sites hosting exploit kits or malicious code
- **Reference URLs**: External sources and research links (typically to_ids=0)

**Protective Tools Integration:**
URLs marked with to_ids=1 can be used in:
- Web proxy blocking lists
- DNS filtering systems
- Browser security extensions
- Network security appliances

**API Endpoint**: `/attributes/restSearch`

The code below queries for URL attributes, but note that it currently retrieves all URLs regardless of the to_ids flag. For protective tools, you would typically add `to_ids=1` to the search parameters.

In [5]:
urls = misp.search("attributes", type_attribute=["url"], pythonify=True)
for url in urls:
    print(url.value)

https://analyst1.com/file-assets/RANSOM-MAFIA-ANALYSIS-OF-THE-WORLD%E2%80%99S-FIRST-RANSOMWARE-CARTEL.pdf
http://securechannel.org/gateway/authenticate.jsp
http://hostconfig.xyz/system/configure.php
https://log-collector.app/report/status.json
http://update-checker.com/verify/sync.php
https://cdn-status.net/api/v1/connect
https://telemetry-cloud.info/ping/update.bin
http://dns-sync.cc/resolve/lookup.php
https://img-cache.pro/assets/img_load.php
http://net-monitor.co/data/fetch_logs.php
https://proxy-update.site/init/connect.php
http://securechannel.org/gateway/authenticate.jsp
http://hostconfig.xyz/system/configure.php
https://log-collector.app/report/status.json
http://update-checker.com/verify/sync.php
https://cdn-status.net/api/v1/connect
https://telemetry-cloud.info/ping/update.bin
http://dns-sync.cc/resolve/lookup.php
https://img-cache.pro/assets/img_load.php
http://net-monitor.co/data/fetch_logs.php
https://proxy-update.site/init/connect.php
http://securechannel.org/gateway/authe

### Exercise 1.4: Query Recent Published IP Addresses

**Objective**: Demonstrate temporal filtering capabilities and understand MISP's publication workflow for current threat intelligence.

**Key Concepts:**
- **Publication Status**: Whether an event has been validated and released for sharing
- **Publication Timestamp**: When an event was officially published
- **Temporal Filtering**: Searching for threats within specific time ranges
- **IP Attributes**: Network addresses associated with threat activity

**Understanding MISP Publication Workflow:**
MISP events go through different states:
- **Draft**: Events being developed, not yet ready for sharing
- **Published**: Events that have been validated and approved for community sharing
- **Timestamp Tracking**: MISP tracks both creation and publication times

**Why Recent Published Data Matters:**
- **Current Threats**: Focus on active and emerging threats
- **Operational Relevance**: Prioritize recently observed threat infrastructure
- **Detection Effectiveness**: Ensure security tools block current threats
- **Intelligence Freshness**: Maintain up-to-date situational awareness

**Temporal Filter Formats:**
- **Relative Periods**: "7d" (7 days), "30d" (30 days), "1y" (1 year)
- **Specific Dates**: ISO format dates for precise time ranges
- **Publication vs Creation**: Different timestamps for different purposes

**IP Address Types:**
- **ip-src**: Source IP addresses (where attacks originate)
- **ip-dst**: Destination IP addresses (targets of attacks)
- **Both Types**: Comprehensive coverage of IP-based threats

**Operational Applications:**
Recent published IP addresses can be used for:
- Firewall rule updates
- Intrusion detection system signatures
- Network monitoring and alerting
- Threat hunting activities

**API Endpoint**: `/attributes/restSearch`

The code below searches for IP addresses that were published in the last 7 days and are currently in published status, ensuring we get current, validated threat intelligence.

In [6]:
ips = misp.search("attributes", type_attribute=["ip-src", "ip-dst"], publish_timestamp="7d", published=True, pythonify=True)
for ip in ips:
    print(ip.value)

81.17.24.130
194.26.29.251
194.26.29.84
185.245.85.251
185.245.84.227
179.43.189.218
179.43.187.47
179.43.175.108
179.43.175.38
179.43.162.55
179.43.133.202
154.21.20.82
112.132.218.45
112.51.253.153
90.131.156.107
79.124.8.66
46.101.242.222
5.226.139.66
179.43.142.42
179.43.176.60
111.111.111.111
45.141.87.11
194.26.29.95
194.26.29.98
62.173.140.223


### Exercise 1.5: Advanced Multi-Criteria Search with MITRE ATT&CK and Sector Targeting

**Objective**: Demonstrate complex search capabilities using boolean logic to combine MITRE ATT&CK techniques with sector-specific targeting.

**Key Concepts:**
- **Boolean Logic**: AND, OR, NOT operations for combining search criteria
- **MITRE ATT&CK**: Framework cataloging adversary tactics and techniques
- **Sector Targeting**: Industry-specific threat intelligence
- **Multi-Tag Filtering**: Searching using multiple galaxy classifications simultaneously

**Understanding MITRE ATT&CK Integration:**
MISP integrates deeply with the MITRE ATT&CK framework:
- **Techniques**: Specific attack methods (e.g., T1554 - Compromise Client Software Binary)
- **Tactics**: High-level attack categories (e.g., Persistence, Defense Evasion)
- **Sub-techniques**: More granular variations of main techniques
- **Procedure Examples**: Real-world implementations by threat actors

**About T1554 - Compromise Client Software Binary:**
- **Original ID**: Previously known as T1154 in older ATT&CK versions
- **Current ID**: T1554 in updated framework
- **Tactic**: Persistence (maintaining access to systems)
- **Description**: Modifying legitimate software to include malicious functionality
- **Sophistication**: Requires significant technical skill and system access

**Sector Classification:**
MISP uses sector galaxies to categorize targets:
- **Government, Administration**: Public sector organizations
- **Financial Services**: Banks, credit unions, financial institutions
- **Healthcare**: Hospitals, clinics, medical organizations
- **Critical Infrastructure**: Power, water, transportation systems
- **Education**: Schools, universities, research institutions

**Why Government Sector + T1554 Matters:**
This combination represents sophisticated attacks against high-value targets:
- **Advanced Persistent Threats (APTs)**: Often used by nation-state actors
- **High Impact**: Government systems contain sensitive national information
- **Persistence Goals**: Long-term access for espionage or disruption
- **Detection Challenges**: Modified legitimate software can evade detection

**Boolean Search Logic:**
The search uses AND logic requiring both conditions:
1. Events must be tagged with government sector targeting
2. Events must involve the T1554 attack technique
3. Only events matching BOTH criteria will be returned

**Tag Formats:**
- Sector: `misp-galaxy:sector="Government, Administration"`
- Technique: `misp-galaxy:mitre-attack-pattern="Compromise Client Software Binary - T1554"`

**API Endpoint**: `/events/restSearch`

The code below performs a complex search combining sector targeting with specific MITRE ATT&CK techniques to find relevant government-focused threats.

In [7]:
events = misp.search("events", tag={
        "AND": [
            "misp-galaxy:sector=\"Government, Administration\"",
            "misp-galaxy:mitre-attack-pattern=\"Compromise Client Software Binary - T1554\""
        ]
    }, pythonify=True)
print(events)

[<MISPEvent(info=Kobalos - Linux threat to high performance computing infrastructure)]


### Task - Extract domains from MISP
- Objective: Use PyMISP to find detection-ready URL attributes from the training instance and extract unique domains.

- Instructions:
    1. Search attributes for type `url` and `domain` with `to_ids=1`, `publish_timestamp="30d"`, and `published=True`.
    2. From each result, extract the domain (e.g., with `urllib.parse.urlparse`) and collect unique domains.
    3. Print the sorted list of domains and the total count.

- Hint: use a `set` to deduplicate domains and `sorted()` before printing.

In [None]:

# Your code here...