Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Urlscan] Create an enrichment connector #2155

Merged
merged 6 commits into from
May 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions internal-enrichment/urlscan-enrichment/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
src/config.yml
src/__pycache__
src/logs
src/*.gql
4 changes: 4 additions & 0 deletions internal-enrichment/urlscan-enrichment/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
config.yml
__pycache__
logs
*.gql
15 changes: 8 additions & 7 deletions internal-enrichment/urlscan-enrichment/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
FROM python:3.11-alpine
ENV CONNECTOR_TYPE=INTERNAL_ENRICHMENT

# Install Python modules
RUN apk --no-cache add git build-base libmagic libffi-dev libxml2-dev libxslt-dev
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install --no-cache-dir -r /tmp/requirements.txt
# Copy the worker
COPY src /opt/opencti-connector-urlscan-enrichment

# Copy the connector
COPY src /opt/connector
WORKDIR /opt/connector
# Install Python modules
# hadolint ignore=DL3003
RUN apk --no-cache add git build-base libmagic libffi-dev && \
cd /opt/opencti-connector-urlscan-enrichment && \
pip3 install --no-cache-dir -r requirements.txt && \
apk del git build-base

# Expose and entrypoint
COPY entrypoint.sh /
Expand Down
104 changes: 103 additions & 1 deletion internal-enrichment/urlscan-enrichment/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,103 @@
# URLScan Enrichment connector
# OpenCTI URLScan Enrichment Connector

## Introduction

URLScan (https://urlscan.io/) is an online service that allows you to scan URLs to analyze and detect potential security threats. It provides a platform where users can submit links to be scanned to obtain information about the page's content, loaded external resources, potential threats, and other relevant security details.

## Requirements

- pycti

## Configuration variables

There are a number of configuration options, which are set either in `docker-compose.yml` (for Docker) or in `config.yml` (for manual deployment).

## OpenCTI environment variables

Below are the parameters you'll need to set for OpenCTI:

| Parameter | config.yml | Docker environment variable | Mandatory | Description |
|---------------|------------|-----------------------------|-----------|------------------------------------------------------|
| OpenCTI URL | url | `OPENCTI_URL` | Yes | The URL of the OpenCTI platform. |
| OpenCTI Token | token | `OPENCTI_TOKEN` | Yes | The default admin token set in the OpenCTI platform. |

### Base connector environment variables

Below are the parameters you'll need to set for running the connector properly:

| Parameter | config.yml | Docker environment variable | Default | Mandatory | Description |
|-------------------|-------------------|---------------------------------|-----------|-----------|--------------------------------------------------------------------------------------------------|
| Connector ID | id | `CONNECTOR_ID` | / | Yes | A unique `UUIDv4` identifier for this connector instance. |
| Connector Name | name | `CONNECTOR_NAME` | `URLScan` | Yes | Name of the connector. |
| Connector Scope | scope | `CONNECTOR_SCOPE` | / | Yes | Scope of the connector. Availables: `url or hostname or domain-name`, `ipv4-addr`, `ipv6-addr` |
| Run and Terminate | run_and_terminate | `CONNECTOR_RUN_AND_TERMINATE` | `False` | No | Launch the connector once if set to True. Takes 2 available values: `True` or `False` |
| Log Level | log_level | `CONNECTOR_LOG_LEVEL` | / | Yes | Determines the verbosity of the logs. Options are `debug`, `info`, `warn`, or `error`. |

### URLScan Enrichment connector environment variables

Below are the parameters you'll need to set for URLScan Enrichment connector:

| Parameter | config.yml | Docker environment variable | Default | Mandatory | Description |
|--------------------------------------|-------------------------|---------------------------------------------------|-----------|------------|-------------------------------------------------------------------------------------------------------------------------|
| URLScan Enr. Api Key | api_key | `URLSCAN_ENRICHMENT_API_KEY` | / | Yes | URLScan API Key |
| URLScan Enr. Api Base Url | api_base_url | `URLSCAN_ENRICHMENT_API_BASE_URL` | / | Yes | URLScan Base Url |
| URLScan Enr. Import Screenshot | import_screenshot | `URLSCAN_ENRICHMENT_IMPORT_SCREENSHOT` | `true` | Yes | Allows or not the import of the screenshot of the scan submitted in URLScan to OpenCTI. |
| URLScan Enr. Visibility | visibility | `URLSCAN_ENRICHMENT_VISIBILITY` | `public` | Yes | URLScan offers several levels of visibility for submitted scans: `public`, `unlisted`, `private` |
| URLScan Enr. Search filtered by date | search_filtered_by_date | `URLSCAN_ENRICHMENT_SEARCH_FILTERED_BY_DATE` | `>now-1y` | Yes | Allows you to filter by date available: `>now-1h`, `>now-1d`, `>now-1y`, `[2022 TO 2023]`, `[2022/01/01 TO 2023/12/01]` |
| URLScan Enr. Max TLP | max_tlp | `URLSCAN_ENRICHMENT_MAX_TLP` | / | Yes | Do not send any data to URLScan if the TLP of the observable is greater than MAX_TLP |


## Deployment

### Docker Deployment

Before building the Docker container, you need to set the version of pycti in `requirements.txt` equal to whatever version of OpenCTI you're running. Example, `pycti==6.1.3`. If you don't, it will take the latest version, but sometimes the OpenCTI SDK fails to initialize.

Build a Docker Image using the provided `Dockerfile`.

Example:

```shell
# Replace the IMAGE NAME with the appropriate value
docker build . -t [IMAGE NAME]:latest
```

Make sure to replace the environment variables in `docker-compose.yml` with the appropriate configurations for your
environment. Then, start the docker container with the provided docker-compose.yml

```shell
docker compose up -d
# -d for detached
```

### Manual Deployment

Create a file `config.yml` based on the provided `config.yml.sample`.

Replace the configuration variables (especially the "**ChangeMe**" variables) with the appropriate configurations for
you environment.

Install the required python dependencies (preferably in a virtual environment):

```shell
pip3 install -r requirements.txt
```

Then, start the connector from crowdstrike-endpoint-security/src:

```shell
python3 main.py
```

## Usage

After installation, the connector should require minimal interaction to use, and some configurations should be specified in your `docker-compose.yml` or `config.yml`.

## Warnings

- If you have the variable auto set to true, then it is important to choose the correct scope by selecting only one type of scope-submission (url or hostname or domain-name) to avoid looping ingestions.
- This is an example of looping ingestion: you have set a scope submission of URL and Domain name. When you will search for URL, it will retrieve lots of entities, including some domain names. These domain names will then be searched too. However, they can bring you some URLs too, creating this infinite loop.

- If you enrich IPv4 and IPv6 observables, only a link to URLScan search in external reference (OpenCTI) will be generated, but you can play with the search period with the environment variable search_filtered_by_date to refine the search.

- While the analysis is still in progress, the Result API endpoint will respond with an HTTP status code of 404. The connector's polling logic is to wait 10 seconds and retry 12 times, for a maximum wait time of 2 minutes, until the analysis is complete or the maximum wait time is reached.
29 changes: 17 additions & 12 deletions internal-enrichment/urlscan-enrichment/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,20 @@ services:
connector-urlscan-enrichment:
image: opencti/connector-urlscan-enrichment:6.1.5
environment:
- CONNECTOR_NAME=connector-urlscanio
- CONNECTOR_SCOPE=Url,Domain-Name,Hostname
- OPENCTI_URL=http://opencti:8080
- OPENCTI_TOKEN= ChangeMe
- CONNECTOR_ID= ChangeMe
- CONNECTOR_CONFIDENCE_LEVEL=100 # From 0 (Unknown) to 100 (Fully trusted).
- CONNECTOR_LOG_LEVEL=info
- CONNECTOR_AUTO=true
- URLSCAN_API_KEY=ChangeMe
- CONNECTOR_WANT_RESULTS=true
- CONNECTOR_DOMAIN_ENRICHMENT_COUNT=5 # Maximum Number of domain enrichment results added to notes.
restart: always
# OpenCTI's generic execution parameters:
- OPENCTI_URL=http://localhost
- OPENCTI_TOKEN=ChangeMe
# Connector's generic execution parameters:
- CONNECTOR_ID=ChangeMe
- CONNECTOR_NAME=Urlscan
- CONNECTOR_SCOPE=url,ipv4-addr,ipv6-addr
- CONNECTOR_AUTO=false
- CONNECTOR_LOG_LEVEL=error
# Connector's custom execution parameters:
- URLSCAN_ENRICHMENT_API_KEY=ChangeMe
- URLSCAN_ENRICHMENT_API_BASE_URL=https://urlscan.io/api/v1/
- URLSCAN_ENRICHMENT_IMPORT_SCREENSHOT=true
- URLSCAN_ENRICHMENT_VISIBILITY=public # Available values : public, unlisted, private
- URLSCAN_ENRICHMENT_SEARCH_FILTERED_BY_DATE=>now-1y # Available : ">now-1h", ">now-1d", ">now-1y", "[2022 TO 2023]", "[2022/01/01 TO 2023/12/01]"
- URLSCAN_ENRICHMENT_MAX_TLP=TLP:AMBER # Required, Available values: TLP:CLEAR, TLP:WHITE, TLP:GREEN, TLP:AMBER, TLP:AMBER+STRICT, TLP:RED
restart: always
7 changes: 5 additions & 2 deletions internal-enrichment/urlscan-enrichment/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
#!/bin/sh

# Start the connector (WORKDIR is /opt/connector as set in the Dockerfile)
python3 main.py
# Correct working directory
cd /opt/opencti-connector-urlscan-enrichment

# Start the connector
python3 main.py
19 changes: 19 additions & 0 deletions internal-enrichment/urlscan-enrichment/src/config.yml.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
opencti:
url: "http://localhost:8080"
token: "ChangeMe"

connector:
id: "ChangeMe"
type: "INTERNAL_ENRICHMENT"
name: "UrlScan"
scope: "url,ipv4-addr,ipv6-addr" # Availables => scope-submission: url or hostname or domain-name / scope-search: ipv4-addr,ipv6-addr
auto: false # Enable/disable auto-enrichment of observables
log_level: "error"

urlscan_enrichment:
api_key: "ChangeMe"
api_base_url: "https://urlscan.io/api/v1/" # Required
import_screenshot: false
visibility: "public" # Available values : public, unlisted, private
search_filtered_by_date: ">now-2d" # Available : ">now-1d", ">now-1y", "[2022 TO 2023]", "[2022/01/01 TO 2023/12/01"
max_tlp: "TLP:AMBER" # Required, Available values: TLP:CLEAR, TLP:WHITE, TLP:GREEN, TLP:AMBER, TLP:AMBER+STRICT, TLP:RED