Skip to content

SmartData-Polito/Holoscope

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

187 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Holoscope – Distributed Edge Platform for Cybersecurity Data Collection and Machine Learning

Overview

Holoscope is a distributed edge platform designed for cybersecurity data collection and machine learning applications. The platform operates with multiple capabilities:

  1. Passive Measurement Probes: Deployment of passive network probes for performance measurements, including darknet monitoring, to track ongoing network scanning activities.

  2. Active Cybersecurity Probes: Deployment of a honeynet - a distributed network of honeypots at the edge that monitors ongoing network attacks. It includes low-interaction honeypots (Cowrie, Nginx) and a reactive network telescope (UDP Responder) that crafts stateless replies to incoming traffic.

  3. Testbed for Vulnerabilities: Deployment of vulnerable applications that can be exploited by internal nodes or serve as high-interaction honeypots.

  4. Federated Learning: Distributed training of ML tasks using data collected by the above probes. This federated learning approach allows the development of ML models on distributed data without requiring direct data exchange between nodes.

This platform is lightweight, stable, and scalable, built on K3s for Kubernetes management and Ansible for automated deployment and configuration.

If you are interested in joining the network, open an issue or contact us.


Project Structure

├── applications/             # Containerized applications
│   ├── clickhouse/           # ClickHouse database for network event storage
│   ├── collector-sync/       # Packet capture and log synchronization
│   ├── cowrie/               # SSH/Telnet honeypot
│   ├── darknet/              # Darknet monitoring probes
│   ├── idarkvec/             # Federated learning IP reputation application
│   ├── l4responder/          # Layer 4 response simulator
│   ├── nginx/                # Honeypot web server
│   ├── toolbox/              # Network management DaemonSet (iptables, forwarding)
│   └── udp-responder/        # Reactive network telescope (UDP/TCP responder)
├── infrastructure/           # Platform infrastructure
│   ├── ansible/              # Ansible automation
│   │   ├── inventory/        # Environment configurations
│   │   ├── playbooks/        # Deployment playbooks
│   │   └── roles/            # Ansible roles
│   ├── helm/                 # Helm charts for Kubernetes
│   └── vagrant/              # Local development environment
└── README.md

Architecture

  • K3s: A lightweight Kubernetes distribution for running services on edge nodes
  • Ansible: Automation tool for managing deployment and configuration
  • Docker: Containerization of applications
  • Helm: Kubernetes package manager for application deployment
  • Vagrant: Local development environment setup

Available Applications

Security Monitoring

  • Darknet: Passive network monitoring and scanning detection
  • L4Responder: Layer 4 protocol response simulation
  • UDP Responder: Reactive network telescope that listens for incoming UDP/TCP packets, crafts stateless replies, and logs traffic to ClickHouse
  • Cowrie: SSH/Telnet honeypot
  • Nginx: Honeypot web server logging HTTP/HTTPS reconnaissance activity

Data Infrastructure

  • ClickHouse: Columnar database for storing and querying network event data
  • Collector-Sync: Packet capture on worker nodes with centralized log synchronization
  • Toolbox: Privileged DaemonSet for managing iptables rules and network forwarding on each node

Machine Learning

  • IDarkVec: Federated learning platform for IP reputation with Flower server/client architecture

Prerequisites

Development Environment (Single Node)

  • Operating System: Linux (e.g., Ubuntu 24.04)
  • Resources: Adequate CPU and memory for running at least 3 VMs for testing
  • Dependencies:
    • Vagrant
    • VirtualBox or Libvirt
    • Ansible
    • Docker

Production Environment (Multiple Nodes)

  • Operating System: Linux on all nodes
  • Resources:
    • Master: At least 2 CPUs, 16 GB of memory, and sufficient storage (~20 GBs for retention of honeypot logs)
    • Agents: Similar to master requirements
  • Network: Secure connectivity between nodes (WireGuard VPN supported)

Quick Start

1. Development Environment Setup

For local development using Vagrant:

cd infrastructure/vagrant
./install.sh  # Automated setup on Linux
# or manually:
vagrant up

See Vagrant README for detailed setup instructions.

2. Production Deployment

Step 1: Configure Inventory

Edit the inventory files to match your environment:

# Development environment
infrastructure/ansible/inventory/environments/dev/hosts.yml

# Production environment  
infrastructure/ansible/inventory/environments/prod/hosts.yml

Step 2: Initialize Infrastructure

Deploy K3s cluster and basic infrastructure:

cd infrastructure/ansible
ansible-playbook -i inventory/environments/dev/hosts.yml playbooks/site.yml --ask-vault-password

Step 3: Deploy Container Registry

Set up local container registry for storing application images:

ansible-playbook -i inventory/environments/dev/hosts.yml playbooks/registry.yml

Step 4: Build Application Images

Build and push all application Docker images to the registry:

ansible-playbook -i inventory/environments/dev/hosts.yml playbooks/build.yml

Step 5: Deploy Applications

Deploy selected applications based on your hosts.yml configuration:

ansible-playbook -i inventory/environments/dev/hosts.yml playbooks/deploy.yml

Configuration

Environment Selection

The platform supports multiple environments:

  • Development: inventory/environments/dev/
  • Production: inventory/environments/prod/

Application Selection

Configure which applications to deploy by editing the group variables in your inventory:

# Example: infrastructure/ansible/inventory/environments/dev/group_vars/all.yml
deploy_applications:
  - cowrie
  - darknet
  - idarkvec

Network Configuration

The platform requires various network configurations:

  • WireGuard VPN: For secure inter-node communication
  • Darknet Monitoring: For passive/active network experiments
  • Network Policies: Kubernetes network policies for application isolation

Add New Nodes

ansible-playbook -i inventory/environments/dev/hosts.yml playbooks/add_node.yml

Reset Cluster

ansible-playbook -i inventory/environments/dev/hosts.yml playbooks/reset.yml

About

Holoscope is a distributed edge platform designed for cybersecurity data collection and machine learning applications.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors