While companies train employees to spot phishing emails, they are defenseless against Vishing (Voice Phishing). Attackers now use Generative AI to clone specific voices—impersonating known colleagues or executives—to manipulate employees. Traditional training cannot simulate this high-pressure, deepfake reality.
This project is an automated "Attacker Agent" that immunizes employees through realistic exposure therapy using custom voice cloning.
- Trigger: An n8n workflow initiates the simulation based on a schedule.
- Dynamic Context: n8n passes target data (e.g.,
{{victim_name}},{{department}}) to the Agent. - The Deepfake: The Agent activates the "Alex" persona. Crucially, this uses a custom-trained ElevenLabs voice model (cloned from our own team's voice) to demonstrate the terrifying realism of modern impersonation attacks.
- The Interaction: The Agent follows a strict system prompt ("Stressed SysAdmin") to handle objections and aggressively pressure the victim for a password.
- Reporting: The outcome (Pass/Fail) is logged to the compliance database via n8n.
The Social Engineering Vishing Simulator is a training tool designed to help organizations educate employees about voice phishing (vishing) attacks through realistic simulations. This solution enables security teams to test employee awareness and response to social engineering attempts via phone calls, building a more security-conscious workforce.
Organizations face increasing threats from vishing attacks where malicious actors use phone calls to manipulate employees into divulging sensitive information or performing unauthorized actions. Current security training often focuses on email phishing while neglecting voice-based social engineering tactics, leaving a critical vulnerability in organizational defenses.
Voice-based attacks are becoming more sophisticated with the rise of AI-powered voice cloning and deepfake technology. Traditional phishing awareness training does not adequately prepare employees for real-time, conversational manipulation attempts. As remote work increases phone communication dependency, the attack surface for vishing has expanded significantly, making immediate action necessary.
Implementing a vishing simulator provides multiple ROI benefits:
- Reduces successful social engineering attacks that lead to data breaches and financial losses
- Lowers incident response costs by preventing attacks before they occur
- Strengthens compliance posture for regulations requiring security awareness training
- Differentiates the organization's security program with cutting-edge training methods
- Creates measurable improvements in employee security behaviors
- Potential employee stress or negative reactions to realistic simulation scenarios
- Risk of eroding trust if simulations are not properly communicated as training tools
- Technical challenges in implementing realistic call scenarios at scale
- Legal and privacy considerations around call recording and monitoring
- Resource requirements for creating diverse, effective simulation scenarios
Primary users include security awareness teams, IT security managers, compliance officers, and training coordinators. The indirect beneficiaries are all employees who will participate in simulations, particularly those in high-risk roles such as finance, HR, IT support, and executive assistants.
The project is complete when the platform can deliver automated vishing simulations, track employee responses, provide detailed analytics on vulnerability trends, offer targeted training resources, and integrate with existing security awareness platforms.
- Reduce successful vishing attacks by 60% within 6 months
- Achieve 85% employee participation in quarterly simulations
- Decrease average response time to report suspicious calls by 40%
- Maintain employee satisfaction score above 4.0/5.0 for training effectiveness
The submission demonstrates a live end-to-end loop: n8n triggers the call, and the ElevenLabs custom voice model successfully speaks to the victim. The latency is low enough to hold a natural conversation, and the agent successfully logs the final result.
This project integrates Biometric Voice Cloning with LLM Logic.
- Voice Cloning: We recorded and fine-tuned a custom voice model on ElevenLabs to replicate a specific human identity, rather than using a generic stock voice.
- Prompt Logic: The agent utilizes a complex system prompt with behavioral guardrails (e.g., "ID 4492-B") and dynamic variable injection (
{{victim_name}}) to maintain context. - Orchestration: n8n acts as the nervous system, bridging the AI, the telephony provider (Twilio), and the database.
Most security training is passive (videos/quizzes). This is active, adversarial training. By cloning our own voice, we demonstrate "Spear Vishing"—attacks where the hacker sounds exactly like someone the victim knows and trusts. We are using the attacker's weapon (Deepfakes) as a defensive educational tool.
This tool directly mitigates the risk of deepfake fraud. By exposing employees to a simulation where the "attacker" sounds like their actual boss or coworker, organizations can break the "trust bias" that makes vishing so successful. This helps companies meet SOC2/ISO compliance for security awareness.
The submission is a fully autonomous Voice Agent. It listens, thinks, and speaks using a specific cloned identity. It improvises responses based on the victim's hesitation and persistently pursues its goal without human intervention.
- Voice Synthesis: ElevenLabs (Custom Instant Voice Clone created from team voice samples).
- Agent Brain: ElevenLabs Conversational AI (Model: "Alex" System Prompt).
- Orchestration: n8n (Workflow automation, scheduling, and variable passing).
- Telephony: Twilio (Integrated via ElevenLabs for PSTN connectivity).
This file contains the complete configuration for the ElevenLabs Conversational Agent ("Alex"). It includes:
- Agent Configuration: Name, language settings, and ASR/TTS parameters.
- System Prompt: The detailed "Stressed SysAdmin" persona instructions, including goal definition ("The Hook", "The Ask"), objection handling logic, and guardrails.
- Voice Settings: Reference to the custom-cloned voice model ID (
voice_id). - Telephony Config: Integration settings for the Twilio phone number.
This file defines the n8n Workflow that orchestrates the simulation loop. Key components include:
- Webhook Trigger: Listens for call completion events.
- Classifier Agent: A LangChain node that analyzes the call transcript using an LLM. It evaluates the victim's performance based on a scoring rubric (Resistance Score 1-10) and determines if critical data was revealed.
- Logic Flow: Handles the processing of the transcript and logging of the pass/fail results.
