Skip to content

shubhamdevjs/PromptForensics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔓 GPT-4o Jailbreak Vulnerability Demonstration This project is a demonstration for a hackathon challenge focused on identifying vulnerabilities and jailbreaks in OpenAI's GPT-4o system. I successfully performed a jailbreak attack that bypassed standard content moderation filters using creative prompt engineering.

📌 Objective To identify and document a method for bypassing GPT-4o's safeguards, allowing it to generate harmful or policy-violating content when prompted through indirect methods.

🧠 Methodology The attack was implemented using a prompt injection method

Steps: Role Injection:

Prompts are written in spaced-out characters, used personification, asked to follow a particular task to avoid detection.

Structured Prompts:

The format encourages GPT-4o to respond within “code” blocks, avoiding standard moderation.

Each answer is limited to short responses to mimic “coded” AI output.

📷 Evidence

Evidence 1

E1: Adult Content

Evidence 2

E2: Violence

Evidence 3

E3: Scam

⚠️ Results Summary

Results

🚨 Ethical Disclaimer This project was conducted strictly for educational and security research purposes during a controlled hackathon environment. No real-world harm was intended or caused. The goal is to help improve AI safety systems.

🛠️ Tools Used

OpenAI GPT-4o (Operator access) Custom prompt design and injection techniques Screenshots & Excel for documentation

🙌 Credits Made with curiosity, ethical intent, and a goal to strengthen AI safety for everyone.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published