Skip to content

Iskandeur/decipher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Decipher πŸ”

An advanced Python tool for automatic cipher detection and decryption using frequency analysis and pattern recognition. Supports both Caesar ciphers and substitution ciphers in English and French.

🎯 Features

  • Automatic Cipher Detection: Determines if text is encrypted based on frequency patterns
  • Caesar Cipher Breaking: Tests all 26 possible shifts with language scoring
  • Advanced Substitution Cipher Analysis: Multiple intelligent methods including:
    • Frequency analysis (English & French)
    • Expert pattern recognition
    • Word pattern analysis
    • Iterative optimization
    • πŸ€– AI-Powered Refinement: Gemini AI post-processing to perfect decoded text
  • Simple Frequency Viewer: Standalone tool for letter frequency analysis and visualization
  • Language Pattern Comparison: Compare text patterns with English and French frequencies
  • Bilingual Support: Handles both English and French texts
  • Interactive CLI: Easy-to-use command-line interface
  • Visualization: Optional frequency graphs using matplotlib

πŸš€ Quick Start

Installation

git clone https://github.com/Iskandeur/decipher.git
cd decipher
pip install -r requirements.txt

Optional: AI Enhancement

For advanced AI-powered cipher solving, the tool can integrate with Google Gemini AI:

# 1. Copy the example file and add your API key
cp .env.example .env
# Then edit .env with your actual API key

# 2. The tool will automatically load the API key from .env
# AI support is optional - the tool works fully without it

Note: The .env file is automatically ignored by git for security. Get your API key from Google AI Studio.

Basic Usage

# Analyze a cipher from command line
python cipher_analyzer.py "WKH TXLFN EURZQ IRA MXPSV RYHU WKH ODCB GRJ"

# Analyze a cipher from file
python cipher_analyzer.py -f encrypted_text.txt

# Run demo with sample Caesar cipher
python cipher_analyzer.py --demo

# Show frequency graph
python cipher_analyzer.py --demo -g

# Use the frequency viewer for detailed analysis
python frequency_viewer.py "sample text here"
python frequency_viewer.py -f sample_texts/english_sample.txt

# Enable AI assistance for complex ciphers
python cipher_analyzer.py "complex cipher text" --ai

πŸ“– Examples

Caesar Cipher (English)

$ python cipher_analyzer.py "WKH TXLFN EURZQ IRA MXPSV RYHU WKH ODCB GRJ"

βœ… BEST TRANSLATION (Caesar Cipher (English)):
THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
Caesar Shift: 3

AI-Powered Refinement

$ python cipher_analyzer.py "complex substitution cipher" --ai

πŸ”¬ Advanced substitution analysis...
    🎯 Expert manual analysis...
    Expert result preview: c etait oge qmorgee d avril frmide et claire...
    πŸ€– Gemini AI refinement...
    AI refinement preview: c etait une chargee d avril froide et claire...
    Readability improvement: 6 β†’ 8 readable words

βœ… BEST TRANSLATION (Expert Manual Analysis + AI Refinement):
c etait une chargee d avril froide et claire les barrières s'engageaient

Substitution Cipher (French)

$ python cipher_analyzer.py -f test_french_cipher.txt

βœ… BEST TRANSLATION (Expert Manual Analysis):
c etait oge qmorgee d avril frmide et claire les bmrlmxes smggaiegt...
Confidence Score: 7682.1
Detected Language: French

Frequency Analysis

$ python frequency_viewer.py "The quick brown fox jumps over the lazy dog"

πŸ” LUKIN E NIMI KON - Frequency Viewer
==================================================
Analyzing: The quick brown fox jumps over the lazy dog

πŸ“Š FREQUENCY ANALYSIS
============================================================
Total letters analyzed: 35

Letter | Count | Percentage | English Expected | French Expected
------------------------------------------------------------
  o    |    4  |   11.4%    |       7.5%      |      5.4%
  u    |    3  |    8.6%    |       2.8%      |      6.3%
  r    |    3  |    8.6%    |       6.0%      |      6.6%
  h    |    2  |    5.7%    |       6.1%      |      0.9%
  e    |    2  |    5.7%    |      12.7%      |     14.7%

🎯 LANGUAGE SIMILARITY SCORES
==============================
English similarity: -156.8
French similarity: -189.2
β†’ Text appears more similar to English patterns

πŸ› οΈ Command Line Options

Cipher Analyzer

usage: cipher_analyzer.py [-h] [-f FILE] [-g] [-l {auto,english,french}] [--demo] [--freq-only] [--ai] [text]

positional arguments:
  text                  Text to analyze (or use --file)

options:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  Read text from file
  -g, --graph           Show frequency graph
  -l {auto,english,french}, --language {auto,english,french}
                        Target language for analysis (default: auto-detect)
  --demo                Run with demo Caesar cipher
  --freq-only           Show only frequency analysis (no decryption)
  --ai                  Enable Gemini AI refinement to perfect decoded text

Frequency Viewer

usage: frequency_viewer.py [-h] [-f FILE] [--no-graph] [--no-table] [--version] [text]

positional arguments:
  text          Text to analyze (or use --file)

options:
  -h, --help    show this help message and exit
  -f FILE, --file FILE
                Read text from file
  --no-graph    Skip showing the comparison graph
  --no-table    Skip showing the frequency table
  --version     show program's version number and exit

🧠 How It Works

1. Encryption Detection

The tool analyzes letter frequency patterns and word readability to determine if text appears encrypted.

2. Caesar Cipher Analysis

  • Tests all 26 possible shifts
  • Scores results against English and French frequency patterns
  • Returns the most linguistically coherent result

3. Substitution Cipher Analysis

The tool employs multiple sophisticated methods:

  • Frequency Analysis: Maps most frequent cipher letters to expected language frequencies
  • Expert Pattern Recognition: Uses proven linguistic patterns (e.g., "ju" β†’ "le" in French)
  • Word Pattern Analysis: Identifies common word structures and endings
  • Iterative Optimization: Refines mappings through systematic testing
  • πŸ€– AI Refinement: Gemini AI conservatively fixes obvious letter errors while preserving original structure and meaning

4. Language Detection

Automatically detects whether the source text is English or French based on:

  • Letter frequency patterns
  • Common word recognition
  • Language-specific patterns

πŸ“ Project Structure

decipher/
β”œβ”€β”€ cipher_analyzer.py      # Main cipher analysis tool
β”œβ”€β”€ frequency_viewer.py     # Standalone frequency analysis tool
β”œβ”€β”€ manual_decoder.py       # Specialized decoder for complex ciphers
β”œβ”€β”€ sample_texts/           # Sample text files for testing
β”‚   β”œβ”€β”€ english_sample.txt  # English text example
β”‚   β”œβ”€β”€ french_sample.txt   # French text example
β”‚   └── cipher_sample.txt   # Encrypted text example
β”œβ”€β”€ requirements.txt        # Python dependencies
└── README.md              # This file

πŸ”¬ Technical Details

Supported Cipher Types

  • Caesar Cipher: Simple shift cipher (ROT-N)
  • Substitution Cipher: Each letter mapped to another letter

Language Support

  • English: Full frequency analysis and word pattern recognition
  • French: Specialized patterns for French linguistic structures

Scoring Algorithm

The tool uses a multi-factor scoring system:

  • Letter frequency similarity to expected language patterns
  • Word recognition bonuses
  • Language-specific pattern bonuses
  • Linguistic coherence scoring

πŸ”§ Dependencies

  • Python 3.6+
  • matplotlib (for frequency graphs)
  • google-genai (for AI-powered text refinement, optional)
  • python-dotenv (for environment variable management)
  • Standard library: argparse, collections, string, sys

πŸ“Š Sample Results

Caesar Cipher Success Rate

  • English: ~95% accuracy for texts >20 characters
  • French: ~90% accuracy for texts >20 characters

Substitution Cipher Success Rate

  • Simple substitutions: ~80% for texts >100 characters
  • Complex substitutions: ~60% with expert patterns

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

🎯 Use Cases

  • Educational: Learn about cryptography and frequency analysis
  • Historical: Analyze historical encrypted documents
  • Puzzles: Solve cryptographic puzzles and challenges
  • Research: Study linguistic patterns in different languages

⚠️ Limitations

  • Works best with longer texts (>50 characters)
  • Modern encryption methods are not supported
  • Performance depends on text quality and language consistency
  • Some substitution ciphers may require manual refinement
  • AI features require internet connection and valid API key

πŸ“ˆ Future Enhancements

  • Support for additional languages (German, Spanish, Italian)
  • Polyalphabetic cipher support (VigenΓ¨re)
  • Web interface
  • Statistical analysis reports
  • Machine learning improvements

Made with ❀️ for cryptography enthusiasts

About

An advanced Python tool for automatic cipher detection and decryption using frequency analysis and pattern recognition.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages