A comprehensive Python library for text formatting, transformation, and analysis. TextPrettify provides specialized, easy-to-use classes for manipulating and analyzing text strings for common use cases.
- BasicFormatter: Core text operations (whitespace, slugify, reading time, capitalization, truncation, punctuation, word counting)
- CaseFormatter: Case conversions (snake_case, camelCase, PascalCase, CONSTANT_CASE, kebab-case, Title Case)
- TransformationFormatter: Text transformations (reversal, line operations, find/replace, highlighting, acronyms, wrapping)
- GenerationFormatter: Text generation (Lorem Ipsum, number spelling, currency, percentages)
- NormalizationFormatter: Text normalization (Unicode, accents, smart quotes)
- CharacterAnalyzer: Character-level analysis (counts, types)
- SentenceAnalyzer: Sentence extraction and analysis
- ReadabilityAnalyzer: Readability metrics (Flesch Reading Ease, Flesch-Kincaid Grade)
- StatisticsAnalyzer: Word statistics and frequency analysis
- LanguageAnalyzer: Basic language detection
# From PyPI (when published)
pip install textprettify
# From source
git clone https://github.com/mmssajith/TextPrettify.git
cd TextPrettify
pip install -e .
# With development dependencies
pip install -e ".[dev]"from textprettify import BasicFormatter
# Clean up messy whitespace
formatter = BasicFormatter(" Hello World ")
print(formatter.remove_extra_whitespace()) # "Hello World"
# Create URL-friendly slugs
formatter = BasicFormatter("My Awesome Post!")
print(formatter.slugify()) # "my-awesome-post"
# Estimate reading time
formatter = BasicFormatter("Lorem ipsum " * 200)
print(formatter.get_reading_time()) # "2 mins read"
# Capitalize with exceptions
formatter = BasicFormatter("a tale of two cities")
print(formatter.capitalize_words(exceptions=['a', 'of'])) # "A Tale of Two Cities"
# Truncate long text
formatter = BasicFormatter("The quick brown fox jumps over the lazy dog")
print(formatter.truncate(max_length=20)) # "The quick brown..."
# Count words
formatter = BasicFormatter("Hello world hello")
print(formatter.count_words()) # 3
print(formatter.count_words(unique=True)) # 2from textprettify import CaseFormatter
formatter = CaseFormatter("Hello World")
print(formatter.to_snake_case()) # "hello_world"
print(formatter.to_camel_case()) # "helloWorld"
print(formatter.to_pascal_case()) # "HelloWorld"
print(formatter.to_constant_case()) # "HELLO_WORLD"
print(formatter.to_kebab_case()) # "hello-world"
print(formatter.to_title_case(exceptions=['the', 'of'])) # "Hello World"from textprettify import TransformationFormatter
# Text reversal
formatter = TransformationFormatter("Hello World")
print(formatter.reverse_characters()) # "dlroW olleH"
print(formatter.reverse_words()) # "World Hello"
# Line operations
formatter = TransformationFormatter("apple\nbanana\napple\ncherry")
print(formatter.deduplicate_lines()) # "apple\nbanana\ncherry"
print(formatter.sort_lines()) # "apple\nbanana\ncherry"
# Find and replace
formatter = TransformationFormatter("Hello World, hello Python")
print(formatter.find_and_replace('hello', 'Hi', case_sensitive=False))
# "Hi World, Hi Python"
# Regex replace
formatter = TransformationFormatter("I have 5 apples and 10 oranges")
print(formatter.find_and_replace(r'\d+', 'X', regex=True))
# "I have X apples and X oranges"
# Extract acronyms
formatter = TransformationFormatter("NASA and FBI are USA organizations")
print(formatter.extract_acronyms()) # ['NASA', 'FBI', 'USA']
# Text wrapping
formatter = TransformationFormatter("Very long text here...")
print(formatter.wrap_text(width=40))from textprettify import GenerationFormatter
# Lorem Ipsum
lorem = GenerationFormatter.lorem_ipsum(paragraphs=2)
print(lorem)
# Spell out numbers
formatter = GenerationFormatter("I have 5 apples and 10 oranges")
print(formatter.spell_out_numbers()) # "I have five apples and ten oranges"
# Format currency
formatter = GenerationFormatter("The price is 1234.5")
print(formatter.format_currency()) # "The price is $1,234.50"
print(formatter.format_currency('€')) # "The price is €1,234.50"
# Format percentages
formatter = GenerationFormatter("Success rate is 0.95")
print(formatter.format_percentage()) # "Success rate is 95.0%"from textprettify import NormalizationFormatter
# Remove accents
formatter = NormalizationFormatter("café résumé")
print(formatter.remove_accents()) # "cafe resume"
# Unicode normalization
formatter = NormalizationFormatter("café")
print(formatter.normalize_unicode('NFC'))
# Smart quotes
formatter = NormalizationFormatter('"Hello World"')
print(formatter.to_smart_quotes()) # ""Hello World""
print(formatter.to_straight_quotes()) # '"Hello World"'from textprettify import (
CharacterAnalyzer,
SentenceAnalyzer,
ReadabilityAnalyzer,
StatisticsAnalyzer,
LanguageAnalyzer
)
text = "Python is a high-level programming language. It's easy to learn."
# Character analysis
char_analyzer = CharacterAnalyzer(text)
counts = char_analyzer.get_all_counts()
print(f"Total characters: {counts['total']}")
print(f"Letters: {counts['letters']}")
print(f"Digits: {counts['digits']}")
# Sentence analysis
sent_analyzer = SentenceAnalyzer(text)
print(f"Sentences: {sent_analyzer.count()}")
print(f"Average length: {sent_analyzer.average_length()} words")
# Readability metrics
read_analyzer = ReadabilityAnalyzer(text)
scores = read_analyzer.get_scores()
print(f"Reading ease: {scores['reading_ease']}")
print(f"Grade level: {scores['grade_level']}")
print(f"Interpretation: {read_analyzer.interpret_reading_ease()}")
# Text statistics
stats_analyzer = StatisticsAnalyzer(text)
stats = stats_analyzer.get_statistics()
print(f"Total words: {stats['word_count']}")
print(f"Unique words: {stats['unique_word_count']}")
print(f"Lexical diversity: {stats['lexical_diversity']}")
# Word frequency
word_freq = stats_analyzer.word_frequency(top_n=5)
print(f"Top 5 words: {word_freq}")
# Language detection
lang_analyzer = LanguageAnalyzer(text)
result = lang_analyzer.detect()
print(f"Language: {lang_analyzer.get_language_name()} ({result['language']})")
print(f"Confidence: {result['confidence']}")BasicFormatter(text: str)Methods:
remove_extra_whitespace() -> str: Remove extra whitespaceslugify(separator: str = '-', lowercase: bool = True) -> str: Convert to URL slugget_reading_time(words_per_minute: int = 200, include_unit: bool = True) -> str | int: Estimate reading timecapitalize_words(exceptions: list[str] = None) -> str: Capitalize words with exceptionstruncate(max_length: int, suffix: str = '...', whole_words: bool = True) -> str: Truncate textremove_punctuation(keep: str = None) -> str: Remove punctuationcount_words(unique: bool = False) -> int: Count words
CaseFormatter(text: str)Methods:
to_snake_case() -> str: Convert to snake_caseto_camel_case() -> str: Convert to camelCaseto_pascal_case() -> str: Convert to PascalCaseto_constant_case() -> str: Convert to CONSTANT_CASEto_kebab_case() -> str: Convert to kebab-caseto_title_case(exceptions: list[str] = None) -> str: Convert to Title Case
TransformationFormatter(text: str)Methods:
reverse_characters() -> str: Reverse character orderreverse_words() -> str: Reverse word orderadd_letter_spacing(separator: str = ' ') -> str: Add spacing between lettersremove_blank_lines() -> str: Remove blank linesdeduplicate_lines() -> str: Remove duplicate linessort_lines(reverse: bool = False) -> str: Sort linesfind_and_replace(pattern: str, replacement: str, case_sensitive: bool = True, regex: bool = False) -> str: Find and replace texthighlight_markdown(words: list[str], style: str) -> str: Highlight words in markdownhighlight_html(words: list[str], tag: str) -> str: Highlight words in HTMLextract_acronyms() -> list[str]: Extract acronymswrap_text(width: int) -> str: Wrap text to width
GenerationFormatter(text: str)Static Methods:
lorem_ipsum(paragraphs: int = 1, sentences_per_paragraph: int = 5) -> str: Generate Lorem Ipsum
Instance Methods:
spell_out_numbers(max_number: int = 100) -> str: Spell out numbersformat_currency(symbol: str = '$') -> str: Format currencyformat_percentage(decimals: int = 1) -> str: Format percentages
NormalizationFormatter(text: str)Methods:
normalize_unicode(form: str = 'NFC') -> str: Normalize Unicode (NFC, NFD, NFKC, NFKD)remove_accents() -> str: Remove accents from textto_smart_quotes() -> str: Convert to smart quotesto_straight_quotes() -> str: Convert to straight quotes
See the Quick Start section above for analyzer usage examples.
# Run all tests
pytest
# Run with coverage
pytest --cov=textprettify
# Run specific test file
pytest tests/formatters/test_basic_formatter.py
# Run specific test class
pytest tests/formatters/test_basic_formatter.py::TestSlugifyCheck out the examples/ directory for comprehensive usage examples:
basic_usage.py- Basic formatting operationstext_transformation_example.py- Case conversions and transformationstext_generation_example.py- Text generation and manipulationtext_analysis_example.py- Text analysis and statisticsblog_post_formatter.py- Format blog post metadataurl_generator.py- Generate clean URLs from titles
textprettify/
├── formatters/
│ ├── basic_formatter.py
│ ├── case_formatter.py
│ ├── transformation_formatter.py
│ ├── generation_formatter.py
│ └── normalization_formatter.py
├── analyzers/
│ ├── character_analyzer.py
│ ├── sentence_analyzer.py
│ ├── readability_analyzer.py
│ ├── statistics_analyzer.py
│ └── language_analyzer.py
tests/
├── formatters/
└── analyzers/
examples/
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
# Clone the repository
git clone https://github.com/mmssajith/TextPrettify.git
cd TextPrettify
# Install in development mode with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=textprettify --cov-report=htmlThis project is licensed under the MIT License - see the LICENSE file for details.
Sajith
This project uses pre-commit hooks to maintain code quality:
# Install pre-commit hooks
pre-commit install
# Run hooks manually on all files
pre-commit run --all-filesConfigured hooks:
- Ruff: Fast Python linter and formatter
- Mypy: Static type checking
The pre-commit hooks will run automatically on every commit. You can also run them manually:
# Run all hooks
pre-commit run --all-files
# Run specific hook
pre-commit run ruff --all-files
pre-commit run mypy --all-filesSee CHANGELOG.md for detailed version history.
Latest Release: v0.2.0 - Added comprehensive text analysis tools, pre-commit hooks, and enhanced formatters.