Skip to content

MRJR0101/code-normalizer-pro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CODE - Code Normalization Tool

Python CLI that cleans up source code encoding, line endings, and whitespace across entire codebases -- with parallel processing, SHA256 caching, and pre-commit hook support.

  • Location: C:\Dev\PROJECTS\CODE
  • Status: v3.0 code complete. Package stub ready. No pyproject.toml = blocked from PyPI.
  • Updated: 2026-03-10

What It Does

Run it against any directory and it will:

  1. Detect and convert file encoding to UTF-8 (handles utf-8, utf-8-sig, utf-16, utf-16-le, utf-16-be, windows-1252, latin-1, iso-8859-1)
  2. Fix line endings -- CRLF to LF
  3. Strip trailing whitespace from every line
  4. Ensure a single newline at end of file
  5. Optionally validate syntax for Python, JS, TS, Go, Rust, C, C++, Java

Files already clean are skipped. SHA256 caching means repeat runs on unchanged files are near-instant. Multi-core parallel mode handles large codebases at 80-200 files/sec.


Quick Start

Set-Location C:\Dev\PROJECTS\CODE

# See what would change without touching anything
python main.py C:\path\to\project --dry-run

# Normalize everything in-place using all CPU cores
python main.py C:\path\to\project --parallel --in-place

# Normalize only Python and JavaScript files
python main.py C:\path\to\project -e .py -e .js --in-place

# Review and approve each file before it's written
python main.py C:\path\to\project --interactive

# Run syntax validation after normalizing
python main.py C:\path\to\project --in-place --check

# Install a pre-commit hook into a git repo
cd C:\your-repo
python C:\Dev\PROJECTS\CODE\main.py --install-hook

main.py at root is a thin wrapper that delegates to src/code_normalize_pro.py. Call either one -- same result.


Pre-Commit Hook

Checks only staged files before each commit. Blocks commit if any need normalization and prints the fix command.

# One-time install per repo
cd C:\your-repo
python C:\Dev\PROJECTS\CODE\main.py --install-hook

# Commit as normal -- hook fires automatically
git commit -m "your message"

# Skip hook for one commit
git commit --no-verify -m "your message"

Performance

Files Sequential Parallel 4-core Speedup
100 3.2s 1.1s 2.9x
500 16.8s 4.3s 3.9x
1000 33.5s 7.1s 4.7x

8 cores: 150-200 files/sec. SHA256 cache on unchanged files: 500-1000 files/sec. Workers default to CPU count. Override with --workers N.


Testing

Set-Location C:\Dev\PROJECTS\CODE
.\.venv\Scripts\Activate.ps1

python -m pytest -q
python main.py --help

Test files in tests/ cover the main tool plus all four launch/sales scripts. All 5 features tested on 2026-02-09 (see docs/TEST_REPORT.md). Manual confirmation of interactive mode still pending.


Project Layout

CODE/
  main.py                          -- Root entrypoint. Delegates to src/code_normalize_pro.py
  src/
    code_normalize_pro.py          -- v3.0 Pro. 917 lines. The active tool.
    code_normalize_v2.py           -- v2.0. Kept for reference.
  code_normalizer_pro/             -- PyPI package stub
    __init__.py                    -- Exposes __version__ = "3.0.1"
    cli.py                         -- Console entry point (calls src/code_normalize_pro.py)
    README.md
  config/
    settings.py                    -- Env-var settings loader (not wired up yet)
  docs/
    README.md                      -- Full feature reference docs
    TEST_REPORT.md                 -- Test results from 2026-02-09
    ARCHITECTURE.md                -- Stub
    launch/                        -- Outreach templates, user tracking CSV, metrics JSON
    sales/                         -- Pricing, pipeline CSV, customer offer template
    release/
      alpha_release_checklist.md   -- Step-by-step PyPI publish checklist
      release_readiness.json       -- Says ready=true, wheel+sdist listed
  roadmaps/
    README.md                      -- Overview of all 6 paths
    01_solo_dev_tool.md            -- CHOSEN: bootstrap to PyPI
    02_dev_tool_saas.md
    03_enterprise_platform.md
    04_open_source_support.md
    05_grammarly_for_code.md
    06_ai_transformation_engine.md
  scripts/
    launch_metrics.py
    feedback_prioritizer.py
    sales_pipeline_metrics.py
    release_prep.py
  tests/
    test_code_normalize_pro.py
    test_feedback_prioritizer.py
    test_launch_metrics.py
    test_release_prep.py
    test_sales_pipeline_metrics.py
  site/
    index.html                     -- Static landing page
    styles.css
  .github/
    workflows/ci.yml               -- CI: install, smoke check, pytest, build
    ISSUE_TEMPLATE/
    pull_request_template.md
  files/
    cache_sandbox/                 -- Test fixtures (a.py, b.py)
    smoke_case.py
  EXECUTION_PLAN.md                -- 7-day launch plan (all tasks pending)
  VERIFY.md                        -- Verification runbook
  MISSINGMORE.txt                  -- Gap tracking
  QUICK_REFERENCE.md               -- Command cheat sheet
  CHANGELOG.md                     -- Stub (unreleased only)

Dependencies

Core: zero. Python 3.10+ only.

Optional:

  • tqdm -- progress bars
  • Syntax checkers (only needed with --check): Python: built-in (py_compile) | JS: node | TS: tsc | Go: gofmt Rust: rustc | C: gcc | C++: g++ | Java: javac

Dev/test: pytest (see requirements.txt)


Known Issues (fix before PyPI launch)

Critical -- blocks shipping:

  1. No pyproject.toml -- CI runs python -m build which will fail without it. The code_normalizer_pro.egg-info/ dir shows packaging was attempted but no config file exists in the tree. Create pyproject.toml with src layout and console_scripts entry point before running Day 1 tasks.

  2. code_normalizer_pro/cli.py has a broken import: from code_normalize_pro import main After pip install, Python looks for a module named code_normalize_pro in site-packages, not in src/. Without a proper src layout in pyproject.toml, the installed CLI command will fail on launch.

Code bugs worth fixing:

  1. Cache default is on in __init__ but --cache flag implies opt-in and --no-cache implies opt-out. The flags and the default contradict each other. Pick one direction and make the help text match.

  2. --parallel --in-place silently disables backups. process_file_worker passes create_backup=False but backup logic only lives inside process_file. Users running parallel mode have no backups. Either warn loudly or fix it.

  3. walk_and_process and process_file both increment total_files for the same files. Summary stats will show inflated counts.

  4. .normalize-cache.json lands in CWD, not the target directory. Running the tool against three different projects from the same shell session corrupts the cache. Pass root / CACHE_FILE to CacheManager in walk_and_process.

  5. --dry-run always exits 0 even when it finds files needing normalization. CI pipelines need a non-zero exit to catch violations. Add --fail-on-changes or make dry-run exit 1 when changes are detected.

Cleanup:

  1. code_normalize_pro.py at root -- stale copy. Real file is src/. Delete it.
  2. roadmaps/New Text Document.txt -- empty temp file. Delete it.
  3. roadmaps/talking about code.txt -- saved AI chat session. Delete or move to docs/.
  4. All README_20260220_*.md.bak files throughout the tree -- ReadmeForge backups.
  5. config/settings.py is a clean env-var loader but nothing imports it. Either wire it into code_normalize_pro.py or remove it.
  6. README_PRO.md at root duplicates docs/README.md. Consolidate.
  7. restore_report.json and smoke_report.json at root -- generated artifacts, add to .gitignore.
  8. PROJECT_STATUS.md says roadmap docs are "coming soon" -- all 6 exist. Stale.

Launch Status (Path 1 - Solo Dev Tool)

EXECUTION_PLAN.md has a 7-day checklist. As of 2026-03-10, nothing started.

Before Day 1 tasks will work, pyproject.toml needs to exist (see issue #1 above).

Day 1 after pyproject.toml is in place:

Set-Location C:\Dev\PROJECTS\CODE
.\.venv\Scripts\Activate.ps1
python -m pytest -q
pip install -e .
code-normalizer-pro --help

Full release steps: see docs/release/alpha_release_checklist.md


CI

.github/workflows/ci.yml runs on push to main/master and on PRs:

  • Python 3.11
  • pip install from requirements.txt
  • CLI smoke check (main.py and src/code_normalize_pro.py --help)
  • pytest -q
  • python -m build (sdist + wheel)

Note: python -m build requires pyproject.toml. CI will fail until that exists.


Version History

Version Date Changes
v3.0 2026-02-09 Parallel processing, SHA256 caching, pre-commit hooks, multi-language syntax, interactive mode
v2.0 2026-02-09 Dry-run, in-place editing, backups, tqdm, detailed stats
v1.0 -- Basic encoding fix, CRLF, whitespace

Package version: 3.0.1 (set in code_normalizer_pro/__init__.py)


Developer: MR (Michael Rawls Jr.) -- Houston, TX -- GitHub: MRJR0101

About

Normalize source code encoding, line endings, and whitespace. Parallel processing, caching, pre-commit hooks.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors