Skip to content

v0.2.3 - Add parallel `postprocess` (#173)

Choose a tag to compare

@github-actions github-actions released this 22 Nov 20:44
· 18 commits to main since this release
3dd54b3

🌟 Summary

v0.2.3 makes the mkdocs Ultralytics plugin faster and more robust 🚀 by adding optional parallel HTML postprocessing and a smarter, thread-safe cache for GitHub author data.


📊 Key Changes

  • ⚙️ Parallel postprocess_site execution

    • New arguments: use_processes: bool = True and workers: int | None = None.
    • HTML files can now be processed in parallel using:
      • Multiple processes (ProcessPoolExecutor) or
      • Multiple threads (ThreadPoolExecutor), depending on use_processes.
    • Automatically picks an appropriate worker count based on CPU cores (or a user-specified workers value).
  • 🧱 Shared worker state for process pools

    • Introduces _WORKER_STATE plus helper _set_worker_state and _process_file to avoid repeatedly sending large read-only data (like config and Git metadata) to each task.
    • Reduces overhead and improves performance when using process-based parallelism.
  • 📈 Improved progress handling & logging

    • Uses a single global TQDM progress bar for all workers.
    • Enables logging for the single-worker (sequential) path; disables per-task logging in parallel pools to stay safe and pickle-friendly.
    • Clear console message indicating how many HTML files will be processed and with how many workers and which mode (thread or process).
  • 🧵 Thread-safe, cached GitHub author lookups

    • Adds a global, shared cache:
      • _AUTHOR_CACHE, _AUTHOR_CACHE_MTIME, and _CACHE_LOCK in plugin/utils.py.
    • get_github_username_from_email:
      • Wraps cache access and updates in a lock to avoid data races when running in parallel.
      • Avoids duplicate GitHub REST API calls and redundant avatar URL resolution.
    • get_github_usernames_from_file:
      • Loads mkdocs_github_authors.yaml only once per process.
      • Tracks modification time so it can reload if the file changes.
      • Only writes the YAML file back when the cache actually changed, reducing I/O contention.
  • 🛡 More robust GitHub lookup behavior

    • Safely handles:
      • Empty email strings (logs a warning when verbose).
      • GitHub noreply emails by deriving the username directly and resolving the avatar URL once.
    • Ensures cache writes are consistent and guarded by a lock.
  • 🧪 Minor structural optimizations

    • Simplified markdown index (md_index) creation in postprocess_site using a dictionary comprehension.
    • Centralized common parameters into task_kwargs for cleaner worker submission logic.

🎯 Purpose & Impact

  • 🚀 Faster documentation builds

    • Parallel processing of HTML files significantly speeds up mkdocs builds, especially for large sites with many pages.
    • Reduced overhead for Git metadata and author resolution in process pools.
  • 🤝 More stable parallel execution

    • Thread-safe caching and careful locking prevent race conditions and YAML file corruption when running with multiple workers.
    • Less risk of hitting GitHub rate limits due to repeated identical API calls.
  • 📉 Lower I/O and API usage

    • Only writes mkdocs_github_authors.yaml when needed.
    • Caches avatar URLs and usernames, so repeated builds or pages with the same authors don’t re-query GitHub.
  • 🔧 Flexible configuration for different environments

    • workers lets you tune performance for CI, local development, or resource-constrained machines.
    • use_processes lets you choose between process-based isolation (often faster for CPU-heavy tasks) and threads (lighter-weight, easier debugging).
  • 📚 Better user experience

    • Clearer progress output during postprocess_site.
    • More reliable author attribution and avatars in generated docs, with fewer intermittent failures under load.

In short: v0.2.3 focuses on performance, scalability, and robustness for documentation postprocessing, particularly when running mkdocs with parallel workers ⚡📚.

What's Changed

Full Changelog: v0.2.2...v0.2.3