In [11]:
%pip install jsonschema pathspec

Note: you may need to restart the kernel to use updated packages.


# Documentation for Directory Structure Automation

## Overview
This document details the process of creating, optimizing, and updating a Python script to manage directory structures. The focus is on implementing an automated and robust system for:
- Reading and updating directory structures.
- Generating and maintaining a JSON representation of the structure.
- Ensuring minimal redundancy and efficient file handling.

The document includes brainstorming ideas, optimizations, and implementation changes that were incorporated throughout the development process.

---

## Initial Requirements

### Goals:
1. Automate the creation of a directory structure.
2. Generate a JSON file to represent the directory structure.
3. Allow updates to the JSON file to reflect changes in the directory.
4. Avoid modifying existing files or directories unnecessarily.
5. Handle potential errors, including malformed JSON or missing files.

---

## Brainstorming and Challenges

### Key Questions Addressed:
1. **How to handle existing files and directories?**
   - Avoid overwriting existing directories or files unless explicitly needed.
   - Ensure efficient comparison of current and saved directory structures.

2. **How to optimize the update process?**
   - Compare the current structure with the saved structure in the JSON file.
   - Update only if differences are detected.

3. **Error Handling:**
   - Manage scenarios where the JSON file is missing or corrupted.
   - Ensure graceful fallback and minimal disruption to the workflow.

4. **Cross-Platform Compatibility:**
   - Ensure the script works seamlessly on Windows, Linux, and macOS.
   - Normalize directory paths for compatibility.

---

## Changes and Optimizations

### Initial Implementation:
- A script to create directory structures and save them to a JSON file.
- Basic support for generating placeholder files in directories.

### Optimizations:
1. **Addition of README.md Files:**
   - Automatically create a `README.md` file in each main directory.
   - Populate the file with a brief description of the directory's purpose.

2. **Efficient Updates to JSON File:**
   - Compare the current directory structure with the saved JSON.
   - Update the JSON file only if changes are detected.
   - Skip redundant writes to minimize processing time.

3. **Cross-Platform Improvements:**
   - Normalize directory paths using `os.path.relpath` and ensure compatibility with Windows (`\`) and Unix-based systems (`/`).

4. **Enhanced Error Handling:**
   - Gracefully handle missing or corrupted JSON files.
   - Provide clear error messages and fallback mechanisms.

5. **Dynamic Clearing of Terminal Output:**
   - Added OS-specific commands to clear the terminal screen before displaying prompts (`cls` for Windows, `clear` for Linux/macOS).

6. **Modularization:**
   - Split functionalities into separate functions to improve readability and reusability.
   - Functions include: `get_directory_structure`, `generate_structure_file`, `create_project_structure`, `update_json_file`, and `load_structure_from_file`.

---

## Final Implementation

### Core Features:
1. **Directory Scanning and Structure Generation:**
   - Recursively scans directories to generate a nested dictionary representation of the structure.

2. **JSON File Management:**
   - Reads the existing JSON file.
   - Updates the file only if changes are detected in the directory structure.

3. **README.md Generation:**
   - Creates a `README.md` file in each main directory with relevant descriptions.

4. **Cross-Platform Compatibility:**
   - Supports Windows, Linux, and macOS.

5. **Error Handling:**
   - Manages missing files, malformed JSON, and other edge cases gracefully.

---

## Implementation Details

### Key Functions:
1. **`get_directory_structure`:**
   - Scans a directory and generates a nested dictionary structure.

2. **`generate_structure_file`:**
   - Saves the generated structure to a JSON file.

3. **`create_project_structure`:**
   - Creates directories and placeholder files based on a given dictionary structure.

4. **`update_json_file`:**
   - Updates the JSON file to reflect the latest directory structure, deleting outdated content if necessary.

5. **`load_structure_from_file`:**
   - Loads the JSON file and returns its contents, handling errors for missing or malformed files.

---

## Future Enhancements

### Potential Improvements:
1. **Logging:**
   - Add detailed logging for operations such as file creation, updates, and errors.

2. **Parallel Processing:**
   - Use multithreading to improve performance when scanning large directories.

3. **Customizable Directory Templates:**
   - Allow users to define custom templates for directory and file structures.

4. **Interactive CLI:**
   - Enhance the CLI with options for interactive configuration of directories and files.

---

## Conclusion
This project demonstrates a robust approach to automating directory structure management. By incorporating efficient update mechanisms, error handling, and cross-platform compatibility, the solution is well-suited for diverse use cases. The outlined future enhancements will further improve its utility and scalability.



In [2]:
import os
import json
import platform
import logging
from concurrent.futures import ThreadPoolExecutor
from jsonschema import validate, ValidationError
from pathspec import PathSpec

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(levelname)s - %(message)s",
    handlers=[logging.FileHandler("structure_tool.log"), logging.StreamHandler()],
)

# JSON structure schema for validation
STRUCTURE_SCHEMA = {
    "type": "object",
    "patternProperties": {
        ".*": {
            "anyOf": [
                {"type": "object"},
                {"type": "array", "items": {"type": "string"}}
            ]
        }
    }
}

# Templates for common files
TEMPLATES = {
    "README.md": "# {directory}\n\nThis directory contains resources related to {directory}.",
    "main.py": "# Entry point for the application\n\nif __name__ == '__main__':\n    print('Hello, World!')"
}

def load_gitignore_patterns(root_dir):
    """
    Loads .gitignore patterns using pathspec for efficient pattern matching.
    """
    gitignore_path = os.path.join(root_dir, ".gitignore")
    if os.path.exists(gitignore_path):
        with open(gitignore_path, 'r') as gitignore:
            return PathSpec.from_lines("gitwildmatch", gitignore)
    return None

def should_ignore(path, spec):
    """
    Checks if a path should be ignored based on the .gitignore spec.
    """
    return spec.match_file(path) if spec else False

def validate_structure(structure):
    """
    Validates the JSON structure against the predefined schema.
    """
    try:
        validate(instance=structure, schema=STRUCTURE_SCHEMA)
        logging.info("JSON structure is valid.")
    except ValidationError as e:
        logging.error(f"Invalid JSON structure: {e.message}")
        raise

def get_directory_structure(root_dir):
    """
    Optimized directory traversal using os.scandir.
    """
    spec = load_gitignore_patterns(root_dir)
    structure = {}

    def scan_dir(path, current):
        with os.scandir(path) as it:
            for entry in it:
                relative_path = os.path.relpath(entry.path, root_dir).replace("\\", "/")
                if should_ignore(relative_path, spec):
                    continue
                if entry.is_dir():
                    current[entry.name] = {}
                    scan_dir(entry.path, current[entry.name])
                else:
                    current.setdefault("", []).append(entry.name)

    scan_dir(root_dir, structure)
    return structure

def generate_structure_file(root_dir, output_file="structure.json"):
    """
    Generates a JSON file representing the directory structure of the given root directory.
    """
    structure = get_directory_structure(root_dir)
    with open(output_file, 'w') as f:
        json.dump(structure, f, indent=4)
    logging.info(f"Directory structure saved to {output_file}")

def create_file_with_template(file_path, directory_name):
    """
    Creates a file with predefined content based on its name.
    """
    file_name = os.path.basename(file_path)
    content = TEMPLATES.get(file_name, "")
    with open(file_path, 'w') as f:
        f.write(content.format(directory=directory_name))

def create_project_structure(base_path, structure, dry_run=False):
    """
    Creates a project directory structure or prints it in dry-run mode.
    Skips existing directories and files to avoid redundant operations.
    """
    def create_files_and_dirs(base, items):
        for key, value in items.items():
            current_path = os.path.join(base, key)
            if isinstance(value, dict):  # A directory with subdirectories
                if not os.path.exists(current_path):
                    if dry_run:
                        logging.info(f"Would create directory: {current_path}")
                    else:
                        os.makedirs(current_path, exist_ok=True)
                        logging.info(f"Created directory: {current_path}")
                    # Create a README.md for new directories
                    readme_path = os.path.join(current_path, "README.md")
                    if not os.path.exists(readme_path):
                        if dry_run:
                            logging.info(f"Would create file: {readme_path}")
                        else:
                            create_file_with_template(readme_path, key)
                create_files_and_dirs(current_path, value)
            elif isinstance(value, list):  # A directory with files
                if not os.path.exists(current_path):
                    if dry_run:
                        logging.info(f"Would create directory: {current_path}")
                    else:
                        os.makedirs(current_path, exist_ok=True)
                        logging.info(f"Created directory: {current_path}")
                for file in value:
                    file_path = os.path.join(current_path, file)
                    if not os.path.exists(file_path):
                        if dry_run:
                            logging.info(f"Would create file: {file_path}")
                        else:
                            open(file_path, 'w').close()
                            logging.info(f"Created file: {file_path}")
                    else:
                        logging.info(f"Skipped existing file: {file_path}")

    create_files_and_dirs(base_path, structure)

def load_structure_from_file(file_path):
    """
    Loads a project structure from a JSON file.
    """
    with open(file_path, 'r') as f:
        return json.load(f)


In [3]:
def main():
    os_type = platform.system()
    logging.info(f"Operating System detected: {os_type}")
    print("Choose an option:")
    print(" - create: Create a new project structure from a JSON file")
    print(" - extract: Extract an existing directory structure to a JSON file")
    print(" - dry-run: Preview the directory structure without creating it")
    choice = input("Enter your choice (create, extract, dry-run): ").strip().lower()

    if choice == "create":
        json_file_path = input("Enter the path to the JSON file defining the structure: ").strip()
        if not json_file_path:
            logging.error("No JSON file path provided.")
            return
        base_directory = input("Enter the base directory for the new project: ").strip()
        if not base_directory:
            logging.error("No base directory provided.")
            return

        if os.path.isfile(json_file_path):
            structure = load_structure_from_file(json_file_path)
            validate_structure(structure)
            create_project_structure(base_directory, structure)
            logging.info(f"Project structure created at {os.path.abspath(base_directory)}")
        else:
            logging.error(f"The file '{json_file_path}' does not exist.")

    elif choice == "extract":
        root_directory = input("Enter the directory to scan: ").strip()
        if not root_directory:
            logging.error("No root directory provided.")
            return
        output_file = input("Enter the output file name (default: structure.json): ").strip() or "structure.json"

        if os.path.isdir(root_directory):
            generate_structure_file(root_directory, output_file)
        else:
            logging.error(f"The directory '{root_directory}' does not exist.")

    elif choice == "dry-run":
        json_file_path = input("Enter the path to the JSON file defining the structure: ").strip()
        if not json_file_path:
            logging.error("No JSON file path provided.")
            return
        base_directory = input("Enter the base directory to preview the structure: ").strip()
        if not base_directory:
            logging.error("No base directory provided.")
            return

        if os.path.isfile(json_file_path):
            structure = load_structure_from_file(json_file_path)
            validate_structure(structure)
            create_project_structure(base_directory, structure, dry_run=True)
        else:
            logging.error(f"The file '{json_file_path}' does not exist.")

    else:
        logging.error("Invalid choice. Please enter 'create', 'extract', or 'dry-run'.")


In [5]:
if __name__ == "__main__":
    os_type = platform.system()
    if os_type == "Windows":
        os.system("cls")
    elif os_type in ["Linux", "Darwin"]:
        os.system("clear")
    main()


2025-01-25 21:34:40,966 - INFO - Operating System detected: Windows


Choose an option:
 - create: Create a new project structure from a JSON file
 - extract: Extract an existing directory structure to a JSON file
 - dry-run: Preview the directory structure without creating it


2025-01-25 21:35:09,689 - INFO - JSON structure is valid.
2025-01-25 21:35:09,689 - INFO - Created directory: C:\Users\sathish\Downloads\AB\backend\agents
2025-01-25 21:35:09,691 - INFO - Created file: C:\Users\sathish\Downloads\AB\backend\agents\__init__.py
2025-01-25 21:35:09,691 - INFO - Created file: C:\Users\sathish\Downloads\AB\backend\agents\agent.py
2025-01-25 21:35:09,692 - INFO - Created file: C:\Users\sathish\Downloads\AB\backend\agents\multi_agent_manager.py
2025-01-25 21:35:09,692 - INFO - Created file: C:\Users\sathish\Downloads\AB\backend\agents\agent_state_manager.py
2025-01-25 21:35:09,693 - INFO - Created file: C:\Users\sathish\Downloads\AB\backend\agents\agent_collaboration.py
2025-01-25 21:35:09,693 - INFO - Created file: C:\Users\sathish\Downloads\AB\backend\agents\bureaucracy_agent.py
2025-01-25 21:35:09,694 - INFO - Created directory: C:\Users\sathish\Downloads\AB\backend\langchain_tools
2025-01-25 21:35:09,695 - INFO - Created directory: C:\Users\sathish\Downloa