# Folder Analysis Tool - Testing Notebook

This notebook is used for testing and demonstrating the folder analysis functionality.
The core functionality is implemented in the `src` package, and this notebook imports and tests those functions.

## Overview
- Test the folder structure analysis functions
- Demonstrate different output formats
- Validate functionality with various folder structures
- Show examples of programmatic usage

## 1. Import Required Libraries

First, let's import the standard libraries we'll need for testing.

In [1]:
import os
import sys
import tempfile
import shutil
from pathlib import Path
from io import StringIO

## 2. Import Custom Functionality from src Package

Add the src folder to the Python path and import our custom folder analysis functions.

In [2]:
# Add the parent directory to the Python path so we can import from src
notebook_dir = Path.cwd()
project_root = notebook_dir.parent
sys.path.insert(0, str(project_root))

# Import our custom folder analysis functions
from src.folder_analyzer import (
    analyze_folder,
    print_tree_structure, 
    get_folder_structure_as_string,
    create_argument_parser
)

print("Successfully imported folder analysis functions!")
print(f"Project root: {project_root}")
print(f"Current working directory: {notebook_dir}")

Successfully imported folder analysis functions!
Project root: c:\Users\leopo\Documents\GitHub\folder-analysis-tool
Current working directory: c:\Users\leopo\Documents\GitHub\folder-analysis-tool\notebooks


## 3. Set Up Test Environment

Create a sample directory structure for testing our folder analysis functionality.

In [3]:
# Create a temporary directory for testing
test_base_dir = Path(tempfile.mkdtemp(prefix="folder_analysis_test_"))
print(f"Created test directory: {test_base_dir}")

# Create a sample directory structure
sample_structure = {
    "documents": {
        "reports": ["report1.pdf", "report2.docx"],
        "presentations": ["slides.pptx"],
        "data": {
            "csv_files": ["data1.csv", "data2.csv", "data3.csv"],
            "json_files": ["config.json", "metadata.json"]
        }
    },
    "projects": {
        "project_a": {
            "src": ["main.py", "utils.py"],
            "tests": ["test_main.py"],
            "docs": ["README.md"]
        },
        "project_b": {
            "scripts": ["script1.sh", "script2.bat"],
            "output": []  # Empty directory
        }
    },
    "media": {
        "images": ["photo1.jpg", "photo2.png", "logo.svg"],
        "videos": ["video1.mp4"]
    }
}

def create_structure(base_path, structure):
    """Recursively create the directory structure"""
    for name, content in structure.items():
        current_path = base_path / name
        if isinstance(content, dict):
            # It's a directory
            current_path.mkdir(exist_ok=True)
            create_structure(current_path, content)
        elif isinstance(content, list):
            # It's a directory with files
            current_path.mkdir(exist_ok=True)
            for file_name in content:
                (current_path / file_name).touch()

# Create the test structure
create_structure(test_base_dir, sample_structure)
print("Sample directory structure created successfully!")

# Show what we created
print(f"\nTest directory contents:")
for item in sorted(test_base_dir.rglob("*")):
    relative_path = item.relative_to(test_base_dir)
    if item.is_dir():
        print(f"📁 {relative_path}/")
    else:
        print(f"📄 {relative_path}")

Created test directory: C:\Users\leopo\AppData\Local\Temp\folder_analysis_test_ixalu9ve
Sample directory structure created successfully!

Test directory contents:
📁 documents/
📁 documents\data/
📁 documents\data\csv_files/
📄 documents\data\csv_files\data1.csv
📄 documents\data\csv_files\data2.csv
📄 documents\data\csv_files\data3.csv
📁 documents\data\json_files/
📄 documents\data\json_files\config.json
📄 documents\data\json_files\metadata.json
📁 documents\presentations/
📄 documents\presentations\slides.pptx
📁 documents\reports/
📄 documents\reports\report1.pdf
📄 documents\reports\report2.docx
📁 media/
📁 media\images/
📄 media\images\logo.svg
📄 media\images\photo1.jpg
📄 media\images\photo2.png
📁 media\videos/
📄 media\videos\video1.mp4
📁 projects/
📁 projects\project_a/
📁 projects\project_a\docs/
📄 projects\project_a\docs\README.md
📁 projects\project_a\src/
📄 projects\project_a\src\main.py
📄 projects\project_a\src\utils.py
📁 projects\project_a\tests/
📄 projects\project_a\tests\test_main.py
📁 pr

## 4. Test Core Functionality

Now let's test the main functions from our folder analyzer module.

In [4]:
# Test 1: Basic tree structure printing
print("=== Test 1: Basic tree structure printing ===")
print("Testing print_tree_structure function:")
print()

print_tree_structure(test_base_dir)

=== Test 1: Basic tree structure printing ===
Testing print_tree_structure function:

folder_analysis_test_ixalu9ve/
├── documents/
│   ├── data/
│   │   ├── csv_files/
│   │   │   ├── data1.csv
│   │   │   ├── data2.csv
│   │   │   └── data3.csv
│   │   └── json_files/
│   │       ├── config.json
│   │       └── metadata.json
│   ├── presentations/
│   │   └── slides.pptx
│   └── reports/
│       ├── report1.pdf
│       └── report2.docx
├── media/
│   ├── images/
│   │   ├── logo.svg
│   │   ├── photo1.jpg
│   │   └── photo2.png
│   └── videos/
│       └── video1.mp4
└── projects/
    ├── project_a/
    │   ├── docs/
    │   │   └── README.md
    │   ├── src/
    │   │   ├── main.py
    │   │   └── utils.py
    │   └── tests/
    │       └── test_main.py
    └── project_b/
        ├── output/
        └── scripts/
            ├── script1.sh
            └── script2.bat


In [5]:
# Force reload the module to get the latest changes
import importlib
import src.folder_analyzer
importlib.reload(src.folder_analyzer)

# Re-import the functions
from src.folder_analyzer import (
    analyze_folder,
    print_tree_structure, 
    get_folder_structure_as_string,
    create_argument_parser
)

print("Module reloaded successfully!")

Module reloaded successfully!


In [6]:
# Test 2: Get structure as string (programmatic use)
print("=== Test 2: Get structure as string ===")
print("Testing get_folder_structure_as_string function:")
print()

structure_string = get_folder_structure_as_string(test_base_dir)
print("Structure as string:")
print(structure_string)

# Show that we can manipulate the string programmatically
lines = structure_string.strip().split('\n')
print(f"\nStructure analysis:")
print(f"Total lines: {len(lines)}")
print(f"Directories: {len([line for line in lines if line.endswith('/')])}")
print(f"Files: {len([line for line in lines if not line.endswith('/') and '├──' in line or '└──' in line])}")

=== Test 2: Get structure as string ===
Testing get_folder_structure_as_string function:

Structure as string:
folder_analysis_test_ixalu9ve/
├── documents/
│   ├── data/
│   │   ├── csv_files/
│   │   │   ├── data1.csv
│   │   │   ├── data2.csv
│   │   │   └── data3.csv
│   │   └── json_files/
│   │       ├── config.json
│   │       └── metadata.json
│   ├── presentations/
│   │   └── slides.pptx
│   └── reports/
│       ├── report1.pdf
│       └── report2.docx
├── media/
│   ├── images/
│   │   ├── logo.svg
│   │   ├── photo1.jpg
│   │   └── photo2.png
│   └── videos/
│       └── video1.mp4
└── projects/
    ├── project_a/
    │   ├── docs/
    │   │   └── README.md
    │   ├── src/
    │   │   ├── main.py
    │   │   └── utils.py
    │   └── tests/
    │       └── test_main.py
    └── project_b/
        ├── output/
        └── scripts/
            ├── script1.sh
            └── script2.bat


Structure analysis:
Total lines: 36
Directories: 18
Files: 25


In [7]:
# Test 3: analyze_folder function with output to file
print("=== Test 3: Analyze folder with output to file ===")

# Create an output directory
output_dir = test_base_dir / "analysis_output"
output_dir.mkdir(exist_ok=True)

print("Testing analyze_folder function with file output:")
print()

# Analyze and save to file
output_file_path = analyze_folder(test_base_dir, str(output_dir))

print(f"\nOutput file created: {output_file_path}")

# Read and display the contents of the output file
if output_file_path and Path(output_file_path).exists():
    with open(output_file_path, 'r', encoding='utf-8') as f:
        file_contents = f.read()
    
    print("\nContents of the output file:")
    print("=" * 40)
    print(file_contents)
    print("=" * 40)

=== Test 3: Analyze folder with output to file ===
Testing analyze_folder function with file output:

Analyzing folder structure of: C:\Users\leopo\AppData\Local\Temp\folder_analysis_test_ixalu9ve
Results will be saved to: C:\Users\leopo\AppData\Local\Temp\folder_analysis_test_ixalu9ve\analysis_output\folder_structure.txt
folder_analysis_test_ixalu9ve/
├── analysis_output/
│   └── folder_structure.txt
├── documents/
│   ├── data/
│   │   ├── csv_files/
│   │   │   ├── data1.csv
│   │   │   ├── data2.csv
│   │   │   └── data3.csv
│   │   └── json_files/
│   │       ├── config.json
│   │       └── metadata.json
│   ├── presentations/
│   │   └── slides.pptx
│   └── reports/
│       ├── report1.pdf
│       └── report2.docx
├── media/
│   ├── images/
│   │   ├── logo.svg
│   │   ├── photo1.jpg
│   │   └── photo2.png
│   └── videos/
│       └── video1.mp4
└── projects/
    ├── project_a/
    │   ├── docs/
    │   │   └── README.md
    │   ├── src/
    │   │   ├── main.py
    │   │   └── uti

## 5. Analyze Sample Data

Let's analyze different parts of our test structure and real project directories.

In [8]:
# Analyze specific subdirectories
subdirs_to_analyze = ["documents", "projects", "media"]

for subdir in subdirs_to_analyze:
    subdir_path = test_base_dir / subdir
    if subdir_path.exists():
        print(f"=== Analysis of {subdir} subdirectory ===")
        print_tree_structure(subdir_path)
        print("\n" + "-" * 50 + "\n")

=== Analysis of documents subdirectory ===
documents/
├── data/
│   ├── csv_files/
│   │   ├── data1.csv
│   │   ├── data2.csv
│   │   └── data3.csv
│   └── json_files/
│       ├── config.json
│       └── metadata.json
├── presentations/
│   └── slides.pptx
└── reports/
    ├── report1.pdf
    └── report2.docx

--------------------------------------------------

=== Analysis of projects subdirectory ===
projects/
├── project_a/
│   ├── docs/
│   │   └── README.md
│   ├── src/
│   │   ├── main.py
│   │   └── utils.py
│   └── tests/
│       └── test_main.py
└── project_b/
    ├── output/
    └── scripts/
        ├── script1.sh
        └── script2.bat

--------------------------------------------------

=== Analysis of media subdirectory ===
media/
├── images/
│   ├── logo.svg
│   ├── photo1.jpg
│   └── photo2.png
└── videos/
    └── video1.mp4

--------------------------------------------------



In [9]:
# Analyze our actual project directory
print("=== Analysis of our actual project directory ===")
print("Let's analyze the structure of our folder-analysis-tool project:")
print()

actual_project_dir = project_root
print_tree_structure(actual_project_dir)

=== Analysis of our actual project directory ===
Let's analyze the structure of our folder-analysis-tool project:

folder-analysis-tool/
├── notebooks/
│   ├── config.json
│   └── test_folder_analyzer.ipynb
├── output/
│   └── folder_structure.txt
├── src/
│   ├── __init__.py
│   ├── config_manager.py
│   └── folder_analyzer.py
├── config.json
├── main.py
└── README.md


## 6. Display Results and Visualizations

Let's create some summary statistics and visualizations of our folder analysis results.

In [10]:
def analyze_directory_stats(directory_path):
    """Generate statistics about a directory structure"""
    path = Path(directory_path)
    
    if not path.exists():
        return None
    
    stats = {
        'total_items': 0,
        'directories': 0,
        'files': 0,
        'file_extensions': {},
        'max_depth': 0,
        'largest_files': [],
        'directory_sizes': {}
    }
    
    # Analyze the directory structure
    for item in path.rglob("*"):
        stats['total_items'] += 1
        
        # Calculate depth
        depth = len(item.relative_to(path).parts)
        stats['max_depth'] = max(stats['max_depth'], depth)
        
        if item.is_dir():
            stats['directories'] += 1
            # Count items in this directory
            dir_item_count = len(list(item.iterdir()))
            stats['directory_sizes'][item.name] = dir_item_count
        else:
            stats['files'] += 1
            
            # Track file extensions
            ext = item.suffix.lower()
            if ext:
                stats['file_extensions'][ext] = stats['file_extensions'].get(ext, 0) + 1
            
            # Track file sizes (if possible)
            try:
                size = item.stat().st_size
                stats['largest_files'].append((item.name, size))
            except (OSError, PermissionError):
                pass
    
    # Sort largest files
    stats['largest_files'].sort(key=lambda x: x[1], reverse=True)
    stats['largest_files'] = stats['largest_files'][:5]  # Top 5
    
    return stats

# Generate statistics for our test directory
print("=== Directory Statistics Analysis ===")
test_stats = analyze_directory_stats(test_base_dir)

if test_stats:
    print(f"📊 **Summary Statistics for Test Directory:**")
    print(f"   • Total items: {test_stats['total_items']}")
    print(f"   • Directories: {test_stats['directories']}")
    print(f"   • Files: {test_stats['files']}")
    print(f"   • Maximum depth: {test_stats['max_depth']}")
    
    print(f"\n📁 **File Extensions Found:**")
    for ext, count in sorted(test_stats['file_extensions'].items()):
        print(f"   • {ext}: {count} files")
    
    print(f"\n📂 **Directory Sizes (item count):**")
    for dir_name, count in sorted(test_stats['directory_sizes'].items(), key=lambda x: x[1], reverse=True)[:10]:
        print(f"   • {dir_name}: {count} items")

# Also analyze our project directory
print(f"\n{'='*50}")
project_stats = analyze_directory_stats(project_root)

if project_stats:
    print(f"📊 **Summary Statistics for Project Directory:**")
    print(f"   • Total items: {project_stats['total_items']}")
    print(f"   • Directories: {project_stats['directories']}")
    print(f"   • Files: {project_stats['files']}")
    print(f"   • Maximum depth: {project_stats['max_depth']}")
    
    print(f"\n📁 **File Extensions Found:**")
    for ext, count in sorted(project_stats['file_extensions'].items()):
        print(f"   • {ext}: {count} files")

=== Directory Statistics Analysis ===
📊 **Summary Statistics for Test Directory:**
   • Total items: 37
   • Directories: 18
   • Files: 19
   • Maximum depth: 4

📁 **File Extensions Found:**
   • .bat: 1 files
   • .csv: 3 files
   • .docx: 1 files
   • .jpg: 1 files
   • .json: 2 files
   • .md: 1 files
   • .mp4: 1 files
   • .pdf: 1 files
   • .png: 1 files
   • .pptx: 1 files
   • .py: 3 files
   • .sh: 1 files
   • .svg: 1 files
   • .txt: 1 files

📂 **Directory Sizes (item count):**
   • documents: 3 items
   • csv_files: 3 items
   • images: 3 items
   • project_a: 3 items
   • media: 2 items
   • projects: 2 items
   • data: 2 items
   • reports: 2 items
   • json_files: 2 items
   • project_b: 2 items

📊 **Summary Statistics for Project Directory:**
   • Total items: 88
   • Directories: 29
   • Files: 59
   • Maximum depth: 6

📁 **File Extensions Found:**
   • .idx: 1 files
   • .ipynb: 1 files
   • .json: 2 files
   • .md: 1 files
   • .pack: 1 files
   • .py: 4 files
   • 

In [11]:
# Cleanup: Remove the temporary test directory
print("=== Cleanup ===")
print(f"Cleaning up temporary test directory: {test_base_dir}")

try:
    shutil.rmtree(test_base_dir)
    print("✅ Cleanup completed successfully!")
except Exception as e:
    print(f"⚠️ Warning: Could not clean up test directory: {e}")

print("\n" + "="*60)
print("🎉 **All tests completed successfully!**")
print("="*60)
print("\n**Summary of what we tested:**")
print("✅ Basic tree structure printing")
print("✅ Getting structure as a string for programmatic use")
print("✅ Analyzing folders with output to files")
print("✅ Statistical analysis of directory structures")
print("✅ Analysis of both test and real project directories")
print("\n**The folder analysis tool is working correctly!**")

=== Cleanup ===
Cleaning up temporary test directory: C:\Users\leopo\AppData\Local\Temp\folder_analysis_test_ixalu9ve
✅ Cleanup completed successfully!

🎉 **All tests completed successfully!**

**Summary of what we tested:**
✅ Basic tree structure printing
✅ Getting structure as a string for programmatic use
✅ Analyzing folders with output to files
✅ Statistical analysis of directory structures
✅ Analysis of both test and real project directories

**The folder analysis tool is working correctly!**


In [12]:
#test on the actual project directory
target_dir = project_root
output_dir = os.path.join(project_root, "output")

analyze_folder(target_dir, output_dir)

Analyzing folder structure of: c:\Users\leopo\Documents\GitHub\folder-analysis-tool
Results will be saved to: c:\Users\leopo\Documents\GitHub\folder-analysis-tool\output\folder_structure.txt
folder-analysis-tool/
├── notebooks/
│   ├── config.json
│   └── test_folder_analyzer.ipynb
├── output/
│   └── folder_structure.txt
├── src/
│   ├── __init__.py
│   ├── config_manager.py
│   └── folder_analyzer.py
├── config.json
├── main.py
└── README.md
Analysis complete! Results saved to: c:\Users\leopo\Documents\GitHub\folder-analysis-tool\output\folder_structure.txt


'c:\\Users\\leopo\\Documents\\GitHub\\folder-analysis-tool\\output\\folder_structure.txt'