# SciTeX IO Module - File Operations Made Simple

The `scitex.io` module provides unified file I/O operations with automatic format detection, path management, and reproducibility features.

## Key Features
- **Unified Interface**: Single `load()` and `save()` functions for multiple formats
- **Auto Format Detection**: Automatically detects file format from extension
- **Path Management**: Relative paths and symlink creation for reproducibility
- **Output Organization**: Automatic directory organization by file type

In [None]:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Timestamp: "2025-07-02 07:47:00 (ywatanabe)"
# File: ./examples/01_scitex_io.ipynb
# ----------------------------------------
import os
__FILE__ = "./examples/01_scitex_io.ipynb"
__DIR__ = os.path.dirname(__FILE__)
# ----------------------------------------

import scitex as stx
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

print("SciTeX IO Module Demo")
print(f"SciTeX version: {stx.__version__}")

## 1. Basic Loading and Saving

The core functions `stx.io.load()` and `stx.io.save()` work with multiple file formats automatically.

In [None]:
# Create sample data
data_dict = {
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'score': [85.5, 92.3, 78.9]
}
df = pd.DataFrame(data_dict)
array_data = np.random.randn(10, 3)

print("Sample DataFrame:")
print(df)
print(f"\nSample array shape: {array_data.shape}")

In [None]:
# Save data in different formats - stx.io.save auto-detects format
stx.io.save(df, './data/sample_data.csv')
stx.io.save(df, './data/sample_data.json') 
stx.io.save(array_data, './data/array_data.npy')
stx.io.save(data_dict, './data/dict_data.pkl')

print("✅ Saved data in multiple formats")
print("  - CSV: ./data/sample_data.csv")
print("  - JSON: ./data/sample_data.json")
print("  - NumPy: ./data/array_data.npy")
print("  - Pickle: ./data/dict_data.pkl")

In [None]:
# Load data back - stx.io.load auto-detects format
loaded_df_csv = stx.io.load('./data/sample_data.csv')
loaded_df_json = stx.io.load('./data/sample_data.json')
loaded_array = stx.io.load('./data/array_data.npy')
loaded_dict = stx.io.load('./data/dict_data.pkl')

print("✅ Loaded data from multiple formats")
print(f"CSV DataFrame shape: {loaded_df_csv.shape}")
print(f"JSON DataFrame shape: {loaded_df_json.shape}")
print(f"Array shape: {loaded_array.shape}")
print(f"Dict keys: {list(loaded_dict.keys())}")

## 2. Symlink Creation for Reproducibility

SciTeX automatically creates symlinks from the current working directory for better reproducibility.

In [None]:
# Save with symlink creation (default behavior)
results = np.random.randn(100)
stx.io.save(results, './output/experiment_results.npy', symlink_from_cwd=True)

# This creates both:
# 1. The actual file: ./output/experiment_results.npy
# 2. A symlink from CWD: experiment_results.npy -> ./output/experiment_results.npy

print("✅ Saved with symlink creation")
print("  File: ./output/experiment_results.npy")
print("  Symlink: experiment_results.npy -> ./output/experiment_results.npy")

## 3. Figure Saving with SciTeX

Matplotlib figures are saved with automatic organization and data export.

In [None]:
# Create a sample plot
fig, ax = plt.subplots(figsize=(8, 6))
x = np.linspace(0, 10, 100)
y = np.sin(x) * np.exp(-x/10)

ax.plot(x, y, 'b-', linewidth=2, label='Damped sine')
ax.set_xlabel('Time (s)')
ax.set_ylabel('Amplitude')
ax.set_title('Damped Sine Wave')
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Save figure using stx.io.save
stx.io.save(fig, './figures/damped_sine.png', symlink_from_cwd=True)

print("✅ Figure saved with data export")
print("  Figure: ./figures/damped_sine.png")
print("  Data: ./figures/damped_sine.csv (plot data)")
print("  Symlink: damped_sine.png -> ./figures/damped_sine.png")

## 4. Advanced Features

### Multiple Format Support

In [None]:
# Demonstrate various supported formats
formats_demo = {
    'csv': './data/demo.csv',
    'json': './data/demo.json',
    'npy': './data/demo.npy', 
    'pkl': './data/demo.pkl',
    'txt': './data/demo.txt',
}

sample_data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

for format_name, file_path in formats_demo.items():
    if format_name == 'txt':
        # For text files, save as plain text
        stx.io.save(sample_data, file_path)
    elif format_name == 'csv':
        # Convert to DataFrame for CSV
        df_demo = pd.DataFrame(sample_data, columns=['A', 'B', 'C'])
        stx.io.save(df_demo, file_path)
    else:
        stx.io.save(sample_data, file_path)
    
    print(f"✅ Saved {format_name.upper()}: {file_path}")

print("\n📁 All formats saved successfully!")

### Path Management and Organization

In [None]:
# SciTeX automatically organizes outputs by type
demo_data = np.random.randn(50, 3)
demo_df = pd.DataFrame(demo_data, columns=['Feature1', 'Feature2', 'Feature3'])

# Data files go to ./data/
stx.io.save(demo_df, './data/processed_features.csv')
stx.io.save(demo_data, './data/raw_features.npy')

# Cache files
cache_data = {'processed': True, 'timestamp': '2025-07-02'}
stx.io.save(cache_data, './cache/processing_status.pkl')

# Configuration
config = {'batch_size': 32, 'learning_rate': 0.001}
stx.io.save(config, './config/model_params.json')

print("✅ Organized outputs by type:")
print("  📊 Data: ./data/")
print("  💾 Cache: ./cache/")
print("  ⚙️  Config: ./config/")
print("  📈 Figures: ./figures/")

## 5. Integration with Other SciTeX Modules

The IO module works seamlessly with other SciTeX modules.

In [None]:
# Example: Statistical analysis with automatic saving
data = np.random.randn(1000)

# Compute statistics
stats_result = {
    'mean': np.mean(data),
    'std': np.std(data),
    'median': np.median(data),
    'min': np.min(data),
    'max': np.max(data)
}

# Save raw data and statistics
stx.io.save(data, './data/random_sample.npy')
stx.io.save(stats_result, './results/statistics.json')

# Create and save visualization
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# Histogram
ax1.hist(data, bins=50, alpha=0.7, color='skyblue', edgecolor='black')
ax1.set_title('Data Distribution')
ax1.set_xlabel('Value')
ax1.set_ylabel('Frequency')
ax1.axvline(stats_result['mean'], color='red', linestyle='--', label=f"Mean: {stats_result['mean']:.3f}")
ax1.legend()

# Box plot
ax2.boxplot(data)
ax2.set_title('Data Box Plot')
ax2.set_ylabel('Value')

plt.tight_layout()
stx.io.save(fig, './figures/data_analysis.png')

print("✅ Complete analysis saved:")
print(f"  📊 Raw data: ./data/random_sample.npy")
print(f"  📈 Statistics: ./results/statistics.json")
print(f"  🎨 Visualization: ./figures/data_analysis.png")

plt.show()

## Summary

The SciTeX IO module provides:

✅ **Unified Interface**: Single functions for multiple formats  
✅ **Auto Detection**: Automatic format detection from file extensions  
✅ **Reproducibility**: Symlink creation and organized outputs  
✅ **Integration**: Seamless work with other SciTeX modules  
✅ **Organization**: Automatic directory structure by file type  

This makes file operations simple, consistent, and reproducible across your scientific computing workflows.