# SciTeX IO Translator - Basic Examples

This notebook demonstrates the basic functionality of the SciTeX IO Translator MCP server.

## Overview

The IO Translator helps you:
- Convert standard Python code to SciTeX format
- Convert SciTeX code back to standard Python
- Validate code compliance
- Extract IO patterns from existing code

## Setup

Note: In actual usage, the MCP server communicates through the MCP protocol. 
These examples show the expected inputs and outputs.

In [None]:
# In real usage with Claude Desktop, you would ask:
# "Can you convert this code to SciTeX format?"
# Here we'll demonstrate the transformations

## Example 1: Basic Pandas Translation

In [None]:
# Original Python code
original_code = '''
import pandas as pd
import numpy as np

# Load data
df = pd.read_csv('experiment_data.csv')
print(f"Loaded {len(df)} rows")

# Process data
df_summary = df.groupby('condition').agg({
    'value': ['mean', 'std', 'count']
})

# Save results
df_summary.to_csv('summary_stats.csv')
'''

print("ORIGINAL CODE:")
print(original_code)

In [None]:
# Expected SciTeX translation
scitex_code = '''
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Timestamp: "2025-07-02 07:00:00 (ywatanabe)"
# File: ./analysis.py
# ----------------------------------------
import os
__FILE__ = "./analysis.py"
__DIR__ = os.path.dirname(__FILE__)
# ----------------------------------------

import scitex as stx

def main():
    """Main function."""
    # Load data
    df = stx.io.load('./experiment_data.csv')
    print(f"Loaded {len(df)} rows")
    
    # Process data
    df_summary = df.groupby('condition').agg({
        'value': ['mean', 'std', 'count']
    })
    
    # Save results
    stx.io.save(df_summary, './data/summary_stats.csv', symlink_from_cwd=True)
    return 0

def run_main():
    """Run main function with proper setup."""
    import sys
    CONFIG, sys.stdout, sys.stderr, plt, CC = stx.gen.start(
        sys, plt, verbose=True
    )
    main(CONFIG)
    stx.gen.close(CONFIG, verbose=True)

if __name__ == "__main__":
    run_main()
'''

print("SCITEX TRANSLATION:")
print(scitex_code)

### Key Changes:
1. Added SciTeX header with timestamp and file info
2. `pd.read_csv()` → `stx.io.load()`
3. `.to_csv()` → `stx.io.save()`
4. Added relative path prefix `./`
5. Organized output to `./data/` directory
6. Added `symlink_from_cwd=True` for easy access
7. Wrapped code in proper main function structure

## Example 2: Matplotlib Visualization Translation

In [None]:
# Original matplotlib code
matplotlib_code = '''
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Generate sample data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)

# Create figure
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(8, 6))

# First subplot
ax1.plot(x, y1, 'b-', label='sin(x)')
ax1.set_xlabel('X values')
ax1.set_ylabel('Y values')
ax1.set_title('Sine Wave')
ax1.legend()
ax1.grid(True)

# Second subplot
ax2.plot(x, y2, 'r-', label='cos(x)')
ax2.set_xlabel('X values')
ax2.set_ylabel('Y values')
ax2.set_title('Cosine Wave')
ax2.legend()
ax2.grid(True)

plt.tight_layout()
plt.savefig('trigonometric_functions.png', dpi=150)
plt.show()
'''

print("ORIGINAL MATPLOTLIB CODE:")
print(matplotlib_code)

In [None]:
# SciTeX translation with enhanced features
scitex_matplotlib = '''
import scitex as stx
import numpy as np

def main():
    """Main function."""
    # Generate sample data
    x = np.linspace(0, 10, 100)
    y1 = np.sin(x)
    y2 = np.cos(x)
    
    # Create figure
    fig, (ax1, ax2) = stx.plt.subplots(2, 1, figsize=(8, 6))
    
    # First subplot
    ax1.plot(x, y1, 'b-', label='sin(x)')
    ax1.set_xyt('X values', 'Y values', 'Sine Wave')  # Combined setter
    ax1.legend()
    ax1.grid(True)
    
    # Second subplot  
    ax2.plot(x, y2, 'r-', label='cos(x)')
    ax2.set_xyt('X values', 'Y values', 'Cosine Wave')  # Combined setter
    ax2.legend()
    ax2.grid(True)
    
    plt.tight_layout()
    
    # Save with SciTeX features
    stx.io.save(fig, './figures/trigonometric_functions.png', 
                dpi=150, symlink_from_cwd=True)
    
    # Also save plot data for reproducibility
    plot_data = stx.pd.DataFrame({
        'x': x,
        'sin_x': y1,
        'cos_x': y2
    })
    stx.io.save(plot_data, './figures/trigonometric_functions_data.csv',
                symlink_from_cwd=True)
    
    plt.show()
    return 0
'''

print("SCITEX MATPLOTLIB TRANSLATION:")
print(scitex_matplotlib)

### Key Matplotlib Enhancements:
1. `plt.subplots()` → `stx.plt.subplots()`
2. Three separate calls → `ax.set_xyt()` (combined)
3. `plt.savefig()` → `stx.io.save()` with path organization
4. Automatic data export for reproducibility
5. Output organized in `./figures/` directory

## Example 3: Path Management

In [None]:
# Code with problematic paths
path_example = '''
import pandas as pd
import numpy as np
import os

# Absolute paths (bad practice)
data1 = pd.read_csv('/home/researcher/projects/exp1/data.csv')
data2 = pd.read_csv('C:\\Users\\Lab\\Desktop\\measurements.csv')

# Unorganized outputs
data1.to_csv('processed.csv')
np.save('results.npy', data1.values)

# Parent directory access
reference = pd.read_csv('../shared_data/reference.csv')
reference.to_csv('../../outputs/reference_processed.csv')
'''

print("ORIGINAL CODE WITH PATH ISSUES:")
print(path_example)

In [None]:
# SciTeX translation with improved paths
scitex_paths = '''
import scitex as stx

def main(CONFIG):
    """Main function."""
    # All paths converted to relative with ./
    data1 = stx.io.load('./data/exp1_data.csv')
    data2 = stx.io.load('./data/measurements.csv')
    
    # Organized outputs by type
    stx.io.save(data1, './data/processed.csv', symlink_from_cwd=True)
    stx.io.save(data1.values, './arrays/results.npy', symlink_from_cwd=True)
    
    # Reference data handled properly
    reference = stx.io.load(CONFIG.paths.REFERENCE_DATA)
    stx.io.save(reference, './data/reference_processed.csv', symlink_from_cwd=True)
    
    return 0
'''

print("SCITEX PATH TRANSLATION:")
print(scitex_paths)

# Also generates CONFIG/PATH.yaml
config_yaml = '''
# Auto-generated path configuration
paths:
  REFERENCE_DATA: './data/shared/reference.csv'
  DATA_DIR: './data/'
  FIGURES_DIR: './figures/'
  ARRAYS_DIR: './arrays/'
'''

print("\nGENERATED CONFIG/PATH.yaml:")
print(config_yaml)

## Example 4: Validation and Compliance Checking

In [None]:
# Code to validate
code_to_validate = '''
import scitex as stx
import pandas as pd  # Mixed imports

# Missing proper header
df = pd.read_csv('data.csv')  # Should use stx.io.load
result = stx.io.load('./processed.csv')  # Good

# Bad path
df.to_csv('/tmp/output.csv')  # Absolute path

# Good save
stx.io.save(result, './data/final.csv', symlink_from_cwd=True)
'''

print("CODE TO VALIDATE:")
print(code_to_validate)

In [None]:
# Expected validation results
validation_results = {
    "compliant": False,
    "issues": {
        "errors": [
            "Missing shebang: #!/usr/bin/env python3",
            "Missing encoding declaration",
            "Missing timestamp in header",
            "Missing __FILE__ definition",
            "Missing __DIR__ definition"
        ],
        "warnings": [
            "Use stx.io.load() instead of pd.read_csv()",
            "Use stx.io.save() instead of .to_csv()",
            "Avoid absolute paths; use relative paths starting with ./"
        ],
        "suggestions": [
            "Consider organizing outputs by type: ./figures/, ./data/, ./cache/",
            "Consider using run_main() pattern for proper SciTeX initialization"
        ]
    }
}

print("VALIDATION RESULTS:")
print(f"Compliant: {validation_results['compliant']}")
print(f"\nErrors ({len(validation_results['issues']['errors'])}):")
for error in validation_results['issues']['errors']:
    print(f"  ❌ {error}")
print(f"\nWarnings ({len(validation_results['issues']['warnings'])}):")
for warning in validation_results['issues']['warnings']:
    print(f"  ⚠️  {warning}")
print(f"\nSuggestions ({len(validation_results['issues']['suggestions'])}):")
for suggestion in validation_results['issues']['suggestions']:
    print(f"  💡 {suggestion}")

## Example 5: Complex Scientific Workflow

In [None]:
# Complex scientific analysis
scientific_workflow = '''
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import pickle
import json

# Load experimental data
data = pd.read_csv('raw_measurements.csv')
metadata = json.load(open('experiment_metadata.json'))

# Preprocessing
data['normalized'] = (data['signal'] - data['signal'].mean()) / data['signal'].std()
data = data[data['quality'] > 0.8]

# Statistical analysis
groups = data.groupby('condition')
results = {}

for name, group in groups:
    results[name] = {
        'mean': group['normalized'].mean(),
        'std': group['normalized'].std(),
        'n': len(group)
    }

# Perform t-test
t_stat, p_value = stats.ttest_ind(
    data[data['condition'] == 'control']['normalized'],
    data[data['condition'] == 'treatment']['normalized']
)

# Visualization
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Raw data
axes[0, 0].plot(data['time'], data['signal'])
axes[0, 0].set_xlabel('Time (s)')
axes[0, 0].set_ylabel('Signal')
axes[0, 0].set_title('Raw Signal')

# Histogram
axes[0, 1].hist(data['normalized'], bins=30, alpha=0.7)
axes[0, 1].set_xlabel('Normalized Value')
axes[0, 1].set_ylabel('Count')
axes[0, 1].set_title('Distribution')

# Box plot
data_by_condition = [group['normalized'].values for name, group in groups]
axes[1, 0].boxplot(data_by_condition, labels=results.keys())
axes[1, 0].set_ylabel('Normalized Signal')
axes[1, 0].set_title(f'Comparison (p={p_value:.3f})')

# Results summary
axes[1, 1].axis('off')
summary_text = f"T-statistic: {t_stat:.3f}\np-value: {p_value:.3f}\n"
for name, res in results.items():
    summary_text += f"\n{name}:\n  mean: {res['mean']:.3f}\n  std: {res['std']:.3f}\n  n: {res['n']}"
axes[1, 1].text(0.1, 0.5, summary_text, transform=axes[1, 1].transAxes,
                fontsize=10, verticalalignment='center')

plt.tight_layout()
plt.savefig('analysis_results.png', dpi=300)

# Save results
data.to_csv('processed_data.csv', index=False)
with open('statistical_results.pkl', 'wb') as f:
    pickle.dump(results, f)
with open('results_summary.json', 'w') as f:
    json.dump({'t_stat': t_stat, 'p_value': p_value, 'groups': results}, f, indent=2)
'''

print("COMPLEX SCIENTIFIC WORKFLOW:")
print(scientific_workflow[:1000] + "\n... [truncated]")

In [None]:
# SciTeX translation preview
print("SCITEX TRANSLATION HIGHLIGHTS:")
print("\n1. File I/O:")
print("   pd.read_csv() → stx.io.load()")
print("   json.load() → stx.io.load()")
print("   pickle.dump() → stx.io.save()")
print("\n2. Visualization:")
print("   plt.subplots() → stx.plt.subplots()")
print("   ax.set_xlabel/ylabel/title() → ax.set_xyt()")
print("   plt.savefig() → stx.io.save()")
print("\n3. Path Organization:")
print("   'analysis_results.png' → './figures/analysis_results.png'")
print("   'processed_data.csv' → './data/processed_data.csv'")
print("   'statistical_results.pkl' → './cache/statistical_results.pkl'")
print("\n4. Enhanced Features:")
print("   + Automatic plot data export")
print("   + Symlink creation for outputs")
print("   + Config extraction for parameters")
print("   + Proper error handling")

## Summary

The SciTeX IO Translator provides:

1. **Automatic Code Translation**
   - Standard Python → SciTeX format
   - SciTeX → Standard Python (for sharing)

2. **Path Management**
   - Converts absolute → relative paths
   - Organizes outputs by type
   - Creates symlinks for easy access

3. **Enhanced Functionality**
   - Combined matplotlib setters
   - Automatic data export
   - Config file generation

4. **Validation & Compliance**
   - Header format checking
   - Import style validation
   - Path convention enforcement
   - Best practice suggestions

## Next Steps

- Try translating your own code
- Run validation on existing scripts
- Explore other MCP servers for additional functionality