# Integrated MCP Server Workflow

This notebook demonstrates how multiple SciTeX MCP servers work together in a complete research workflow.

## Scenario: Neuroscience Data Analysis Pipeline

We'll analyze EEG data from a cognitive experiment:
1. **IO Translator**: Convert legacy analysis code to SciTeX
2. **Config Server**: Extract and manage experimental parameters
3. **DSP Server**: Process neural signals
4. **Stats Server**: Perform statistical analysis
5. **PLT Server**: Create publication figures
6. **PD Server**: Manage and export results

## Step 1: Legacy Code Translation

In [None]:
# Original legacy analysis script
legacy_code = '''
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal, stats
import mne

# Load EEG data
subjects = ['S01', 'S02', 'S03', 'S04', 'S05']
results = []

for subj in subjects:
    # Load raw data
    raw = mne.io.read_raw_fif(f'/data/eeg/{subj}_raw.fif', preload=True)
    events = mne.read_events(f'/data/eeg/{subj}_events.eve')
    
    # Preprocessing
    raw.filter(1, 40)  # Bandpass filter
    raw.notch_filter(50)  # Remove line noise
    
    # Epoch data
    epochs = mne.Epochs(raw, events, event_id={'target': 1, 'standard': 2},
                       tmin=-0.2, tmax=0.8, baseline=(-0.2, 0))
    
    # Compute ERPs
    erp_target = epochs['target'].average()
    erp_standard = epochs['standard'].average()
    
    # Extract P300 amplitude
    p300_window = (0.3, 0.5)
    p300_amp = erp_target.data[:, int(p300_window[0]*1000):int(p300_window[1]*1000)].max()
    
    results.append({
        'subject': subj,
        'p300_amplitude': p300_amp,
        'n_trials': len(epochs)
    })

# Save results
df_results = pd.DataFrame(results)
df_results.to_csv('/results/p300_analysis.csv', index=False)

# Plot grand average
plt.figure(figsize=(10, 6))
# ... plotting code ...
plt.savefig('/figures/grand_average_erp.png')
'''

print("LEGACY NEUROSCIENCE ANALYSIS CODE")
print("=" * 50)
print(legacy_code[:800] + "\n... [truncated]")

In [None]:
# Step 1: Use IO Translator to convert to SciTeX
print("STEP 1: TRANSLATING TO SCITEX")
print("=" * 50)

# In practice, this would be called through MCP:
# result = mcp.call_tool("scitex-io-translator", "translate_to_scitex", {
#     "source_code": legacy_code,
#     "target_modules": ["io", "dsp", "plt", "pd"],
#     "add_config_support": True
# })

print("✓ Code translated to SciTeX format")
print("✓ Paths converted to relative")
print("✓ Configuration extracted to CONFIG/")
print("✓ Added reproducibility features")

## Step 2: Configuration Management

In [None]:
# Translated SciTeX code with config management
scitex_code = '''
import scitex as stx

def main(CONFIG):
    """EEG P300 analysis pipeline."""
    # Initialize experiment
    exp = stx.config.load_experiment(
        config_dir='./CONFIG',
        validate=True,
        schema='neuroscience'
    )
    
    # Process all subjects
    results = []
    
    for subj_id in CONFIG.subjects:
        # Load data with automatic format detection
        raw_data = stx.io.load(CONFIG.paths.raw_data_pattern.format(subj_id))
        events = stx.io.load(CONFIG.paths.events_pattern.format(subj_id))
        
        # Apply preprocessing pipeline
        processed = stx.dsp.preprocess_eeg(
            raw_data,
            sampling_rate=CONFIG.eeg.sampling_rate,
            filters=CONFIG.eeg.preprocessing.filters,
            reference=CONFIG.eeg.preprocessing.reference,
            bad_channels='auto',
            artifact_rejection=CONFIG.eeg.preprocessing.artifact_rejection
        )
        
        # Extract epochs
        epochs = stx.dsp.create_epochs(
            processed,
            events,
            event_mapping=CONFIG.experiment.event_codes,
            time_window=CONFIG.analysis.epoch_window,
            baseline=CONFIG.analysis.baseline,
            reject_criteria=CONFIG.eeg.rejection_thresholds
        )
        
        # Compute ERPs and extract features
        erp_results = stx.dsp.compute_erp_features(
            epochs,
            components=CONFIG.analysis.erp_components,
            methods=CONFIG.analysis.extraction_methods
        )
        
        results.append({
            'subject_id': subj_id,
            **erp_results,
            'n_trials': len(epochs),
            'quality_metrics': stx.dsp.assess_data_quality(epochs)
        })
    
    # Create results dataframe
    df_results = stx.pd.force_df(results)
    
    return df_results
'''

print("SCITEX CODE WITH CONFIGURATION MANAGEMENT")
print("=" * 50)
print(scitex_code)

In [None]:
# Generated configuration files
config_files = {
    'CONFIG/experiment.yaml': '''
experiment:
  name: "P300 Oddball Paradigm"
  type: "event-related-potential"
  date: "2024-01-15"
  
  event_codes:
    target: 1
    standard: 2
    response: 3
    
  paradigm:
    n_blocks: 4
    trials_per_block: 100
    target_probability: 0.2
    isi: [1000, 1500]  # Inter-stimulus interval range (ms)
''',
    
    'CONFIG/eeg.yaml': '''
eeg:
  sampling_rate: 1000
  n_channels: 64
  reference: "average"
  
  preprocessing:
    filters:
      highpass: 0.1
      lowpass: 40
      notch: [50, 100]  # Line noise and harmonics
    
    artifact_rejection:
      method: "ica"
      n_components: 20
      eog_channels: ["EOG1", "EOG2"]
    
  rejection_thresholds:
    peak_to_peak: 150  # microvolts
    flat: 5            # microvolts
    
  channels_of_interest:
    - "Fz"
    - "Cz"
    - "Pz"
''',
    
    'CONFIG/analysis.yaml': '''
analysis:
  epoch_window: [-0.2, 1.0]  # seconds
  baseline: [-0.2, 0.0]
  
  erp_components:
    N100:
      window: [0.08, 0.12]
      channels: ["Fz", "Cz"]
      polarity: "negative"
      
    P300:
      window: [0.25, 0.50]
      channels: ["Cz", "Pz"]
      polarity: "positive"
      
    N400:
      window: [0.35, 0.45]
      channels: ["Cz", "Pz"]
      polarity: "negative"
  
  extraction_methods:
    - "peak_amplitude"
    - "mean_amplitude"
    - "peak_latency"
    - "area_under_curve"
  
  statistical:
    alpha: 0.05
    correction: "fdr_bh"
    bootstrap_iterations: 1000
'''
}

print("GENERATED CONFIGURATION FILES")
print("=" * 50)
for filename, content in config_files.items():
    print(f"\n{filename}:")
    print(content)

## Step 3: Signal Processing with DSP Server

In [None]:
# DSP server processing pipeline
dsp_pipeline = '''
# Step 3: Advanced signal processing
def process_eeg_signals(df_epochs, CONFIG):
    """Apply advanced DSP techniques to EEG data."""
    
    # Time-frequency analysis
    tf_results = stx.dsp.time_frequency_analysis(
        df_epochs,
        method='morlet',
        frequencies=np.logspace(0, 1.7, 30),  # 1-50 Hz
        n_cycles=7,
        output='complex',  # For phase analysis
        n_jobs=-1
    )
    
    # Phase-amplitude coupling
    pac_results = stx.dsp.compute_pac(
        df_epochs,
        phase_freqs=CONFIG.analysis.pac.phase_frequencies,
        amp_freqs=CONFIG.analysis.pac.amplitude_frequencies,
        method='tort',
        n_surrogates=200
    )
    
    # Connectivity analysis
    connectivity = stx.dsp.compute_connectivity(
        df_epochs,
        method=['coherence', 'plv', 'wpli'],
        frequencies=CONFIG.analysis.frequency_bands,
        n_cycles=5,
        time_resolved=True
    )
    
    # Source localization (if MRI available)
    if CONFIG.paths.get('mri_template'):
        sources = stx.dsp.estimate_sources(
            df_epochs,
            forward_model=CONFIG.paths.forward_model,
            method='mne',
            regularization='auto'
        )
    
    # Feature extraction for machine learning
    ml_features = stx.dsp.extract_eeg_features(
        df_epochs,
        feature_sets=[
            'spectral_power',
            'hjorth_parameters',
            'fractal_dimension',
            'sample_entropy',
            'wavelet_energy'
        ],
        standardize=True
    )
    
    return {
        'time_frequency': tf_results,
        'pac': pac_results,
        'connectivity': connectivity,
        'ml_features': ml_features
    }
'''

print("DSP SERVER PROCESSING PIPELINE")
print("=" * 50)
print(dsp_pipeline)

## Step 4: Statistical Analysis with Stats Server

In [None]:
# Statistical analysis pipeline
stats_pipeline = '''
# Step 4: Comprehensive statistical analysis
def analyze_erp_statistics(df_results, dsp_results, CONFIG):
    """Perform statistical analysis of ERP data."""
    
    # Between-group comparisons (if applicable)
    if 'group' in df_results.columns:
        group_stats = stx.stats.compare_groups(
            df_results,
            dependent_vars=['p300_amplitude', 'p300_latency'],
            grouping_var='group',
            covariates=['age', 'gender'],
            tests=['ancova', 'permutation'],
            n_permutations=5000,
            effect_sizes=['cohens_d', 'eta_squared'],
            confidence_level=0.95
        )
    
    # Within-subject comparisons
    condition_stats = stx.stats.repeated_measures_analysis(
        data=df_results,
        within_factors=['condition', 'time'],
        dependent_var='amplitude',
        subject_var='subject_id',
        sphericity_correction='greenhouse-geisser',
        post_hoc='bonferroni',
        plot_interactions=True
    )
    
    # Time-frequency statistics
    tf_stats = stx.stats.cluster_permutation_test(
        dsp_results['time_frequency'],
        contrast='target-standard',
        n_permutations=1000,
        threshold='tfce',  # Threshold-free cluster enhancement
        tail='two-sided',
        adjacency='temporal-spectral',
        n_jobs=-1
    )
    
    # Connectivity statistics
    conn_stats = stx.stats.network_based_statistic(
        dsp_results['connectivity'],
        contrast='condition',
        threshold=3.0,
        n_permutations=5000,
        method='nbs'
    )
    
    # Machine learning classification
    ml_results = stx.stats.classification_analysis(
        features=dsp_results['ml_features'],
        labels=df_results['condition'],
        models=['svm', 'random_forest', 'lda'],
        cv_strategy='stratified_kfold',
        n_splits=5,
        scoring=['accuracy', 'f1', 'roc_auc'],
        feature_selection='mutual_info',
        n_features=20,
        permutation_test=True
    )
    
    # Effect size and power analysis
    power_analysis = stx.stats.compute_achieved_power(
        data=df_results,
        effect_sizes=group_stats['effect_sizes'],
        alpha=CONFIG.analysis.statistical.alpha,
        design='mixed',
        include_sensitivity=True
    )
    
    return {
        'group_comparisons': group_stats,
        'repeated_measures': condition_stats,
        'time_frequency_stats': tf_stats,
        'connectivity_stats': conn_stats,
        'classification': ml_results,
        'power_analysis': power_analysis
    }
'''

print("STATISTICAL ANALYSIS PIPELINE")
print("=" * 50)
print(stats_pipeline)

## Step 5: Visualization with PLT Server

In [None]:
# Publication-ready visualization
visualization_pipeline = '''
# Step 5: Create publication figures
def create_publication_figures(df_results, dsp_results, stats_results, CONFIG):
    """Generate publication-ready figures."""
    
    # Set publication style
    stx.plt.set_publication_style(
        journal='neuroimage',
        column_width='double',
        color_palette='colorblind_safe'
    )
    
    # Figure 1: Grand average ERPs
    fig1 = stx.plt.create_erp_figure(
        layout='multi_channel',
        figsize='auto'
    )
    
    fig1.plot_grand_average_erp(
        df_results,
        conditions=['target', 'standard'],
        channels=CONFIG.eeg.channels_of_interest,
        time_window=[-0.2, 0.8],
        baseline=CONFIG.analysis.baseline,
        show_sem=True,
        show_topo_maps=[0.1, 0.3, 0.4],  # Time points for topographies
        mark_components=CONFIG.analysis.erp_components,
        add_difference_wave=True
    )
    
    # Add statistical markers
    fig1.add_statistical_markers(
        stats_results['repeated_measures'],
        method='cluster',
        alpha=0.05
    )
    
    # Figure 2: Time-frequency results
    fig2 = stx.plt.create_tf_figure(
        n_conditions=2,
        n_channels=3,
        figsize=(12, 8)
    )
    
    fig2.plot_time_frequency_maps(
        dsp_results['time_frequency'],
        baseline=CONFIG.analysis.baseline,
        vmin=-3, vmax=3,  # Z-scored power
        cmap='RdBu_r',
        contour=stats_results['time_frequency_stats']['clusters'],
        colorbar_label='Power (z-score)'
    )
    
    # Figure 3: Connectivity results
    fig3 = stx.plt.create_connectivity_figure(
        style='circular',
        figsize=(10, 10)
    )
    
    fig3.plot_connectivity_matrix(
        dsp_results['connectivity']['wpli'],
        frequency_band='alpha',
        threshold='significant',
        node_colors='lobe',
        edge_cmap='viridis',
        show_labels=True
    )
    
    # Figure 4: Statistical summary
    fig4 = stx.plt.create_results_figure(
        layout='dashboard',
        figsize=(15, 10)
    )
    
    # Panel A: Component amplitudes
    fig4.panels['A'].plot_component_comparison(
        df_results,
        components=['N100', 'P300'],
        show_individual=True,
        show_stats=True,
        violin=True
    )
    
    # Panel B: Classification results
    fig4.panels['B'].plot_classification_results(
        stats_results['classification'],
        show_confusion_matrix=True,
        show_roc_curves=True,
        show_feature_importance=True
    )
    
    # Panel C: Effect sizes
    fig4.panels['C'].plot_effect_sizes(
        stats_results['group_comparisons']['effect_sizes'],
        show_ci=True,
        reference_lines=[0.2, 0.5, 0.8],  # Small, medium, large
        sort_by_magnitude=True
    )
    
    # Panel D: Power analysis
    fig4.panels['D'].plot_power_curves(
        stats_results['power_analysis'],
        show_achieved=True,
        show_required_n=True
    )
    
    # Save all figures
    figures = {
        'figure_1_erp': fig1,
        'figure_2_timefreq': fig2,
        'figure_3_connectivity': fig3,
        'figure_4_summary': fig4
    }
    
    for name, fig in figures.items():
        stx.io.save(
            fig,
            f'./figures/{name}',
            formats=['png', 'pdf', 'svg'],
            dpi=300,
            metadata={
                'experiment': CONFIG.experiment.name,
                'n_subjects': len(CONFIG.subjects),
                'analysis_date': stx.gen.timestamp()
            },
            export_data=True,
            symlink_from_cwd=True
        )
    
    return figures
'''

print("VISUALIZATION PIPELINE")
print("=" * 50)
print(visualization_pipeline)

## Step 6: Results Management with PD Server

In [None]:
# Results management and export
results_management = '''
# Step 6: Organize and export results
def manage_results(all_results, CONFIG):
    """Organize, validate, and export all results."""
    
    # Combine all results into structured format
    results_db = stx.pd.create_results_database(
        erp_data=all_results['erp'],
        dsp_results=all_results['dsp'],
        stats_results=all_results['stats'],
        metadata=CONFIG.experiment,
        schema='neuroscience_erp'
    )
    
    # Quality control
    qc_report = stx.pd.quality_control(
        results_db,
        checks=[
            'completeness',      # All subjects processed
            'outlier_detection', # Statistical outliers
            'consistency',       # Cross-measure consistency
            'replication'        # Internal replication
        ],
        generate_report=True
    )
    
    # Create results summary table
    summary_table = stx.pd.create_summary_table(
        results_db,
        group_by=['condition', 'component'],
        metrics=['mean', 'std', 'ci95', 'effect_size', 'p_value'],
        format_numbers=True,
        highlight_significant=True
    )
    
    # Export to multiple formats
    export_manager = stx.pd.ResultsExporter(
        results_db,
        output_dir='./results/final/'
    )
    
    # Scientific data formats
    export_manager.to_hdf5(
        'complete_results.h5',
        compression='gzip',
        include_metadata=True
    )
    
    export_manager.to_mat(
        'results_for_matlab.mat',
        scipy_compatible=True
    )
    
    # Publication formats
    export_manager.to_excel(
        'results_tables.xlsx',
        sheets={
            'Summary': summary_table,
            'Individual_ERPs': results_db['erp_data'],
            'Statistics': results_db['stats_summary'],
            'QC_Report': qc_report
        },
        formatting='publication'
    )
    
    # Generate manuscript tables
    manuscript_tables = stx.pd.create_manuscript_tables(
        summary_table,
        table_format='apa',  # APA style
        caption_template='professional',
        number_format='3.2f',
        p_value_format='exact',
        save_as=['latex', 'docx', 'rtf']
    )
    
    # Create data package for sharing
    data_package = stx.pd.create_bids_package(
        results_db,
        package_name='erp_p300_study',
        include_raw=False,  # Only derivatives
        include_code=True,
        include_config=True,
        readme_template='comprehensive',
        license='CC-BY-4.0'
    )
    
    # Generate citations and methods
    methods_text = stx.pd.generate_methods_section(
        pipeline_config=CONFIG,
        software_versions=stx.gen.get_environment_info(),
        include_equations=True,
        citation_style='apa'
    )
    
    stx.io.save(methods_text, './manuscript/methods_section.md',
                symlink_from_cwd=True)
    
    return {
        'summary': summary_table,
        'qc_report': qc_report,
        'data_package': data_package
    }
'''

print("RESULTS MANAGEMENT PIPELINE")
print("=" * 50)
print(results_management)

## Complete Integrated Workflow

In [None]:
# Complete integrated workflow
complete_workflow = '''
import scitex as stx

def run_complete_analysis():
    """Execute complete EEG analysis workflow using MCP servers."""
    
    # Step 1: Initialize project with orchestrator
    project = stx.orchestrator.initialize_project(
        name="P300_ERP_Analysis",
        type="neuroscience",
        create_structure=True
    )
    
    # Step 2: Translate and validate code
    translated = stx.io_translator.process_codebase(
        source_dir="./legacy_scripts/",
        target_modules=["all"],
        validate=True
    )
    
    # Step 3: Load configuration
    CONFIG = stx.config.load_validated(
        config_dir="./CONFIG/",
        schema="neuroscience_erp"
    )
    
    # Step 4: Run main analysis
    with stx.gen.progress_monitor("EEG Analysis Pipeline"):
        
        # Data loading and preprocessing
        erp_results = main(CONFIG)
        
        # Signal processing
        dsp_results = process_eeg_signals(erp_results, CONFIG)
        
        # Statistical analysis
        stats_results = analyze_erp_statistics(
            erp_results, dsp_results, CONFIG
        )
        
        # Visualization
        figures = create_publication_figures(
            erp_results, dsp_results, stats_results, CONFIG
        )
        
        # Results management
        final_results = manage_results(
            {
                'erp': erp_results,
                'dsp': dsp_results,
                'stats': stats_results,
                'figures': figures
            },
            CONFIG
        )
    
    # Step 5: Generate final report
    report = stx.orchestrator.generate_project_report(
        project,
        include_figures=True,
        include_stats=True,
        include_methods=True,
        format='html'
    )
    
    # Step 6: Package for reproducibility
    stx.orchestrator.create_reproducibility_package(
        project,
        include_data=True,
        include_environment=True,
        create_docker=True,
        create_singularity=True
    )
    
    print("\n" + "="*50)
    print("ANALYSIS COMPLETE")
    print("="*50)
    print(f"Results saved to: {project.results_dir}")
    print(f"Figures saved to: {project.figures_dir}")
    print(f"Report available at: {report.path}")
    print(f"Reproducibility package: {project.package_path}")
    
    return final_results


if __name__ == "__main__":
    # Run with SciTeX initialization
    stx.gen.run_main(run_complete_analysis)
'''

print("COMPLETE INTEGRATED WORKFLOW")
print("=" * 50)
print(complete_workflow)

## Workflow Summary

This integrated workflow demonstrates how SciTeX MCP servers work together:

### 1. **Code Translation (IO Translator)**
   - Converted legacy analysis scripts to SciTeX format
   - Extracted configuration to separate files
   - Added reproducibility features

### 2. **Configuration Management (Config Server)**
   - Centralized experimental parameters
   - Validated configuration schema
   - Enabled parameter sweeps and optimization

### 3. **Signal Processing (DSP Server)**
   - Advanced filtering and preprocessing
   - Time-frequency analysis
   - Connectivity and source estimation
   - Feature extraction for ML

### 4. **Statistical Analysis (Stats Server)**
   - Appropriate test selection
   - Multiple comparison corrections
   - Effect size calculations
   - Machine learning classification

### 5. **Visualization (PLT Server)**
   - Publication-ready figures
   - Multi-panel layouts
   - Statistical annotations
   - Automatic data export

### 6. **Data Management (PD Server)**
   - Results organization
   - Quality control
   - Multi-format export
   - BIDS compliance

### 7. **Project Orchestration (Orchestrator)**
   - Workflow coordination
   - Progress monitoring
   - Report generation
   - Reproducibility packaging

## Benefits of Integration

1. **Seamless Workflow**: Each server handles its specialized domain
2. **Consistency**: Shared configuration and data formats
3. **Reproducibility**: Every step tracked and documented
4. **Scalability**: Parallel processing where applicable
5. **Quality Assurance**: Built-in validation at each step
6. **Publication Ready**: Outputs formatted for journals

## Next Steps

- Customize configuration for your specific experiment
- Add domain-specific analysis modules
- Integrate with compute clusters for large datasets
- Share reproducibility packages with collaborators