# Primavera XER File to CSV Converter

This notebook helps you parse Primavera P6 XER files and convert them to CSV format.

## What are XER files?
- Tab-delimited text files from Primavera P6
- Contain multiple tables (tasks, resources, calendars, etc.)
- Each table can be exported as a separate CSV

## Usage
1. Set the path to your XER file
2. Run the cells to parse and explore
3. Export specific tables or all tables to CSV

## Setup

In [None]:
import sys
from pathlib import Path
import pandas as pd

# Add project root to path
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root))

from src.utils.xer_parser import XERParser

print("✅ Setup complete")

## Configuration

Set your input XER file path and output directory here:

In [None]:
# INPUT: Path to your XER file
XER_FILE_PATH = "/workspaces/mxi-samsung/data/raw/SAMSUNG-TFAB1-11-20-25- Live-3.xer"

# OUTPUT: Directory for CSV exports
OUTPUT_DIR = "/workspaces/mxi-samsung/data/output/xer_exports"

print(f"Input file: {XER_FILE_PATH}")
print(f"Output directory: {OUTPUT_DIR}")

## Parse XER File

This will read the XER file and parse all tables into memory:

In [None]:
# Initialize parser
parser = XERParser(XER_FILE_PATH)

# Parse the file
print("Parsing XER file...")
tables = parser.parse()

print(f"\n✅ Successfully parsed {len(tables)} tables")

## Explore Available Tables

See what tables are available in your XER file:

In [None]:
# List all tables
print("Available tables:\n")
for i, table_name in enumerate(parser.list_tables(), 1):
    df = parser.get_table(table_name)
    print(f"{i:2d}. {table_name:<20} ({len(df):>6} rows, {len(df.columns):>3} columns)")

## Get Detailed Summary

View a detailed summary including column names for each table:

In [None]:
summary = parser.summary()

print(f"File: {summary['file_path']}")
print(f"Total tables: {summary['total_tables']}\n")

for table_name, info in summary['tables'].items():
    print(f"\n{table_name}:")
    print(f"  Rows: {info['rows']}")
    print(f"  Columns: {info['columns']}")
    print(f"  Fields: {', '.join(info['column_names'][:10])}{'...' if info['columns'] > 10 else ''}")

## Preview Specific Tables

### Tasks Table

In [None]:
tasks = parser.get_tasks()
if tasks is not None:
    print(f"Tasks table: {len(tasks)} rows\n")
    display(tasks.head(10))
else:
    print("No TASK table found in XER file")

### Projects Table

In [None]:
projects = parser.get_projects()
if projects is not None:
    print(f"Projects table: {len(projects)} rows\n")
    display(projects.head(10))
else:
    print("No PROJECT table found in XER file")

### Resources Table

In [None]:
resources = parser.get_resources()
if resources is not None:
    print(f"Resources table: {len(resources)} rows\n")
    display(resources.head(10))
else:
    print("No RSRC table found in XER file")

### Preview Any Table

Replace 'TABLE_NAME' with any table from the list above:

In [None]:
# Change this to any table name from the list above
TABLE_NAME = "TASK"

df = parser.get_table(TABLE_NAME)
if df is not None:
    print(f"{TABLE_NAME} table:")
    print(f"  Rows: {len(df)}")
    print(f"  Columns: {len(df.columns)}")
    print(f"  Fields: {list(df.columns)}\n")
    display(df.head(20))
else:
    print(f"Table '{TABLE_NAME}' not found")

## Export to CSV

### Export a Single Table

In [None]:
# Export specific table
TABLE_TO_EXPORT = "TASK"  # Change as needed
OUTPUT_FILE = f"{OUTPUT_DIR}/{TABLE_TO_EXPORT}.csv"

parser.export_table_to_csv(TABLE_TO_EXPORT, OUTPUT_FILE)

### Export All Tables

This will create a separate CSV file for each table:

In [None]:
print("Exporting all tables to CSV...\n")
parser.export_all_to_csv(OUTPUT_DIR)
print("\n✅ Export complete!")

## Data Analysis Examples

### Task Statistics

In [None]:
tasks = parser.get_tasks()
if tasks is not None:
    print("Task Statistics:\n")
    print(f"Total tasks: {len(tasks)}")
    
    # Check for common fields
    if 'task_name' in tasks.columns:
        print(f"\nSample task names:")
        print(tasks['task_name'].head(10).to_string())
    
    if 'status_code' in tasks.columns:
        print(f"\nTasks by status:")
        print(tasks['status_code'].value_counts())
    
    # Show all available columns
    print(f"\nAvailable columns ({len(tasks.columns)}):")
    print(list(tasks.columns))

### Filter and Export Subset

In [None]:
# Example: Export only active tasks
tasks = parser.get_tasks()
if tasks is not None and 'status_code' in tasks.columns:
    active_tasks = tasks[tasks['status_code'] == 'TK_Active']
    output_file = f"{OUTPUT_DIR}/active_tasks.csv"
    active_tasks.to_csv(output_file, index=False)
    print(f"Exported {len(active_tasks)} active tasks to {output_file}")
else:
    print("Cannot filter tasks - check available columns")

## Custom Data Processing

Add your own processing logic here:

In [None]:
# Your custom code here
# Example: Merge tables, filter data, create summaries, etc.
