# CLARISSA Tutorial 01: ECLIPSE Deck Fundamentals

**Learning Objectives:**
- Understand ECLIPSE deck structure and section ordering
- Parse and validate keyword syntax
- Generate syntactically correct deck sections
- Check OPM Flow compatibility

**Prerequisites:** Basic Python knowledge, familiarity with reservoir simulation concepts

**Estimated Time:** 45 minutes

## CLARISSA System Architecture

CLARISSA is a 6-layer system that translates natural language into executable simulation decks:

| Layer | Function |
|-------|----------|
| User Interface | Voice, Text, Web, API inputs |
| Translation | NL Parser, Confidence Scoring, Rollback |
| Knowledge | Vector Store, Corrections DB, Analog Database |
| Core | LLM (Planning), RL Agent, Neuro-Symbolic Constraints |
| Validation | Syntax, Semantic, Physics checks |
| Simulation | OPM Flow, Eclipse Export, Result Parser |

## What is a Deck?

A **deck** (historically called a "card deck" from punch card days) is a structured text file that defines:

- **What** to simulate (fluid system, rock properties)
- **Where** (grid geometry, regions)
- **When** (schedule of operations)
- **How** (numerical controls, output requests)

ECLIPSE decks follow a strict section ordering that we must understand to generate valid input.

## Section 1: Deck Structure and Section Ordering

In [None]:
from enum import Enum, auto
from dataclasses import dataclass, field
from typing import List, Dict, Optional, Tuple
import re

class DeckSection(Enum):
    """ECLIPSE deck sections in required order."""
    RUNSPEC = auto()    # Run specification - dimensions, phases, features
    GRID = auto()       # Grid geometry and properties
    EDIT = auto()       # Grid property modifications (optional)
    PROPS = auto()      # Rock and fluid properties
    REGIONS = auto()    # Region definitions (optional)
    SOLUTION = auto()   # Initial conditions
    SUMMARY = auto()    # Output requests (optional)
    SCHEDULE = auto()   # Well operations and time stepping

# Valid section transitions
SECTION_ORDER = list(DeckSection)
print(f"Valid section order: {' -> '.join(s.name for s in SECTION_ORDER)}")

In [None]:
def validate_section_order(sections: List[DeckSection]) -> Tuple[bool, Optional[str]]:
    """Validate that sections appear in correct order.
    
    Args:
        sections: List of sections in order they appear
        
    Returns:
        (is_valid, error_message)
    """
    if not sections:
        return False, "No sections found"
    
    # Check required sections
    required = {DeckSection.RUNSPEC, DeckSection.GRID, DeckSection.PROPS, 
                DeckSection.SOLUTION, DeckSection.SCHEDULE}
    missing = required - set(sections)
    if missing:
        return False, f"Missing required sections: {[s.name for s in missing]}"
    
    # Check order
    section_indices = {s: i for i, s in enumerate(SECTION_ORDER)}
    prev_idx = -1
    for section in sections:
        curr_idx = section_indices[section]
        if curr_idx < prev_idx:
            return False, f"{section.name} appears after a later section"
        prev_idx = curr_idx
    
    return True, None

# Test with valid order
valid_sections = [DeckSection.RUNSPEC, DeckSection.GRID, DeckSection.PROPS, 
                  DeckSection.SOLUTION, DeckSection.SCHEDULE]
is_valid, error = validate_section_order(valid_sections)
print(f"Valid deck: {is_valid}")

# Test with invalid order
invalid_sections = [DeckSection.GRID, DeckSection.RUNSPEC, DeckSection.PROPS, 
                    DeckSection.SOLUTION, DeckSection.SCHEDULE]
is_valid, error = validate_section_order(invalid_sections)
print(f"Invalid deck: {is_valid}, Error: {error}")

## Section 2: Keyword Syntax Parsing

ECLIPSE keywords follow specific syntax rules:

- Keywords are uppercase, max 8 characters
- Data records follow the keyword
- Records end with `/`
- `*` means repeat, `1*` means use default
- `--` indicates comments

In [None]:
@dataclass
class KeywordRecord:
    """A single record within an ECLIPSE keyword."""
    values: List[str]
    
    def expand_repeats(self) -> List[str]:
        """Expand repeat notation (e.g., '3*0.25' -> ['0.25', '0.25', '0.25'])."""
        expanded = []
        for val in self.values:
            if '*' in val and not val.startswith('1*'):
                parts = val.split('*')
                if len(parts) == 2 and parts[0].isdigit():
                    count = int(parts[0])
                    expanded.extend([parts[1]] * count)
                else:
                    expanded.append(val)
            else:
                expanded.append(val)
        return expanded

@dataclass
class Keyword:
    """An ECLIPSE keyword with its data records."""
    name: str
    records: List[KeywordRecord] = field(default_factory=list)
    
    def __str__(self) -> str:
        lines = [self.name]
        for record in self.records:
            lines.append('  ' + ' '.join(record.values) + ' /')
        lines.append('/')
        return '\n'.join(lines)

In [None]:
def parse_keyword_block(text: str) -> List[Keyword]:
    """Parse ECLIPSE keyword blocks from text."""
    keywords = []
    lines = text.strip().split('\n')
    
    current_keyword = None
    current_records = []
    
    for line in lines:
        # Remove comments
        if '--' in line:
            line = line[:line.index('--')]
        line = line.strip()
        
        if not line:
            continue
            
        # Check for keyword (uppercase, starts at beginning)
        if line.isupper() and not any(c.isdigit() for c in line.split()[0]):
            if current_keyword:
                keywords.append(Keyword(current_keyword, current_records))
            current_keyword = line.split()[0]
            current_records = []
            # Check for inline data
            if len(line.split()) > 1:
                data = line.split()[1:]
                if data[-1] == '/':
                    current_records.append(KeywordRecord(data[:-1]))
        elif line == '/':
            if current_keyword:
                keywords.append(Keyword(current_keyword, current_records))
                current_keyword = None
                current_records = []
        elif line.endswith('/'):
            values = line[:-1].strip().split()
            if values:
                current_records.append(KeywordRecord(values))
    
    if current_keyword:
        keywords.append(Keyword(current_keyword, current_records))
    
    return keywords

# Test parsing
test_block = '''
WELSPECS
  PROD1 G1 10 10 8335 OIL /
  INJ1  G1 1  1  8335 WATER /
/

COMPDAT
  PROD1 10 10 1 5 OPEN 1* 0.5 /
/
'''

keywords = parse_keyword_block(test_block)
for kw in keywords:
    print(f"Keyword: {kw.name}")
    for rec in kw.records:
        print(f"  Record: {rec.values}")
        print(f"  Expanded: {rec.expand_repeats()}")

## Section 3: Deck Generation

Now let's build functions to generate valid deck sections.

In [None]:
@dataclass
class RunspecData:
    """Data for RUNSPEC section."""
    title: str = "CLARISSA Generated Model"
    nx: int = 10
    ny: int = 10
    nz: int = 5
    phases: List[str] = field(default_factory=lambda: ['OIL', 'WATER', 'GAS'])
    metric: bool = False

def generate_runspec(data: RunspecData) -> str:
    """Generate RUNSPEC section."""
    lines = [
        "RUNSPEC",
        "",
        f"TITLE",
        f"{data.title}",
        "",
        "-- Phases",
    ]
    
    for phase in data.phases:
        lines.append(phase)
    
    lines.extend([
        "",
        "-- Units",
        "METRIC" if data.metric else "FIELD",
        "",
        "-- Grid dimensions",
        "DIMENS",
        f"  {data.nx} {data.ny} {data.nz} /",
        ""
    ])
    
    return "\n".join(lines)

# Test RUNSPEC generation
runspec_data = RunspecData(title="5-Spot Waterflood", nx=20, ny=20, nz=5)
print(generate_runspec(runspec_data))

In [None]:
@dataclass
class GridData:
    """Data for GRID section."""
    dx: float = 100.0  # ft or m
    dy: float = 100.0
    dz: float = 20.0
    tops: float = 8000.0  # Top depth
    poro: float = 0.2
    permx: float = 100.0  # mD
    permy: float = 100.0
    permz: float = 10.0
    nx: int = 10
    ny: int = 10
    nz: int = 5

def generate_grid(data: GridData) -> str:
    """Generate GRID section with Cartesian grid."""
    total_cells = data.nx * data.ny * data.nz
    top_cells = data.nx * data.ny
    
    lines = [
        "GRID",
        "",
        "-- Cell dimensions",
        "DX",
        f"  {total_cells}*{data.dx} /",
        "DY",
        f"  {total_cells}*{data.dy} /",
        "DZ",
        f"  {total_cells}*{data.dz} /",
        "",
        "-- Top depth (first layer only)",
        "TOPS",
        f"  {top_cells}*{data.tops} /",
        "",
        "-- Porosity",
        "PORO",
        f"  {total_cells}*{data.poro} /",
        "",
        "-- Permeability",
        "PERMX",
        f"  {total_cells}*{data.permx} /",
        "PERMY",
        f"  {total_cells}*{data.permy} /",
        "PERMZ",
        f"  {total_cells}*{data.permz} /",
        ""
    ]
    
    return "\n".join(lines)

# Test GRID generation
grid_data = GridData(nx=20, ny=20, nz=5, permx=150.0)
print(generate_grid(grid_data))

In [None]:
def generate_props() -> str:
    """Generate PROPS section with standard black oil properties."""
    return '''PROPS

-- Water-Oil relative permeability
SWOF
-- Sw    Krw      Krow     Pcow
   0.20  0.0000   1.0000   0.0
   0.30  0.0200   0.6000   0.0
   0.40  0.0500   0.3500   0.0
   0.50  0.1000   0.2000   0.0
   0.60  0.2000   0.0900   0.0
   0.70  0.3500   0.0200   0.0
   0.80  0.5000   0.0000   0.0
/

-- PVT data
PVTW
-- Pref   Bw       Cw         Vw       Cv
   4000   1.012    3.0E-6     0.5      0.0 /

PVDO
-- P       Bo        Vo
   1000    1.200     2.5
   2000    1.150     2.0
   3000    1.100     1.5
   4000    1.050     1.2
   5000    1.020     1.0
/

ROCK
-- Pref  Cr
   4000  3.0E-6 /

DENSITY
-- Oil    Water   Gas
   45.0   64.0    0.06 /
'''

print(generate_props())

In [None]:
@dataclass
class EquilData:
    """Data for equilibration."""
    datum_depth: float = 8000.0
    datum_pressure: float = 4000.0
    woc_depth: float = 9000.0  # Water-oil contact
    goc_depth: float = 7000.0  # Gas-oil contact (above reservoir)

def generate_solution(data: EquilData) -> str:
    """Generate SOLUTION section."""
    return f'''SOLUTION

EQUIL
-- Datum   Pres    WOC     Pcow  GOC     Pcog  Init
   {data.datum_depth}  {data.datum_pressure}  {data.woc_depth}  0  {data.goc_depth}  0  1 /
'''

equil_data = EquilData(datum_depth=8500, datum_pressure=3800)
print(generate_solution(equil_data))

In [None]:
@dataclass
class Well:
    """Well specification."""
    name: str
    i: int  # I-location
    j: int  # J-location
    k1: int = 1  # Top completion
    k2: int = 5  # Bottom completion
    well_type: str = "PROD"  # PROD or INJ
    phase: str = "OIL"  # OIL, WATER, GAS
    rate: float = 1000.0  # STB/D or MSCF/D
    bhp_limit: float = 1000.0  # psi

def generate_schedule(wells: List[Well], end_time: int = 365) -> str:
    """Generate SCHEDULE section."""
    lines = ["SCHEDULE", ""]
    
    # Well specifications
    lines.append("WELSPECS")
    for w in wells:
        group = "G1"
        lines.append(f"  {w.name:8} {group} {w.i:3} {w.j:3} 1* {w.phase} /")
    lines.extend(["/", ""])
    
    # Completions
    lines.append("COMPDAT")
    for w in wells:
        lines.append(f"  {w.name:8} {w.i:3} {w.j:3} {w.k1:3} {w.k2:3} OPEN 1* 0.5 /")
    lines.extend(["/", ""])
    
    # Production controls
    producers = [w for w in wells if w.well_type == "PROD"]
    if producers:
        lines.append("WCONPROD")
        for w in producers:
            lines.append(f"  {w.name:8} OPEN ORAT {w.rate:.0f} 4* {w.bhp_limit:.0f} /")
        lines.extend(["/", ""])
    
    # Injection controls
    injectors = [w for w in wells if w.well_type == "INJ"]
    if injectors:
        lines.append("WCONINJE")
        for w in injectors:
            lines.append(f"  {w.name:8} {w.phase} OPEN RATE {w.rate:.0f} 1* 5000 /")
        lines.extend(["/", ""])
    
    # Time steps
    lines.append("TSTEP")
    lines.append(f"  {end_time}*1 /")
    lines.extend(["", "END"])
    
    return "\n".join(lines)

# Create 5-spot pattern
wells = [
    Well("PROD1", 10, 10, well_type="PROD", rate=500),
    Well("INJ1", 1, 1, well_type="INJ", phase="WATER", rate=600),
    Well("INJ2", 1, 20, well_type="INJ", phase="WATER", rate=600),
    Well("INJ3", 20, 1, well_type="INJ", phase="WATER", rate=600),
    Well("INJ4", 20, 20, well_type="INJ", phase="WATER", rate=600),
]
print(generate_schedule(wells, end_time=730))

## Section 4: Complete Deck Generation

In [None]:
def generate_complete_deck(
    runspec: RunspecData,
    grid: GridData,
    equil: EquilData,
    wells: List[Well],
    simulation_days: int = 365
) -> str:
    """Generate a complete ECLIPSE deck."""
    sections = [
        generate_runspec(runspec),
        generate_grid(grid),
        generate_props(),
        generate_solution(equil),
        generate_schedule(wells, simulation_days)
    ]
    return "\n".join(sections)

# Generate complete deck
runspec = RunspecData(title="5-Spot Waterflood Tutorial", nx=20, ny=20, nz=5)
grid = GridData(nx=20, ny=20, nz=5, permx=150, tops=8500)
equil = EquilData(datum_depth=8500, datum_pressure=3800)

complete_deck = generate_complete_deck(runspec, grid, equil, wells, 730)
print(f"Generated deck: {len(complete_deck)} characters")
print("\n--- First 2000 characters ---")
print(complete_deck[:2000])

## Section 5: OPM Flow Compatibility Check

In [None]:
# Keywords supported by OPM Flow (subset)
OPM_SUPPORTED_KEYWORDS = {
    'RUNSPEC': {'TITLE', 'DIMENS', 'OIL', 'WATER', 'GAS', 'FIELD', 'METRIC', 'START'},
    'GRID': {'DX', 'DY', 'DZ', 'TOPS', 'PORO', 'PERMX', 'PERMY', 'PERMZ', 'NTG', 'ACTNUM'},
    'PROPS': {'SWOF', 'SGOF', 'PVTW', 'PVDO', 'PVDG', 'ROCK', 'DENSITY'},
    'SOLUTION': {'EQUIL', 'PRESSURE', 'SWAT'},
    'SCHEDULE': {'WELSPECS', 'COMPDAT', 'WCONPROD', 'WCONINJE', 'TSTEP', 'END'}
}

def check_opm_compatibility(deck_text: str) -> List[str]:
    """Check deck for OPM Flow compatibility.
    
    Returns list of warnings for unsupported keywords.
    """
    warnings = []
    current_section = None
    
    all_supported = set()
    for kws in OPM_SUPPORTED_KEYWORDS.values():
        all_supported.update(kws)
    
    for line in deck_text.split('\n'):
        line = line.strip()
        if not line or line.startswith('--'):
            continue
        
        # Check for section headers
        word = line.split()[0].upper()
        if word in OPM_SUPPORTED_KEYWORDS:
            current_section = word
        elif word.isupper() and len(word) <= 8 and word not in all_supported:
            if not any(c in word for c in ['/', '*', '.']):
                warnings.append(f"Unsupported keyword: {word}")
    
    return warnings

# Test compatibility
warnings = check_opm_compatibility(complete_deck)
if warnings:
    print("Compatibility warnings:")
    for w in warnings:
        print(f"  - {w}")
else:
    print("Deck is OPM Flow compatible!")

## Summary

In this tutorial, we learned:

1. **Deck Structure**: ECLIPSE decks have a strict section order (RUNSPEC -> GRID -> PROPS -> SOLUTION -> SCHEDULE)
2. **Keyword Syntax**: Keywords are uppercase, data ends with `/`, `*` for repeats
3. **Validation**: We can validate section ordering and keyword syntax programmatically
4. **Generation**: CLARISSA generates each section from structured data
5. **OPM Compatibility**: We check for keywords supported by the open-source simulator

**Next Tutorial**: [02_OPM_Flow_Integration.ipynb](02_OPM_Flow_Integration.ipynb) - Running simulations with OPM Flow