# Type-Safe CIF Parsing

This notebook demonstrates type checking with cif_parser.

The type stubs reflect the dynamic nature of CIF files while providing IDE autocomplete and static type checking.

You can verify type correctness with:
```bash
mypy examples/type_checking_example.py
pyright examples/type_checking_example.py
```

In [None]:
from typing import Dict, List, Optional, Union

import cif_parser

## Sample CIF Content

In [None]:
cif_content = """
data_crystal
_cell_length_a    10.5
_cell_length_b    10.5
_cell_length_c    15.2
_cell_angle_alpha 90.0
_title 'Example Crystal Structure'
_temperature_kelvin ?

loop_
_atom_site_label
_atom_site_type_symbol
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
C1   C   0.1234  0.2345  0.3456  1.00
C2   C   0.2345  0.3456  0.4567  1.00
N1   N   0.3456  0.4567  0.5678  0.95
O1   O   0.4567  0.5678  0.6789  1.00
"""

## Parse the Document

Type checker knows `parse()` returns `Document` and `first_block()` returns `Optional[Block]`.

In [None]:
doc: cif_parser.Document = cif_parser.parse(cif_content)
block: Optional[cif_parser.Block] = doc.first_block()

if block is None:
    raise ValueError("No blocks found!")

# Type checker knows block.name is str
print(f"Block name: {block.name}")

## 1. Type-Safe Numeric Value Access

The type checker knows:
- `is_numeric` is `bool`
- `numeric` is `Optional[float]`

In [None]:
cell_a: Optional[cif_parser.Value] = block.get_item("_cell_length_a")

if cell_a is not None:
    if cell_a.is_numeric:
        length: Optional[float] = cell_a.numeric
        if length is not None:
            # Now type checker knows length is definitely float
            doubled: float = length * 2.0
            print(f"Cell length a: {length}")
            print(f"Doubled: {doubled}")

## 2. Type-Safe Text Value Access

In [None]:
title: Optional[cif_parser.Value] = block.get_item("_title")

if title is not None and title.is_text:
    text: Optional[str] = title.text
    if text is not None:
        # Type checker knows text is str
        upper: str = text.upper()
        print(f"Title: {text}")
        print(f"Uppercase: {upper}")

## 3. Handling Special Values

CIF has special values for unknown (`?`) and not applicable (`.`).

In [None]:
temp: Optional[cif_parser.Value] = block.get_item("_temperature_kelvin")

if temp is not None:
    if temp.is_unknown:
        print("Temperature: Unknown (?)")
    elif temp.is_not_applicable:
        print("Temperature: Not applicable (.)")
    elif temp.is_numeric and temp.numeric is not None:
        print(f"Temperature: {temp.numeric} K")

## 4. Type-Safe Loop Iteration

Type checker knows:
- `loop.tags` is `List[str]`
- Iteration yields `Dict[str, Value]`

In [None]:
loop: Optional[cif_parser.Loop] = block.get_loop(0)

if loop is not None:
    tags: List[str] = loop.tags
    print(f"Loop has {len(tags)} columns: {tags[:3]}...")

    # Count carbon atoms
    carbon_count: int = 0
    for row in loop:
        # row is Dict[str, Value]
        atom_type: cif_parser.Value = row["_atom_site_type_symbol"]
        if atom_type.is_text and atom_type.text == "C":
            carbon_count += 1

    print(f"Found {carbon_count} carbon atoms")

## 5. Converting to Python Types

Useful for DuckDB/pandas integration. `to_python()` returns `Union[str, float, None]`.

In [None]:
# Re-get loop
loop = block.get_loop(0)

if loop is not None:
    rows: List[Dict[str, Union[str, float, None]]] = []

    for row in loop:
        python_row: Dict[str, Union[str, float, None]] = {}

        for tag, value in row.items():
            python_value: Union[str, float, None] = value.to_python()
            python_row[tag] = python_value

        rows.append(python_row)

    print(f"Converted {len(rows)} rows to Python types")
    print(f"First row keys: {list(rows[0].keys())[:3]}...")

## 6. Document Iteration

In [None]:
block_count: int = len(doc)
print(f"Document has {block_count} block(s)")

for blk in doc:
    # Type checker knows blk is Block
    name: str = blk.name
    num_items: int = len(blk.item_keys)
    print(f"  - Block '{name}' with {num_items} items")

## 7. Dictionary-Style Block Access

Access blocks by index or name. Type checker knows these can raise `IndexError` or `KeyError`.

In [None]:
# Access by index
try:
    first_block: cif_parser.Block = doc[0]
    print(f"Block 0: {first_block.name}")
except IndexError:
    print("No block at index 0")

# Access by name
try:
    named_block: cif_parser.Block = doc["crystal"]
    print(f"Block 'crystal': {named_block.name}")
except KeyError:
    print("Block 'crystal' not found")

## Summary

Key type-safety features:

- `Optional` return types for lookups that may fail
- `is_numeric`, `is_text`, `is_unknown` boolean checks
- `to_python()` for easy conversion to native types
- Full IDE autocomplete support