# Lab 21: Pedigree Rendering and Visualization

## Overview

In this lab, we'll explore the pedigree rendering and visualization techniques used in Bonsai v3. These techniques are essential for helping users interpret and understand the results of genetic genealogy analyses. Effective visualization makes complex pedigree structures more accessible and highlights important genetic relationships.

In [None]:
# 🧬 Google Colab Setup - Run this cell first!
import os
import sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx
from IPython.display import display, HTML, Markdown

def is_colab():
    '''Check if running in Google Colab'''
    try:
        import google.colab
        return True
    except ImportError:
        return False

if is_colab():
    print("🔬 Setting up Google Colab environment...")
    
    # Install dependencies
    print("📦 Installing packages...")
    !pip install -q pysam biopython scikit-allel networkx pygraphviz seaborn plotly
    !apt-get update -qq && apt-get install -qq samtools bcftools tabix graphviz-dev
    
    # Create directories
    !mkdir -p /content/class_data /content/results
    
    # Download essential class data
    print("📥 Downloading class data...")
    S3_BASE = "https://computational-genetic-genealogy.s3.us-east-2.amazonaws.com/class_data/"
    data_files = [
        "pedigree.fam", "pedigree.def", 
        "merged_opensnps_autosomes_ped_sim.seg",
        "merged_opensnps_autosomes_ped_sim-everyone.fam",
        "ped_sim_run2.seg", "ped_sim_run2-everyone.fam"
    ]
    
    for file in data_files:
        !wget -q -O /content/class_data/{file} {S3_BASE}{file}
        print(f"  ✅ {file}")
    
    # Define utility functions
    def setup_environment():
        return "/content/class_data", "/content/results"
    
    def save_results(dataframe, filename, description="results"):
        os.makedirs("/content/results", exist_ok=True)
        full_path = f"/content/results/{filename}"
        dataframe.to_csv(full_path, index=False)
        display(HTML(f'''
        <div style="padding: 10px; background-color: #e3f2fd; border-left: 4px solid #2196f3; margin: 10px 0;">
            <p><strong>💾 Results saved!</strong> To download: 
            <code>from google.colab import files; files.download('{full_path}')</code></p>
        </div>
        '''))
        return full_path
    
    def save_plot(plt, filename, description="plot"):
        os.makedirs("/content/results", exist_ok=True)
        full_path = f"/content/results/{filename}"
        plt.savefig(full_path, dpi=300, bbox_inches='tight')
        plt.show()
        display(HTML(f'''
        <div style="padding: 10px; background-color: #e8f5e8; border-left: 4px solid #4caf50; margin: 10px 0;">
            <p><strong>📊 Plot saved!</strong> To download: 
            <code>from google.colab import files; files.download('{full_path}')</code></p>
        </div>
        '''))
        return full_path
    
    print("✅ Colab setup complete! Ready to explore genetic genealogy.")
    
else:
    print("🏠 Local environment detected")
    def setup_environment():
        return "class_data", "results"
    def save_results(df, filename, description=""):
        os.makedirs("results", exist_ok=True)
        path = f"results/{filename}"
        df.to_csv(path, index=False)
        return path
    def save_plot(plt, filename, description=""):
        os.makedirs("results", exist_ok=True)
        path = f"results/{filename}"
        plt.savefig(path, dpi=300, bbox_inches='tight')
        plt.show()
        return path

# Set up paths and configure visualization
DATA_DIR, RESULTS_DIR = setup_environment()
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_context("notebook")

In [None]:
# Setup Bonsai module paths
if not is_jupyterlite():
    # In local environment, add the utils directory to system path
    utils_dir = os.getenv('PROJECT_UTILS_DIR', os.path.join(os.path.dirname(DATA_DIR), 'utils'))
    bonsaitree_dir = os.path.join(utils_dir, 'bonsaitree')
    
    # Add to path if it exists and isn't already there
    if os.path.exists(bonsaitree_dir) and bonsaitree_dir not in sys.path:
        sys.path.append(bonsaitree_dir)
        print(f"Added {bonsaitree_dir} to sys.path")
else:
    # In JupyterLite, use a simplified approach
    print("⚠️ Running in JupyterLite: Some Bonsai functionality may be limited.")
    print("This notebook is primarily designed for local execution where the Bonsai codebase is available.")

In [None]:
# Helper functions for exploring modules
def display_module_classes(module_name):
    """Display classes and their docstrings from a module"""
    try:
        # Import the module
        module = importlib.import_module(module_name)
        
        # Find all classes
        classes = inspect.getmembers(module, inspect.isclass)
        
        # Filter classes defined in this module (not imported)
        classes = [(name, cls) for name, cls in classes if cls.__module__ == module_name]
        
        # Print info for each class
        for name, cls in classes:
            print(f"\n## {name}")
            
            # Get docstring
            doc = inspect.getdoc(cls)
            if doc:
                print(f"Docstring: {doc}")
            else:
                print("No docstring available")
            
            # Get methods
            methods = inspect.getmembers(cls, inspect.isfunction)
            if methods:
                print("\nMethods:")
                for method_name, method in methods:
                    if not method_name.startswith('_'):  # Skip private methods
                        print(f"- {method_name}")
    except ImportError as e:
        print(f"Error importing module {module_name}: {e}")
    except Exception as e:
        print(f"Error processing module {module_name}: {e}")

def display_module_functions(module_name):
    """Display functions and their docstrings from a module"""
    try:
        # Import the module
        module = importlib.import_module(module_name)
        
        # Find all functions
        functions = inspect.getmembers(module, inspect.isfunction)
        
        # Filter functions defined in this module (not imported)
        functions = [(name, func) for name, func in functions if func.__module__ == module_name]
        
        # Print info for each function
        for name, func in functions:
            if name.startswith('_'):  # Skip private functions
                continue
                
            print(f"\n## {name}")
            
            # Get signature
            sig = inspect.signature(func)
            print(f"Signature: {name}{sig}")
            
            # Get docstring
            doc = inspect.getdoc(func)
            if doc:
                print(f"Docstring: {doc}")
            else:
                print("No docstring available")
    except ImportError as e:
        print(f"Error importing module {module_name}: {e}")
    except Exception as e:
        print(f"Error processing module {module_name}: {e}")

def view_function_source(module_name, function_name):
    """Display the source code of a function"""
    try:
        # Import the module
        module = importlib.import_module(module_name)
        
        # Get the function
        func = getattr(module, function_name)
        
        # Get the source code
        source = inspect.getsource(func)
        
        # Print the source code
        from IPython.display import display, Markdown
        display(Markdown(f"```python\n{source}\n```"))
    except ImportError as e:
        print(f"Error importing module {module_name}: {e}")
    except AttributeError:
        print(f"Function {function_name} not found in module {module_name}")
    except Exception as e:
        print(f"Error processing function {function_name}: {e}")

## Check Bonsai Installation

Let's verify that the Bonsai v3 module is available for import:

In [None]:
try:
    from utils.bonsaitree.bonsaitree import v3
    print("✅ Successfully imported Bonsai v3 module")
except ImportError as e:
    print(f"❌ Failed to import Bonsai v3 module: {e}")
    print("This lab requires access to the Bonsai v3 codebase.")
    print("Make sure you've properly set up your environment with the Bonsai repository.")

## Lab 21: Pedigree Rendering and Visualization

Visualizing genetic pedigrees is a critical component of genetic genealogy applications. Effective visualization helps users understand complex family structures, identify patterns, and interpret genetic relationships. Bonsai v3 includes dedicated rendering functionality in its codebase.

In this lab, we'll explore:

1. **Graph-Based Rendering**: How Bonsai represents and renders pedigrees using directed graphs
2. **Pedigree Rendering API**: Understanding the rendering functions in the Bonsai v3 codebase
3. **Customizing Visualizations**: Adding colors, labels, and highlighting to pedigrees
4. **Practical Applications**: Using pedigree visualizations to support genetic genealogy analysis

We'll be working with the actual rendering code from the Bonsai v3 codebase, examining how it utilizes Graphviz to create effective pedigree visualizations.

## Part 1: Graph-Based Rendering

Pedigrees are naturally represented as directed graphs, where nodes represent individuals and edges represent parent-child relationships. In the Bonsai v3 codebase, the rendering functionality uses Graphviz to create these visualizations.

Let's start by examining the rendering module in the Bonsai v3 codebase:

In [ ]:
# Import the rendering module from Bonsai v3
try:
    from utils.bonsaitree.bonsaitree.v3 import rendering
    
    # Display the source code for the rendering function
    view_source(rendering.render_ped)
    
    print("✅ Successfully imported Bonsai v3 rendering module")
except ImportError as e:
    print(f"❌ Failed to import Bonsai v3 rendering module: {e}")
    print("This lab requires access to the Bonsai v3 codebase.")
    
# Check for Graphviz installation
try:
    import graphviz
    print("✅ Graphviz is installed")
    
    # Display version info
    graphviz_version = graphviz.__version__
    print(f"Graphviz version: {graphviz_version}")
except ImportError:
    print("❌ Graphviz is not installed. Install with: pip install graphviz")
    print("Additionally, the Graphviz system package may be required.")
    print("On Ubuntu/Debian: sudo apt-get install graphviz")
    print("On macOS: brew install graphviz")
    print("On Windows: download from http://www.graphviz.org/download/")
    
# Also import pedigrees module for helper functions
try:
    from utils.bonsaitree.bonsaitree.v3 import pedigrees
    print("✅ Successfully imported Bonsai v3 pedigrees module")
except ImportError as e:
    print(f"❌ Failed to import Bonsai v3 pedigrees module: {e}")

### 1.1 Understanding Pedigree Data Structures

Before we can visualize a pedigree, we need to understand how Bonsai v3 represents pedigree data. The key data structure is the "up-node dictionary", which maps each individual to their parent(s).

Let's examine the pedigree-related functions in the Bonsai v3 codebase:

In [ ]:
# Display key functions from the pedigrees module
display_module_functions('utils.bonsaitree.bonsaitree.v3.pedigrees')

# Let's look at a few specific functions that are relevant for visualization
key_functions = [
    'get_all_id_set',
    'reverse_node_dict',
    'get_subtree_node_set'
]

print("\nExamining key functions for visualization:")
for func_name in key_functions:
    if hasattr(pedigrees, func_name):
        print(f"\n{func_name} source code:")
        view_source(getattr(pedigrees, func_name))
    else:
        print(f"Function {func_name} not found in the pedigrees module")

### 1.2 Creating a Sample Pedigree

Now, let's create a sample pedigree that we can visualize. We'll use the up-node dictionary format used by Bonsai v3:

In [ ]:
# Create a sample pedigree using the up-node dictionary format
def create_sample_pedigree():
    """
    Create a sample pedigree for demonstration.
    
    The structure is as follows:
        1       2       3       4
         \     /         \     /
          \   /           \   /
           \ /             \ /
            5               6
             \             /
              \           /
               \         /
                \       /
                 \     /
                  \   /
                   \ /
                    7
                   / \
                  /   \
                 /     \
                8       9
    
    Returns:
        Dictionary in Bonsai's up-node format {id: {parent1: degree1, parent2: degree2}, ...}
    """
    # The up-node dictionary maps each individual to their parents
    # The format is {individual_id: {parent_id1: 1, parent_id2: 1}, ...}
    # where the value 1 represents a direct parent-child relationship (degree 1)
    
    up_dict = {
        # Individual 5 has parents 1 and 2
        5: {1: 1, 2: 1},
        
        # Individual 6 has parents 3 and 4
        6: {3: 1, 4: 1},
        
        # Individual 7 has parents 5 and 6
        7: {5: 1, 6: 1},
        
        # Individuals 8 and 9 are children of 7
        8: {7: 1},
        9: {7: 1},
        
        # Founders (individuals 1, 2, 3, 4) don't have parents
        1: {},
        2: {},
        3: {},
        4: {}
    }
    
    return up_dict

# Create the sample pedigree
sample_pedigree = create_sample_pedigree()

# Examine the pedigree
print("Sample pedigree (up-node dictionary):")
for individual, parents in sample_pedigree.items():
    parent_str = ", ".join([f"{p}" for p in parents]) if parents else "none (founder)"
    print(f"Individual {individual} - Parents: {parent_str}")

# Get all IDs in the pedigree using the Bonsai function
all_ids = pedigrees.get_all_id_set(sample_pedigree)
print(f"\nAll IDs in the pedigree: {all_ids}")

# Get the founder set (individuals with no parents in the pedigree)
founders = pedigrees.get_founder_set(sample_pedigree) if hasattr(pedigrees, 'get_founder_set') else {id for id, parents in sample_pedigree.items() if not parents}
print(f"Founders: {founders}")

# Create the down-node dictionary (mapping parents to children)
down_dict = pedigrees.reverse_node_dict(sample_pedigree)
print("\nDown-node dictionary (parents → children):")
for parent, children in down_dict.items():
    print(f"Individual {parent} - Children: {', '.join([str(c) for c in children])}")

### 1.3 Basic Pedigree Visualization with Bonsai's render_ped

Now let's use Bonsai's `render_ped` function to visualize our sample pedigree:

In [ ]:
# Set up a directory for rendering output
import os
render_dir = os.path.join(RESULTS_DIR, 'pedigree_renders')

# Create the directory if it doesn't exist
if not os.path.exists(render_dir):
    os.makedirs(render_dir)
    print(f"Created directory: {render_dir}")
else:
    print(f"Using existing directory: {render_dir}")
    
# Render the pedigree using Bonsai's render_ped function
try:
    # Create a simple label dictionary for better readability
    label_dict = {i: f"Person {i}" for i in sample_pedigree.keys()}
    
    # Render the pedigree
    rendering.render_ped(
        up_dct=sample_pedigree,
        name="sample_pedigree",
        out_dir=render_dir,
        label_dict=label_dict
    )
    
    # Display the rendered image
    from IPython.display import Image, display
    display(Image(os.path.join(render_dir, "sample_pedigree.png")))
    
    print(f"✅ Pedigree rendered successfully to {render_dir}/sample_pedigree.png")
except Exception as e:
    print(f"❌ Failed to render pedigree: {e}")
    
    # If the rendering fails, let's create a simple visualization using networkx
    try:
        import networkx as nx
        import matplotlib.pyplot as plt
        
        print("Falling back to NetworkX visualization...")
        
        # Create a directed graph from the up-node dictionary
        G = nx.DiGraph()
        
        # Add nodes
        for node in sample_pedigree:
            G.add_node(node)
        
        # Add edges from parents to children
        for child, parents in sample_pedigree.items():
            for parent in parents:
                G.add_edge(parent, child)
        
        # Create the visualization
        plt.figure(figsize=(10, 8))
        pos = nx.shell_layout(G)
        nx.draw(G, pos, with_labels=True, node_color='skyblue', 
                node_size=1000, font_size=12, font_weight='bold', 
                arrows=True, arrowsize=20)
        plt.title("Sample Pedigree (NetworkX Fallback)")
        plt.axis('off')
        plt.show()
    except Exception as nested_e:
        print(f"❌ NetworkX fallback also failed: {nested_e}")

## Part 2: Pedigree Rendering API

Now that we've seen a basic visualization, let's explore the pedigree rendering API in Bonsai v3 more deeply. The `render_ped` function provides several options for customizing the visualization:

In [ ]:
# Examine the render_ped function signature
import inspect

if hasattr(rendering, 'render_ped'):
    # Get the signature of the render_ped function
    sig = inspect.signature(rendering.render_ped)
    
    print("render_ped function signature:")
    print(f"rendering.render_ped{sig}")
    
    # Get parameter documentation
    doc = inspect.getdoc(rendering.render_ped)
    if doc:
        print("\nDocumentation:")
        print(doc)
    else:
        print("\nNo documentation available.")
        
    # Summarize the parameters
    print("\nParameters:")
    for param_name, param in sig.parameters.items():
        param_type = param.annotation if param.annotation != inspect.Parameter.empty else "Not specified"
        default = param.default if param.default != inspect.Parameter.empty else "Required"
        print(f"- {param_name}: {param_type} (Default: {default})")
else:
    print("render_ped function not found in the rendering module.")

### 2.1 Customizing Node Colors

One way to enhance pedigree visualizations is to use different colors to represent different attributes of individuals. Let's customize our visualization by coloring nodes based on sex and highlighting a focal individual:

In [ ]:
# Create a more complex pedigree with sex information
def create_sample_pedigree_with_sex():
    """
    Create a sample pedigree with sex information.
    
    Returns:
        Tuple of (up_dict, sex_dict)
    """
    # Define the pedigree structure (same as before)
    up_dict = {
        5: {1: 1, 2: 1},
        6: {3: 1, 4: 1},
        7: {5: 1, 6: 1},
        8: {7: 1},
        9: {7: 1},
        1: {},
        2: {},
        3: {},
        4: {}
    }
    
    # Define sex for each individual (M=male, F=female)
    sex_dict = {
        1: 'M',  # Male
        2: 'F',  # Female
        3: 'M',  # Male
        4: 'F',  # Female
        5: 'M',  # Male
        6: 'F',  # Female
        7: 'M',  # Male
        8: 'M',  # Male
        9: 'F'   # Female
    }
    
    return up_dict, sex_dict

# Create the sample pedigree with sex information
sample_pedigree, sex_dict = create_sample_pedigree_with_sex()

# Create a color dictionary based on sex
color_dict = {
    id_val: 'skyblue' if sex == 'M' else 'pink' 
    for id_val, sex in sex_dict.items()
}

# Choose a focal individual to highlight
focal_id = 7  # Person 7 (connecting two families)

# Enhanced labels with sex information
label_dict = {
    i: f"Person {i} ({'M' if sex_dict[i] == 'M' else 'F'})" 
    for i in sample_pedigree.keys()
}

# Render the pedigree with customized colors and a focal individual
try:
    # Render the pedigree
    rendering.render_ped(
        up_dct=sample_pedigree,
        name="colored_pedigree",
        out_dir=render_dir,
        color_dict=color_dict,
        label_dict=label_dict,
        focal_id=focal_id
    )
    
    # Display the rendered image
    from IPython.display import Image, display
    display(Image(os.path.join(render_dir, "colored_pedigree.png")))
    
    print(f"✅ Colored pedigree rendered successfully")
except Exception as e:
    print(f"❌ Failed to render colored pedigree: {e}")

### 2.2 Working with Different Output Formats

Graphviz supports multiple output formats. Let's modify our approach to generate different format outputs and examine them:

In [ ]:
# Since Bonsai's render_ped function doesn't directly support format selection,
# we'll create a wrapper function that renders the pedigree and then converts to different formats

import subprocess

def render_pedigree_in_formats(up_dict, name, out_dir, formats=None, **kwargs):
    """
    Render a pedigree in multiple formats.
    
    Args:
        up_dict: The pedigree as an up-node dictionary
        name: Base name for the output files
        out_dir: Directory to save the rendered images
        formats: List of formats to generate (e.g., ["png", "svg", "pdf"])
        **kwargs: Additional arguments to pass to rendering.render_ped
        
    Returns:
        Dict mapping formats to file paths
    """
    if formats is None:
        formats = ["png"]  # Default to PNG only
    
    # Ensure the output directory exists
    if not os.path.exists(out_dir):
        os.makedirs(out_dir)
    
    # Generate the initial rendering with Bonsai's render_ped
    # This produces a file in the default format (usually PNG)
    rendering.render_ped(
        up_dct=up_dict,
        name=name,
        out_dir=out_dir,
        **kwargs
    )
    
    # Base path without extension
    base_path = os.path.join(out_dir, name)
    
    # Check which formats we need to generate
    output_files = {}
    
    # The initial rendering produces a DOT file and the default format
    dot_file = f"{base_path}"  # DOT file without extension
    default_format = "png"  # Usually PNG
    
    # Add the default format to our outputs
    output_files[default_format] = f"{base_path}.{default_format}"
    
    # Convert to additional formats if requested
    for fmt in formats:
        if fmt != default_format:
            output_file = f"{base_path}.{fmt}"
            try:
                # Use Graphviz command-line tools to convert
                subprocess.run(
                    ["dot", f"-T{fmt}", f"-o{output_file}", f"{dot_file}"],
                    check=True,
                    capture_output=True
                )
                output_files[fmt] = output_file
                print(f"Generated {fmt.upper()} format: {output_file}")
            except subprocess.CalledProcessError as e:
                print(f"Failed to generate {fmt.upper()} format: {e}")
                print(f"Error output: {e.stderr.decode() if e.stderr else 'None'}")
            except FileNotFoundError:
                print(f"Command 'dot' not found. Make sure Graphviz is installed.")
    
    return output_files

# Render the pedigree in multiple formats
try:
    # Render in PNG, SVG, and PDF formats
    output_files = render_pedigree_in_formats(
        up_dict=sample_pedigree,
        name="multi_format_pedigree",
        out_dir=render_dir,
        formats=["png", "svg"],  # Removed PDF to avoid issues on some systems
        color_dict=color_dict,
        label_dict=label_dict,
        focal_id=focal_id
    )
    
    # Display the PNG rendering
    if "png" in output_files:
        from IPython.display import Image
        display(Image(output_files["png"]))
    
    # Display the SVG rendering (if available)
    if "svg" in output_files:
        from IPython.display import SVG
        display(SVG(output_files["svg"]))
except Exception as e:
    print(f"❌ Multi-format rendering failed: {e}")

## Part 3: Customizing Visualizations

Let's explore more advanced customization options for pedigree visualizations:

### 3.1 Creating a Custom Pedigree Rendering Function

While Bonsai's `render_ped` function is very useful, it has some limitations. Let's create an enhanced version that offers more customization options:

In [ ]:
def enhanced_render_pedigree(
    up_dct, 
    name, 
    out_dir, 
    color_dict=None, 
    label_dict=None, 
    focal_id=None,
    shape_dict=None,
    edge_colors=None,
    node_size=None,
    font_size=None,
    direction="BT",  # TB=top to bottom (ancestors at top), BT=bottom to top (ancestors at bottom)
    format="png"
):
    """
    Enhanced pedigree rendering function with more customization options.
    
    Args:
        up_dct: Up-node dictionary {id: {parent1: 1, parent2: 1}, ...}
        name: Base name for the output file
        out_dir: Directory to save the rendered image
        color_dict: Dictionary mapping node IDs to colors
        label_dict: Dictionary mapping node IDs to labels
        focal_id: ID of the focal individual to highlight
        shape_dict: Dictionary mapping node IDs to shapes ('box', 'circle', 'ellipse', etc.)
        edge_colors: Dictionary mapping (parent_id, child_id) tuples to edge colors
        node_size: Base size for nodes (can be adjusted for individual nodes using shape attributes)
        font_size: Font size for node labels
        direction: Graph direction ('TB'=top to bottom, 'BT'=bottom to top, 'LR'=left to right, 'RL'=right to left)
        format: Output format (png, svg, pdf, etc.)
        
    Returns:
        Path to the rendered image
    """
    # Ensure the output directory exists
    if not os.path.exists(out_dir):
        os.makedirs(out_dir)
    
    # Get all node IDs
    all_id_set = pedigrees.get_all_id_set(up_dct)
    
    # Set default values for dictionaries
    if color_dict is None:
        color_dict = {i: 'dodgerblue' for i in all_id_set}
    
    if label_dict is None:
        label_dict = {i: str(i) for i in all_id_set}
    
    if shape_dict is None:
        shape_dict = {i: 'box' for i in all_id_set}
    
    if edge_colors is None:
        edge_colors = {}
    
    # Set default values for other properties
    if node_size is None:
        node_size = 0.5
    
    if font_size is None:
        font_size = 10
    
    # Apply highlighting to the focal individual
    if focal_id is not None:
        color_dict[focal_id] = 'red'
    
    # Create a new directed graph
    dot = graphviz.Digraph(name)
    
    # Set graph attributes
    dot.attr(rankdir=direction)  # Direction of the graph
    
    # Add nodes with attributes
    for node in all_id_set:
        # Define node attributes
        attrs = {
            'color': 'black',  # Default edge color
            'fillcolor': color_dict.get(node, 'white'),  # Fill color
            'style': 'filled',  # Style
            'label': label_dict.get(node, str(node)),  # Label
            'shape': shape_dict.get(node, 'box'),  # Shape
            'width': str(node_size),  # Width
            'height': str(node_size),  # Height
            'fontsize': str(font_size)  # Font size
        }
        
        # Add the node with attributes
        dot.node(str(node), **attrs)
    
    # Add edges with attributes
    for child, parents in up_dct.items():
        for parent in parents:
            # Define edge attributes
            edge_color = edge_colors.get((parent, child), 'black')
            
            # Add the edge
            dot.edge(
                str(parent), 
                str(child), 
                color=edge_color,
                arrowhead='none'  # No arrowheads for cleaner look
            )
    
    # Render the graph
    output_path = dot.render(directory=out_dir, format=format).replace('\\', '/')
    
    return output_path

# Let's use our enhanced rendering function
try:
    # Create a more detailed set of rendering options
    
    # Custom labels with additional information
    enhanced_labels = {
        i: f"Person {i}\n{'Male' if sex_dict[i] == 'M' else 'Female'}" 
        for i in sample_pedigree.keys()
    }
    
    # Custom shapes based on sex
    shapes = {
        i: 'box' if sex_dict[i] == 'M' else 'ellipse' 
        for i in sample_pedigree.keys()
    }
    
    # Edge colors for specific relationships
    edge_colors = {
        # Highlight the path from grandparents to focal individual
        (1, 5): 'blue',
        (2, 5): 'blue',
        (5, 7): 'blue',
        (3, 6): 'green',
        (4, 6): 'green',
        (6, 7): 'green'
    }
    
    # Render the enhanced pedigree
    output_path = enhanced_render_pedigree(
        up_dct=sample_pedigree,
        name="enhanced_pedigree",
        out_dir=render_dir,
        color_dict=color_dict,
        label_dict=enhanced_labels,
        focal_id=focal_id,
        shape_dict=shapes,
        edge_colors=edge_colors,
        node_size=1.0,
        font_size=12,
        direction="TB",  # Top to bottom (ancestors at top)
        format="png"
    )
    
    # Display the rendered image
    from IPython.display import Image
    display(Image(output_path))
    
    print(f"✅ Enhanced pedigree rendered successfully to {output_path}")
    
    # Also render in bottom-to-top direction (ancestors at bottom)
    output_path_bt = enhanced_render_pedigree(
        up_dct=sample_pedigree,
        name="enhanced_pedigree_bt",
        out_dir=render_dir,
        color_dict=color_dict,
        label_dict=enhanced_labels,
        focal_id=focal_id,
        shape_dict=shapes,
        edge_colors=edge_colors,
        node_size=1.0,
        font_size=12,
        direction="BT",  # Bottom to top (ancestors at bottom)
        format="png"
    )
    
    # Display the bottom-up rendering
    display(Image(output_path_bt))
    
    print(f"✅ Bottom-up pedigree rendered successfully to {output_path_bt}")
except Exception as e:
    print(f"❌ Enhanced rendering failed: {e}")

### 3.2 Visualizing Pedigree Subtrees

In many genetic genealogy applications, it's useful to visualize specific subtrees of a larger pedigree. Let's implement a function to extract and visualize subtrees:

In [ ]:
# Let's create a larger pedigree for demonstration
def create_extended_pedigree():
    """
    Create a larger pedigree for demonstration.
    
    This extends our sample pedigree with additional branches.
    
    Returns:
        Dictionary in Bonsai's up-node format
    """
    # Start with our existing pedigree
    pedigree = create_sample_pedigree()
    
    # Add more individuals and relationships
    pedigree.update({
        # Add siblings to Person 7
        10: {5: 1, 6: 1},  # Sibling of 7
        11: {5: 1, 6: 1},  # Sibling of 7
        
        # Add siblings to Person 5
        12: {1: 1, 2: 1},  # Sibling of 5
        
        # Add a nuclear family connected to Person 1
        13: {},  # New founder
        14: {1: 1, 13: 1},  # Child of 1 and 13
        15: {14: 1},  # Child of 14
        16: {14: 1}   # Child of 14
    })
    
    return pedigree

# Create the extended pedigree
extended_pedigree = create_extended_pedigree()

# Use Bonsai's functions to extract subtrees
def visualize_subtree(up_dct, root_id, name, out_dir, **kwargs):
    """
    Extract and visualize a subtree from a pedigree.
    
    Args:
        up_dct: Full pedigree as an up-node dictionary
        root_id: ID of the root individual for the subtree
        name: Base name for the output file
        out_dir: Directory to save the rendered image
        **kwargs: Additional arguments to pass to enhanced_render_pedigree
        
    Returns:
        Tuple of (sub_dict, output_path)
    """
    # Extract the subtree using Bonsai's functions
    if hasattr(pedigrees, 'get_subdict'):
        # Use Bonsai's get_subdict function if available
        sub_dict = pedigrees.get_subdict(up_dct, root_id)
    else:
        # Otherwise, implement a simplified version
        # This only includes descendants of root_id
        sub_dict = {root_id: up_dct.get(root_id, {})}
        
        # Function to recursively add descendants
        def add_descendants(node_id):
            for child_id in list(up_dct.keys()):
                parents = up_dct.get(child_id, {})
                if node_id in parents:
                    if child_id not in sub_dict:
                        sub_dict[child_id] = parents
                        add_descendants(child_id)
        
        # Add all descendants
        add_descendants(root_id)
    
    # Visualize the subtree
    output_path = enhanced_render_pedigree(
        up_dct=sub_dict,
        name=name,
        out_dir=out_dir,
        **kwargs
    )
    
    return sub_dict, output_path

# Define sex information for the extended pedigree
extended_sex_dict = {
    1: 'M', 2: 'F', 3: 'M', 4: 'F', 5: 'M', 6: 'F', 7: 'M', 8: 'M', 9: 'F',
    10: 'M', 11: 'F', 12: 'F', 13: 'F', 14: 'M', 15: 'F', 16: 'M'
}

# Create color dictionary based on sex
extended_color_dict = {
    id_val: 'skyblue' if sex == 'M' else 'pink' 
    for id_val, sex in extended_sex_dict.items()
}

# Create shape dictionary based on sex
extended_shapes = {
    id_val: 'box' if sex == 'M' else 'ellipse' 
    for id_val, sex in extended_sex_dict.items()
}

# Create label dictionary
extended_labels = {
    id_val: f"P{id_val}\n({'M' if sex == 'M' else 'F'})" 
    for id_val, sex in extended_sex_dict.items()
}

# Visualize the full extended pedigree
try:
    # Render the full pedigree
    full_output = enhanced_render_pedigree(
        up_dct=extended_pedigree,
        name="extended_pedigree",
        out_dir=render_dir,
        color_dict=extended_color_dict,
        label_dict=extended_labels,
        shape_dict=extended_shapes,
        direction="TB"
    )
    
    # Display the full pedigree
    print("Full Extended Pedigree:")
    display(Image(full_output))
    
    # Visualize subtrees for different root individuals
    root_ids = [1, 5, 7, 14]
    
    for root_id in root_ids:
        # Extract and visualize the subtree
        sub_dict, output_path = visualize_subtree(
            up_dct=extended_pedigree,
            root_id=root_id,
            name=f"subtree_{root_id}",
            out_dir=render_dir,
            color_dict=extended_color_dict,
            label_dict=extended_labels,
            shape_dict=extended_shapes,
            focal_id=root_id,
            direction="TB"
        )
        
        # Display the subtree
        print(f"Subtree rooted at Person {root_id}:")
        display(Image(output_path))
except Exception as e:
    print(f"❌ Subtree visualization failed: {e}")

## Part 4: Practical Applications

Now let's look at some practical applications of pedigree visualization in genetic genealogy:

### 4.1 Visualizing IBD Sharing in a Pedigree

A key application in genetic genealogy is visualizing IBD sharing between individuals in a pedigree:

In [ ]:
# Create simulated IBD data for our pedigree
def create_simulated_ibd_data(pedigree, sex_dict):
    """
    Create simulated IBD sharing data between individuals in a pedigree.
    
    Args:
        pedigree: Up-node dictionary
        sex_dict: Dictionary mapping individual IDs to sexes ('M' or 'F')
        
    Returns:
        Dictionary mapping pairs of individuals to IBD sharing metrics
    """
    # Create a down-node dictionary for easier relationship inference
    down_dict = pedigrees.reverse_node_dict(pedigree)
    
    # Define a function to check relationships
    def are_parent_child(id1, id2):
        """Check if id1 is a parent of id2 or vice versa."""
        return id1 in pedigree.get(id2, {}) or id2 in pedigree.get(id1, {})
    
    def are_siblings(id1, id2):
        """Check if id1 and id2 are siblings."""
        parents1 = set(pedigree.get(id1, {}).keys())
        parents2 = set(pedigree.get(id2, {}).keys())
        return len(parents1.intersection(parents2)) > 0
    
    def are_grandparent_grandchild(id1, id2):
        """Check if id1 is a grandparent of id2 or vice versa."""
        # Check if id1 is a grandparent of id2
        for parent_id in pedigree.get(id2, {}):
            if id1 in pedigree.get(parent_id, {}):
                return True
        
        # Check if id2 is a grandparent of id1
        for parent_id in pedigree.get(id1, {}):
            if id2 in pedigree.get(parent_id, {}):
                return True
        
        return False
    
    def are_cousins(id1, id2):
        """Check if id1 and id2 are first cousins."""
        # Get parents of id1 and id2
        parents1 = set(pedigree.get(id1, {}).keys())
        parents2 = set(pedigree.get(id2, {}).keys())
        
        # Check if any parent of id1 is a sibling of any parent of id2
        for p1 in parents1:
            for p2 in parents2:
                if are_siblings(p1, p2):
                    return True
        
        return False
    
    # Create a dictionary to store IBD sharing data
    ibd_data = {}
    
    # Get all individual IDs
    individuals = list(pedigree.keys())
    
    # Generate IBD data for each pair of individuals
    for i, id1 in enumerate(individuals):
        for id2 in individuals[i+1:]:  # Only process each pair once
            # Initialize sharing data
            shared_segments = []
            total_cm = 0
            
            # Determine relationship type and adjust expected sharing
            if are_parent_child(id1, id2):
                # Parent-child: ~3400 cM shared
                expected_cm = 3400
                min_segments = 22  # One per chromosome
                max_segments = 25
            elif are_siblings(id1, id2):
                # Siblings: ~2550 cM shared on average
                expected_cm = 2550
                min_segments = 35
                max_segments = 45
            elif are_grandparent_grandchild(id1, id2):
                # Grandparent-grandchild: ~1700 cM shared
                expected_cm = 1700
                min_segments = 20
                max_segments = 30
            elif are_cousins(id1, id2):
                # First cousins: ~850 cM shared
                expected_cm = 850
                min_segments = 10
                max_segments = 20
            else:
                # More distant or unrelated: less sharing
                relatives = False
                for id3 in pedigree:
                    if id3 in pedigree.get(id1, {}) and id3 in pedigree.get(id2, {}):
                        relatives = True
                        break
                
                if relatives:
                    # Distant relatives
                    expected_cm = 100
                    min_segments = 2
                    max_segments = 6
                else:
                    # Unrelated
                    expected_cm = 0
                    min_segments = 0
                    max_segments = 1
            
            # Add some random variation to the expected sharing
            if expected_cm > 0:
                total_cm = max(0, expected_cm * (0.9 + 0.2 * random.random()))
                
                # Generate random segments
                num_segments = random.randint(min_segments, max_segments)
                
                # Distribute total cM across segments
                segment_cms = []
                remaining_cm = total_cm
                for _ in range(num_segments - 1):
                    segment_cm = remaining_cm * random.random() * 0.3  # Take up to 30% of remaining
                    segment_cms.append(segment_cm)
                    remaining_cm -= segment_cm
                
                # Add the last segment with remaining cM
                if remaining_cm > 0:
                    segment_cms.append(remaining_cm)
                
                # Create the segment objects
                for s, segment_cm in enumerate(segment_cms):
                    if segment_cm >= 7:  # Only include segments >= 7 cM
                        chrom = random.randint(1, 22)
                        start_pos = random.randint(1000000, 100000000)
                        end_pos = start_pos + random.randint(5000000, 50000000)
                        segment = {
                            'id1': id1,
                            'id2': id2,
                            'chromosome': str(chrom),
                            'start_pos': start_pos,
                            'end_pos': end_pos,
                            'cm': segment_cm,
                            'snps': int(segment_cm * 70)  # Approximate SNP count
                        }
                        shared_segments.append(segment)
            
            # Store the data
            pair = (min(id1, id2), max(id1, id2))
            ibd_data[pair] = {
                'segments': shared_segments,
                'total_cm': sum(seg['cm'] for seg in shared_segments),
                'num_segments': len(shared_segments)
            }
    
    return ibd_data

# Create simulated IBD data for our extended pedigree
ibd_data = create_simulated_ibd_data(extended_pedigree, extended_sex_dict)

# Display summary of the IBD data
print("IBD Sharing Summary:")
print(f"{'Individual 1':<10} {'Individual 2':<10} {'Total cM':<10} {'Segments':<10}")
print("-" * 45)

# Sort by total cM
sorted_pairs = sorted(ibd_data.items(), key=lambda x: x[1]['total_cm'], reverse=True)

for (id1, id2), data in sorted_pairs[:10]:  # Show top 10 relationships
    print(f"{id1:<10} {id2:<10} {data['total_cm']:<10.1f} {data['num_segments']:<10}")

# Visualize IBD sharing in the pedigree
def visualize_pedigree_with_ibd(pedigree, ibd_data, focal_id=None, min_cm=50):
    """
    Visualize a pedigree with IBD sharing information.
    
    Args:
        pedigree: Up-node dictionary
        ibd_data: Dictionary mapping pairs of individuals to IBD sharing metrics
        focal_id: Optional ID to highlight IBD sharing from this individual
        min_cm: Minimum cM threshold for displaying IBD connections
        
    Returns:
        Path to the rendered image
    """
    # Get all IDs in the pedigree
    all_ids = pedigrees.get_all_id_set(pedigree)
    
    # Create a new directed graph
    dot = graphviz.Digraph('pedigree_with_ibd')
    
    # Set graph attributes
    dot.attr(rankdir='TB')  # Top to bottom
    
    # Add nodes (individuals)
    for id_val in all_ids:
        # Define node attributes
        sex = extended_sex_dict.get(id_val, 'U')
        color = 'skyblue' if sex == 'M' else 'pink'
        shape = 'box' if sex == 'M' else 'ellipse'
        
        # Highlight focal individual
        if id_val == focal_id:
            color = 'red'
        
        dot.node(
            str(id_val),
            label=f"P{id_val}\n({'M' if sex == 'M' else 'F'})",
            fillcolor=color,
            style='filled',
            shape=shape
        )
    
    # Add parent-child edges
    for child, parents in pedigree.items():
        for parent in parents:
            dot.edge(
                str(parent),
                str(child),
                color='black',
                style='solid',
                penwidth='1'
            )
    
    # Add IBD sharing edges
    for (id1, id2), data in ibd_data.items():
        total_cm = data['total_cm']
        
        # Skip if below threshold or if focal_id is specified and neither individual is the focal_id
        if total_cm < min_cm or (focal_id and id1 != focal_id and id2 != focal_id):
            continue
        
        # Calculate edge attributes based on total cM
        # Thicker edges for more sharing
        penwidth = 0.5 + min(5, total_cm / 500)
        
        # Color intensity based on total cM
        intensity = min(255, int(50 + (total_cm / 3500) * 205))
        color = f"#{intensity:02x}00{255-intensity:02x}"  # Red to blue gradient
        
        # Add the IBD edge
        dot.edge(
            str(id1),
            str(id2),
            color=color,
            style='dashed',
            penwidth=str(penwidth),
            constraint='false',  # Don't use this edge for layout
            label=f"{total_cm:.1f} cM"
        )
    
    # Render the graph
    output_path = dot.render(
        filename='pedigree_with_ibd',
        directory=render_dir,
        format='png'
    ).replace('\\', '/')
    
    return output_path

# Visualize the pedigree with IBD sharing
try:
    # Full pedigree with all IBD relationships
    output_path = visualize_pedigree_with_ibd(
        pedigree=extended_pedigree,
        ibd_data=ibd_data,
        min_cm=50  # Only show relationships with at least 50 cM shared
    )
    
    # Display the rendered image
    print("Pedigree with IBD Sharing:")
    display(Image(output_path))
    
    # Pedigree with IBD sharing from a specific individual
    focal_id = 7  # Person 7
    
    output_path_focal = visualize_pedigree_with_ibd(
        pedigree=extended_pedigree,
        ibd_data=ibd_data,
        focal_id=focal_id,
        min_cm=20  # Lower threshold for focal individual
    )
    
    # Display the rendered image
    print(f"Pedigree with IBD Sharing from Person {focal_id}:")
    display(Image(output_path_focal))
except Exception as e:
    print(f"❌ Failed to visualize pedigree with IBD: {e}")

### 4.2 Visualizing Chromosome Painting

Another useful visualization in genetic genealogy is chromosome painting, which shows how segments are shared across chromosomes:

In [ ]:
def create_chromosome_painting(individual_id, ibd_data, figsize=(15, 10)):
    """
    Create a chromosome painting visualization for an individual.
    
    Args:
        individual_id: ID of the individual to visualize
        ibd_data: Dictionary mapping pairs of individuals to IBD sharing data
        figsize: Figure size (width, height)
        
    Returns:
        Matplotlib figure
    """
    # Extract segments involving the individual
    segments = []
    
    for (id1, id2), data in ibd_data.items():
        if id1 == individual_id or id2 == individual_id:
            # Get the other individual's ID
            other_id = id2 if id1 == individual_id else id1
            
            # Add segments to the list
            for segment in data['segments']:
                segments.append({
                    'chromosome': segment['chromosome'],
                    'start_pos': segment['start_pos'],
                    'end_pos': segment['end_pos'],
                    'cm': segment['cm'],
                    'other_id': other_id
                })
    
    # If no segments, return an empty figure
    if not segments:
        fig, ax = plt.subplots(figsize=figsize)
        ax.text(0.5, 0.5, "No IBD segments found", ha='center', va='center', fontsize=14)
        ax.axis('off')
        return fig
    
    # Sort segments by chromosome and position
    segments.sort(key=lambda s: (int(s['chromosome']) if s['chromosome'].isdigit() else 999, s['start_pos']))
    
    # Get the unique chromosomes
    chromosomes = sorted(set(s['chromosome'] for s in segments), 
                        key=lambda x: int(x) if x.isdigit() else 999)
    
    # Approximate chromosome lengths (in base pairs)
    chrom_lengths = {
        '1': 248956422, '2': 242193529, '3': 198295559, 
        '4': 190214555, '5': 181538259, '6': 170805979,
        '7': 159345973, '8': 145138636, '9': 138394717,
        '10': 133797422, '11': 135086622, '12': 133275309,
        '13': 114364328, '14': 107043718, '15': 101991189,
        '16': 90338345, '17': 83257441, '18': 80373285,
        '19': 58617616, '20': 64444167, '21': 46709983,
        '22': 50818468, 'X': 156040895, 'Y': 57227415
    }
    
    # Create a figure with one subplot per chromosome
    fig, axs = plt.subplots(len(chromosomes), 1, figsize=figsize, 
                           squeeze=False, sharex=True, gridspec_kw={'hspace': 0.3})
    axs = axs.flatten()
    
    # Create a color map for each unique "other_id"
    other_ids = sorted(set(s['other_id'] for s in segments))
    colors = plt.cm.tab10.colors
    color_map = {other_id: colors[i % len(colors)] for i, other_id in enumerate(other_ids)}
    
    # Draw segments on each chromosome
    for i, chrom in enumerate(chromosomes):
        ax = axs[i]
        
        # Get chromosome length
        chrom_length = chrom_lengths.get(chrom, 200000000)
        
        # Draw chromosome backbone
        ax.plot([0, chrom_length], [0, 0], 'k-', linewidth=2)
        
        # Filter segments for this chromosome
        chrom_segments = [s for s in segments if s['chromosome'] == chrom]
        
        # Draw segments
        for segment in chrom_segments:
            other_id = segment['other_id']
            color = color_map[other_id]
            
            # Draw a thick line for the segment
            ax.plot(
                [segment['start_pos'], segment['end_pos']],
                [0, 0],
                '-',
                linewidth=10,
                color=color,
                solid_capstyle='butt',
                alpha=0.7
            )
            
            # Add label if segment is large enough
            if segment['cm'] > 20:
                ax.text(
                    (segment['start_pos'] + segment['end_pos']) / 2,
                    0.1,
                    f"P{other_id} ({segment['cm']:.1f} cM)",
                    ha='center',
                    va='bottom',
                    fontsize=8,
                    rotation=0,
                    bbox=dict(facecolor='white', alpha=0.7, edgecolor='none')
                )
        
        # Set y-axis limits and remove ticks
        ax.set_ylim(-0.5, 0.5)
        ax.set_yticks([])
        
        # Add chromosome label
        ax.text(
            -0.02 * chrom_length,
            0,
            f"Chr {chrom}",
            ha='right',
            va='center',
            fontsize=10,
            fontweight='bold'
        )
        
        # Set x-axis ticks in millions of base pairs
        ax.set_xlim(-0.05 * chrom_length, 1.05 * chrom_length)
        
        # Only show x-axis for the bottom plot
        if i < len(chromosomes) - 1:
            ax.set_xticklabels([])
    
    # Format x-axis ticks as Mb
    def format_mb(x, pos):
        return f"{x / 1_000_000:.0f}"
    
    axs[-1].xaxis.set_major_formatter(plt.FuncFormatter(format_mb))
    axs[-1].set_xlabel('Position (Mb)')
    
    # Add a legend
    handles = [plt.Line2D([0], [0], color=color, lw=4) for color in [color_map[i] for i in other_ids]]
    labels = [f"Person {i}" for i in other_ids]
    fig.legend(handles, labels, loc='upper right', bbox_to_anchor=(0.95, 0.95))
    
    # Add title
    fig.suptitle(f"Chromosome Painting for Person {individual_id}", fontsize=16, y=0.98)
    
    plt.tight_layout()
    plt.subplots_adjust(top=0.95, right=0.95)
    
    return fig

# Create chromosome paintings for selected individuals
try:
    # Select individuals for chromosome painting
    individuals_to_visualize = [7, 8, 9, 5]
    
    for individual_id in individuals_to_visualize:
        fig = create_chromosome_painting(individual_id, ibd_data)
        
        # Save the figure
        output_path = os.path.join(render_dir, f"chromosome_painting_{individual_id}.png")
        fig.savefig(output_path, dpi=120, bbox_inches='tight')
        
        # Display the figure
        print(f"Chromosome Painting for Person {individual_id}:")
        display(Image(output_path))
        plt.close(fig)  # Close the figure to free memory
except Exception as e:
    print(f"❌ Failed to create chromosome painting: {e}")

## Summary

In this lab, we explored pedigree rendering and visualization techniques used in Bonsai v3. Key takeaways include:

1. **Graph-Based Rendering**: We examined how Bonsai v3 represents pedigrees as directed graphs and uses Graphviz for visualization.

2. **Pedigree Rendering API**: We learned about the `render_ped` function in Bonsai's rendering module and its parameters for customizing pedigree visualizations.

3. **Customization Options**: We explored how to enhance pedigree visualizations with colors, shapes, labels, and highlighted edges to represent different attributes and relationships.

4. **Practical Applications**: We demonstrated how to visualize IBD sharing in the context of a pedigree and create chromosome paintings to show segment sharing between individuals.

These visualization techniques are essential tools for genetic genealogy, helping users understand complex family structures and genetic relationships. The Bonsai v3 codebase provides a solid foundation for creating these visualizations, and we've seen how to extend and customize them for specific applications.

In [ ]:
# Convert this notebook to PDF using poetry
!poetry run jupyter nbconvert --to pdf Lab21_Pedigree_Rendering.ipynb

# Note: PDF conversion requires LaTeX to be installed on your system
# If you encounter errors, you may need to install it:
# On Ubuntu/Debian: sudo apt-get install texlive-xetex
# On macOS with Homebrew: brew install texlive