# Interactive Table Editor for Databricks using ipyaggrid

This notebook demonstrates how to create an interactive editable grid for Databricks tables with:
- Column selection/filtering
- Dropdown menus for specific columns
- Date picker for date columns
- Handling widget sync issues

Compatible with Databricks Runtime 17.1 ML

## 1. Installation and Setup

First, let's install ipyaggrid and its dependencies:

In [None]:
%pip install ipyaggrid==0.3.0 pandas==1.5.3 notebook>=5.3 ipywidgets>=7.6.0

# Force widget extension installation
import subprocess
subprocess.check_call(["jupyter", "nbextension", "enable", "--py", "--sys-prefix", "ipyaggrid"])
subprocess.check_call(["jupyter", "nbextension", "enable", "--py", "--sys-prefix", "widgetsnbextension"])

In [None]:
# Import required libraries
import json
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import time

# Import ipyaggrid
from ipyaggrid import Grid
from ipywidgets import Button, HBox, VBox, HTML, Output

# For monitoring changes
import logging
logging.basicConfig(level=logging.INFO)

## 2. Connect to Databricks Table

Let's set up a function to read data from a Databricks table into a pandas DataFrame:

In [None]:
def get_databricks_table(table_name, limit=1000):
    """
    Read a Databricks table into a pandas DataFrame.
    
    Parameters:
    -----------
    table_name : str
        Full name of the table (e.g., 'database_name.table_name')
    limit : int, optional
        Maximum number of rows to retrieve
        
    Returns:
    --------
    pandas.DataFrame
        Table data as a pandas DataFrame
    """
    # In Databricks, spark session is already available
    query = f"SELECT * FROM {table_name} LIMIT {limit}"
    return spark.sql(query).toPandas()

# Example usage:
# df = get_databricks_table('default.your_table_name')

# For testing purposes, let's create a sample DataFrame
def create_sample_data():
    """Create sample data to test the grid"""
    data = {
        'id': range(1, 11),
        'name': [f'Item {i}' for i in range(1, 11)],
        'category': np.random.choice(['Type A', 'Type B', 'Type C'], 10),
        'status': np.random.choice(['Active', 'Inactive', 'Pending'], 10),
        'date_created': [(datetime.now() - timedelta(days=i*5)).strftime('%Y-%m-%d') for i in range(10)],
        'value': np.random.randint(100, 10000, 10),
        'notes': [f'Note for item {i}' for i in range(1, 11)]
    }
    return pd.DataFrame(data)

# Create sample data
df = create_sample_data()
df.head()

## 3. Configure ipyaggrid Column Definitions

Now let's configure the grid with column-specific editors:

In [None]:
def configure_grid_columns(df):
    """
    Configure grid columns with appropriate editors and settings.
    
    Parameters:
    -----------
    df : pandas.DataFrame
        DataFrame to be displayed in the grid
        
    Returns:
    --------
    list
        Column definitions for ipyaggrid
    """
    column_defs = [
        # ID column - non-editable
        {
            'headerName': 'ID',
            'field': 'id',
            'editable': False,
            'width': 80,
            'filter': 'agNumberColumnFilter'
        },
        # Name column - text editable
        {
            'headerName': 'Name',
            'field': 'name',
            'editable': True,
            'width': 150,
            'filter': 'agTextColumnFilter'
        },
        # Category column - dropdown selection
        {
            'headerName': 'Category',
            'field': 'category',
            'editable': True,
            'width': 120,
            'cellEditor': 'agSelectCellEditor',
            'cellEditorParams': {
                'values': ['Type A', 'Type B', 'Type C', 'Type D']
            },
            'filter': 'agSetColumnFilter'
        },
        # Status column - dropdown selection
        {
            'headerName': 'Status',
            'field': 'status',
            'editable': True,
            'width': 120,
            'cellEditor': 'agSelectCellEditor',
            'cellEditorParams': {
                'values': ['Active', 'Inactive', 'Pending', 'Completed']
            },
            'filter': 'agSetColumnFilter'
        },
        # Date column - date picker
        {
            'headerName': 'Date Created',
            'field': 'date_created',
            'editable': True,
            'width': 150,
            'cellEditor': 'agDateCellEditor',  # Using ag-Grid's built-in date editor
            'filter': 'agDateColumnFilter',
            'filterParams': {
                'comparator': 'date_comparator'
            }
        },
        # Value column - numeric
        {
            'headerName': 'Value',
            'field': 'value',
            'editable': True,
            'width': 120,
            'filter': 'agNumberColumnFilter',
            'valueFormatter': 'data.value ? "$" + data.value.toLocaleString() : ""'
        },
        # Notes column - text area
        {
            'headerName': 'Notes',
            'field': 'notes',
            'editable': True,
            'width': 250,
            'cellEditor': 'agLargeTextCellEditor',
            'cellEditorParams': {
                'maxLength': 500,  # Maximum allowed length of text
                'rows': 5,         # Number of rows in the textarea
                'cols': 50         # Number of columns in the textarea
            },
            'filter': 'agTextColumnFilter'
        }
    ]
    
    return column_defs

## 4. Create Grid with Custom Configuration

In [None]:
# Output for displaying messages
output_area = Output()

# Define grid options
def create_grid(df):
    # Configure grid columns
    column_defs = configure_grid_columns(df)
    
    # Define grid options
    grid_options = {
        'columnDefs': column_defs,
        'rowData': df.to_dict('records'),
        'enableSorting': True,
        'enableFilter': True,
        'enableColResize': True,
        'rowSelection': 'multiple',
        'suppressRowClickSelection': True,
        'pagination': True,
        'paginationAutoPageSize': True,
        'undoRedoCellEditing': True,  # Enable undo/redo functionality
        'undoRedoCellEditingLimit': 10, # Store up to 10 undo/redo actions
        'stopEditingWhenCellsLoseFocus': True,
        'enterMovesDown': True,
        'singleClickEdit': False
    }
    
    # Create grid
    grid = Grid(
        grid_data=df,
        grid_options=grid_options,
        column_defs=column_defs,
        index=True,  # Show row index
        quick_filter=True,  # Enable quick filtering
        export_csv=True,    # Enable CSV export
        export_excel=True,  # Enable Excel export
        show_toggle_edit=True,  # Show toggle edit button
        theme='ag-theme-alpine',  # Use Alpine theme for better contrast
        columns_fit='auto',
        sync_grid=True,
        sync_on_edit=True,  # Sync data when edited
        height=500         # Set grid height
    )
    
    return grid

## 5. Handle Data Changes and Sync Issues

In [None]:
# Create function to handle data changes
def handle_changes(grid, df):
    """
    Handle data changes and sync between grid and Databricks.
    
    Parameters:
    -----------
    grid : ipyaggrid.Grid
        The grid instance
    df : pandas.DataFrame
        The original DataFrame
    """
    # Create buttons for operations
    save_button = Button(description="Save Changes", button_style="success")
    refresh_button = Button(description="Refresh Grid", button_style="info")
    status_html = HTML("<p>Ready to edit data.</p>")
    
    # Define button click handlers
    def on_save_clicked(b):
        with output_area:
            output_area.clear_output()
            try:
                # This is the critical part for handling sync issues
                # First, ensure editing is complete
                grid.js_code("gridOptions.api.stopEditing()")
                
                # Wait a moment for changes to propagate
                time.sleep(0.5)
                
                # Get updated data from grid
                updated_data = grid.grid_data_out
                
                # Show updated data
                print(f"Changes detected in {len(updated_data)} rows")
                display(updated_data.head())
                
                # In a real scenario, you'd update the Databricks table here
                # Example:
                # updated_df_spark = spark.createDataFrame(updated_data)
                # updated_df_spark.createOrReplaceTempView("temp_updated_table")
                # spark.sql("MERGE INTO your_table t USING temp_updated_table s ON t.id = s.id WHEN MATCHED THEN UPDATE SET ...")
                
                status_html.value = f"<p style='color:green'>Data saved successfully at {datetime.now().strftime('%H:%M:%S')}</p>"
            except Exception as e:
                print(f"Error saving data: {str(e)}")
                status_html.value = f"<p style='color:red'>Error: {str(e)}</p>"
    
    def on_refresh_clicked(b):
        with output_area:
            output_area.clear_output()
            try:
                # Refresh data from source
                # In real scenario: refreshed_df = get_databricks_table('your_table')
                refreshed_df = create_sample_data()  # For demo purposes
                
                # Update grid data
                grid.grid_data = refreshed_df
                grid.js_code("gridOptions.api.setRowData(gridOptions.rowData)")
                
                print("Grid data refreshed from source")
                status_html.value = f"<p style='color:blue'>Grid refreshed at {datetime.now().strftime('%H:%M:%S')}</p>"
            except Exception as e:
                print(f"Error refreshing data: {str(e)}")
                status_html.value = f"<p style='color:red'>Error: {str(e)}</p>"
    
    # Attach handlers to buttons
    save_button.on_click(on_save_clicked)
    refresh_button.on_click(on_refresh_clicked)
    
    # Return UI components
    return VBox([
        HBox([save_button, refresh_button, status_html]),
        output_area
    ])


## 6. Create and Display Grid

In [None]:
# Create grid
grid = create_grid(df)

# Create controls for grid
controls = handle_changes(grid, df)

# Display grid and controls
display(VBox([grid, controls]))

## 7. Workarounds for Known Sync Issues

Here are some common widget sync issues with ipyaggrid and their workarounds:

### Common Issues and Workarounds:

1. **Cell edits not being registered:**
   - Always call `gridOptions.api.stopEditing()` before trying to retrieve data
   - Add a small delay (0.5s) before retrieving data after edits
   - Use `grid.grid_data_out` property to get the latest data

2. **Widget state not syncing between frontend and backend:**
   - Use the `sync_grid=True` and `sync_on_edit=True` options
   - For critical changes, add explicit buttons rather than relying on automatic sync
   - Use the `grid.js_code()` method to execute JavaScript directly on the grid

3. **Browser compatibility issues:**
   - Chrome and Firefox work best with ipyaggrid
   - Safari may have issues with date pickers and certain editors
   - For Internet Explorer users, consider using simpler editors

4. **Performance with large datasets:**
   - Limit the number of rows loaded initially (use pagination)
   - Only edit a subset of rows at once
   - Use the `columnDefs` option to limit visible columns

5. **Focus and keyboard navigation issues:**
   - Set `enterMovesDown` to control navigation behavior
   - Use `stopEditingWhenCellsLoseFocus: true` to commit changes when clicking elsewhere
   - Set `singleClickEdit: false` to require double-clicks for editing

### Tips for Databricks Integration:

1. When saving changes back to Databricks:
   - Use temporary views and MERGE statements for atomic updates
   - Consider batching changes to reduce network overhead
   - Add validation before committing changes

2. For date handling:
   - Ensure consistent date formats between frontend and backend
   - Parse dates explicitly using pandas when needed

3. For dropdown values:
   - Consider dynamically populating dropdown options from actual data
   - Use spark SQL to get distinct values: `spark.sql("SELECT DISTINCT category FROM your_table").collect()`

## 8. Complete Databricks Integration Example

Here's a more complete example of how to integrate with an actual Databricks table:

In [None]:
def update_databricks_table(updated_df, table_name):
    """
    Update a Databricks table with changes from the grid.
    
    Parameters:
    -----------
    updated_df : pandas.DataFrame
        Updated DataFrame from the grid
    table_name : str
        Full name of the table to update (e.g., 'database_name.table_name')
    """
    try:
        # Convert pandas DataFrame to Spark DataFrame
        sdf = spark.createDataFrame(updated_df)
        
        # Register as temp view for SQL operations
        sdf.createOrReplaceTempView("updated_data_temp")
        
        # Example MERGE operation to update the table
        # Adjust the key columns and SET clause as needed for your table
        spark.sql(f"""
        MERGE INTO {table_name} t
        USING updated_data_temp s
        ON t.id = s.id
        WHEN MATCHED THEN UPDATE SET 
            t.name = s.name,
            t.category = s.category,
            t.status = s.status,
            t.date_created = s.date_created,
            t.value = s.value,
            t.notes = s.notes
        WHEN NOT MATCHED THEN INSERT *
        """)
        
        return True, "Table updated successfully"
    
    except Exception as e:
        return False, f"Error updating table: {str(e)}"

# Example of how to use this function with the save button:
'''
def on_save_clicked(b):
    with output_area:
        output_area.clear_output()
        try:
            # Stop any active editing
            grid.js_code("gridOptions.api.stopEditing()")
            time.sleep(0.5)
            
            # Get updated data
            updated_data = grid.grid_data_out
            
            # Update Databricks table
            success, message = update_databricks_table(updated_data, 'default.your_table_name')
            
            if success:
                status_html.value = f"<p style='color:green'>{message} at {datetime.now().strftime('%H:%M:%S')}</p>"
            else:
                status_html.value = f"<p style='color:red'>{message}</p>"
                
        except Exception as e:
            print(f"Error: {str(e)}")
            status_html.value = f"<p style='color:red'>Error: {str(e)}</p>"
'''

## 9. Browser-Specific Workarounds

Different browsers may require specific workarounds:

In [None]:
# Detect browser type (simplified example)
def detect_browser_js():
    return """
    function detectBrowser() {
        const ua = navigator.userAgent;
        if (ua.indexOf("Chrome") > -1) return "Chrome";
        if (ua.indexOf("Safari") > -1) return "Safari";
        if (ua.indexOf("Firefox") > -1) return "Firefox";
        if (ua.indexOf("MSIE") > -1 || ua.indexOf("Trident") > -1) return "IE";
        return "Unknown";
    }
    
    // Store browser info in window for access
    window.browserInfo = detectBrowser();
    
    // Apply browser-specific fixes
    if (window.browserInfo === "Safari") {
        // Safari-specific fixes
        // e.g., adjust date picker behavior
        console.log("Applied Safari-specific fixes");
    } else if (window.browserInfo === "IE") {
        // IE-specific fixes
        console.log("Applied IE-specific fixes");
    }
    
    return window.browserInfo;
    """

# Example of how to use this detection in your notebook:
'''
from IPython.display import Javascript, display

# Execute browser detection
display(Javascript(detect_browser_js()))

# Then you can use grid.js_code to apply browser-specific fixes after detection
grid.js_code("""
    if (window.browserInfo === "Safari") {
        // Safari-specific grid configuration
        gridOptions.defaultColDef.cellEditorParams = gridOptions.defaultColDef.cellEditorParams || {};
        gridOptions.defaultColDef.cellEditorParams.browserFix = true;
    }
""")
'''

## Conclusion

This notebook provides a template for creating an interactive, editable grid for Databricks tables using ipyaggrid. It includes:

1. Custom editors for different column types (text, dropdown, date picker)
2. Solutions for handling widget sync issues
3. Browser compatibility workarounds
4. Integration with Databricks tables

Key tips for avoiding sync issues:

- Always call `stopEditing()` before retrieving grid data
- Add small delays after edits to ensure changes propagate
- Use explicit save buttons rather than relying on automatic sync
- Apply browser-specific fixes where needed
- Use the grid's built-in undo/redo functionality

Remember that the performance of ipyaggrid will depend on the size of your dataset, so consider using pagination and limiting the number of rows loaded initially.