# nwgrep Highlighting with Pandas

This notebook demonstrates highlighting matching rows in pandas DataFrames using `nwgrep`.

When `highlight=True`, nwgrep returns a `pandas.Styler` object that displays matching rows with a yellow background in Jupyter notebooks.

In [1]:
from __future__ import annotations

import pandas as pd

from nwgrep import nwgrep, register_grep_accessor

# Create sample dataframe
df = pd.DataFrame({
    "timestamp": ["2024-01-01", "2024-01-02", "2024-01-03", "2024-01-04"],
    "level": ["INFO", "ERROR", "WARN", "ERROR"],
    "message": ["Application started", "Database connection failed", "Slow query detected", "Retry attempt failed"]
})

print("Original DataFrame:")
df

Original DataFrame:


Unnamed: 0,timestamp,level,message
0,2024-01-01,INFO,Application started
1,2024-01-02,ERROR,Database connection failed
2,2024-01-03,WARN,Slow query detected
3,2024-01-04,ERROR,Retry attempt failed


## Basic Highlighting

Highlight rows containing "ERROR" with a yellow background:

In [2]:
# Highlight matching rows
result = nwgrep(df, "ERROR", highlight=True)
result

Unnamed: 0,timestamp,level,message
1,2024-01-02,ERROR,Database connection failed
3,2024-01-04,ERROR,Retry attempt failed


## Highlighting with Case-Insensitive Search

Highlight rows with case-insensitive matching:

In [3]:
# Case-insensitive highlighting
result = nwgrep(df, "error", case_sensitive=False, highlight=True)
result

Unnamed: 0,timestamp,level,message
1,2024-01-02,ERROR,Database connection failed
3,2024-01-04,ERROR,Retry attempt failed


## Highlighting with Regex Patterns

Highlight rows matching a regex pattern:

In [4]:
# Highlight rows with dates
result = nwgrep(df, r"\d{4}-\d{2}-\d{2}", regex=True, highlight=True)
result

Unnamed: 0,timestamp,level,message
0,2024-01-01,INFO,Application started
1,2024-01-02,ERROR,Database connection failed
2,2024-01-03,WARN,Slow query detected
3,2024-01-04,ERROR,Retry attempt failed


## Highlighting with Column Filtering

Highlight rows where only specific columns match:

In [5]:
# Only search in the 'message' column
result = nwgrep(df, "connection", case_sensitive=False, columns=["message"], highlight=True)
result

Unnamed: 0,timestamp,level,message
1,2024-01-02,ERROR,Database connection failed


## Using the Accessor

You can also use the `.grep()` accessor with highlighting:

In [6]:
# Register the accessor
register_grep_accessor()

# Use df.grep() with highlighting
df.grep("failed", case_sensitive=False, highlight=True)

Unnamed: 0,timestamp,level,message
1,2024-01-02,ERROR,Database connection failed
3,2024-01-04,ERROR,Retry attempt failed


## Highlighting with Invert Match

Highlight rows that DON'T match the pattern (like `grep -v`):

In [7]:
# Highlight rows that DON'T contain 'ERROR'
result = nwgrep(df, "ERROR", invert=True, highlight=True)
result

Unnamed: 0,timestamp,level,message
0,2024-01-01,INFO,Application started
2,2024-01-03,WARN,Slow query detected


## Multiple Patterns

Highlight rows matching any of multiple patterns (OR logic):

In [8]:
# Highlight rows with 'ERROR' OR 'WARN'
result = nwgrep(df, ["ERROR", "WARN"], highlight=True)
result

Unnamed: 0,timestamp,level,message
1,2024-01-02,ERROR,Database connection failed
2,2024-01-03,WARN,Slow query detected
3,2024-01-04,ERROR,Retry attempt failed


## Notes

- The `highlight=True` parameter returns a `pandas.Styler` object with yellow background on matching rows
- Pandas highlighting uses the built-in `pandas.Styler` and requires no additional dependencies
- The highlighting is compatible with all other nwgrep options (regex, case sensitivity, column filtering, etc.)
- The `highlight` parameter is incompatible with `count=True` (will raise a `ValueError`)