Skip to content

Add string extraction utilities for parsing structured text #20

@NewGraphEnvironment

Description

@NewGraphEnvironment

Summary

Add utility functions for extracting text segments from strings using regex patterns.

Functions

  • ngr_str_extract_between() - Extract text between two regex delimiters from a character vector
  • ngr_str_df_extract() - Apply multiple regex extractions to a data frame column, creating new columns for each segment

Use Case

Parsing semi-structured text like grant descriptions where fields are separated by labels:

"Grant Amount: $400,000 Intake Year: 2025 Region: Fraser Basin"

Features

  • Case-insensitive matching
  • Optional whitespace normalization (squish)
  • Handles optional colons after labels
  • Named list input creates correspondingly named output columns

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions