Skip to content

rubuy-74/pstr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

28 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

pstr - A Regex Engine in Go

Build Status License Go Version

pstr is a simple regex engine written in Go from scratch for educational purposes. It can parse basic regular expressions, convert them into a Nondeterministic Finite Automaton (NFA), and perform matching on input strings.


πŸš€ Features

  • Basic Regex Parsing: Supports literals, ( ) groups, [ ] character classes, and quantifiers like *, +, ?, and {m,n}.
  • NFA Engine: Converts parsed regex tokens into an NFA state machine.
  • String Matching: Checks if an input string is valid according to the generated NFA.
  • Interactive CLI: A simple command-line interface to test regex patterns in real-time.
  • Exposed API: An API endpoint to check regex patterns programmatically.

πŸ›  Tech Stack


πŸ—οΈ Getting Started

Prerequisites

  • Go 1.24 or newer

▢️ Running the CLI

  1. Clone the repository:

    git clone https://github.com/rubuy-74/pstr
    cd pstr
  2. Run the interactive CLI: You can use the provided shell script:

    ./run.sh

    Or run it directly with Go:

    go run cmd/pstr/main.go
  3. Test a pattern: The application will prompt you to enter a regex and then a string to check against it.

    Example Interaction:

    Enter regex
    > (a|b)*c
    Enter string to check
    > ababc
    Congratulations, the string is VALID
    

▢️ Running the API

  1. Run the API server:

    go run cmd/pstr/main.go

    The server will start on port 3000.

  2. Send a POST request: You can use curl or any API client to send a POST request to the /check endpoint.

    Example with curl:

    curl -X POST -H "Content-Type: application/json" -d '{"regex": "(a|b)*c", "string": "ababc"}' http://localhost:3000/check

    Expected Response (Success):

    {
        "valid": true
    }

    Expected Response (Error):

    {
        "error": "failed to parse regex",
        "message": "missing left operand for | operator at position 0"
    }

πŸ§ͺ Testing

Note: This testing section was created using Cursor AI to provide comprehensive test coverage and reliability verification.

The project includes extensive test suites to ensure reliability and prevent crashes on edge cases. All tests verify that the regex engine handles invalid inputs gracefully instead of crashing.

▢️ Running Tests

Run All Tests (Basic)

go test ./...
  • Runs all tests in all packages
  • Shows only pass/fail status
  • Fast and clean output

Run All Tests (Verbose)

go test ./... -v
  • Shows detailed output for each test
  • Great for debugging and seeing what's being tested
  • Shows individual test case results

Run Tests for Specific Package

go test ./internal/parser/
go test ./internal/state_machine/
go test ./internal/

Run Specific Test Functions

go test ./... -run="TestEmptyInputValidation"
go test ./... -run="TestProcessRepeatEdgeCases"
go test ./... -run="TestPanicRecovery"

Run Tests with Coverage

go test ./... -cover
  • Shows test coverage percentage
  • Helps identify untested code

Run Tests with Race Detection

go test ./... -race
  • Detects race conditions in concurrent code
  • Important for production code

Run Tests with Benchmarking

go test ./... -bench=.
  • Runs benchmark tests (if you have any)
  • Measures performance

Generate Coverage Report

go test ./... -coverprofile=coverage.out
go tool cover -html=coverage.out
  • Generates detailed coverage report
  • Opens HTML report in browser

πŸ“Š Test Coverage

The project includes comprehensive test suites covering:

  • Integration Tests: Complete pipeline testing with edge cases
  • Parser Tests: Original functionality + reliability tests
  • State Machine Tests: NFA creation and validation
  • Reliability Tests: Edge cases that previously caused crashes
  • Panic Recovery Tests: Ensures no crashes on invalid inputs
  • Memory Safety Tests: Validates safe memory access

Total Test Cases: 50+ individual test cases covering all reliability fixes.

🎯 Test Categories

Input Validation Tests

  • Empty string handling
  • Whitespace-only inputs
  • Invalid operator usage

Array Bounds Safety Tests

  • processRepeat() with no preceding tokens
  • processOr() with missing operands
  • processBrackets() with empty/invalid content
  • processGroup() with empty/unclosed groups
  • ToNFA() with empty token lists

Type Safety Tests

  • Token.ToNFA() with invalid type assertions
  • Safe type assertion handling
  • Panic recovery verification

Error Handling Tests

  • Proper error propagation
  • Meaningful error messages
  • Graceful failure handling

Edge Case Tests

  • Malformed regex patterns
  • Unclosed brackets/groups/ranges
  • Invalid range syntax
  • Complex combinations

πŸ’‘ Recommended Usage

  • Daily Development: go test ./...
  • Debugging Issues: go test ./... -v
  • Before Commits: go test ./... -race -cover
  • CI/CD Pipeline: go test ./... -v -cover

πŸ“ Project Structure

β”œβ”€β”€ cmd/
β”‚   └── pstr/
β”‚       └── main.go          # API endpoint and CLI entry point
β”œβ”€β”€ internal/
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ state/
β”‚   β”‚   β”‚   └── state.go       # NFA state data structures
β”‚   β”‚   β”œβ”€β”€ token/
β”‚   β”‚   β”‚   └── token.go       # Regex token data structures
β”‚   β”‚   └── token_type/
β”‚   β”‚       └── token_type.go  # Enum for token types
β”‚   β”œβ”€β”€ parser/
β”‚   β”‚   β”œβ”€β”€ parser.go        # Regex string to token parsing
β”‚   β”‚   β”œβ”€β”€ parser_test.go   # Tests for the parser
β”‚   β”‚   └── reliability_test.go # Reliability and edge case tests
β”‚   β”œβ”€β”€ state_machine/
β”‚   β”‚   β”œβ”€β”€ state_machine.go # Token to NFA conversion and matching logic
β”‚   β”‚   └── state_machine_test.go # State machine tests
β”‚   β”œβ”€β”€ integration_test.go  # End-to-end integration tests
β”‚   └── utils/
β”‚       └── utils.go         # Utility functions
β”œβ”€β”€ go.mod                   # Go module definition
β”œβ”€β”€ run.sh                   # Script to run the CLI
└── TODO.md                  # Project goals and references

πŸ“œ License

This project is licensed under the MIT License.


Made with ❀️ by rubuy-74

About

Regex Engine built in Golang

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages