Skip to content

bug: unsafe byte indexing in parse_value can panic instead of returning error #44

@nathanjmcdougall

Description

@nathanjmcdougall

Problem

In \src/document.rs\ (line ~97), \parse_value\ uses direct byte indexing:

\
ust
let raw = &source[span.0..span.1];
let line_start = source[..span.0].rfind('\n').map(|nl| nl + 1).unwrap_or(0);
\\

If yamlpath ever returns a span that doesn't align to UTF-8 character boundaries, Rust panics (thread panic -> Python process crash) instead of returning a recoverable error.

Inconsistency

The \�xtract()\ method in the same file (line ~60) already uses the safe pattern:

\
ust
source.get(start..end)
.map(|s| s.to_string())
.ok_or_else(|| PyErr::new::<PyValueError, _>(
"Feature location is not aligned to UTF-8 character boundaries"
))
\\

Proposed fix

Replace direct indexing in \parse_value\ with .get()\ + error return, matching the existing \�xtract()\ pattern:

\
ust
// Before:
let raw = &source[span.0..span.1];

// After:
let raw = source.get(span.0..span.1).ok_or_else(|| {
PyErr::new::<pyo3::exceptions::PyValueError, _>(
"Feature span is not valid in source"
)
})?;
\\

Same treatment for \source[..span.0]\ on line 101.

Performance note

.get()\ compiles to the same bounds check as direct indexing — zero performance difference on the happy path. The only difference is the failure mode (None vs panic).

Severity

Medium — low probability (requires upstream yamlpath bug or multi-byte UTF-8 edge case) but catastrophic impact (unrecoverable process crash).

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions