lex: triple-quoted string literals with PEP-257 dedent#565
Merged
Conversation
Triple-quoted ("""...""") strings desugar to the existing single-
quoted literal in the lexer, so logos' string regex consumes them and
every downstream stage (parser, interpolation, escape decoding) stays
unchanged. A scanning pass finds the closing """ and emits a
synthesised single-quoted form with raw newlines preserved.
When the closing """ sits on its own line, strip_triple_indent drops
the leading newline and removes the common leading whitespace from each
content line, matching Python PEP 257 and Rust's indoc! macro. The
terminating newline of the last content line is preserved. Inline form
(closing on a content line) keeps the body verbatim with no dedent.
Span attribution maps every emitted byte back to its original source
byte so diagnostics still point at the right location.
Nine regression tests pin behaviour across the VM and Cranelift JIT:
single-line form, inline multi-line, dedented multi-line, content-byte
verification, escape decoding, {name} interpolation (single and multi-
line), empty body, and embedded single quote. A backend drift can't
silently re-break the surface.
Adds examples/triple-quoted-strings.ilo so the examples_engines harness
exercises the feature on every engine and so agents reading the
examples directory see the canonical shape in context.
SPEC.md gains a Triple-quoted strings subsection under String Literals covering raw newlines, dedent rules, escape passthrough, interpolation parity, and the embedded-single-quote edge case. ai.txt gets the token-minimal agent-spec entry inline next to the existing escape table.
❌ 2 Tests Failed:
View the top 2 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pending #34: triple-quoted (
"""...""") multi-line string literals. Same surface as"..."(escapes,{name}interpolation) plus raw newlines inside the literal. When the closing"""sits on its own line, the leading newline is dropped and the common leading whitespace is stripped from each content line; the terminating\nof the last content line is preserved (Python PEP 257 / Rustindoc!convention).Manifesto framing: agents writing multi-line content today resort to
cat-concatenation with embedded\nescapes, which is verbose and easy to get wrong."""..."""lets the body carry raw newlines and applies indent stripping, so indented source produces clean output.Repro
Before:
After:
Both produce
"line one\nline two\n". The second form is shorter and reads as the value it produces.What's in the diff
""", scans for the matching close, and emits a synthesised single-quoted form with raw newlines preserved.strip_triple_indenthandles the dedent and leading-newline drop when the closing"""is on its own line. Span attribution maps every emitted byte back to its source byte for diagnostic accuracy.examples/triple-quoted-strings.iloso theexamples_enginesharness exercises the feature on every engine.Test plan
cargo test --release --features cranelift --test regression_triple_quoted_strings- 9 passingcargo test --release --features cranelift --test examples_engines- example runs on every enginecargo fmt && cargo clippy --release --features cranelift --all-targets -- -D warnings- cleanregression_reserved_names_doc::spec_reserved_short_names_match_builtin_registry(b64/hex SPEC drift, unrelated to this PR)Follow-ups