This crate provides useful infrastructure for compiler diagnostics, which are intended to be shared/reused across various Miden components that perform some type of compilation, e.g. AirScript, the (in development) Miden IR, and Miden Assembly.
See miden-parsing for parsing utilities that
build on top of the low-level tools provided here. A complete compiler frontend is expected to
build a lexer on top of that crate's Scanner
type, and implement it's Parser
trait. Many
of the details involved in producing an AST with source spans are handled by that crate, but it
is entirely possible to forge your own alternative.
This crate provides functionality for two distinct, but inter-related use cases:
- Tracking compiler sources, locations/spans within those sources, and supporting infra
for associating spans to Rust data structures, e.g.
#[derive(Spanned)]
- Constructing, emitting, and displaying/capturing compiler diagnostics, optionally decorated
with source spans for rendering
rustc
-like messages/warnings/errors.
We build upon some of the primitives provided by the codespan crate to provide a rich set of functionality around tracking sources, locations, and spans in as efficient a way as possible. The intent is to ensure that decorating compiler structures with source locations is as cheap as possible, while preserving the ability to easily obtain useful information about those structures, such as what file/line/column a given object was derived from.
The following are the key features that support this use case:
SourceId
is a compact reference to a specific source file that was loaded into memorySourceIndex
is a compact reference to a specific location in some source fileSourceSpan
is a compact structure which refers to a specific range of locations in some source file. This type is the most common value type you will interact with, and is used to generate pretty diagnostics that point to a specific range of characters in a source file to which the diagnostic pertains.Span<T>
, is a type used to associate aSourceSpan
with a typeT
non-invasively; derefs toT
, and implements a variety of other traits that delegate toT
in a pass-through fashion, e.g.PartialEq
Spanned
is a trait which types may implement to produce aSourceSpan
upon request. TheSpan<T>
type implements this, and it is automatically implemented for allBox<T>
whereT: Spanned
.- The
CodeMap
is a thread-safe datastructure that is intended to be constructed once by a compiler driver and shared across all its child threads. It stores files read into memory, de-duplicating by the name of the source file (whether real or synthetic). It provides APIs which can be used to obtain useful high-level information from aSourceId
,SourceIndex
, orSourceSpan
, such as the file name, line and column numbers; as well as obtain a slice of the original source content.
We build upon some utilities provided by the codespan_reporting crate to provide a richer set of features for generating, displaying and/or capturing compiler diagnostics.
The following are the key features that support this use case:
Diagnostic
is a type that represents a compiler diagnostic, with a severity, a message, with an (optional) set of labels/notes.ToDiagnostic
is a trait which represents the ability to generate aDiagnostic
from a type. In practice this is used with errors which are converted to diagnostics when compilation should proceed because an error is non-fatal. For example, during parsing/semantic analysis you typically want to capture as many errors as possible before failing the compilation task, rather than exiting on the first error encountered.Severity
represents whether a diagnostic is an error, warning, bug, or simple note.Label
is used to associate a source location with a diagnostic, with some descriptive text. Labels come in primary/secondary flavors, which affect how the labels are ordered/rendered when displayed.InFlightDiagnostic
provides a fluent, builder-pattern API for constructing and emitting diagnostics on the fly, see examples below.Emitter
is a trait that can be used to control how diagnostics are emitted. This can be used to do useful things such as disable diagnostics, capture them for tests, or control how they are displayed; the built-inNullEmitter
,CaptureEmitter
, andDefaultEmitter
types perform those respective functions.DiagnosticsHandler
is a thread-safe type meant to be constructed once by the compiler driver and then shared across all threads of execution. It can be configured to use a particularEmitter
implementation, control what the minimum severity of emitted diagnostics are, convert warnings to errors, and more.
use miden_diagnostics::{SourceSpan, Span, Spanned};
#[derive(Clone, PartialEq, Eq, Hash, Spanned)]
pub struct Ident(#[span] Span<String>);
#[derive(Spanned)]
pub enum Expr {
Var(#[span] Ident),
Int(#[span] Span<i64>),
Let(#[span] Let),
Binary(#[span] BinaryExpr),
Unary(#[span] UnaryExpr),
}
#[derive(Spanned)]
pub struct Let {
pub span: SourceSpan,
pub var: Ident,
pub body: Box<Expr>,
}
#[derive(Spanned)]
pub struct BinaryExpr {
pub span: SourceSpan,
pub op: BinaryOp,
pub lhs: Box<Expr>,
pub rhs: Box<Expr>,
}
#[derive(Copy, Clone, PartialEq, Eq)]
pub enum BinaryOp {
Add,
Sub,
Mul,
Div,
}
#[derived(Spanned)]
pub struct UnaryExpr {
pub span: SourceSpan,
pub op: UnaryOp,
pub rhs: Box<Expr>,
}
#[derive(Copy, Clone, PartialEq, Eq)]
pub enum UnaryOp {
Neg,
}
use std::sync::Arc;
use miden_diagnostics::*;
const INPUT_FILE = r#"
let x = 42
in
let y = x * 2
in
x + y
"#;
pub fn main() -> Result<(), ()> {
// The codemap is where parsed inputs are stored and is the base for all source locations
let codemap = Arc::new(CodeMap::new());
// The emitter defines how diagnostics will be emitted/displayed
let emitter = Arc::new(DefaultEmitter::new(ColorChoice::Auto));
// The config provides some control over what diagnostics to display and how
let config = DiagnosticsConfig::default();
// The diagnostics handler itself is used to emit diagnostics
let diagnostics = Arc::new(DiagnosticsHandler::new(config, codemap.clone(), emitter));
// In our example, we're adding an input to the codemap, and then requesting the compiler compile it
codemap.add("nofile", INPUT_FILE.to_string());
compiler::compile(codemap, diagnostics, "nofile")
}
mod compiler {
use miden_diagnostics::*;
pub fn compile<F: Into<FileName>>(codemap: Arc<CodeMap>, diagnostics: Arc<DiagnosticsHandler>, filename: F) -> Result<(), ()> {
let filename = filename.into();
let file = codemap.get_by_name(&filename).unwrap();
// The details of parsing are left as an exercise for the reader, but it is expected
// that for Miden projects that this crate will be combined with `miden-parsing` to
// handle many of the details involved in producing a stream of tokens from raw sources
//
// In this case, we're parsing an Expr, or returning an error that has an associated source span
match parser::parse(file.source(), &diagnostics)? {
Ok(_expr) => Ok(()),
Err(err) => {
diagnostics.diagnostic(Severity::Error)
.with_message("parsing failed")
.with_primary_label(err.span(), err.to_string())
.emit();
Err(())
}
}
}
}