Parser and typed AST for Gmail-style email search queries.
Backend-agnostic — produces a portable [QueryNode], you pick the
search engine.
use mail_query::{parse, FilterKind, QueryField, QueryNode};
let ast = parse("from:alice subject:\"deploy\" is:unread after:2026-01-01")?;
// AST is now a Boxed tree of And/Or/Not + leaves. Walk it with the
// Visitor trait to translate to tantivy / meilisearch / SQL FTS / IMAP
// SEARCH / whatever backend you prefer.
# Ok::<_, mail_query::ParseError>(())Every Rust email project re-implements this parser. The closest alternatives:
query-parser,search-query-parser— generic, toy syntax, no email operators.tantivy-query-grammar::UserInputAst— Lucene vocabulary, not Gmail. Pulls heavy deps. Not#[non_exhaustive], so any spec addition is a breaking change for downstream pin-on-major users.
mail-query is the focused Gmail-vocabulary parser the ecosystem
doesn't yet have.
- Parses Gmail's documented operator surface from
https://support.google.com/mail/answer/7190:
- Address fields:
from:,to:,cc:,bcc:,deliveredto:,rfc822msgid:,list: - Content fields:
subject:,body:,filename: is:andhas:filterslabel:andcategory:size:,larger:,smaller:with unit suffixes (5M,200K)after:,before:,date:,older:,newer:,older_than:,newer_than:with both specific dates and relative durations (older_than:5d)AND/OR/NOT/-/ parentheses / brace groupsAROUND<n>for word proximity
- Address fields:
- Recognises
+wordas an exact-match (no-stemming) hint, mirroring Gmail's syntax. - Round-trips:
parse(node.to_string())? == node(structural equality, not byte identity). - Walks the AST via a [
Visitor] trait so backend authors can translate to their own query language. - Exposes extension points for caller-specific filters: register names
via [
ParserOptions::register_custom_filter] and they route through [FilterKind::Custom].
- It does not execute queries. The output is a portable AST; you pick the backend.
- It does not resolve
older_than:5dto a concrete date at parse time. The AST carriesDateValue::Relative { amount, unit }; backends callParserOptions::now_providerat execution time. This is what lets a saved query mean the same thing tomorrow as today and lets the AST round-trip without embedding a date. - It does not parse IMAP SEARCH grammar (RFC 3501 §6.4.4) — that's a separate, future crate. The vocabularies overlap but the grammars do not.
These are decisions where the crate is narrower or more opinionated than the Gmail surface.
older_than:5disRelative, not a resolvedNaiveDate. See above.+wordis a distinct AST variantExact, notText. The no- stemming hint is preserved so backends can act on it.- OR has lower precedence than AND.
a b OR cparses as(a AND b) OR c. Matches Gmail's documented behaviour and Lucene convention. - Unknown filters error by default. A bare
is:my-app-flagreturns [ParseError::UnknownFilter] unless the caller has registered it via [ParserOptions::register_custom_filter]. This is the default-strict posture; opt in to widen.
Filter names Gmail adds over time, color-star variants beyond the
common set, or your application's own is:owed-reply — register them
once at construction time:
use mail_query::{parse_with, FilterKind, ParserOptions, QueryNode};
let mut options = ParserOptions::new();
options.register_custom_filters(["owed-reply", "reply-later"]);
let ast = parse_with("is:owed-reply", &options)?;
assert_eq!(
ast,
QueryNode::Filter(FilterKind::Custom("owed-reply".into()))
);
# Ok::<_, mail_query::ParseError>(())The crate canonicalises names to lowercase + hyphenated form, so
is:owed_reply and is:Owed-Reply both resolve to
Custom("owed-reply").
use mail_query::{parse, FilterKind, Visitor};
#[derive(Default)]
struct CountFilters(usize);
impl Visitor for CountFilters {
fn visit_filter(&mut self, _: &FilterKind) {
self.0 += 1;
}
}
let ast = parse("from:alice is:unread OR has:attachment")?;
let mut counter = CountFilters::default();
counter.walk(&ast);
assert_eq!(counter.0, 2);
# Ok::<_, mail_query::ParseError>(())The default walk implementation recurses into And / Or / Not
and dispatches to typed visit_* hooks for leaves. Override only what
you need.
Every public enum is #[non_exhaustive]. New variants (for new Gmail
operators) are non-breaking additions. Pattern-matching callers must
include a _ => … arm.
The full coverage matrix lives in
testdata/coverage.md. Each fixture is a
language-neutral JSON file under testdata/conformance/
so a future port to another language can adopt the same corpus.
Three tests enforce the integrity of the corpus:
- Every fixture file is referenced in
coverage.md. - Every contract-critical fixture exists on disk.
- The actual parser output matches
expected_ast(orexpected_error) for every fixture.
cargo test --all-featuresserde— addsSerialize/Deserializederives to every AST type, withchrono/serdeenabled forNaiveDate. Default off.
The crate has two required dependencies (chrono with clock only and
thiserror) and no transitive runtime cost beyond that.
Future work (out of scope for v0.1.0):
- Tantivy interop:
From<tantivy_query_grammar::UserInputAst>and back. Behind a feature flag so the heavy deps stay opt-in. - IMAP SEARCH grammar (RFC 3501 §6.4.4) parser to the same AST as a normalisation layer.
- WASM build for an
npmpackage consuming the same conformance corpus.
If you want them, open an issue.
- File bug reports at https://github.com/planetaryescape/mail-query/issues.
- Patches that change behaviour must add or update a fixture in
testdata/conformance/and a row intestdata/coverage.md.
MIT OR Apache-2.0. See LICENSE-MIT and LICENSE-APACHE.