Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement raw strings (r#"foo"#) #9674

Merged
merged 7 commits into from Oct 8, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
36 changes: 32 additions & 4 deletions doc/rust.md
Expand Up @@ -239,13 +239,14 @@ literal : string_lit | char_lit | num_lit ;

~~~~~~~~ {.ebnf .gram}
char_lit : '\x27' char_body '\x27' ;
string_lit : '"' string_body * '"' ;
string_lit : '"' string_body * '"' | 'r' raw_string ;

char_body : non_single_quote
| '\x5c' [ '\x27' | common_escape ] ;

string_body : non_double_quote
| '\x5c' [ '\x22' | common_escape ] ;
raw_string : '"' raw_string_body '"' | '#' raw_string '#' ;

common_escape : '\x5c'
| 'n' | 'r' | 't' | '0'
Expand All @@ -267,9 +268,10 @@ which must be _escaped_ by a preceding U+005C character (`\`).

A _string literal_ is a sequence of any Unicode characters enclosed within
two `U+0022` (double-quote) characters, with the exception of `U+0022`
itself, which must be _escaped_ by a preceding `U+005C` character (`\`).
itself, which must be _escaped_ by a preceding `U+005C` character (`\`),
or a _raw string literal_.

Some additional _escapes_ are available in either character or string
Some additional _escapes_ are available in either character or non-raw string
literals. An escape starts with a `U+005C` (`\`) and continues with one of
the following forms:

Expand All @@ -285,9 +287,35 @@ the following forms:
* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
(`r`), or `U+0074` (`t`), denoting the unicode values `U+000A` (LF),
`U+000D` (CR) or `U+0009` (HT) respectively.
* The _backslash escape_ is the character U+005C (`\`) which must be
* The _backslash escape_ is the character `U+005C` (`\`) which must be
escaped in order to denote *itself*.

Raw string literals do not process any escapes. They start with the character
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth putting an example or two here? e.g. r"foo\bar" r####"string "### still in the string"#### or something.

`U+0072` (`r`), followed zero or more of the character `U+0023` (`#`) and a
`U+0022` (double-quote) character. The _raw string body_ is not defined in the
EBNF grammar above: it can contain any sequence of Unicode characters and is
terminated only by another `U+0022` (double-quote) character, followed by the
same number of `U+0023` (`#`) characters that preceeded the opening `U+0022`
(double-quote) character.

All Unicode characters contained in the raw string body represent themselves,
the characters `U+0022` (double-quote) (except when followed by at least as
many `U+0023` (`#`) characters as were used to start the raw string literal) or
`U+005C` (`\`) do not have any special meaning.

Examples for string literals:

~~~
"foo"; r"foo"; // foo
"\"foo\""; r#""foo""#; // "foo"

"foo #\"# bar";
r##"foo #"# bar"##; // foo #"# bar

"\x52"; "R"; r"R"; // R
"\\x52"; r"\x52"; // \x52
~~~

#### Number literals

~~~~~~~~ {.ebnf .gram}
Expand Down
7 changes: 6 additions & 1 deletion doc/tutorial.md
Expand Up @@ -353,7 +353,12 @@ whose literals are written between single quotes, as in `'x'`.
Just like C, Rust understands a number of character escapes, using the backslash
character, such as `\n`, `\r`, and `\t`. String literals,
written between double quotes, allow the same escape sequences.
More on strings [later](#vectors-and-strings).

On the other hand, raw string literals do not process any escape sequences.
They are written as `r##"blah"##`, with a matching number of zero or more `#`
before the opening and after the closing quote, and can contain any sequence of
characters except their closing delimiter. More on strings
[later](#vectors-and-strings).

The nil type, written `()`, has a single value, also written `()`.

Expand Down
1 change: 1 addition & 0 deletions src/etc/vim/syntax/rust.vim
Expand Up @@ -148,6 +148,7 @@ syn match rustFormat display "%%" contained
syn match rustSpecial display contained /\\\([nrt\\'"]\|x\x\{2}\|u\x\{4}\|U\x\{8}\)/
syn match rustStringContinuation display contained /\\\n\s*/
syn region rustString start=+"+ skip=+\\\\\|\\"+ end=+"+ contains=rustTodo,rustFormat,rustSpecial,rustStringContinuation
syn region rustString start='r\z(#*\)"' end='"\z1'

syn region rustAttribute start="#\[" end="\]" contains=rustString,rustDeriving
syn region rustDeriving start="deriving(" end=")" contained contains=rustTrait
Expand Down
2 changes: 1 addition & 1 deletion src/librustc/front/test.rs
Expand Up @@ -407,7 +407,7 @@ fn mk_test_desc_and_fn_rec(cx: &TestCtxt, test: &Test) -> @ast::Expr {
debug2!("encoding {}", ast_util::path_name_i(path));

let name_lit: ast::lit =
nospan(ast::lit_str(ast_util::path_name_i(path).to_managed()));
nospan(ast::lit_str(ast_util::path_name_i(path).to_managed(), ast::CookedStr));

let name_expr = @ast::Expr {
id: ast::DUMMY_NODE_ID,
Expand Down
2 changes: 1 addition & 1 deletion src/librustc/metadata/creader.rs
Expand Up @@ -142,7 +142,7 @@ fn visit_view_item(e: @mut Env, i: &ast::view_item) {
let ident = token::ident_to_str(&ident);
let meta_items = match path_opt {
None => meta_items.clone(),
Some(p) => {
Some((p, _path_str_style)) => {
let p_path = Path(p);
match p_path.filestem() {
Some(s) =>
Expand Down
2 changes: 1 addition & 1 deletion src/librustc/metadata/encoder.rs
Expand Up @@ -1446,7 +1446,7 @@ fn encode_meta_item(ebml_w: &mut writer::Encoder, mi: @MetaItem) {
}
MetaNameValue(name, value) => {
match value.node {
lit_str(value) => {
lit_str(value, _) => {
ebml_w.start_tag(tag_meta_item_name_value);
ebml_w.start_tag(tag_meta_item_name);
ebml_w.writer.write(name.as_bytes());
Expand Down
4 changes: 2 additions & 2 deletions src/librustc/middle/check_const.rs
Expand Up @@ -86,7 +86,7 @@ pub fn check_pat(v: &mut CheckCrateVisitor, p: @Pat, _is_const: bool) {
match e.node {
ExprVstore(
@Expr { node: ExprLit(@codemap::Spanned {
node: lit_str(_),
node: lit_str(*),
_}),
_ },
ExprVstoreUniq
Expand Down Expand Up @@ -120,7 +120,7 @@ pub fn check_expr(v: &mut CheckCrateVisitor,
"disallowed operator in constant expression");
return;
}
ExprLit(@codemap::Spanned {node: lit_str(_), _}) => { }
ExprLit(@codemap::Spanned {node: lit_str(*), _}) => { }
ExprBinary(*) | ExprUnary(*) => {
if method_map.contains_key(&e.id) {
sess.span_err(e.span, "user-defined operators are not \
Expand Down
2 changes: 1 addition & 1 deletion src/librustc/middle/const_eval.rs
Expand Up @@ -475,7 +475,7 @@ pub fn eval_const_expr_partial<T: ty::ExprTyProvider>(tcx: &T, e: &Expr)

pub fn lit_to_const(lit: &lit) -> const_val {
match lit.node {
lit_str(s) => const_str(s),
lit_str(s, _) => const_str(s),
lit_char(n) => const_uint(n as u64),
lit_int(n, _) => const_int(n),
lit_uint(n, _) => const_uint(n),
Expand Down
2 changes: 1 addition & 1 deletion src/librustc/middle/trans/consts.rs
Expand Up @@ -71,7 +71,7 @@ pub fn const_lit(cx: &mut CrateContext, e: &ast::Expr, lit: ast::lit)
}
ast::lit_bool(b) => C_bool(b),
ast::lit_nil => C_nil(),
ast::lit_str(s) => C_estr_slice(cx, s)
ast::lit_str(s, _) => C_estr_slice(cx, s)
}
}

Expand Down
2 changes: 1 addition & 1 deletion src/librustc/middle/trans/expr.rs
Expand Up @@ -705,7 +705,7 @@ fn trans_rvalue_dps_unadjusted(bcx: @mut Block, expr: &ast::Expr,
args.iter().enumerate().map(|(i, arg)| (i, *arg)).collect();
return trans_adt(bcx, repr, 0, numbered_fields, None, dest);
}
ast::ExprLit(@codemap::Spanned {node: ast::lit_str(s), _}) => {
ast::ExprLit(@codemap::Spanned {node: ast::lit_str(s, _), _}) => {
return tvec::trans_lit_str(bcx, expr, s, dest);
}
ast::ExprVstore(contents, ast::ExprVstoreSlice) |
Expand Down
8 changes: 4 additions & 4 deletions src/librustc/middle/trans/tvec.rs
Expand Up @@ -205,7 +205,7 @@ pub fn trans_slice_vstore(bcx: @mut Block,

// Handle the &"..." case:
match content_expr.node {
ast::ExprLit(@codemap::Spanned {node: ast::lit_str(s), span: _}) => {
ast::ExprLit(@codemap::Spanned {node: ast::lit_str(s, _), span: _}) => {
return trans_lit_str(bcx, content_expr, s, dest);
}
_ => {}
Expand Down Expand Up @@ -296,7 +296,7 @@ pub fn trans_uniq_or_managed_vstore(bcx: @mut Block, heap: heap, vstore_expr: &a
heap_exchange => {
match content_expr.node {
ast::ExprLit(@codemap::Spanned {
node: ast::lit_str(s), span
node: ast::lit_str(s, _), span
}) => {
let llptrval = C_cstr(bcx.ccx(), s);
let llptrval = PointerCast(bcx, llptrval, Type::i8p());
Expand Down Expand Up @@ -357,7 +357,7 @@ pub fn write_content(bcx: @mut Block,
let _indenter = indenter();

match content_expr.node {
ast::ExprLit(@codemap::Spanned { node: ast::lit_str(s), _ }) => {
ast::ExprLit(@codemap::Spanned { node: ast::lit_str(s, _), _ }) => {
match dest {
Ignore => {
return bcx;
Expand Down Expand Up @@ -490,7 +490,7 @@ pub fn elements_required(bcx: @mut Block, content_expr: &ast::Expr) -> uint {
//! Figure out the number of elements we need to store this content

match content_expr.node {
ast::ExprLit(@codemap::Spanned { node: ast::lit_str(s), _ }) => {
ast::ExprLit(@codemap::Spanned { node: ast::lit_str(s, _), _ }) => {
s.len()
},
ast::ExprVec(ref es, _) => es.len(),
Expand Down
2 changes: 1 addition & 1 deletion src/librustc/middle/ty.rs
Expand Up @@ -3266,7 +3266,7 @@ pub fn expr_kind(tcx: ctxt,
ast::ExprDoBody(*) |
ast::ExprBlock(*) |
ast::ExprRepeat(*) |
ast::ExprLit(@codemap::Spanned {node: lit_str(_), _}) |
ast::ExprLit(@codemap::Spanned {node: lit_str(*), _}) |
ast::ExprVstore(_, ast::ExprVstoreSlice) |
ast::ExprVstore(_, ast::ExprVstoreMutSlice) |
ast::ExprVec(*) => {
Expand Down
2 changes: 1 addition & 1 deletion src/librustc/middle/typeck/check/mod.rs
Expand Up @@ -2259,7 +2259,7 @@ pub fn check_expr_with_unifier(fcx: @mut FnCtxt,
match expr.node {
ast::ExprVstore(ev, vst) => {
let typ = match ev.node {
ast::ExprLit(@codemap::Spanned { node: ast::lit_str(_), _ }) => {
ast::ExprLit(@codemap::Spanned { node: ast::lit_str(*), _ }) => {
let tt = ast_expr_vstore_to_vstore(fcx, ev, vst);
ty::mk_estr(tcx, tt)
}
Expand Down
4 changes: 2 additions & 2 deletions src/librustdoc/clean.rs
Expand Up @@ -1008,7 +1008,7 @@ impl Clean<ViewItemInner> for ast::view_item_ {
fn clean(&self) -> ViewItemInner {
match self {
&ast::view_item_extern_mod(ref i, ref p, ref mi, ref id) =>
ExternMod(i.clean(), p.map(|x| x.to_owned()), mi.clean(), *id),
ExternMod(i.clean(), p.map(|&(ref x, _)| x.to_owned()), mi.clean(), *id),
&ast::view_item_use(ref vp) => Import(vp.clean())
}
}
Expand Down Expand Up @@ -1114,7 +1114,7 @@ impl ToSource for syntax::codemap::Span {

fn lit_to_str(lit: &ast::lit) -> ~str {
match lit.node {
ast::lit_str(st) => st.to_owned(),
ast::lit_str(st, _) => st.to_owned(),
ast::lit_char(c) => ~"'" + std::char::from_u32(c).unwrap().to_str() + "'",
ast::lit_int(i, _t) => i.to_str(),
ast::lit_uint(u, _t) => u.to_str(),
Expand Down
4 changes: 2 additions & 2 deletions src/librustpkg/util.rs
Expand Up @@ -406,7 +406,7 @@ impl<'self> Visitor<()> for ViewItemVisitor<'self> {
// ignore metadata, I guess
ast::view_item_extern_mod(lib_ident, path_opt, _, _) => {
let lib_name = match path_opt {
Some(p) => p,
Some((p, _)) => p,
None => self.sess.str_of(lib_ident)
};
debug2!("Finding and installing... {}", lib_name);
Expand Down Expand Up @@ -513,7 +513,7 @@ pub fn find_and_install_dependencies(context: &BuildContext,

pub fn mk_string_lit(s: @str) -> ast::lit {
Spanned {
node: ast::lit_str(s),
node: ast::lit_str(s, ast::CookedStr),
span: dummy_sp()
}
}
Expand Down
11 changes: 9 additions & 2 deletions src/libsyntax/ast.rs
Expand Up @@ -680,11 +680,17 @@ pub enum mac_ {
mac_invoc_tt(Path,~[token_tree],SyntaxContext), // new macro-invocation
}

#[deriving(Clone, Eq, Encodable, Decodable, IterBytes)]
pub enum StrStyle {
CookedStr,
RawStr(uint)
}

pub type lit = Spanned<lit_>;

#[deriving(Clone, Eq, Encodable, Decodable, IterBytes)]
pub enum lit_ {
lit_str(@str),
lit_str(@str, StrStyle),
lit_char(u32),
lit_int(i64, int_ty),
lit_uint(u64, uint_ty),
Expand Down Expand Up @@ -862,6 +868,7 @@ pub enum asm_dialect {
#[deriving(Clone, Eq, Encodable, Decodable, IterBytes)]
pub struct inline_asm {
asm: @str,
asm_str_style: StrStyle,
clobbers: @str,
inputs: ~[(@str, @Expr)],
outputs: ~[(@str, @Expr)],
Expand Down Expand Up @@ -1027,7 +1034,7 @@ pub enum view_item_ {
// optional @str: if present, this is a location (containing
// arbitrary characters) from which to fetch the crate sources
// For example, extern mod whatever = "github.com/mozilla/rust"
view_item_extern_mod(Ident, Option<@str>, ~[@MetaItem], NodeId),
view_item_extern_mod(Ident, Option<(@str, StrStyle)>, ~[@MetaItem], NodeId),
view_item_use(~[@view_path]),
}

Expand Down
6 changes: 3 additions & 3 deletions src/libsyntax/attr.rs
Expand Up @@ -67,7 +67,7 @@ impl AttrMetaMethods for MetaItem {
match self.node {
MetaNameValue(_, ref v) => {
match v.node {
ast::lit_str(s) => Some(s),
ast::lit_str(s, _) => Some(s),
_ => None,
}
},
Expand Down Expand Up @@ -127,7 +127,7 @@ impl AttributeMethods for Attribute {
/* Constructors */

pub fn mk_name_value_item_str(name: @str, value: @str) -> @MetaItem {
let value_lit = dummy_spanned(ast::lit_str(value));
let value_lit = dummy_spanned(ast::lit_str(value, ast::CookedStr));
mk_name_value_item(name, value_lit)
}

Expand All @@ -153,7 +153,7 @@ pub fn mk_attr(item: @MetaItem) -> Attribute {

pub fn mk_sugared_doc_attr(text: @str, lo: BytePos, hi: BytePos) -> Attribute {
let style = doc_comment_style(text);
let lit = spanned(lo, hi, ast::lit_str(text));
let lit = spanned(lo, hi, ast::lit_str(text, ast::CookedStr));
let attr = Attribute_ {
style: style,
value: @spanned(lo, hi, MetaNameValue(@"doc", lit)),
Expand Down
18 changes: 12 additions & 6 deletions src/libsyntax/ext/asm.rs
Expand Up @@ -44,6 +44,7 @@ pub fn expand_asm(cx: @ExtCtxt, sp: Span, tts: &[ast::token_tree])
tts.to_owned());

let mut asm = @"";
let mut asm_str_style = None;
let mut outputs = ~[];
let mut inputs = ~[];
let mut cons = ~"";
Expand All @@ -58,8 +59,11 @@ pub fn expand_asm(cx: @ExtCtxt, sp: Span, tts: &[ast::token_tree])
while continue_ {
match state {
Asm => {
asm = expr_to_str(cx, p.parse_expr(),
"inline assembly must be a string literal.");
let (s, style) =
expr_to_str(cx, p.parse_expr(),
"inline assembly must be a string literal.");
asm = s;
asm_str_style = Some(style);
}
Outputs => {
while *p.token != token::EOF &&
Expand All @@ -70,7 +74,7 @@ pub fn expand_asm(cx: @ExtCtxt, sp: Span, tts: &[ast::token_tree])
p.eat(&token::COMMA);
}

let constraint = p.parse_str();
let (constraint, _str_style) = p.parse_str();
p.expect(&token::LPAREN);
let out = p.parse_expr();
p.expect(&token::RPAREN);
Expand All @@ -93,7 +97,7 @@ pub fn expand_asm(cx: @ExtCtxt, sp: Span, tts: &[ast::token_tree])
p.eat(&token::COMMA);
}

let constraint = p.parse_str();
let (constraint, _str_style) = p.parse_str();
p.expect(&token::LPAREN);
let input = p.parse_expr();
p.expect(&token::RPAREN);
Expand All @@ -111,14 +115,15 @@ pub fn expand_asm(cx: @ExtCtxt, sp: Span, tts: &[ast::token_tree])
p.eat(&token::COMMA);
}

let clob = format!("~\\{{}\\}", p.parse_str());
let (s, _str_style) = p.parse_str();
let clob = format!("~\\{{}\\}", s);
clobs.push(clob);
}

cons = clobs.connect(",");
}
Options => {
let option = p.parse_str();
let (option, _str_style) = p.parse_str();

if "volatile" == option {
volatile = true;
Expand Down Expand Up @@ -175,6 +180,7 @@ pub fn expand_asm(cx: @ExtCtxt, sp: Span, tts: &[ast::token_tree])
id: ast::DUMMY_NODE_ID,
node: ast::ExprInlineAsm(ast::inline_asm {
asm: asm,
asm_str_style: asm_str_style.unwrap(),
clobbers: cons.to_managed(),
inputs: inputs,
outputs: outputs,
Expand Down