indented multiline strings using """ ... """ #70

Closed
StefanKarpinski opened this Issue Jun 21, 2011 · 19 comments

10 participants

@StefanKarpinski
The Julia Language member

Stolen from Python but with some differences:

function jabberwock()
  """
  'Twas brillig, and the slithy toves
    Did gyre and gimble in the wabe;
  All mimsy were the borogoves,
    And the mome raths outgrabe.
  """
end

julia> print(jabberwock())
'Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe;
All mimsy were the borogoves,
  And the mome raths outgrabe.

The point of the """ construct is to make it easy to embed snippets of text or code in a readable, nice way inside of Julia code. To that end, it is similar to "here documents" (shell, Perl, Ruby) as well as Python's multiline strings, which use the same delimiters. However, a couple of additional semantics make these more pleasant to use:

  • If the initial """ is on a line followed only by whitespace, that whitespace will be stripped, including the newline.
  • Lines following the opening """ token — up to and including the line on which the closing """ occurs — must begin with the same indentation sequence (identical whitespace characters) as the line on which the opening """ occurs.
  • This common indentation sequence will be stripped from each line of the multiline quote before the content is further processed — conceptually, it is part of the surrounding Julia code, rather than part of the quoted string.

After this whitespace stripping is applied, all normal string interpretation is performed as for a " string, including unescaping and interpolation. Moreover, you can prefix """ with an identifier as you can with " strings to invoke macro-based custom-string forms. Thus, a Q""" multiline string has no interpolation performed and r""" is a handy multiline regex literal (maybe the x extended regex format should always be on).

@ViralBShah
The Julia Language member

Why do we have two issues for multiline quotes? I like ###. I think it is a good idea to not introduce a new symbol for quotes. Perhaps, some of the rules can be applied though.

@JeffBezanson
The Julia Language member

This is for strings, not comments.

@StefanKarpinski
The Julia Language member

There were accidentally two identical issues opened for this, but I think Viral is talking about the one for comments vs. this one which is for strings. Having a real multiline string syntax is really handy, which is why all of Perl, Python and Ruby have them. It's not essential, however, which is why this is a v2.0 issue. I just wanted to get some thoughts down regarding the matter while I had them on my mind.

@seanjensengrey

I think it would be nice if comments and strings could be handled similarly. I could see cases where comments could be meta data for downstream tools.

Regardless, http://en.wikipedia.org/wiki/Here_document (multi-line strings) is a huge huge win for

  • including data
  • including DSLs (OpenCL kernels, SQL, etc)

Java and C really got this wrong.

@JeffreySarnoff

Multiline strings can be with version 2.
Multiline comments are much more pressing.

I like {: :} ( it looks like Julia to me. )

@pao
The Julia Language member

Have I missed something? Multiline strings work fine now as long as you escape inner quotes. While not having to do that would be nice, I think the title here might be a bit confusing on that point.

@JeffreySarnoff

My ignorance -- everything I had seen was written in a way that suggested multi-line strings were not available.
What happens to unassigned multi-line strings that are placed within a load()ed file?
Are they as multi-line comments, or is there some side-effect?

@StefanKarpinski
The Julia Language member

@pao, multiline strings work fine, but they're not ideal for including text in indented code. That's what my proposal for """ multiline strings is about. Imo, no language I know of gets this right since the indentation level of the surrounding code is always ignored, forcing the embedded string to be jarringly unindented.

@pao
The Julia Language member

@StefanKarpinski Understood; I think the bug title isn't quite clear based on Jeffrey's question, not that this isn't a good idea. I just haven't figured out what to retitle it to, or I'd have just changed it. Maybe "better multiline..." or "fancy multiline..."?

@seanjensengrey

Data isn't included in a files that often, I think @StefanKarpinski you might be putting too much aesthetic weight behind indented text. Lua gets this pretty perfect, Python gets it nearly perfect.

In Python you can use """\ and it won't insert a leading newline So

text_block = """\
this is a block
of text"""

Will only contain a single newline after the word block with no other formatting applied. It is easy to know what you type is what you will get.

If there isn't a another low complexity way of getting verbatim data into the source, maybe an annotation driven approach where the contents of files are injected into vars at runtime/startup.

@StefanKarpinski
The Julia Language member

I end up having to include data in code quite often, which is what drove the proposed design of this issue.

@seanjensengrey

How about the native heredoc syntax leaves the string unadulterated and then something like http://docs.python.org/library/textwrap.html can clean up the block if one chooses to indent it?

Embedding data into the code is a very important issue for me as well and I would love to see this make it into Julia.

@JeffreySarnoff

The textwrap approach is ok; how about a function-encapsulatable text transformer [chain] approach, as Julia has now (focused on numerics)?

Also, when this is decided, if something like this is used

... end

myInlineData = """

"""

Is there reason not to treat triple-quoted blocks present at the top level of a file and unassigned as a comment?

is there any reason that

@catawbasam

+1 for triple quoted strings. They're great when working with HTML, XML, SQL, JSON etc. Given that Julia is REPL-oriented, it is reasonable to expect users to be using data literals quite a bit.

@timholy
The Julia Language member

At least in files, I've noticed that you can do this:

function test()
    str = 
"Hello,
it's a nice
  day"
end

julia> test()
"Hello,\nit's a nice\n  day"

It's not identical to what Stefan proposed (particularly for the first line), but it gets you most of the way there.

@timholy
The Julia Language member

For clarity, I should have also shown:

julia> print(test())
Hello,
it's a nice
  day
@StefanKarpinski
The Julia Language member

This is a matter of doing some annoying lexer hacking, but one of us should probably bite the bullet and just do it.

@catawbasam

Yeah, the multi-line string is nice -- but the scenario I'm looking at is cutting/pasting text without having to worry about delimiting the quotes, or having to try and read ugly delimited text literals.

As a workaround I've been (mis)using command expressions like below, but feeling kind of guilty about it:

julia> function strescape(myExpr::Expr)
         estr = myExpr.args[2]
         if typeof(estr)==ASCIIString
             return estr
         else
           estr2=strip( estr.args[2] )
           return estr2
         end
       end

julia> s1= :`my "weird" aunt's string`
:( @@cmd "my \"weird\" aunt's string" )

julia> strescape(s1)
"my \"weird\" aunt's string"

[pao: speaking of triple quoting, it works in GitHub Markdown too!]

@vtjnash
The Julia Language member

possibly relevant: c015463#commitcomment-2505245

@nolta nolta added a commit that closed this issue Feb 19, 2013
@nolta nolta strip leading whitespace from triple-quoted strings (closes #70)
For example,

    s = """
        a
         b
        """

is now equivalent to "a\n b".
865ea16
@nolta nolta closed this in 865ea16 Feb 19, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment