Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

here string syntax #2337

Open
JasonMorgan opened this issue Sep 22, 2016 · 13 comments · May be fixed by #17483
Open

here string syntax #2337

JasonMorgan opened this issue Sep 22, 2016 · 13 comments · May be fixed by #17483
Labels
In-PR Indicates that a PR is out for the issue Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif Issue-Enhancement the issue is more of a feature request than a bug KeepOpen The bot will ignore these and not auto-close Up-for-Grabs Up-for-grabs issues are not high priorities, and may be opportunities for external contributors WG-Language parser, language semantics

Comments

@JasonMorgan
Copy link

Issue

it would be nice it here strings could close without necessarily needing to be at the start of a line. Specifically it would be nice if I could tab out the closing side of a here string to to match the opening indentation.

ex:

  @"
    Some wacky
    Multiline
    String
  "@

instead of this:

  @"
    Some wacky
    Multiline
    String
"@

Environment data

Name                           Value
----                           -----
PSVersion                      6.0.0-alpha
PSEdition                      Core
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   3.0.0.0
GitCommitId                    v6.0.0-alpha.9
CLRVersion
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
@lzybkr lzybkr added WG-Language parser, language semantics Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif labels Sep 22, 2016
@lzybkr
Copy link
Member

lzybkr commented Sep 22, 2016

Great idea. I actually implemented this at one point (V2 or V3, I forget) but we didn't end up taking the change.

It is a potential breaking change - though probably an obscure one. The most likely breakage we envisioned was in scripts that generate PowerShell - though that's still a bit of a stretch. Imagine some script like this:

$indent = 4
$t = [regex]::Replace(@"
    `$x = @"
        hello world
    "@
    `$x
"@, "^ {$indent}", "", [System.Text.RegularExpressions.RegexOptions]::Multiline)
iex $t

In other words, maybe the generator wants a nicely indented here string in a here string, and unindents them before executing.

@SteveL-MSFT SteveL-MSFT added Issue-Enhancement the issue is more of a feature request than a bug Up-for-Grabs Up-for-grabs issues are not high priorities, and may be opportunities for external contributors labels Nov 4, 2016
@bergmeister
Copy link
Contributor

I would also welcome the ability to define a here string in one line (for convenient copy-paste-ability). Because sometimes I have strings with single quotes and dollar signs in them where I cannot use either single or double quotes and have to escape the special characters.

@lzybkr
Copy link
Member

lzybkr commented May 22, 2018

It seems weird to call a single line string a here string.

One syntax we discussed but never implemented was very similar to a Rust raw string literal, In PowerShell it might have looked like:

PS> "abc '@' def" -eq @'
abc ''@' def
'@ # Today's here string syntax
True
PS> "abc '@' def" -eq @''abc '@' def''@ # Proposed raw string #1
True
PS> "abc ''@'' def" -eq @'''abc ''@'' def'''@ # Proposed raw string #2
True

The point being that the delimiter is a variable number of characters so you can not worry about escaping or doubling up inside the string literal - just add more quotes in the delimiter when you hit a case where you'd double up the quote to get the literal quote in the string value.

@bergmeister
Copy link
Contributor

bergmeister commented May 22, 2018

That idea of rust raw literal strings would be awesome except that it cannot deal well with strings that start with a (single) quote, therefore a different character should be chosen in my opinion. Do you think this would be technically difficult to implement?

@lzybkr
Copy link
Member

lzybkr commented May 22, 2018

New delimiters should be straight forward.

To handle the quote problem - you could emulate markdown which ignores whitespace immediately after/before the delimiter, e.g.:

PS> "'a'" -eq @' 'a' '@
True

This of course might be a problem if you want leading/trailing whitespace in your string, but you'll have this problem no matter what character you choose - single quote, space, etc.

@vexx32
Copy link
Collaborator

vexx32 commented Jan 8, 2019

Looking at the prior PR, @SteveL-MSFT @lzybkr, it looks like the implementation was rejected due to it being insufficiently thorough.

Can we get a committee decision or recommendation on the desired implementation for this? I note @SteveL-MSFT mentioned that we may want to have a consistent indentation level of whitespace that is snipped at the start of each line, determined by the string terminator, and @iSazonov posited we could use visible characters to define this level as well.

I'm more than happy to attack this problem, it seems to me that it could be attacked at the tokenizer level without much difficulty, potentially, but I'd like to have a concrete definition of the expectation here.

Would we need an RFC drafted and approved for this first?

@SteveL-MSFT
Copy link
Member

@vexx32 I think a mini-RFC would be good to cover the desired usage and compatibility. The main concern from the associated PR which was not approved is that it was a breaking change if whitespace was intended. The recommendation is to propose a new token to make it explicit on the new behavior and not modify the existing behavior.

@HumanEquivalentUnit
Copy link
Contributor

HumanEquivalentUnit commented Mar 27, 2019

To handle the quote problem - you could emulate markdown which ignores whitespace immediately after/before the delimiter, e.g.: PS> "'a'" -eq @' 'a' '@
This of course might be a problem if you want leading/trailing whitespace in your string, but you'll have this problem no matter what character you choose - single quote, space, etc.

Would it be too awful if you had to declare how many delimiters you intended to use?

@[2]'''a'
''@

(using brackets because @2'' clashes with splatting a variable named 2).

or in the end header, there's room for a lot more non-clashing ideas, although it would mean a parsing lookahead:

    @'''a'
    ''@{hereStringDelimiterLength=2; hereStringStyle=IndentedIgnoreWhitespace}@

@Jaykul
Copy link
Contributor

Jaykul commented Mar 27, 2019

Is it worth pointing out that PowerShell already handles here strings as starting with just TWO characters? The third character is completely superflous (mandatory, but totally unnecessary). Try writing $x = @"here" and see what PowerShell says:

No characters are allowed after a here-string header but before the end of the line.

The here-string header is just the first two characters, but there's a requirement that the actual content be on the next line.

Let's talk about the real problem

  1. Why do you want to do this?
  2. Why do you care about where the "@ has to go?

I suspect what's really going on is that people are using editors like VS Code which does code folding based on indenting instead of braces and needing to put "@ in the first column breaks the code folding

We need to address the tooling, not the language.

I mean, I know it's awkward to have strings stuck up against the margins in in the code, but I also know that breaking code folding is the main reason it bothers me. If we make the here strings suddenly allow white space (we should definitely allow it after the @" too, because it's just easier not to worry about invisible things, and it won't break any existing code), the very next day someone's going to realize that what we really wanted was the same amount of leading space that comes before the "@ to be trimmed off of each line in the string, so our multi-line strings are lined up relative to where we are in the code, not relative to the start of the line!

We'll end up seeing stuff like this all over the place in PowerShell code:

  @"
    Some wacky
    Multiline
    String
  "@ -replace '(?m)^\s{2}'

Or worse, people will start indenting with tabs, so they can use spaces "inside" the indent ...

@rkeithhill
Copy link
Collaborator

VS Code which does code folding based on indenting

Not any more. The PowerShell extension does code folding based on the AST (not indenting) since the v1.10 release thanks to @glennsarti.

@brendan-sherrin
Copy link

brendan-sherrin commented Feb 5, 2020

  1. Q: Why do you want to do this?:
    A: I use here strings for sending nested JSON events to splunk. I generate the event formatting in JSON and then write vars > string and use Send-SplunkEvent with the JSON as -inputobject
  2. Q: Why do you care about where the "@ has to go?
    A: I'm generally doing this a few levels indented for a loop/if-then and it breaks the indenting of that level.

An example is the below below here code:
$JSONEvent = @" { "accountId": "$($Account)", "category": "$($Check.Category)", "id": "$($Check.ID)", "name": "$($Check.Name)", "time": "$($Checkitems.Timestamp)", $($Event) } "@ #This needs to be at the start of the line, without indentation

This is part of a script to get AWS Trusted Advisor data for each account. I fill in the usual suspects account, check name etc. then drop a munged $Event on the end of the object as that varies in field count/name for each check. The section the above is a part of, is 2 indents in at this point, it'd just be nice to have this stay in the same indent.

#Edit, didn't preserve the code formatting..

@mklement0
Copy link
Contributor

mklement0 commented Jul 17, 2020

@bergmeister and @lzybkr, I love the idea of a single-line raw string literal, which would also help with the problem #13068 is trying to solve.

Given that this issue is primarily about allowing indented here-string end delimiters, I've created a separate proposal in #13204, based on @lzybkr's ideas.

@michaeltlombardi
Copy link
Contributor

michaeltlombardi commented May 10, 2022

Was working on something off to the side yesterday that reminded me of this issue. For me, it's not just about the code folding but the readability/scannability of the code; here-strings are the only thing I've run across that mandate breaking indentation behavior and it's jarring to see it and then have to move your eyes back to where everything else is.

That people are writing indentation trimmers for here-strings is indicative of the problem being long-standing.

If a new pairing of symbols is required for indentation-trimming here-strings, maybe we could use one of the following:

Function Get-HereStringExample {
    param()

    begin {}

    process {
        @{
            Current = @'
The here-strings we have now.
'@
            PossibleOne = @~'
                A here-string we could have
            '~@ # The ~ indicates trimming on either end; 
            PossibleTwo = @@'
                A here-string we could have
            '@@ # When there's more than one @, trim indent
        }
    }

    end {}
}

I slightly favor a trim character directive (no real preference on what character is used / the specific of the syntax, more about the usefulness) over adding @s. Consider these scenarios:

    ...
    @~'
        A multi-line string,
        with the leading whitespace trimmed
    '@
A multi-line string,
with the leading whitespace trimmed

    ...
    @'
        A multi-line string,
        with the trailing whitespace trimmed
    '~@

        A multi-line string,
        with the trailing whitespace trimmed
    ...
    @~'
        A multi-line string,
        with the leading & trailing whitespace trimmed
    '~@
A multi-line string,
with the leading & trailing whitespace trimmed

This behavior is similar to the whitespace chomping behavior in some templating languages (go templates, Ruby ERB, jinja, etc) - in those cases, they're trimming whitespace around the tag, but the same principle can be applied to inside the control tag for a PowerShell here-string without much cognitive load, I think.

  • A trim directive on the opening tag removes all whitespace up to the first non-whitespace character line-by-line, determining the leading indent to trim from remaining lines by calculating the whitespace before the first non-whitespace character on the first line with non-whitespace characters. It trims up to that much whitespace from remaining lines.
  • A trim directive on the closing tag trims all whitespace after the last non-whitespace character.

@MartinGC94 MartinGC94 linked a pull request Jun 4, 2022 that will close this issue
22 tasks
@ghost ghost added the In-PR Indicates that a PR is out for the issue label Jun 4, 2022
@microsoft-github-policy-service microsoft-github-policy-service bot added Resolution-No Activity Issue has had no activity for 6 months or more and removed Resolution-No Activity Issue has had no activity for 6 months or more labels Feb 10, 2024
@SteveL-MSFT SteveL-MSFT added KeepOpen The bot will ignore these and not auto-close WG-NeedsReview Needs a review by the labeled Working Group labels Apr 29, 2024
@SteveL-MSFT SteveL-MSFT removed the WG-NeedsReview Needs a review by the labeled Working Group label Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In-PR Indicates that a PR is out for the issue Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif Issue-Enhancement the issue is more of a feature request than a bug KeepOpen The bot will ignore these and not auto-close Up-for-Grabs Up-for-grabs issues are not high priorities, and may be opportunities for external contributors WG-Language parser, language semantics
Projects
None yet