Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Syntax for embedding cargo-script manifests #3503

Merged
merged 60 commits into from
Apr 23, 2024

Conversation

epage
Copy link
Contributor

@epage epage commented Sep 26, 2023

Rendered

This is for the T-lang side of #3502

Example:

#!/usr/bin/env cargo

---
[dependencies]
clap = { version = "4.2", features = ["derive"] }
---

use clap::Parser;

#[derive(Parser, Debug)]
#[clap(version)]
struct Args {
    #[clap(short, long, help = "Path to config")]
    config: Option<std::path::PathBuf>,
}

fn main() {
    let args = Args::parse();
    println!("{:?}", args);
}

FCP

@epage epage added the T-lang Relevant to the language team, which will review and decide on the RFC. label Sep 26, 2023
@epage epage mentioned this pull request Sep 26, 2023
@thomcc
Copy link
Member

thomcc commented Sep 27, 2023

In PL/Rust (a Rust subset that works as a postgres procedural language handler) we use a somewhat hacky syntax like (see https://github.com/tcdi/plrust/blob/main/doc/src/dependencies.md for example source)

[dependencies]
rand = "0.8"

[code]
use rand::Rng;
Ok(Some(rand::thread_rng().gen()))

I greatly prefer the approach in this RFC (and will likely push PL/Rust transition to it if the RFC is accepted) but it's probably worth noting as prior art.

@bstrie

This comment was marked as resolved.

@est31
Copy link
Member

est31 commented Sep 27, 2023

If you think this RFC a little bit further, then you could generalize this to make rustc ignore all ``` delimitered blocks, to say that rustc should just ignore all these sections and leave them to other tools similarly to how it does it already for the shebang. And then you'd see how similar triple backtick blocks are to /* */ comments. From there to just adding `/* */` comments to the AST it's only a small step.

IMO it's bad style to add cargo specific extensions to Rust's syntax. Just say that ``` delimitered blocks are ignored by rustc and that parser implementations are suggested to add them.

@epage
Copy link
Contributor Author

epage commented Sep 27, 2023

In PL/Rust (a Rust subset that works as a postgres procedural language handler) we use a somewhat hacky syntax like (see https://github.com/tcdi/plrust/blob/main/doc/src/dependencies.md for example source)

Thanks!

I've added this as prior art and would love more interoperability (one of the stated motivations for the cargo script RFC)

Comment on lines 306 to 316
### Alternative 7: Extended Shebang

````rust
#!/usr/bin/env cargo
# ```cargo
# [dependencies]
# foo = "1.2.3"
# ```

fn main() {}
````
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bstrie (moved here for context / easier to follow)

I would like to consider alternative 7, the extended shebang. I don't think the backticks and the redundant cargo specifier should be necessary, producing this:

#!/usr/bin/env cargo
# [dependencies]
# foo = "1.2.3"

fn main() {}

I think this looks quite good. It's fewer lines than the proposed syntax, and mirrors both the shebang syntax and the attribute syntax.

I understand that people might want this to be generalizable/extensible, but this could suffice for now and any discussion about how to generalize the "info" portion can be left for a future discussion. If people think that it's important to make it generalizable right now, then I'd be interested to hear some concrete use cases.

EDIT: although I suppose an unmentioned downside of this is that # [dependencies] might look a bit too much like an attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mentioned, there is the syntax ambiguity and backticks let us "escape" this block of # lines.

Yes, the cargo specifier is redundant to the reader but each use of cargo is for a different purpose

  • One is for execution
  • One is for parsing

I've noted in the Future Possibilities that we could relax the requirement on having cargo in the infostring in the future. I would lean towards defaulting to cargo rather than parsing the shebang because shebang parsing is messy.

We also wanted to leave the door open (slightly) for adding additional frontmatter blocks, like if we decide to embed lockfiles.

@epage
Copy link
Contributor Author

epage commented Sep 27, 2023

If you think this RFC a little bit further, then you could generalize this to make rustc ignore all ``` delimitered blocks, to say that rustc should just ignore all these sections and leave them to other tools similarly to how it does it already for the shebang. And then you'd see how similar triple backtick blocks are to /* */ comments. From there to just adding `/* */` comments to the AST it's only a small step.

Is this meant to positive suggest something or to point out a slipper slope? I'm not too sure the intent.

IMO it's bad style to add cargo specific extensions to Rust's syntax. Just say that ``` delimitered blocks are ignored by rustc and that parser implementations are suggested to add them.

I'm not seeing how this is different than the suggested future state. We are starting with it being locked to cargo initially as we work out the design / usage and then can remove that restriction which is noted Future Possibilities.

[drawbacks]: #drawbacks

- A new concept for Rust syntax, adding to overall cognitive load
- Requires people escape markdown code fences with an extra backtick which they are likely not used to doing (or aware even exists)
Copy link
Contributor

@jplatte jplatte Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refer to markdown inside multiline TOML strings? It's not really clear. Do any of Cargo's manifest fields even support markdown syntax?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be referring to general usage outside of Rust source files, for example if I wanted to express this syntax here in this comment I would have to escape the backticks somehow (or indent by four spaces, but that doesn't allow syntax highlighting).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooohh I see. In that case there's also the option of using the (older, I think?) syntax of prefixing the .rs snippet with four spaces instead of ````, though I'm not sure it it's as well-known as code fences, and it has the downside of not supporting language tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

394387b adds some context. This is about sharing snippets over github or zulip. Since you are putting a code fence inside of a code fence, the outer one needs to use 4 backticks

- Parsers are available to make this work (e.g. `syn`)

Downsides
- The `cargo` macro would need to come from somewhere (`std`?) which means it is taking on `cargo`-specific knowledge
Copy link
Contributor

@bstrie bstrie Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the macro approach, I don't think it would be necessary to embed any Cargo-specific knowledge in std. In every other approach the data here is stored in a glorified comment, which means we're fine if it gets thrown away as far as Rust is concerned. The macro here could simply expand to nothing, and trust that other tooling will parse the macro body as needed (which is easier than it sounds, since the macro body should just be treated as raw tokens rather than anything that needs parsing). Rather than calling it cargo, call it build! or meta! or something. Although I suppose the fact that it will still need to lex to Rust tokens might be limiting compared to a string or a comment, unless we want to make it magical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added mention of meta! in 42077b7. I don't bother exploring it due to the other problems with macros.

# Future possibilities
[future-possibilities]: #future-possibilities

- Treat `cargo` as the default infostring

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the main reason for not including this in this proposal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified this a little in 842f722

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, we want to start with the absolute minimal approach and see what we feel needs to be relaxed from there (which is backwards compatible) rather than make a lot of assumptions and then regret them,

@Aloso

This comment was marked as resolved.

@kurtbuilds
Copy link

kurtbuilds commented Feb 29, 2024

The question when adding something should always be, "why should it be included?", not "Why not include it?" In this case, the specific question is, why should it become Rust syntax?

I argue above, that it shouldn't. It becomes a 9th (I think?) comment syntax, leading to all kinds of downstream questions. Can I put it elsewhere? Does it work like docstrings? What's the single-line analog of it? All questions that add complexity to learning the language.

You raise an excellent point in your post, though. If there's comments around or before the frontmatter - what impact does that have. It would imply that the frontmatter handler has to know something about comment syntax to handle those cases correctly. At which point, I'd ask the question, why not just put the "frontmatter" block inside existing comment blocks? It doesn't need the full weight of a rust syntax parser, just enough to handle comment blocks - which seems like no more or no less machinery than a --- block parser.


EDIT: There's another prior art discussion that I don't think has been brought up before. There's large precedent for comments being used for somewhat-out-of-band tooling. Directives for formatters and linters is the main one. Putting a Cargo.toml within a plain old comment block fits very nicely within that pattern and mental model.

@epage
Copy link
Contributor Author

epage commented Mar 18, 2024

@dlight

Could this become Rust syntax? Or at least, be recognized by rustc in order to give high quality error messages (or to just ignore it and assume it has been handled by an external tool).

That is what this proposal is about is making --- blocks Rust syntax.

(Sorry if this has been proposed already) I think that there should at least be an idiom to optionally wrap this into a Rust comment, like this:

See https://github.com/epage/rfcs/blob/frontmatter/text/3503-frontmatter.md#alternative-5-regular-comment

@epage
Copy link
Contributor Author

epage commented Mar 18, 2024

@kurtbuilds

At which point, I'd ask the question, why not just put the "frontmatter" block inside existing comment blocks? It doesn't need the full weight of a rust syntax parser, just enough to handle comment blocks - which seems like no more or no less machinery than a --- block parser.

This does not solve any of problems laid out in the RFC

  • We are still adding structure to an unstructured syntax
  • At minimum, we then need to be able to handle the given types of Rust comments
    • Unless we put further restrictions on the comment, we have to deal with not just comments but the full Rust grammar.
    • If we put restrictions on the comment, we are taking something with few rules and added special rules to it, making it harder to intuit and work with as there is a gap in power between how the tool they are using can generally be used and how it can be used in this one case.

EDIT: There's another prior art discussion that I don't think has been brought up before. There's large precedent for comments being used for somewhat-out-of-band tooling. Directives for formatters and linters is the main one. Putting a Cargo.toml within a plain old comment block fits very nicely within that pattern and mental model.

Besides shebang, which has syntactic meaning, the Rust ecosystem doesn't rely on this precedence. For example, rustfmt and clippy use attributes over comments.

@LunarLambda
Copy link

A good argument against embedding the manifest in comments is that it will make it harder for tooling like treesitter (used in various text editors for syntax handling) to handle them properly.

For example, many editors have no issues highlighting code blocks in Markdown with the appropriate language, since code blocks are a separate syntactical element from the rest. However, as of today I think only a couple editors (I'm only aware of VS Code) highlights markdown syntax embedded in Rust doc-comments, which is harder to do since it's embedded inside of an existing syntactical element (comments), which are commonly line-based and not block-based as well. (How many people write /** doc comments?).

Additionally, changing and controlling (e.g. where it can appear, etc.) a separate syntax element on Rust's side of the equation is a lot easier than trying to do the same with comments.

Allowing tools to provide good integration with the feature without technical compromises is an important aspect of this feature's design after all.

@nikomatsakis
Copy link
Contributor

@rfcbot reviewed -- thanks for your patience and hard work on this, @epage

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Apr 11, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Apr 11, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Apr 11, 2024
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Apr 21, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Apr 21, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@jsgf
Copy link

jsgf commented Apr 21, 2024

This all looks great and my initial questions and concerns were all addressed by the text of the rfc itself.

My principle concern is that we can support non cargo, both so that non cargo can ignore this and also make use of the machinery for its own purposes. I think the infostrings will be fine though I guess "missing" will default to cargo, or maybe depend on that the interpreter is.

For cargo, isn't config.toml also a plausible thing here? For example for target specific code, or use of special rustc flags?

@kennytm
Copy link
Member

kennytm commented Apr 22, 2024

For cargo, isn't config.toml also a plausible thing here? For example for target specific code, or use of special rustc flags?

I suppose that has been addressed in the Future possibilities part.

#!/usr/bin/env -S cargo +nightly -Zscript
--- Cargo.toml
[dependencies]
serde = "1"
---

--- .cargo/config.toml
[profile.dev]
panic = "abort"
---

use serde::Deserialize;
...

@epage
Copy link
Contributor Author

epage commented Apr 22, 2024

For cargo, isn't config.toml also a plausible thing here? For example for target specific code, or use of special rustc flags?

The important part for this RFC is that we leave room for future expansion if its needed. If nothing else, there is the chance of needing support for Cargo.lock (though I personally hope we can find ways to avoid that)

For what gets supported for embedding, the more relevant RFC is #3502. Specifically, the Cargo team wanted all of this to focus more on an MVP. We don't even support workspaces in the first iteration. On that RFC, configs are worthwhile to discuss as for how likely of a future possibility they are. My quick note on that though is they are unlikely to ever be supported. Configs are environmental and not project configuration. It doesn't make sense to embed them in the project and it mixes up their roles (which are already muddled enough as-is). Instead, the focus should be on rust-lang/cargo#12738. Again, if you want to discuss this further, #3502 is a better place for this.

@jsgf
Copy link

jsgf commented Apr 23, 2024

Really I was just commenting on the wording:

At least for cargo's use cases, the only other file that we would consider supporting is Cargo.lock...

"Only" seems pretty strong there. But you're right that's in the weeds as far as this RFC goes.

Copy link
Member

@joshtriplett joshtriplett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has passed FCP. Merging. Thank you!

@joshtriplett joshtriplett merged commit 130caeb into rust-lang:master Apr 23, 2024
@epage epage deleted the frontmatter branch April 23, 2024 16:18
@traviscross traviscross removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label May 1, 2024
@jwnrt
Copy link

jwnrt commented May 4, 2024

I have a question about the alternatives that use macros (4 and 6). These are blocked on:

Either we expose syns lesser parse errors or we skip errors, deferring to rustc’s, but then have the wrong behavior on commands that don’t invoke rustc, like cargo metadata

I agree that syn isn't a good option here but using rustc as the parser seems ideal. Do you happen to know the reason that commands like cargo metadata cannot invoke rustc when they're used on a cargo script file? It seems reasonable to require rustc be installed.

I hope I'm understanding correctly that this would work similarly to how you can run rustc --print crate-name on a file containing ![crate_name = "foo"] and get foo back without rustc doing a full compilation. I thought maybe rustc defaulting to the 2015 edition would cause problems here, but that command works fine on files containing async functions (for example) and also worked on files using dependencies.

(aside: sorry if it's bad etiquette to discuss an RFC after it's accepted, I can take this elsewhere if necessary)

@epage
Copy link
Contributor Author

epage commented May 4, 2024

Some approaches to parsing include

  • syn for which we have to decide between reporting poor errors or sometimes ignoring the manifest when rustc isn't invoked to error on our behalf
  • rustc_lexer (and maybe more parsing mechanisms) this would get in the way of this being focused on any editor tool being able to operate on
  • creating a new mechanism for dumping any attribute with rustc --print. This wasn't considered in the original design. An important aspect is that we be able to edit the embedded manifest. We'd have to write this to include span information of the attribute's string which is quite specialized (since this is a generic attribute dumping feature and attributes can come in many forms) and would require special handling for the print to generate parseable content for cargo.

(aside: sorry if it's bad etiquette to discuss an RFC after it's accepted, I can take this elsewhere if necessary)

In my experience, re-litigating RFCs without there being dramatically new information is not very productive.

@jwnrt
Copy link

jwnrt commented May 4, 2024

We'd have to write this to include span information of the attribute's string which is quite specialized (since this is a generic attribute dumping feature and attributes can come in many forms) and would require special handling for the print to generate parseable content for cargo.

This is definitely an issue. The output of rustc --print would ideally be the "processed" string that doesn't include escaped characters etc, but the span would need to be the "raw" string which could even include things like concat!(...). It doesn't feel impossible to solve, though.

In my experience, re-litigating RFCs without there being dramatically new information is not very productive.

Understood, I really don't want to waste everyone's time. I don't have new information beyond the concerns raised previously and should have gotten involved before the FCP.

I'll stick my concern at the bottom of this comment but don't really want to restart discussion here. I suppose a new RFC would be needed to propose an alternative.


My concern is that this syntax feels unlike any existing concept in Rust. It conflicts with prior art, by which I mean other tools (rustfmt, clippy, rustdoc) that interface using attributes and other embedded languages (asm and markdown) which go into macros and comments/attributes respectively. Subjectively it "feels ugly" which I know doesn't carry much weight, but even in Markdown it feels like a bodge that breaks everywhere outside Zola and Hugo. I do feel that an attribute and a cargo! { ... } macro for convenience would be more explicit and integrate better with other Rust concepts like include_str.

Apologies if this is too critical of the proposal that I know you put a lot of time and work into.

@epage
Copy link
Contributor Author

epage commented May 4, 2024

Have you used it in practice or is this from only reading the RFC?

@jwnrt
Copy link

jwnrt commented May 4, 2024

Just the examples in the RFC. I'm enjoying cargo-script but haven't tried the frontmatter syntax.

@epage
Copy link
Contributor Author

epage commented May 4, 2024

I'd recommend having first hand experience with it as well as making sure you are familiar with all of the background information in the section leading up to describing each of the options.

For context where I'm coming from, I used the doc comment syntax for about 6 months with 4 months for backtick and dash frontmatter each. Just over the last 8 months, I've written about 50 scripts from scratch as I use it to reproduce nearly every bug report I get.

bors added a commit to rust-lang/cargo that referenced this pull request May 6, 2024
fix(toml): Remove unstable rejrected frontmatter syntax for cargo script

### What does this PR try to resolve?

With rust-lang/rfcs#3503 approved, we no longer need to allow easy, high fidelity experiments with alternative cargo script syntax.

### How should we test and review this PR?

### Additional information

We still need to improve the experience for users writing bad syntax but that can come later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet