Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add use and a binary format to WIT.md #141

Merged
merged 13 commits into from
Feb 3, 2023

Conversation

alexcrichton
Copy link
Collaborator

@alexcrichton alexcrichton commented Dec 6, 2022

This commit brings the WIT.md file up-to-date with recent developments in WIT, chifely the use syntax. I've added a new "introduction section" which is a higher-level explanation of the format than the lexical/syntactic structure. I've also edited the syntactic structure to clarify a few old references and remove resource, future, and stream types for now. (resources to be added soon I suspect)

Additionally I've added a description of a binary format for WIT based on the component model. This binary format is intended to be the "types only" mode of the wasm-tools component tooling, although it doesn't match the current implementation and the wasm-tools repo will need to be updated. This representation, however, provides the means by which to understand what it means for a component to have the type of a world, namely it's a subtype of the binary representation's component type.

Closes #140

@alexcrichton alexcrichton changed the title Modernize the WIT.md description Add use and a binary format to WIT.md Dec 6, 2022
@alexcrichton
Copy link
Collaborator Author

I realized after opening that the original title didn't even mention use, which is one of the primary purposes of this PR, so I've updated it.

@fibonacci1729 fibonacci1729 mentioned this pull request Dec 6, 2022
This commit brings the `WIT.md` file up-to-date with recent developments
in WIT. I've added a new "introduction section" which is a higher-level
explanation of the format than the lexical/syntactic structure. I've
also edited the syntactic structure to clarify a few old references and
remove `resource`, `future`, and `stream` types for now. (resources to
be added soon I suspect)

Additionally I've added a description of a binary format for WIT based
on the component model. This binary format is intended to be the "types
only" mode of the `wasm-tools component` tooling, although it doesn't
match the current implementation and the `wasm-tools` repo will need to
be updated. This representation, however, provides the means by which to
understand what it means for a component to have the type of a `world`,
namely it's a subtype of the binary representation's component type.
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
@ricochet
Copy link

ricochet commented Dec 7, 2022

A few minor nits and one q, otherwise this looks good to me.

Out of scope and not a blocker to merge: It'll be interesting to see the next round of this with URL's. Something we can (one day) do with components that we can't do with many other dependency resolution technologies is link multiple versions. This MVP reasonably doesn't grapple with that.

* Allow preceding `default` keyword
* Remove `export foo: u32`
* Specify more support in `interface-type` to include all the `use-from`
  possibilities such as `id`, `strlit`, and `id 'in' strlit`
Copy link
Member

@lukewagner lukewagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On first pass, lgtm, nice job writing this up!

design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
@alexcrichton
Copy link
Collaborator Author

I talked with Luke a bit more day, and we concluded that the current idea of slurping up everything into one WIT document is probably not actually going to work. This loses filesystem structure when merging into a document and doesn't actually provide a means by which to name everything in the binary representation within a component. An existing bug from my description here is that there's not a great way to name conflicting interfaces when they are placed in the binary representation.

Another downside is that the source of an import is lost. This is significant when a registry is considered, for example, because when you import something from wasi:random the fact you imported it from there is lost and no longer meaningful by the time it's merged into one document. From a code-generation perspective this is ok but from a "module ecosystem of wits" this is not ok.

I wanted to basically post this here as a status update. I would like to try to come up with a design to fix this but it will require some thinking about how this all works out.

design/mvp/WIT.md Outdated Show resolved Hide resolved
* The top-level item is now a "wit package"
* A package is a collection of documents, and documents are a file
* `use` can happen between interfaces of a file, between interfaces
  across documents in a package, and across packages
* Documents are always a flat list of interfaces/worlds/etc
* The binary encoding now is updated to accommodate restrictions for
  resources, this new flat list, and such.
@alexcrichton
Copy link
Collaborator Author

Ok I've pushed a relatively large update here. After talking with @lukewagner we've settled on something which I think makes sense for all known and predicted use cases. There were a lot of subtle updates here and there so another once-over would be much appreciated!

* Remove strings as targets-for-`use` and instead always use from
  identifiers. This pushes resolution and such up a layer where it's
  going to be happening anyway between packages.

* Move the module that's being used from to the start of `use` to better
  assist with auto-completion in IDEs possibly in the future. This way
  we as humans will type `use foo.bar.{` and an IDE should have enough
  context by that point to auto-complete what to use.

* Add a `pkg` and `self` path anchor which paths start from to indicate
  that they're rooted in the current package or current document.

* Continue to use `.` as a separator between items in a `use` statement.
@alexcrichton
Copy link
Collaborator Author

Another weekend, more thinking, another Monday, yet more thinking. I've pushed an update to the use syntax now to address some concerns I have plus concerns from the last wit-component meeting last Friday:

  • The "thing that's being used from" is now syntactically placed first. Previously it was use { ... } from ... which isn't amenable to IDE auto-completion, whereas reversing the order should make it easier on IDEs since the set of what-to-autocomplete is narrowing down as you type characters. This is a far-future concern since IDE support will take awhile but seems as good enough a reason as any to not paint the bikeshed that particular shade of colors.

  • I've tried to game out what this looks like in the low-level tooling and have opted to completely remove strings from the use syntax. Syntactically this changed but semantically you can do the same thing. Previously what was written as use { a-type } from "wasi:types" is now written as use wasi.types.{a-type}. From a parsing and tooling perspective this clearly identifiers that wasi is an identifer that needs to be resolved to a package by some outer context. For example registry tooling would be externally configured to say what the wasi identifier points to and that configuration would be communicated to the WIT parsing phase.

@ricochet
Copy link

The syntactic change from string to package identifiers makes a ton of sense from an implementation perspective. My initial take was that it loses a little bit of the human readability we were hoping to get out of wit. Additionally, too many years in java gave me an initial knee jerk reaction to seeing pkg.types.foo as looking a lot like com.mycompany.types.foo. But I got past that :)

The indication of current path via self vs pkg as the root is an unfamiliar paradigm to me. Something else to consider is a dot prefix to indicate local self, e.g. use .types.{errno as my-errno}.

design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
@alexcrichton
Copy link
Collaborator Author

Yeah the prior draft here used .types for "use from a sibling file called types.wit" which was Python-inspired. The current use of pkg and self is Rust-inspired where Rust has use crate::... and use self::... for similar constructs where crate is the root of the crate and self is the current module.

Once strings were removed from the prior draft then there was not distinction between using from a dependency and using from an interface within the same document, both would have looked like use foo.bar which is something I wanted to at least initially avoid to prevent confusion if possible. In Rust one of the major things in the 2018 edition, at least from my perspective, was "things make a lot more sense if use paths are always anchored at a known set of roots" for example your dependencies or the crate root.

One of the difficulties with a Python-like interface is name resolution within a file itself, for example:

interface foo {
    type a = u8
}

interface bar {
    use self.foo.a
}

Previously this would have been use foo.a but that can possibly be confused with a foo dependency as well. If the use .foo.a syntax were taken to indicate a sibling foo.wit file I'm not sure what would be left for specifying a local intra-file use.

Copy link
Collaborator

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot to digest here, but it's a fascinating writeup! Hope I'm not rehashing too much in the comments!

design/mvp/WIT.md Outdated Show resolved Hide resolved
import shared2: pkg.shared2

import foo: interface {
use pkg.shared1.{a-type}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be self.shared1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so in that this is referencing the shared1.wit file (although it may not be entirely clear here)

Copy link
Collaborator

@guybedford guybedford Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this as that a feature of the disambiguation was that one could use any name here, unrelated to the actual WIT file name, along the lines of:

world my-world {
  import whatever: pkg.shared1

  import foo: interface {
    use self.whatever.{a-type}
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question, and a bit subtle. What's happening with use is that you're explicitly reaching outside the current world to select an interface and then relying on the machinery of use to add it to the containing world.

If you want to refer to an already imported interface in your same world, I think (but might be wrong, or this might not be implemented yet, so Alex check me on this) you could just use the identifier of the import directly, so, e.g., you could write:

world my-world {
  import shared1: pkg.shared1
  import shared2: pkg.shared2

  import foo: interface {
    type a-type = shared1.a-type
  }
  import bar: interface {
    type other-type = shared2.other-type
  }
}

If that's right, it might be useful to include both examples?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct yeah that reaching and importing from a specific named import isn't supported yet. Currently a path of the form foo.bar refers to the foo dependency and the bar document within there, so I think there will be some nuance implementing this.

That being said there's a fair amount of inference about how precisely a world is structured, so it's hopefully unlikely that the desire to do this comes up any time soon. For example unless an interface is being both imported and exported any references to it are unambiguously an import.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about how this could be integrated in the future, we'd probably naturally want to allow foo.bar (and shared1.a-type in the example above), which raises the ambiguity Alex mentions. Given that inter-document-intra-package references are qualified with pkg., perhaps it makes sense to prefix inter-document-inter-package references as well, e.g. with ext. or dep.. Once we have prefixed external references, then I think that frees up intra-document references to optionally drop the self. prefix when it is unambiguous (i.e., when there is not shadowing by an interface- or world-local name), keeping self. as an option for when there is ambiguity (shadowing).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I've been trying to think about this a good deal recently, and I'd like to propose an alternative. I actually had a really long comment written up about all the various design with use and trying to rationalize that with my historical experience with the Rust module system, but I ended up throwing it away when I realized my conclusion didn't make sense. The crux of it was that, at least in Rust, names aren't inherited between module namespaces. For example in Rust you have:

type SomeType = u32;

mod foo {
    pub fn foo() -> SomeType { 2 } // ERROR: no name `SomeType` defined
}

where each mod has its own fresh new namespace. Personally I think this is somewhat important because it prevents names from spilling over everywhere and raising the risk of "action at a distance" where an edit in one location breaks something far away.

Now a resonable question is what does any of this have to do with worlds here. My thinking is that if we allowed:

world my-world {
  import shared: pkg.shared1

  import foo: interface {
    use shared.{a-type}
  }
}

then why not also allow something along the lines of

interface foo { type t = u32 }

interface bar {
    use foo.{t} // as-is this PR otherwise requires `self.foo.{t}`
}

This feels like it's re-introducing the problem of inheriting namespaces again, however, which I'm worried about enabling. For example if interface blocks inherit their document's outer namespace, why not have documents inehrit their package's outer namespace? That gets confusing though because then you wouldn't know at-a-glance what use foo.bar.{t} refers to -- is foo a document in your package, a dependency, or an interface foo in this document?

So one idea I had to perhaps solve this, instead of the world I wrote above consider this instead:

world my-world {
  import shared: pkg.shared1

  import foo: interface {
    use world.shared.{a-type}
  }
}

basically all paths in use start today with either pkg, self, or an identifier which is a dependency. An addition, for worlds, would be a world keyword which refers to the current world. Or we could add something like import.foo.bar or export.foo.bar as well since imports/exports have disjoint namespaces.

That would mean no change from what's implemented today, but planning to use import and export prefixes on imports-from-worlds in the future. How does that sound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having explicit import/export prefix be the plan of record makes sense too, so I'm fine just going with that, which I guess resolves this issue.

design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Show resolved Hide resolved
design/mvp/WIT.md Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Show resolved Hide resolved
design/mvp/WIT.md Show resolved Hide resolved
design/mvp/WIT.md Show resolved Hide resolved
@alexcrichton
Copy link
Collaborator Author

The changes described in this PR should now be fully implemented as of bytecodealliance/wasm-tools#867 and will be released to crates.io shortly.

Copy link
Member

@lukewagner lukewagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reading this, it still looks good, just a few comments/nits/questions from me. I see there are still some outstanding questions from @guybedford that it'd be good to close out. Once those are done, perhaps we can have a last call for comments, wait a few days, then merge this and iterate in future PRs/issues like usual.

design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
import shared2: pkg.shared2

import foo: interface {
use pkg.shared1.{a-type}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question, and a bit subtle. What's happening with use is that you're explicitly reaching outside the current world to select an interface and then relying on the machinery of use to add it to the containing world.

If you want to refer to an already imported interface in your same world, I think (but might be wrong, or this might not be implemented yet, so Alex check me on this) you could just use the identifier of the import directly, so, e.g., you could write:

world my-world {
  import shared1: pkg.shared1
  import shared2: pkg.shared2

  import foo: interface {
    type a-type = shared1.a-type
  }
  import bar: interface {
    type other-type = shared2.other-type
  }
}

If that's right, it might be useful to include both examples?

design/mvp/WIT.md Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Show resolved Hide resolved
design/mvp/WIT.md Show resolved Hide resolved
}
```

This form indicates that the identifier `package` corresponds to some externally

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the description!
Maybe you want mention that dependencies at the moment shout reside under the folder wit/deps/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW ther deps/* folder is more of a temporary convention of sorts as implemented by wit-component rather than something intended to be codified here. It's envisioned that this is where registry tooling comes into play and figures things out for you. For example a bindings generation tool would have external configuration about dependency information that wouldn't require a deps/* folder.

@lukewagner
Copy link
Member

lukewagner commented Feb 2, 2023

Ok, looks like there aren't any open issues and things have been quiet; I guess I'll merge at the end of the week if nothing else pops up.

@alexcrichton
Copy link
Collaborator Author

Ok I'm going to go ahead and merge this. I'm going to follow up with a few more words about types-in-worlds but that mostly only affects the wasm-encoding. Thanks again everyone for the discussion here!

@alexcrichton alexcrichton merged commit 891c41c into WebAssembly:main Feb 3, 2023
@alexcrichton alexcrichton deleted the udpate-wit branch February 3, 2023 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding a use directive to *.wit
7 participants