Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for Substitutions #220

Merged
merged 6 commits into from
May 9, 2021
Merged

Conversation

stuebinm
Copy link
Contributor

this adds subsititutions, which work similar to the Dhall.substitutions mechanism of the Haskell dhall package, which take a hashmap of simple types that should be available within dhall:

#[derive(Debug, Clone, Deserialize, Serialize, StaticType, Eq, PartialEq)]
enum Foo {
    X(u64),
    Y(i64),
}

let mut substs = collections::HashMap::new();
substs.insert("Foo".to_string(), Foo::static_type());

assert_eq!(from_str("Foo.X 1")
           .substitute_names(substs)
           .static_type_annotation()
           .parse::<Foo>()
           .unwrap(),
           Foo::X(1)

The idea behind this is to make it easy to add programmer-defined types which may be used in configs for programs, without forcing the program's users to re-import the same type definitions each time (or the programmers to keep the dhall-based definitions in sync with their rust types, since these are now generated automatically).

Note that the current implementation is a bit hacky in places — some types which feel like they should be internal to the dhall create are now used in serde_dhall (even if, strictly speaking, they are not), and instead of implementing substitutions during type checking or in the semantics, it just wraps the AST into additional let-bindings before anything else is done with it — I needed this for a little project of mine and was in a bit of a hurry, but thought I'd still share this here in case you want to merge it (if I find the time, I'll probably clean the code up a bit more in the next week or so).

this adds subsititutions, which work similar to the
`Dhall.substitutions` mechanism of the Haskell dhall package, and which
can be used e.g. like this:

```rust
serde_dhall.from_str(...)
  .substitute_names(hashmap!["Newtype" => static_type])
  .parse::<SimpleType>()?
```

The idea behind this is to make it easy to add programmer-defined types
which may be used in configs for programs, without forcing the program's
users to re-import the same type definitions each time (or the
programmers to keep the dhall-based definitions in sync with their rust
types, since these are now generated automatically).

Caveats so far:
 - makes some of the code ugly (dhall internals are now used in
   serde_dhall/src/lib.rs, for example)
 - haven't tested error messages so far
 - some unecessary copying of strings and static type values
@Nadrieril
Copy link
Owner

Oh this is cool! You're adding custom builtins, that's a very useful feature.
The implementation looks great, I would have done it like this too. I think wrapping in Lets isn't that hacky, it expresses exactly what we want.

I'm less confident about the API. "substitution" is compiler terminology that I don't expect people to be familiar with. I would prefer something like with_builtin_types. And I want to leave the possibility for builtin values and maybe functions later. Here's the API I have in mind, where with_builtin_type adds one entry to the HashMap:

impl Deserializer {
	pub fn with_builtin_type(self, name: String, ty: SimpleType) -> Self {...}
	pub fn with_builtin_types(self, substs: impl IntoIterator<Item=(String, SimpleType)>) -> Self {...}
}

How do you feel about that?

This does three things:
1. rename the substitute_names function into inject_types, and makes
   it accept anything that implements IntoIterator instead of just HashMaps
2. adds an extra function to inject just a single type
3. makes these functions chainable; before, each call to
   substitute_names would discard previous invocations. However, this
   currently comes at the cost of a lot of ugly copying.

also, more tests!
@stuebinm
Copy link
Contributor Author

stuebinm commented May 1, 2021

I like it, especially using IntoIterator rather than just Hashmaps! I also agree about the term "substitution"; I named it that mostly to stay consistent with the haskell implementation (though I guess it's more common in the haskell world to have lots of jargon lying around 🙈)

I've implemented this, though I named the two functions you suggested inject_types and inject_single_type — I feel "inject" describes better what actually happens ("transferring" a rust type into dhall), while with_builtin sounds more like adding a new fundamental type that can't be decomposed further in dhall (like an extra type just for regexes or sth).

However, I'm not really happy with the implementation — I've made these functions chainable so several can be called in a row, but this requires a lot of copying behind the scenes, since all types are still stored in a HashMap in Deserializer, and each call to an inject function duplicates everything (is there some nice immutable collection type we could use here instead of HashMap?). Unfortunately I'm not familiar enough with rust iterators to know if there's some better way — I considered just storing the bare iterators in the Deserializer struct and only collect it once Deserializer::_parse is called, but this idea seems to lead to a lot of Box<Iterator> and trouble because that doesn't implement the Debug trait, so I gave it up for now ...

@Nadrieril
Copy link
Owner

Nadrieril commented May 7, 2021

this requires a lot of copying behind the scenes

You don't need to copy anything, you have full ownership of self so you can insert or extend entries into the HashMap mutably :) Essentially:

pub fn with_builtin_type(mut self, name: String, ty: SimpleType) -> Self {
	self.builtins.insert(name, ty);
}

I feel "inject" describes better what actually happens ("transferring" a rust type into dhall)

inject is better than substitution, but I still don't think someone would know what it does from the name :/

with_builtin sounds more like adding a new fundamental type that can't be decomposed further in dhall (like an extra type just for regexes or sth).

I see what you mean, but builtins don't all have to be magical (e.g. Optional/build isn't). For me "builtin" means "some special value/type provided by the dhall runtime". So I prefer with_builtin_type/with_builtin_types to inject_type. Later we might add other things like with_builtin_value, with_builtin_function, with_builtin_opaque_type, etc, where the name "builtin" maybe is clearer.

at request of Nadrieril.
@stuebinm
Copy link
Contributor Author

stuebinm commented May 8, 2021 via email

@Nadrieril
Copy link
Owner

Cool, thank you!

@Nadrieril Nadrieril merged commit e66b036 into Nadrieril:master May 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants