New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use "in" as a field name (despite its keyword status) #31

Closed
chepner opened this Issue Mar 16, 2017 · 10 comments

Comments

Projects
None yet
4 participants
@chepner
Contributor

chepner commented Mar 16, 2017

Is there any way to produce a record with a field name "in"? (I fear not, since it's a keyword for let expressions.) I was playing with the idea of using Dhall to produce a YAML file (with dhall-to-yaml) for use with Swagger (http://swagger.io), which as part of defining parameters that an API endpoint can accept, uses an object key "in" to specify where a parameter can occur.

For example,

{ swagger = "2.0"
, parameters = {
    foo = {
      name = "foo"
    , in = "path"
    , type = "string"
    }
  }
  ...
}

to produce

    swagger: "2.0"
    parameters:
        foo:
            name: "foo"
            in: "path"
            type: "string" 
    ...

@chepner chepner changed the title from Use "in" as a field name to Use "in" as a field name (despite its keyword status) Mar 16, 2017

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Mar 16, 2017

Collaborator

There isn't a way to name a field in, but there may still be another solution to this problem, which is to extend the dhall-json library to accept a function for transforming record field names.

For example, the dhall library recently added this feature when decoding Dhall records into Haskell records:

https://github.com/Gabriel439/Haskell-Dhall-Library/blob/master/src/Dhall.hs#L428

So it would make sense to do the same for the dhall-json library, too, by parametrizing the dhallToJSON function on the same InterpretOptions record

However, that change alone would require that you need to compile your own custom dhall-to-yaml executable, so I could go a step further and add some options to the current dhall-to-yaml executable for common record field transformations, such as stripping off a fixed prefix or something similar

Collaborator

Gabriel439 commented Mar 16, 2017

There isn't a way to name a field in, but there may still be another solution to this problem, which is to extend the dhall-json library to accept a function for transforming record field names.

For example, the dhall library recently added this feature when decoding Dhall records into Haskell records:

https://github.com/Gabriel439/Haskell-Dhall-Library/blob/master/src/Dhall.hs#L428

So it would make sense to do the same for the dhall-json library, too, by parametrizing the dhallToJSON function on the same InterpretOptions record

However, that change alone would require that you need to compile your own custom dhall-to-yaml executable, so I could go a step further and add some options to the current dhall-to-yaml executable for common record field transformations, such as stripping off a fixed prefix or something similar

@scott-fleischman

This comment has been minimized.

Show comment
Hide comment
@scott-fleischman

scott-fleischman Apr 28, 2017

It would also be handy to have a way to put dashes in record names, such as extra-package-dbs and extra-deps used in stack.yaml files.

While a transformation function could work, it may also be beneficial to allow some sort of escaped or quoted record field name, perhaps like the @ verbatim identifiers in C# or string field names in JSON.

scott-fleischman commented Apr 28, 2017

It would also be handy to have a way to put dashes in record names, such as extra-package-dbs and extra-deps used in stack.yaml files.

While a transformation function could work, it may also be beneficial to allow some sort of escaped or quoted record field name, perhaps like the @ verbatim identifiers in C# or string field names in JSON.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Apr 28, 2017

Collaborator

I like the idea of quoted record field names. That should be pretty easy to implement and it solves this problem cleanly

Collaborator

Gabriel439 commented Apr 28, 2017

I like the idea of quoted record field names. That should be pretty easy to implement and it solves this problem cleanly

@markus1189

This comment has been minimized.

Show comment
Hide comment
@markus1189

markus1189 Apr 28, 2017

Collaborator

In Scala you can use backticks, I personally like that as syntax

val `val` = 1
val `class` = 2
Collaborator

markus1189 commented Apr 28, 2017

In Scala you can use backticks, I personally like that as syntax

val `val` = 1
val `class` = 2
@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Apr 30, 2017

Collaborator

So for @scott-fleischman's use case, I can actually just add - as a valid identifier character since there is no binary subtraction operator in dhall (and I have no desire to add one). That means that abc-def is unambiguously an identifier and not abc - def

Collaborator

Gabriel439 commented Apr 30, 2017

So for @scott-fleischman's use case, I can actually just add - as a valid identifier character since there is no binary subtraction operator in dhall (and I have no desire to add one). That means that abc-def is unambiguously an identifier and not abc - def

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Apr 30, 2017

Collaborator

So I have some new reservations about this proposal after attempting a first pass at implementing this.

Let's assume for simplicity that we begin from @markus1189's proposal and use backticks to escape identifiers. The next decision is: what set of characters do you permit within backticks?

There are three choices that I'm ware of:

  • You permit all Unicode characters (perhaps also with a mechanism to escape backticks), meaning that you can have identifiers like this:

    λ(`â̸̛͈͓̤͙̦̊̃̀͠͝ş̶̛̯̹̮̘̪̯͇̼̑̈́̓̿̌̆͋̔̐͗͛̊͌͝d̸̝̠̳̻͆f̴̢̤̥̮͈̖̰͕̼͍́͑͑̈́̈́̊̚ͅ` : Type)  `â̸̛͈͓̤͙̦̊̃̀͠͝ş̶̛̯̹̮̘̪̯͇̼̑̈́̓̿̌̆͋̔̐͗͛̊͌͝d̸̝̠̳̻͆f̴̢̤̥̮͈̖̰͕̼͍́͑͑̈́̈́̊̚ͅ`  `â̸̛͈͓̤͙̦̊̃̀͠͝ş̶̛̯̹̮̘̪̯͇̼̑̈́̓̿̌̆͋̔̐͗͛̊͌͝d̸̝̠̳̻͆f̴̢̤̥̮͈̖̰͕̼͍́͑͑̈́̈́̊̚ͅ`

    That leads to a risk of people using Unicode to hide or obscure malicious code. Also, it conflicts with the goal that a user should be able to easily understand configuration files in their normalized form

  • You permit the same set of characters but use backticks primarily to avoid conflicting with reserved identifiers

    The problem with this approach is that it doesn't buy you anything new over the existing support for field modifiers when desugaring record fields. For example, I could just surround variable names that conflict with identifiers with underscores, like this:

    { _in_ = 2, _Type_ = True }

    ... and then supply a field modification function that detects and strips surrounding underscores

  • You do something in between: permit more characters than before but explicitly whitelist which characters

    The problem with this is that it doesn't require escaped identifiers either. We could just add these characters directly to the supported identifier character set already (like -, for example)

So my inclination here is to still rely on field modifiers instead of verbatim identifiers to solve the original problem, but to add - to the set of permitted identifier characters for @scott-fleischman's use case

Collaborator

Gabriel439 commented Apr 30, 2017

So I have some new reservations about this proposal after attempting a first pass at implementing this.

Let's assume for simplicity that we begin from @markus1189's proposal and use backticks to escape identifiers. The next decision is: what set of characters do you permit within backticks?

There are three choices that I'm ware of:

  • You permit all Unicode characters (perhaps also with a mechanism to escape backticks), meaning that you can have identifiers like this:

    λ(`â̸̛͈͓̤͙̦̊̃̀͠͝ş̶̛̯̹̮̘̪̯͇̼̑̈́̓̿̌̆͋̔̐͗͛̊͌͝d̸̝̠̳̻͆f̴̢̤̥̮͈̖̰͕̼͍́͑͑̈́̈́̊̚ͅ` : Type)  `â̸̛͈͓̤͙̦̊̃̀͠͝ş̶̛̯̹̮̘̪̯͇̼̑̈́̓̿̌̆͋̔̐͗͛̊͌͝d̸̝̠̳̻͆f̴̢̤̥̮͈̖̰͕̼͍́͑͑̈́̈́̊̚ͅ`  `â̸̛͈͓̤͙̦̊̃̀͠͝ş̶̛̯̹̮̘̪̯͇̼̑̈́̓̿̌̆͋̔̐͗͛̊͌͝d̸̝̠̳̻͆f̴̢̤̥̮͈̖̰͕̼͍́͑͑̈́̈́̊̚ͅ`

    That leads to a risk of people using Unicode to hide or obscure malicious code. Also, it conflicts with the goal that a user should be able to easily understand configuration files in their normalized form

  • You permit the same set of characters but use backticks primarily to avoid conflicting with reserved identifiers

    The problem with this approach is that it doesn't buy you anything new over the existing support for field modifiers when desugaring record fields. For example, I could just surround variable names that conflict with identifiers with underscores, like this:

    { _in_ = 2, _Type_ = True }

    ... and then supply a field modification function that detects and strips surrounding underscores

  • You do something in between: permit more characters than before but explicitly whitelist which characters

    The problem with this is that it doesn't require escaped identifiers either. We could just add these characters directly to the supported identifier character set already (like -, for example)

So my inclination here is to still rely on field modifiers instead of verbatim identifiers to solve the original problem, but to add - to the set of permitted identifier characters for @scott-fleischman's use case

Gabriel439 added a commit that referenced this issue Apr 30, 2017

Add `-` as a valid identifier character
Part of #31, requested by @scott-fleischman

You can now use `-` in identifier names for all but the first character.
That means that this is now legal code:

```
{ resolver = "lts-8.1", extra-deps = [ "pipes-4.3.1" ] }
```

This primarily serves people who want to use "kebab-case" for their
identifier names

Normally languages do not support dash in identifier names due to
ambiguity with the subtraction operator.  For example, an identifier
like `extra-deps` could be interpreted as `extra` minus `deps`.
However, Dhall does not support subtraction and only supports unary
negation so there is no ambiguity.
@markus1189

This comment has been minimized.

Show comment
Hide comment
@markus1189

markus1189 Apr 30, 2017

Collaborator

I think the argument for the 2) approach is that you can have a pipeline into a format that is not dhall:
dhall ---normalize---> yaml/json/... without having to go through Haskell code at all, just using the dhall commandline compiler. That's at least something that I would like to have.

Collaborator

markus1189 commented Apr 30, 2017

I think the argument for the 2) approach is that you can have a pipeline into a format that is not dhall:
dhall ---normalize---> yaml/json/... without having to go through Haskell code at all, just using the dhall commandline compiler. That's at least something that I would like to have.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Apr 30, 2017

Collaborator

So for each dhall-* library the command line compiler is provided as a convenience to handle the most common case, but the Haskell API is still the recommended way to customize the behavior

Collaborator

Gabriel439 commented Apr 30, 2017

So for each dhall-* library the command line compiler is provided as a convenience to handle the most common case, but the Haskell API is still the recommended way to customize the behavior

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Apr 30, 2017

Collaborator

Actually, I've changed my mind on this. I'll support the second option since it seems like an unusually common use case that shouldn't require going to the Haskell API

Collaborator

Gabriel439 commented Apr 30, 2017

Actually, I've changed my mind on this. I'll support the second option since it seems like an unusually common use case that shouldn't require going to the Haskell API

Gabriel439 added a commit that referenced this issue Apr 30, 2017

Escaped field names. Fixes #31
You can now escape an field or variable name using backticks, like this:

```haskell
    let `let` = 2
in  let `in`  = True
in  { `True` = `let`
    , `Type` = λ(`Kind` : Bool) → `Kind` && `in`
    }
```

The main purpose of this is to support arbitrary record field names when
marshalling Dhall into other file formats (such as JSON).  For example,
this Dhall configuration file:

```haskell
{ swagger = "2.0"
, parameters = {
    foo = {
      name = "foo"
    , `in` = "path"
    , type = "string"
    }
  }
  ...
}
```

... can be used to generate this JSON file:

```json
    swagger: "2.0"
    parameters:
        foo:
            name: "foo"
            in: "path"
            type: "string"
```
@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Apr 30, 2017

Collaborator

Alright. I created a pull request that implements @markus1189's proposed syntax: #43

Let me know if that solves your use case

Collaborator

Gabriel439 commented Apr 30, 2017

Alright. I created a pull request that implements @markus1189's proposed syntax: #43

Let me know if that solves your use case

@Gabriel439 Gabriel439 closed this in 7db9515 May 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment