Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prelude: Add standard representation for weakly-typed JSON values #586

Merged
merged 4 commits into from Jun 20, 2019

Conversation

Projects
None yet
6 participants
@Gabriel439
Copy link
Contributor

commented Jun 11, 2019

This adds a new JSON type to the Prelude for representing unstructured JSON data
in Dhall.

The goal is to provide a way for users to have a way to pass through
unstructured data when integrating tools that have schema-free subsets
of their APIs. An example of this is "pass-through" API fields which pass
through arbitrary JSON data without interpreting that data in any way.

For people familiar with Haskell, this newly-added JSON type is the Dhall
analog of Data.Aeson.Value.

This will be paired with a matching change to the dhall-json package
to recognize this type and support converting it back and forth to
Data.Aeson.Values.

Gabriel439 added some commits Jun 10, 2019

Add standard representation for weakly-typed JSON values
This adds a new `JSON` type for representing unstructured JSON data
in Dhall.

The goal is to provide a way for users to have a way to pass through
unstructured data when integrating tools that have schema-free subsets
of their APIs.  An example of this is "pass-through" API fields which pass
through arbitrary JSON data without interpreting that data in any way.

For people familiar with Haskell, this newly-added JSON type is the Dhall
analog of `Data.Aeson.Value`.

This will be paired with a matching change to the `dhall-json` package
to recognize this type and support converting it back and forth to
`Data.Aeson.Value`s.
Fix Prelude lint failures
`./scripts/lint-prelude.sh` needs the `--checksum` flag to `rsync` to
work correctly
→ λ(object : List { mapKey : Text, mapValue : JSON } → JSON)
→ λ(array : List JSON → JSON)
→ λ(bool : Bool → JSON)
→ λ(null : JSON)

This comment has been minimized.

Copy link
@philandstuff

philandstuff Jun 11, 2019

Collaborator

Is there a reason you use a Church encoding rather than a union type such as <null | bool Bool | array List JSON | ... > here? The latter is more what I would have expected. Also, I think it would make some of the files here much simpler?

(This is less a review and more a genuine question about how you choose to write Dhall.)

This comment has been minimized.

Copy link
@joneshf

joneshf Jun 11, 2019

Collaborator

JSON is recursive, and Dhall doesn't support user-land recursive types. The encoding in this PR is one way to encode recursive types in Dhall.

@f-f

f-f approved these changes Jun 11, 2019

Copy link
Member

left a comment

🎉

@singpolyma

This comment has been minimized.

Copy link
Collaborator

commented Jun 11, 2019

@singpolyma

This comment has been minimized.

Copy link
Collaborator

commented Jun 11, 2019

@Nadrieril

This comment has been minimized.

Copy link
Collaborator

commented Jun 11, 2019

Is there a reason for taking the arguments one by one instead of taking a record ? Something like let JSON/Type = forall(JSON: Type) -> forall(constructors: { string: Text -> JSON, ... }) -> JSON. Then the record type could be factored out to reduce duplication

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

@Nadrieril @philandstuff: The reason for the current representation is because it produces the most compact normal form.

Here is the current proposed normal form:

  λ(JSON : Type)
 λ(string : Text  JSON)
 λ(number : Double  JSON)
 λ(object : List { mapKey : Text, mapValue : JSON }  JSON)
 λ(array : List JSON  JSON)
 λ(bool : Bool  JSON)
 λ(null : JSON)
 object
  [ { mapKey = "foo", mapValue = null }
  , { mapKey = "bar", mapValue = array [ number 1.0, bool True ] }
  ]

Here is with a record of handlers:

  λ(JSON : Type)
 λ ( json
    : { array :
          List JSON  JSON
      , bool :
          Bool  JSON
      , null :
          JSON
      , number :
          Double  JSON
      , object :
          List { mapKey : Text, mapValue : JSON }  JSON
      , string :
          Text  JSON
      }
    )
 json.object
  [ { mapKey = "foo", mapValue = json.null }
  , { mapKey =
        "bar"
    , mapValue =
        json.array [ json.number 1.0, json.bool True ]
    }
  ]

... and here is with a sum type:

  λ(JSON : Type)
 λ ( wrap
    :   < array :
            List JSON
        | bool :
            Bool
        | null
        | number :
            Double
        | object :
            List { mapKey : Text, mapValue : JSON }
        | string :
            Text
        >
       JSON
    )
 wrap
  ( < array :
        List JSON
    | bool :
        Bool
    | null
    | number :
        Double
    | object :
        List { mapKey : Text, mapValue : JSON }
    | string :
        Text
    >.object
    [ { mapKey =
          "foo"
      , mapValue =
          wrap
          < array :
              List JSON
          | bool :
              Bool
          | null
          | number :
              Double
          | object :
              List { mapKey : Text, mapValue : JSON }
          | string :
              Text
          >.null
      }
    , { mapKey =
          "bar"
      , mapValue =
          wrap
          ( < array :
                List JSON
            | bool :
                Bool
            | null
            | number :
                Double
            | object :
                List { mapKey : Text, mapValue : JSON }
            | string :
                Text
            >.array
            [ wrap
              ( < array :
                    List JSON
                | bool :
                    Bool
                | null
                | number :
                    Double
                | object :
                    List { mapKey : Text, mapValue : JSON }
                | string :
                    Text
                >.number
                1.0
              )
            , wrap
              ( < array :
                    List JSON
                | bool :
                    Bool
                | null
                | number :
                    Double
                | object :
                    List { mapKey : Text, mapValue : JSON }
                | string :
                    Text
                >.bool
                True
              )
            ]
          )
      }
    ]
  )

Also, the nice thing about the proposed normal is that you can understand it by ignoring the leading lambdas:

object
  [ { mapKey = "foo", mapValue = null }
  , { mapKey = "bar", mapValue = array [ number 1.0, bool True ] }
  ]

... although the record-based normal form is similar in this regard (albeit giving the appearance that all of the constructors are "qualified").

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

@singpolyma: Regarding JSON/Type, basically I realized in the course of implementing this that it was a mistake to put Map.dhall at the top-level instead of at ./Map/Type. I plan to put up a separate pull request to make ./Map.dhall a synonym for ./Map/Type, but the main reason I changed my mind is because:

  • Once we support mixed records of types and terms we need a place to store the type and the obvious place is Prelude.JSON.Type
  • We also need to be able to give the type a name when bound to a local variable, and the name JSON is already taken for the package. One possible name to use is JSON/Type (consistent with the path within the Prelude)

Gabriel439 added a commit to dhall-lang/dhall-haskell that referenced this pull request Jun 13, 2019

Add `dhall-json` support for weakly-typed JSON values
Fixes dhall-lang/dhall-lang#409

This matches the standard representation for arbitrary JSON proposed in
dhall-lang/dhall-lang#586
@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

commented Jun 13, 2019

Here's the matching change to the dhall-json package: dhall-lang/dhall-haskell#1007

@Nadrieril

This comment has been minimized.

Copy link
Collaborator

commented Jun 13, 2019

I personally find that the slightly larger normal forms in the encoding with a record of handlers would be worth it, in terms of ease of reading and writing.
I'd go even one step further and say that humans rarely want full normal forms in command outputs; instead, they want e.g. variable names to be preserved and multiline literals to stay multiline. Thus keeping type aliases in command output could be reasonable. In this hypothesis, the encoding with a record becomes a clear win in readability.

@singpolyma

This comment has been minimized.

Copy link
Collaborator

commented Jun 13, 2019

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

commented Jun 13, 2019

@Nadrieril: Even if we were to keep type aliases in output, it's still not clear to me how wrapping things in a record improves things

@Nadrieril

This comment has been minimized.

Copy link
Collaborator

commented Jun 15, 2019

@Gabriel439 : your example would become:

let JSON/Constructors = \(JSON: Type) -> { string: Text -> JSON, ... })
let JSON/Type = forall(JSON: Type) -> forall(constructors: JSON/Constructors JSON) -> JSON
in \(JSON: Type) -> \(json: JSON/Constructors JSON) -> json.object
  [ { mapKey = "foo", mapValue = json.null }
  , { mapKey =
        "bar"
    , mapValue =
        json.array [ json.number 1.0, json.bool True ]
    }
  ]

Assuming a more complex example with several JSON values, I find it would make the result more legible, since there would be minimal noise around each value. With the other encoding, every value needs a rather long introduction before getting to the actual contents a reader is interested in.

@Nadrieril

This comment has been minimized.

Copy link
Collaborator

commented Jun 15, 2019

I actually have a deeper point in mind by suggesting this encoding. The reason is that I believe a record encoding is strictly more extensible than an uncurried one :
If we were to add a new constructor to the type, with the record encoding most producing code should keep working, but with the uncurried form it would all break. Consuming code would have to be modified the same in both cases. Even writing a conversion function between the versions of the datatype would be easier with records.

So even though we might be pretty sure we won't ever add a constructor to JSON/Type, I want to advocate for record encodings becoming a best practise for Church-like encodings in general because I believe it to be a safer default.

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

commented Jun 18, 2019

@Nadrieril: So I've been thinking about this and I came to the conclusion to go with the record encoding you suggested, although for a slightly different future-proofing reason: the record encoding is insensitive to the order of JSON constructors (since record field order doesn't matter) and the normal form is not too much larger (it's basically just adding json. before each constructor). For the curried encoding, reordering the constructors is a breaking change.

However, in practice I don't think users need to ever be aware of what encoding we use if they use the constructors exposed by the Prelude.JSON package. In other words, instead of writing:

let JSON/Constructors = \(JSON: Type) -> { string: Text -> JSON, ... })
let JSON/Type = forall(JSON: Type) -> forall(constructors: JSON/Constructors JSON) -> JSON
in \(JSON: Type) -> \(json: JSON/Constructors JSON) -> json.object
  [ { mapKey = "foo", mapValue = json.null }
  , { mapKey =
        "bar"
    , mapValue =
        json.array [ json.number 1.0, json.bool True ]
    }
  ]

... they should be writing:

let JSON = https://prelude.dhall-lang.org/Prelude/JSON

in  JSON.object
    [ { mapKey = "foo", mapValue = JSON.null }
    , { mapKey =
          "bar"
      , mapValue =
          JSON.array [ JSON.number 1.0, JSON.bool True ]
      }
    ]

Gabriel439 added some commits Jun 18, 2019

Use record encoding
... as suggested by @Nadrieril

@Gabriel439 Gabriel439 merged commit 7a2acce into master Jun 20, 2019

1 check passed

hydra Hydra build #15878 of dhall-lang:586:dhall-lang
Details

@Gabriel439 Gabriel439 deleted the gabriel/json_pure_dhall_2 branch Jun 20, 2019

Gabriel439 added a commit to dhall-lang/dhall-haskell that referenced this pull request Jun 20, 2019

Add `dhall-json` support for weakly-typed JSON values (#1007)
Fixes dhall-lang/dhall-lang#409

This matches the standard representation for arbitrary JSON proposed in
dhall-lang/dhall-lang#586

philandstuff added a commit that referenced this pull request Jun 22, 2019

Fix normalization test for prelude/JSON/Type
In #586, after some debate we went with a record encoding rather than a
curried encoding, but we forgot to update a test to reflect this.

philandstuff added a commit that referenced this pull request Jun 23, 2019

Fix normalization test for prelude/JSON/Type (#599)
In #586, after some debate we went with a record encoding rather than a
curried encoding, but we forgot to update a test to reflect this.

Gabriel439 added a commit that referenced this pull request Jun 27, 2019

Gabriel439 added a commit that referenced this pull request Jun 29, 2019

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

commented Jul 7, 2019

I also discovered another benefit of the record form, which is that it is more readable when it is α-normalized (such as when fetched from cache), since the record field names are preserved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.