Discussion: Support for importing from JSON #121

Gabriella439 · 2018-03-20T01:52:53Z

One pretty heavily requested feature is importing JSON values directly into Dhall. The most common requested reasons for doing this are:

Interop with existing JSON infrastructure (i.e. reusing shared JSON configuration files or API endpoints between tools)
Taking advantage of Dhall's declarative import system to orchestrate tying together multiple heterogeneous inputs

I'm open to the idea although I probably won't implement it until the import semantics are standardized (any day now :) ). In the meantime, though, I can still gather feedback on whether Dhall should support this feature and to flesh out what it might look like if it were proposed.

This would most likely be similar in spirit to Dhall's existing support for importing raw text using as Text. In other words, you would be able to write ./foo.json as JSON to import a JSON file into Dhall), however there are still some open issues that need to be resolved.

For those who are in favor of this feature, the easiest way to drive this discussion is to discuss how Dhall should behave when importing the following JSON expressions which cover most of the corner cases that I'm aware of:

[ 1, null ]

[ 1, true ]

[ { "x": 1 }, { "x":1, "y":true } ]

For each of these imports, should Dhall:

reject the import?
accept the import with a type annotation (i.e. a sum type or Optional type)?
- if so, what type annotation(s) would allow the import to succeed?
both of the above (i.e. be strict without a type annotation and more lenient with a type annotation)?

The text was updated successfully, but these errors were encountered:

aleator · 2018-03-27T11:07:48Z

My humble opinion about this is that Dhall should always require a type annotation, regardless of how 'guessable' the imported type is. The rationale is that even though your list stores only ints now it may end up with booleans later and it feels inappropriate to implicitly guess when importing files. Better would be to make dhall-guess-the-type-of-this-json-bit for this purpose

Also, I think that only implicit conversion that is sensible is to equate null & missing values to optional.
Thus [ 1, null ] could be importable using List (Optional Integer) and [ { "x": 1 }, { "x":1, "y":true } ] as List {x:Integer,y:Optional Bool} while [ 1, true ] would never be importable (atleast until dhall grows enough dt for the type to contain an interpretation function).

Profpatsch · 2018-03-27T11:38:13Z

I don’t think the idea is to guess types. From #326:

./foo.json as JSON : List { name : Text, age : Natural }

aleator · 2018-03-27T11:50:20Z

Well, since outlined options above are "reject" and "accept with type annotation" I thought that there would be a form where type annotation wasn't necessary ie. type would arise from imported data. Sorry about my confusion.

Profpatsch · 2018-03-27T12:50:45Z

I think the last section was about how JSON should be typed in dhall by default (yet still with deterministic rules what gets which type).

For example the aforementioned

./foo.json as JSON : List { name : Text, age : Natural }

would accept [ { "name": "me", "age": 23 } ] but not
[ { "name": "me", "age": 23, "occupation": "programmer" }, { "name": "Mel" } ], while just

./foo.json as JSON

could accept the latter and would type it as List { name: Text, age: Optional Natural, occupation: Optional String }.

@Gabriel439 I personally think dynamically adding optionals would only make sense if Dhall can infer the needed fields from usage, which it can’t. Since everywhere else types are not optional (and can’t be inferred) I think this would break consistency (and maybe bring up the expectation that it infers from usage).

Gabriella439 · 2018-03-27T18:47:01Z

Yeah, my personal preference is for a mandatory type signature, too. I just didn't want to bias the discussion at the very beginning. My reasoning is that it would be very weird for this:

[ 1 ]

... to have an inferred type of List Integer, whereas this:

[ 1, null ]

... has an inferred type of List (Optional Integer). Adding an element to a list shouldn't change its type and (like other people mentioned) wouldn't be consistent with other design decisions in Dhall.

However, there is still the question of whether or not Dhall should allow importing this JSON:

[ 1, true ]

... using a type annotation with a sum type like this:

./foo.json as JSON : List < Left : Integer, Right : Bool >

The main downside of that proposal that I'm aware of is that you have to specify what happens if you start nesting sum types or if you have sum types with multiple constructors that wrap the same type. My inclination is to still reject that, but I just wanted to mention it because dhall-to-json does support this in the opposite direction:

$ dhall-to-json
    let Either = constructors < Left : Integer | Right : Bool >

in  [ Either.Left 1, Either.Right True ]
<Ctrl-D>
[ 1, true ]

aleator · 2018-03-27T20:00:12Z

Well, I was referring to "typed by default" as guessing. I think there are two use cases for dhall-from-json:

To get some static piece of data easily into dhall. This is, in my mind best served by external conversion program, which you run just once.
You want to use different bits of json as input to a dhall program or the data you are importing changes sometimes. Now, in this case you probably want direct support in dhall. However, in this case, no single json snippet is going to be able to tell you what the exact shape of the (future) data will be, so the defaulting mechanism is probably not so hot idea.

As a final thought, how about adding import plugins (using similar scheme as in pandoc) to dhall? You would supply dhall with a program/script that can output dhall expressions and then import other bits of data through that script. For example, you could do something like:

echo './myCSV using CsvConverter {name:Text,Age:natural}' | dhall --plugin=CsvConverter

This would allow testing different JSON import schemes or interacting with other more task specific data sources. Successful data providing plugins could be merged to dhall after they've seen some real world use. (This could also handle things like dhall-lang/dhall-haskell#292)

Gabriella439 · 2018-03-28T00:22:15Z

Yeah, I like the plugin idea, although I would prefer to do it through the Haskell API instead of the command line

Profpatsch · 2018-03-28T00:36:06Z

Untagged unions should be different from sum types in my opinion.

I wouldn’t have expected dhall-to-json to throw the tags away to be honest.

aleator · 2018-03-28T05:33:04Z

Command line vs. Haskell API depends on who you wish to write plugins. I would guess that today most dhall is consumed by Haskell programs and the plugin is easiest and safest to add there.

However, if you use dhall from command line a lot then you'll need to build your own binary. Not a problem for Haskell users but probably a bit of a hurdle for the rest.

Gabriella439 · 2018-03-28T15:35:53Z

Keep in mind that the long term goal of Dhall is language bindings other than Haskell. So ideally there would be a language binding in that user's preferred language that they could use to customize the import resolution behavior.

The main reason I want to avoid a plugin API is that then I have to standardize the semantics and interface for plugins and every Dhall implementation would need to support that standard plugin semantics.

Note that in the long run I don't want users to have to use any binaries at all. The integration with their preferred language should be through a library rather than a subprocess call to some dhall or dhall-to-json executable.

In other words: I agree with the goal that users shouldn't have to build their own binaries, but I believe that the correct solution to that goal is to finish standardizing import semantics in order to create more language bindings rather than make the binaries of one implementation hyper-customizable.

aleator · 2018-04-12T07:53:22Z

I met an another case where having some kind of extended importing would be useful.

I'm using dhall to describe some course exercises. Now, some exercises are in want of bibliography links and all I have is a large bibtex file. In this case I converted the bibliography, partly and by hand, into dhall so I could import the required entries.

It would've been nice if I could've imported the .bib file directly. Doing the bib->dhall conversion means that the .bib file is no longer the primary data source and that I need to write a converter from dhall to bib to make use of the entries that I converted into dhall.

Perhaps extending the syntax so that import foo using <dhall-expression> is valid would be a start for doing something like this?

marcb · 2018-09-30T05:38:04Z

I've been pondering this for a little while and I feel that as JSON should be typed but that a tool should exist to take a corpus of example / expected payloads should be able to provide a type- to ease utilisation of as JSON.

The tool could also possibly take two corpus' that reflect both valid and error JSON responses so allow for a union type to cover both circumstances...

madjar · 2018-11-30T16:55:18Z

I've spent the afternoon making a toy json-to-dhall tool (https://gist.github.com/madjar/252c517644c0e13ef28a2a7ca71f5fa4). It's very prototypey code, and just supports most basic types, as well as optionals and dynamic maps (mapKey/mapValue).

The question is: if I want to transform this into something that's actually useful, where should it live:

Some external project?
As part of the dhall-json package?
In some other form?

Gabriella439 · 2018-11-30T17:05:06Z

@madjar: We want to add this to the language standard and once it's there then it will live in all implementations of the standard using the as JSON keyword (i.e. in the dhall-haskell project, for example). The first step is to review your code and see if that matches how people expect the as JSON feature to behave. I will try to review your code more closely tonight.

The key thing to emphasize is that the standardization process and agreeing upon the desired behavior is the bottleneck here because once it is standardized then I expect it will be pretty straightforward to update the implementation to match.

madjar · 2018-11-30T17:28:21Z

@Gabriel439 If your review it closely, then I'll have to apologize for the quality. It was kind of rushed this afternoon. The approach I've take is the one describe in dhall-lang/dhall-json#23 (comment), under "Convert and type check together", which is to recursively traverse both the json Value and the dhall Expr, accepting only values that exactly match the given type.

Having this tool made the conversion of a json file and the definition of its dhall type quite nice, allowing to incrementally add the missing parts to the type definition while having quick feedback.

But I understand that you see this not as tooling, but as part of the language, thus requiring more standardization than "whatever the tool does". I'll familiarize myself with the processes of the project, then.

Thanks!

Nadrieril · 2019-04-07T23:54:31Z

My opinion on this is that this would be extremely cool.
Regarding type annotations, here is what I had in mind: importing without a type annotation should be ok, because when we write [ "a", "b" ] in dhall we don't need a type annotation. So requiring one to import a similar bit of JSON seem unnatural. However, without a type annotation the type checker would be very strict and disallow any kind of mixing of types. Essentially, it would parse the JSON as it would a similar dhall expression.
This has a nice side-effect that doing echo "./data.json as JSON" | dhall type would give a type for the json payload.
Now when type annotations are added, the data can be more flexible, for example transforming nulls into Nones etc as mentioned above. But I feel this can be left for a later stage, since it would be rather more complex.

ari-becker · 2019-04-08T08:00:00Z

I agree that a as JSON mechanism should take some kind of type definition as a parameter, instead of trying to magically generate a type from the parsed JSON. I think that such a notion is more "Dhall-ish", which is to say, that input should be type-checked instead of blindly trusted.

With that said, I don't think that the type inputs for as JSON should be statically defined, e.g. let Strings = List Text in ./data.json as (JSON Strings). I think that this inherently limits the value of an as JSON language feature due to the dynamic nature of much JSON output.

Consider, for instance, the dhall-terraform-output script which takes Terraform's JSON output and assembles both a type and a record from that output. Because the record keys are variable, it's not possible to define a Dhall type for arbitrary Terraform JSON output ahead of time (or rather, it is, but it would be fragile). However, this doesn't mean that Terraform's JSON output doesn't follow a predictable pattern, and ideally, upon parsing Terraform's JSON output, it would be best to verify that the JSON output fits that pattern, and possibly even get a type that fit the predicted pattern.

What's the best way to do that? I don't know. Maybe, instead of as (JSON Type), we have some kind of as (JSON (? -> Type)), the idea being that it takes a function that produces a type instead of a static type? I have a feeling that a solution would veer on dependent typing, which the language standard doesn't support (yet?), and that's a Pandora's box of its own.

Probably it would be best to start with as (JSON Type), be strict about accepted input, and do so with the explicit caveat that it's still a partial solution and doesn't expect to be universally useful for any kind of JSON input.

alexanderkjeldaas · 2019-04-24T11:16:07Z

I think requiring type annotations will make it mostly impossible to import large JSON structures like CloudFormation data.

Gabriella439 · 2019-04-24T14:09:26Z

@alexanderkjeldaas: Wouldn't they also fail to import without type annotations? Usually those kinds of JSON files mix records of various types

antislava · 2019-04-24T15:36:20Z

@alexanderkjeldaas : Please, try the (new) json-to-dhall tool (in https://github.com/dhall-lang/dhall-haskell) and share your experience/issues. The tool requires a type annotation (schema) but does support union types and should be able to handle situations when different types are "mixed".

feliksik · 2019-07-16T18:55:30Z

Json-to-dhall is great!

I'm not sure how 'done' it is considered to be? Does this unlock this issue, i.e. creating the syntax for the core language and command line dhall utility to import X as Json and import X as Yaml?

This would be great!

feliksik · 2019-07-17T05:50:49Z

The decoder idea sounds powerful, and seems a special case of this:

let myJson = import my.json as Text
let decodeJson : Text -> MyType = import ./json.dhalldecoder as decoder : Text -> MyType
let parsed : MyType = decodeJson myJson

Is that correct?

If this is the case, we can also solve the text manipulation issue view plugins/decoders:
#631

I'm sure there are some concerns here :-)

It is a powerful idea, but the decoding thing may possibly benefit from some more maturation.

How bad is it to have a syntactic sugar like import as X? Sure it's arbitrary what formats to import, but it seems like relatively simple to implement and support?

but how do you deal with the security challenges?

joneshf · 2019-07-17T05:55:54Z

I think the issue in question is #613. In particular, this comment.

jneira · 2019-07-17T07:39:09Z

@feliksik: I would actually be fine baking in language support for JSON specifically instead of waiting for a more general solution. Dhall's future as a language is already intimately tied to its ability to displace JSON/YAML, so I think it's fine to special-case support for JSON

I think we could support custom encoders (including for json and yaml) and direct importing only from json and yaml due their special status.
For example, when the dhall executable will support dhall --to-json (see dhall-lang/dhall-haskell#1096) it would be a little bit weird to have:

/path/to/dhall <  import json as { path = "/path/to/dhall", args= "--from-json"}

Gabriella439 · 2019-07-17T22:36:38Z

The biggest issue is that an import could run an arbitrary executable. However, we could do something similar to the referential sanity check (i.e. only local imports can run executables, since they are trusted anyway). After all, we already trust local imports to send environment variables in custom headers (i.e. toMap { Authorization = "token ${env:GITHUB_TOKEN}" }). So I'm not too concerned about that, but I can see people complaining about it if they weren't already familiar with Dhall's threat model.

The second biggest issue is that relying on external executables complicates Dhall's distribution model (compared to native as JSON support). It's no worse than what we have today (since users currently have to separately install {json,yaml}-to-dhall) but it would be a much smoother user experience if JSON support was built into the language. In my experience, ease of distribution has a large impact on adoption rates.

gregwebs · 2019-09-01T17:56:48Z

Take a look at how cue is doing this: https://cuelang.org/
They can work with existing YAML to check it or import it.

Whereas I found that although dhall-kubernetes is great, I am spending a massive amount of time converting just a single existing valid configuration from YAML to dhall. Solely due to this waste of time I will have to try using cue instead.

TypeScript took off I think in large part because of the ease of transition:

One can create separate type definition files and apply them to existing javascript libraries that do not come with definitions. You can type all the interfaces you use without changing the code.
Javascript files can be immediately imported into TypeScript (used as TypeScript files) and you can slowly add better type annotations

Ideally dhall would support not just importing but also applying type definitions to existing files. Either feature though would help work with the existing world.

If dhall is going to be designed to interoperate with the outside world then it does need to work with different formats. Plugins need to be able to produce a common data structure that preserve file location information so that users can get good error messages. It might be possible for now to tell users to convert their YAML, etc to JSON and that they won't get good information about the file location of their error.

Gabriella439 · 2019-09-01T17:59:50Z

@gregwebs: We're pretty close. We've had yaml-to-dhall and json-to-dhall for a while now and I think we've worked out most of the design for them. I think the main remaining step is upstreaming them into the language

singpolyma · 2019-09-01T18:00:18Z

@gregwebs Would neither of the yaml-to-dhall tools work for your use case if you're converting an existing file?

gregwebs · 2019-09-01T20:10:33Z

Yes, in theory yaml-to-dhall will work for me, I didn't realize it was in dhall-json. That could make the process of using existing files go from hours to minutes!

In practice, however it doesn't actually work for dhall-kubernetes. This is because most of the K8s fields are actually optional. dhall-kuberntes is designed so that one will write definitions with the help of defaults so you will write:

defaults.Deployment //  { metadata = defaults.ObjectMeta }

However, yaml-to-dhall doesn't know about these defaults and complains about missing fields.

I think this is a separate issue reported here. But there is some relation here since dhall-kubernetes works fine when writing in dhall but is unable to import yaml. It seems like dhall-to-yaml needs a --defaults flag.

Nadrieril · 2020-03-22T18:37:45Z

I think we all agree that ./foo.json as JSON : { x: Text, y: List Natural } is unambiguous in the absence of unions and Optionals. Should we standardize just that and decide later if we want to allow more features ? Or do we think the design space is unclear enough that we don't want to commit to even that ?

Gabriella439 · 2020-03-22T21:42:04Z

@Nadrieril: The ambiguity was not an issue for me. I think this is worth standardizing now

sjakobi · 2020-03-22T21:54:43Z

./foo.json as JSON : { x: Text, y: List Natural }

The use of as here is a bit inconsistent with as Text and as Location. In those cases, the result is a actually of type Text or Location. With JSON we're just specifying the source format, but return a different type.

How about using e.g. from instead?

./foo.json from JSON : { x: Text, y: List Natural }

Or maybefromFormat?!

sjakobi · 2020-03-22T22:23:36Z

Another idea:

./foo.json from JSON as { x: Text, y: List Natural }

Gabriella439 · 2020-03-23T01:31:00Z

@sjakobi: Yeah, I like from JSON as … syntax

philandstuff · 2020-03-23T05:51:40Z

I prefer

./foo.json from JSON : { x: Text, y: List Natural }

because using a colon to indicate “this thing has that type” is a well-established part of the language - type assertions, empty lists, empty merge, etc, whereas the existing meaning of as is something else (as started upthread).

SiriusStarr · 2020-03-23T19:00:33Z

I would love the ability to import from JSON and agree that the use of : to indicate type is probably the best option.

Would one have to fully specify the type of the JSON to import it, or could one specify only the desired structure (with anything else being polymorphic). E.g. if I have the JSON

{
  "firstName": "John",
  "lastName": "Smith",
  "age": 27,
}

but only care about the name fields, would this import allow you to do

./person.json from JSON : { firstName: Text, lastName: Text }

or would you have to do

./person.json from JSON : { firstName: Text, lastName: Text, age: Natural }

Profpatsch · 2020-03-26T19:26:21Z

Would love to see ./foo.json as JSON : { … } soon (don’t introduce another keyword please, just require a type annotation like with empty lists).

Gabriella439 · 2020-05-01T21:54:41Z

I created a separate issue to track the idea of customizable parsers: #989

... since that's one way we might address this (by implementing JSON support within the language)

Gabriella439 · 2020-08-20T23:48:37Z

One way we can make progress on this issue is to split it into two steps:

Standardize support for importing a JSON value as a plain Prelude.JSON.Type
Standardize a keyword that can convert Prelude.JSON.Type to a more strongly-typed representation

mujx · 2021-04-19T17:29:03Z

@Gabriel439 If I understood correctly, the issue is blocked by the lack of an implementation / spec?

Seems like an easy way to move forward is to have something like the following (which was already suggested)

./file.json as Json : FileConfig

which doesn't introduce any new keyword nor it needs support for Prelude.JSON.Type since the import would be converted immediately with dhallFromJSON.

Gabriella439 · 2021-04-20T04:40:26Z

@mujx: Yeah, this requires a change to the standard since it cannot be implemented entirely within dhall-json

Gabriella439 mentioned this issue Mar 20, 2018

Other direction in interoperability with JSON/YAML? dhall-lang/dhall-haskell#326

Closed

Gabriella439 mentioned this issue Apr 10, 2018

Interest in json-to-dhall and yaml-to-dhall? dhall-lang/dhall-json#23

Open

Gabriella439 mentioned this issue Oct 17, 2018

Dealing with assoc lists / maps is annoying. #234

Closed

Gabriella439 mentioned this issue Jan 3, 2019

toJSON keyword? #336

Open

f-f mentioned this issue Jan 18, 2019

Discussion: transpile dhall to other languages in both directions? #346

Closed

arianvp mentioned this issue Feb 28, 2019

Comparison with other solutions dhall-lang/dhall-kubernetes#10

Open

sarneaud mentioned this issue Mar 4, 2019

Could the documentation be more specific about what a "malicious user" is? #333

Closed

Gabriella439 mentioned this issue Mar 18, 2019

Embedding arbitrary structures into Dhall values #435

Closed

antislava mentioned this issue Apr 2, 2019

json-to-dhall dhall-lang/dhall-haskell#878

Closed

gregwebs mentioned this issue Sep 2, 2019

Less verbose records with optional values #382

Closed

vojtechkral mentioned this issue Jan 20, 2020

Dhall as a 'schema' for another format? #890

Open

Gabriella439 mentioned this issue Feb 1, 2020

Types that reference JSONSchema are currently unusable dhall-lang/dhall-kubernetes#111

Open

Nadrieril added the standardize me label Mar 23, 2020

Gabriella439 mentioned this issue May 1, 2020

Permit text parsing at import resolution time? #989

Open

f-f mentioned this issue May 11, 2020

Backend specific dependencies purescript/registry-dev#20

Closed

Nadrieril mentioned this issue Jun 18, 2020

RFC: make JSON a builtin type #1027

Open

Gabriella439 mentioned this issue Aug 20, 2020

Apply a function to a Dhall expression supplied via STDIN or file dhall-lang/dhall-haskell#2003

Open

Gabriella439 mentioned this issue Oct 19, 2020

List of Records with optional values does not work despite type annotations #1085

Open

andrey-kuprianov mentioned this issue Jan 29, 2021

MBT-core: test configuration informalsystems/modelator#6

Closed

Discussion: Support for importing from JSON #121

Discussion: Support for importing from JSON #121

Comments

Gabriella439 commented Mar 20, 2018

aleator commented Mar 27, 2018

Profpatsch commented Mar 27, 2018 • edited Loading

aleator commented Mar 27, 2018 • edited Loading

Profpatsch commented Mar 27, 2018 • edited Loading

Gabriella439 commented Mar 27, 2018

aleator commented Mar 27, 2018

Gabriella439 commented Mar 28, 2018

Profpatsch commented Mar 28, 2018 • edited Loading

aleator commented Mar 28, 2018

Gabriella439 commented Mar 28, 2018

aleator commented Apr 12, 2018 • edited Loading

marcb commented Sep 30, 2018

madjar commented Nov 30, 2018 • edited Loading

Gabriella439 commented Nov 30, 2018

madjar commented Nov 30, 2018

Nadrieril commented Apr 7, 2019

ari-becker commented Apr 8, 2019 • edited Loading

alexanderkjeldaas commented Apr 24, 2019

Gabriella439 commented Apr 24, 2019

antislava commented Apr 24, 2019

feliksik commented Jul 16, 2019

feliksik commented Jul 17, 2019

joneshf commented Jul 17, 2019 • edited Loading

jneira commented Jul 17, 2019

Gabriella439 commented Jul 17, 2019

gregwebs commented Sep 1, 2019

Gabriella439 commented Sep 1, 2019

singpolyma commented Sep 1, 2019

gregwebs commented Sep 1, 2019

Nadrieril commented Mar 22, 2020

Gabriella439 commented Mar 22, 2020

sjakobi commented Mar 22, 2020

sjakobi commented Mar 22, 2020

Gabriella439 commented Mar 23, 2020

philandstuff commented Mar 23, 2020

SiriusStarr commented Mar 23, 2020

Profpatsch commented Mar 26, 2020 • edited Loading

Gabriella439 commented May 1, 2020

Gabriella439 commented Aug 20, 2020 • edited Loading

mujx commented Apr 19, 2021

Gabriella439 commented Apr 20, 2021

Profpatsch commented Mar 27, 2018 •

edited

Loading

aleator commented Mar 27, 2018 •

edited

Loading

Profpatsch commented Mar 27, 2018 •

edited

Loading

Profpatsch commented Mar 28, 2018 •

edited

Loading

aleator commented Apr 12, 2018 •

edited

Loading

madjar commented Nov 30, 2018 •

edited

Loading

ari-becker commented Apr 8, 2019 •

edited

Loading

joneshf commented Jul 17, 2019 •

edited

Loading

Profpatsch commented Mar 26, 2020 •

edited

Loading

Gabriella439 commented Aug 20, 2020 •

edited

Loading