New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Support for importing from JSON #121

Open
Gabriel439 opened this Issue Mar 20, 2018 · 15 comments

Comments

Projects
None yet
5 participants
@Gabriel439
Copy link
Contributor

Gabriel439 commented Mar 20, 2018

One pretty heavily requested feature is importing JSON values directly into Dhall. The most common requested reasons for doing this are:

  • Interop with existing JSON infrastructure (i.e. reusing shared JSON configuration files or API endpoints between tools)
  • Taking advantage of Dhall's declarative import system to orchestrate tying together multiple heterogeneous inputs

I'm open to the idea although I probably won't implement it until the import semantics are standardized (any day now :) ). In the meantime, though, I can still gather feedback on whether Dhall should support this feature and to flesh out what it might look like if it were proposed.

This would most likely be similar in spirit to Dhall's existing support for importing raw text using as Text. In other words, you would be able to write ./foo.json as JSON to import a JSON file into Dhall), however there are still some open issues that need to be resolved.

For those who are in favor of this feature, the easiest way to drive this discussion is to discuss how Dhall should behave when importing the following JSON expressions which cover most of the corner cases that I'm aware of:

[ 1, null ]
[ 1, true ]
[ { "x": 1 }, { "x":1, "y":true } ]

For each of these imports, should Dhall:

  • reject the import?
  • accept the import with a type annotation (i.e. a sum type or Optional type)?
    • if so, what type annotation(s) would allow the import to succeed?
  • both of the above (i.e. be strict without a type annotation and more lenient with a type annotation)?
@aleator

This comment has been minimized.

Copy link

aleator commented Mar 27, 2018

My humble opinion about this is that Dhall should always require a type annotation, regardless of how 'guessable' the imported type is. The rationale is that even though your list stores only ints now it may end up with booleans later and it feels inappropriate to implicitly guess when importing files. Better would be to make dhall-guess-the-type-of-this-json-bit for this purpose

Also, I think that only implicit conversion that is sensible is to equate null & missing values to optional.
Thus [ 1, null ] could be importable using List (Optional Integer) and [ { "x": 1 }, { "x":1, "y":true } ] as List {x:Integer,y:Optional Bool} while [ 1, true ] would never be importable (atleast until dhall grows enough dt for the type to contain an interpretation function).

@Profpatsch

This comment has been minimized.

Copy link
Member

Profpatsch commented Mar 27, 2018

I don’t think the idea is to guess types. From #326:

./foo.json as JSON : List { name : Text, age : Natural }
@aleator

This comment has been minimized.

Copy link

aleator commented Mar 27, 2018

Well, since outlined options above are "reject" and "accept with type annotation" I thought that there would be a form where type annotation wasn't necessary ie. type would arise from imported data. Sorry about my confusion.

@Profpatsch

This comment has been minimized.

Copy link
Member

Profpatsch commented Mar 27, 2018

I think the last section was about how JSON should be typed in dhall by default (yet still with deterministic rules what gets which type).

For example the aforementioned

./foo.json as JSON : List { name : Text, age : Natural }

would accept [ { "name": "me", "age": 23 } ] but not
[ { "name": "me", "age": 23, "occupation": "programmer" }, { "name": "Mel" } ], while just

./foo.json as JSON

could accept the latter and would type it as List { name: Text, age: Optional Natural, occupation: Optional String }.

@Gabriel439 I personally think dynamically adding optionals would only make sense if Dhall can infer the needed fields from usage, which it can’t. Since everywhere else types are not optional (and can’t be inferred) I think this would break consistency (and maybe bring up the expectation that it infers from usage).

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

Gabriel439 commented Mar 27, 2018

Yeah, my personal preference is for a mandatory type signature, too. I just didn't want to bias the discussion at the very beginning. My reasoning is that it would be very weird for this:

[ 1 ]

... to have an inferred type of List Integer, whereas this:

[ 1, null ]

... has an inferred type of List (Optional Integer). Adding an element to a list shouldn't change its type and (like other people mentioned) wouldn't be consistent with other design decisions in Dhall.

However, there is still the question of whether or not Dhall should allow importing this JSON:

[ 1, true ]

... using a type annotation with a sum type like this:

./foo.json as JSON : List < Left : Integer, Right : Bool >

The main downside of that proposal that I'm aware of is that you have to specify what happens if you start nesting sum types or if you have sum types with multiple constructors that wrap the same type. My inclination is to still reject that, but I just wanted to mention it because dhall-to-json does support this in the opposite direction:

$ dhall-to-json
    let Either = constructors < Left : Integer | Right : Bool >

in  [ Either.Left 1, Either.Right True ]
<Ctrl-D>
[ 1, true ]
@aleator

This comment has been minimized.

Copy link

aleator commented Mar 27, 2018

Well, I was referring to "typed by default" as guessing. I think there are two use cases for dhall-from-json:

  1. To get some static piece of data easily into dhall. This is, in my mind best served by external conversion program, which you run just once.

  2. You want to use different bits of json as input to a dhall program or the data you are importing changes sometimes. Now, in this case you probably want direct support in dhall. However, in this case, no single json snippet is going to be able to tell you what the exact shape of the (future) data will be, so the defaulting mechanism is probably not so hot idea.

As a final thought, how about adding import plugins (using similar scheme as in pandoc) to dhall? You would supply dhall with a program/script that can output dhall expressions and then import other bits of data through that script. For example, you could do something like:

echo './myCSV using CsvConverter {name:Text,Age:natural}' | dhall --plugin=CsvConverter 

This would allow testing different JSON import schemes or interacting with other more task specific data sources. Successful data providing plugins could be merged to dhall after they've seen some real world use. (This could also handle things like dhall-lang/dhall-haskell#292)

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

Gabriel439 commented Mar 28, 2018

Yeah, I like the plugin idea, although I would prefer to do it through the Haskell API instead of the command line

@Profpatsch

This comment has been minimized.

Copy link
Member

Profpatsch commented Mar 28, 2018

Untagged unions should be different from sum types in my opinion.

I wouldn’t have expected dhall-to-json to throw the tags away to be honest.

@aleator

This comment has been minimized.

Copy link

aleator commented Mar 28, 2018

Command line vs. Haskell API depends on who you wish to write plugins. I would guess that today most dhall is consumed by Haskell programs and the plugin is easiest and safest to add there.

However, if you use dhall from command line a lot then you'll need to build your own binary. Not a problem for Haskell users but probably a bit of a hurdle for the rest.

@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

Gabriel439 commented Mar 28, 2018

Keep in mind that the long term goal of Dhall is language bindings other than Haskell. So ideally there would be a language binding in that user's preferred language that they could use to customize the import resolution behavior.

The main reason I want to avoid a plugin API is that then I have to standardize the semantics and interface for plugins and every Dhall implementation would need to support that standard plugin semantics.

Note that in the long run I don't want users to have to use any binaries at all. The integration with their preferred language should be through a library rather than a subprocess call to some dhall or dhall-to-json executable.

In other words: I agree with the goal that users shouldn't have to build their own binaries, but I believe that the correct solution to that goal is to finish standardizing import semantics in order to create more language bindings rather than make the binaries of one implementation hyper-customizable.

@aleator

This comment has been minimized.

Copy link

aleator commented Apr 12, 2018

I met an another case where having some kind of extended importing would be useful.

I'm using dhall to describe some course exercises. Now, some exercises are in want of bibliography links and all I have is a large bibtex file. In this case I converted the bibliography, partly and by hand, into dhall so I could import the required entries.

It would've been nice if I could've imported the .bib file directly. Doing the bib->dhall conversion means that the .bib file is no longer the primary data source and that I need to write a converter from dhall to bib to make use of the entries that I converted into dhall.

Perhaps extending the syntax so that import foo using <dhall-expression> is valid would be a start for doing something like this?

@marcb

This comment has been minimized.

Copy link

marcb commented Sep 30, 2018

I've been pondering this for a little while and I feel that as JSON should be typed but that a tool should exist to take a corpus of example / expected payloads should be able to provide a type- to ease utilisation of as JSON.

The tool could also possibly take two corpus' that reflect both valid and error JSON responses so allow for a union type to cover both circumstances...

@madjar

This comment has been minimized.

Copy link

madjar commented Nov 30, 2018

I've spent the afternoon making a toy json-to-dhall tool (https://gist.github.com/madjar/252c517644c0e13ef28a2a7ca71f5fa4). It's very prototypey code, and just supports most basic types, as well as optionals and dynamic maps (mapKey/mapValue).

The question is: if I want to transform this into something that's actually useful, where should it live:

  • Some external project?
  • As part of the dhall-json package?
  • In some other form?
@Gabriel439

This comment has been minimized.

Copy link
Contributor Author

Gabriel439 commented Nov 30, 2018

@madjar: We want to add this to the language standard and once it's there then it will live in all implementations of the standard using the as JSON keyword (i.e. in the dhall-haskell project, for example). The first step is to review your code and see if that matches how people expect the as JSON feature to behave. I will try to review your code more closely tonight.

The key thing to emphasize is that the standardization process and agreeing upon the desired behavior is the bottleneck here because once it is standardized then I expect it will be pretty straightforward to update the implementation to match.

@madjar

This comment has been minimized.

Copy link

madjar commented Nov 30, 2018

@Gabriel439 If your review it closely, then I'll have to apologize for the quality. It was kind of rushed this afternoon. The approach I've take is the one describe in dhall-lang/dhall-json#23 (comment), under "Convert and type check together", which is to recursively traverse both the json Value and the dhall Expr, accepting only values that exactly match the given type.

Having this tool made the conversion of a json file and the definition of its dhall type quite nice, allowing to incrementally add the missing parts to the type definition while having quick feedback.

But I understand that you see this not as tooling, but as part of the language, thus requiring more standardization than "whatever the tool does". I'll familiarize myself with the processes of the project, then.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment