Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object is unordered #368

Closed
robrix opened this issue Mar 1, 2016 · 19 comments
Closed

Object is unordered #368

robrix opened this issue Mar 1, 2016 · 19 comments

Comments

@robrix
Copy link

robrix commented Mar 1, 2016

NB: I’m discussing the Object constructor used in Value. I haven’t yet looked into how this all works for Encoding.

Object is a synonym for HashMap Text Value, and as such is unordered. This is in keeping with JSON.org, at least, which describes objects as unordered.

However, JSON documents are obviously ordered, and that ordering may be important for performance when used with e.g. incremental parsers. Furthermore while one can represent unordered key/value pairs unambiguously with ordered ones, the reverse is not possible (without mapping keys or duplicating them in an array).

An ordered representation for Object would be valuable for my use case for both of these reasons.

@neongreen
Copy link

Encoding is ordered because it's just a Builder under the hood. So, if you write something like

instance ToJSON X where
  toEncoding X{..} = pairs ("b" .= xB <> "a" .= xA)

the order is going to be {"b": ..., "a": ...}.

@neongreen
Copy link

By the way, you can use aeson-pretty to output the keys in the order you want (it only solves a half of the problem, but still).

@robrix
Copy link
Author

robrix commented Mar 1, 2016

Thanks for the quick response! That short-circuits my investigation quite conveniently 😄

@mgsloan
Copy link

mgsloan commented Mar 1, 2016

I don't think it'd be worth breaking aeson API compatibility to add this. It would be cool to have ordering info somehow or other, though. This way, you can consume and modify json, and more easily see what was changed.

At that point, though, you might as well have something like aeson which also preserves layout. This'd make it possible to implement json refactoring.

@phadej
Copy link
Collaborator

phadej commented Mar 4, 2016

Coincidentally there is GetShopTV/swagger2#56, if OrdHashMap proves to be useful, we might split it into own package. It's still impossible to preserve ordering from decoded document, but let see.

@mgsloan
Copy link

mgsloan commented Mar 4, 2016

One interesting approach might be Map Text (Int, Value). This way, ordering can be preserved while still having the uniqueness of keys invariant.

@hvr
Copy link
Member

hvr commented Mar 5, 2016

@mgsloan and what would happen on key removal/addition? is there any invariant (or law) involved for the Int value?

@mgsloan
Copy link

mgsloan commented Mar 5, 2016

@hvr Values with the same Int value get ordered by Text key. It's fine if Ints are missing, they are just there to give priority values that determine the output order. I did consider Rational, for easy insertion between two existing values, but I think that's too clever / inefficient :)

@phadej
Copy link
Collaborator

phadej commented Mar 5, 2016

@hvr, @mgsloan check the OrdHashMap implementation in GetShopTV/swagger2#56 I mentioned. It's basically
(Int, Map k (Int, v)), where Int in pari determine the insertion order, and fst :: Int is the "next" insertion value.

As I mentioned, it helps with serialisation.

Deserialisation into it (and avoiding backwards-incompatible changes in aeson) is trickier. Yet I believe that using the approach (or similar) as in dcoutts-presented binary-(de)/serialisation in ZuriHac 2015, it could be possible:

  • productively parse ByteString into [JSONToken] (with ErrorToken) - should be trivial
  • make possible to write parsers for JSONToken i.e.
-- | `Parser` internals obviously changes
class FromJSON a where
    parseJSON :: Value -> Parser a

    fromEncoding :: Parser a
    fromEncoding = fromEncoding >>= parseJSON

My gut feeling says: that will give

  • flexibility
  • boost performance in parsing of primitives: arrays of strings / numbers, maybe even overall.

As I have juggled this idea for a while, writing fromEncoding for records will be tricky exercise, as we don't care about the order. But still, using generic derivation we know which fields we care about, and as parser is monadic, the code size doesn't need to be of size n! (i.e. all permutations statically encoded) :)

@phadej
Copy link
Collaborator

phadej commented Mar 24, 2016

There is http://hackage.haskell.org/package/insert-ordered-containers and I have positive experience with it when serialising Value (you have to be careful to use toEncoding - i.e. define it to your data if instances are written manually)

@bergmark
Copy link
Collaborator

bergmark commented May 9, 2016

It seems to me that there is an existing solution (use toEncoding) and that you can define your own type to do this externally from aeson. Is that correct? Any reason to keep this open?

@robrix
Copy link
Author

robrix commented May 9, 2016

I’d be ok with closing, personally.

@awalterschulze
Copy link

This solves preserving ordering for encoding, but I would like ordering to be preserved on decoding into a Value. Is this possible with aeson? Or is there still someway I can reuse the aeson parser to decode into my own order preserving structure?

@Lysxia
Copy link
Collaborator

Lysxia commented Nov 5, 2017

@awalterschulze Unfortunately that's not possible with aeson. The parser, which discards key order, is pretty opaque so there isn't a good way around this.

@awalterschulze
Copy link

Currently I am using Text.JSON which supports ordered parsing. Am I giving up a lot of speed offered by aeson? Or is this still a performant choice?

I am a noob haskell developer

@Lysxia
Copy link
Collaborator

Lysxia commented Nov 5, 2017

The reliance on lists and String instead of Vector and Text suggest it's not tuned for performance, but I can't tell at a glance how much slower it is or how much of a problem it is for katydid.

You may also like @phadej's suggestion above, of parsing a token stream (preserving a lot of information from the source); he made a PR since then. For validation that seems just right.

@awalterschulze
Copy link

Yes that looks perfect. I'll be looking forward to that merge.

@robrix
Copy link
Author

robrix commented Apr 3, 2019

Bitten by this again, this time in the decoding direction. It looks like @phadej’s RFC for a token parsing PR is stale; have there been any other developments in this space?

@laech
Copy link

laech commented Nov 21, 2021

I have ran into this issue as well, I need to create a JSON formatter as part of a program, so it must preserve the original user document's ordering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants