Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add useSnakeCasedKeys to JSONDecoder.KeyDecodingStrategy #14039

Closed
wants to merge 1 commit into from

Conversation

norio-nomura
Copy link
Contributor

Adds JSONDecoder.KeyDecodingStrategy.useSnakeCasedKeys.

This will make following code works.

import Foundation

struct S: Codable {
    var myURLProperty: String
}

let encoder = JSONEncoder()
encoder.keyEncodingStrategy = .convertToSnakeCase
let data = try encoder.encode(S(myURLProperty: "property"))
let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .useSnakeCasedKeys
let decoded = try decoder.decode(S.self, from: data)

/cc: @itaiferber

Resolves SR-6629.

@itaiferber
Copy link
Contributor

itaiferber commented Jan 23, 2018

Hi Norio, thanks for putting this together! Unfortunately, this would have to go through internal API review for additional review before we could accept it.

@parkera What do you think about this? SR-6629 is the problem where you can't faithfully round-trip something like "myHTTPLink" → "my_http_link" → "myHttpLink".

Right now, the approach on decode is to look at the JSON payload and apply the transformation of JSON → CodingKey. For a CodingKey like myHTTPLink, a JSON string of my_http_link doesn't transform back into a valid CodingKey. The alternative approach is to look for keys in the JSON which match what we've got in the CodingKeys, e.g. take myHTTPLink and map it to my_http_link and look for that in the JSON. This works because we can apply the "encode" transformation again, but is inconsistent with what would happen if you wrote your own transformation (which necessarily goes from JSON → CodingKey).

(We could add API to do this, like Norio is suggesting, or potentially change the semantics of .convertFromSnakeCase, or just leave things be.)

@hartbit
Copy link
Contributor

hartbit commented Apr 30, 2018

@itaiferber For what's its worth, I tried using convertToSnakeCase/convertFromSnakeCase for the first time today and was really surprised that the behaviour was not what @norio-nomura is proposing. Converting the CodingKeys to snake_case instead of converting the JSON keys to camelCase seems like the only solution to avoid all those round-trip errors.

Edit: The fact that a change in key decoding strategy can break round-trip conversion feels like a serious issue to me. I would go as far as "correcting" the convertToSnakeCase strategy in 4.2, even if it slightly breaks backwards compatibility.

@itaiferber
Copy link
Contributor

@hartbit @norio-nomura I forgot to update this PR (and the SR) with some additional discussion and notes for thought. Even if we decide to special-case convertToSnakeCase and convertFromSnakeCase here and perform the conversions in the "favorable" direction, there are still other cases which can't necessarily be fixed.

Consider KeyedDecodingContainer<Key>.allKeys, which maps the contents of what is in the payload to a list of Keys. This is done by iterating the keys in the payload and creating a Key with the value of every found key via CodingKey.init?(stringValue:).

Right now, we do this by having two conversions — Key-to-String (Key.stringValue → case conversion) and String-to-Key (case conversion → Key.init?(stringValue:)). If we "fix" the issue by only using Key-to-String conversions as suggested, there's no real way to map from something already in the payload to a Key. One way to do this could be to require all CodingKeys to be CaseIterable and on every request for allKeys, take Key, iterate every case, converts every key to a string, applies the case conversion, and then iterate all of the keys in the payload and try to look them up in the converted table. This is both expensive, and not necessarily possible as not all CodingKeys can be CaseIterable (think _DictionaryCodingKey and other dynamic keys).

We could decide to "fix" this but still leave allKeys as-is, but this is still bad because you can't rely on iterating through allKeys to really know what's in the payload.


And at that, this is a special case for convertToSnakeCase/convertFromSnakeCase; we could not apply this to strategies which are even more lossy (e.g. .custom, for which we really only have one direction of conversion). There's a balance here between doing the "right" (i.e. magic) thing for some API consumers and being consistent.

One thing to keep in mind, I think, is that these conversions are by nature lossy. There are always going to be edge cases where we've lost too much information to round-trip the keys. The question is where is the balance between preserving more cases and being consistent across the board. I currently favor being consistent here.

@hartbit
Copy link
Contributor

hartbit commented Apr 30, 2018

Seems like a loose-loose situation where there is not great solution.

One way to do this could be to require all CodingKeys to be CaseIterable and on every request for allKeys, take Key, iterate every case, converts every key to a string, applies the case conversion, and then iterate all of the keys in the payload and try to look them up in the converted table.

Couldn't KeyedDecodingContainer cache the converted keys?

How about special casing the snake_case to camelCase conversion to recognise common abbreviations often found in Swift code and uppercasing them as the naming conventions suggests? The obvious that come to mind are: HTTP, URL, UUID, XML, JSON

@itaiferber
Copy link
Contributor

Yes, caching is possible, though unfortunately this doesn't solve the problem for keys which are not iterable.

Recognizing special words is certainly an option (and can be done transparently without updated API); the question is — what abbreviations are important enough to make it into that list? 🙂 All the ones you listed, I agree with. If we're doing HTTP, should we do HTTPS? (And what about other schemes?) It gets a bit hairier... What if people have other abbreviations they want that are not on our list? One option is to make an extensible list of known initialisms that anyone can extend in their process and just do that.

(I'm not trying to say that just because there isn't a "best" solution we shouldn't do anything; it's just about trying to figure out the best way going forward.)

Another solution is simply exposing JSONEncoder.KeyEncodingStrategy.toSnakeCase() and JSONDecoder.KeyDecodingStrategy.fromSnakeCase() and letting folks more easily do what they need to in .custom conversions.

@hartbit
Copy link
Contributor

hartbit commented May 1, 2018

I noticed another problem as a consequence of the current behaviour: resolving the round-trip issue in @norio-nomura's example requires giving the CondingKey a rawValue of myUrlProperty. I think this is very unintuitive and not easily discoverable as no domain (wether the Swift domain or JSON domain) ever declares a key with the value myUrlProperty: it feels outlandish to both domains.

The more I think about it the more I think that @norio-nomura's useSnakeCasedKeys has to be part of the solution, whatever the cost is of resolving KeyedDecodingContainer. By the way, isn't KeyedDecodingContainer.keys invisible to users to JSONEncoder and JSONDecoder? Can't we just return an empty array?

@itaiferber
Copy link
Contributor

No, we can't do that, unfortunately. JSONDecoder.container(keyedBy:) returns a KeyedDecodingContainer, which exposes allKeys. That value has to be consumable for types which iterate through the possible keys in the payload in order to decide how to decode; we can't simply return an empty array for the benefit of making these edge cases work.

Perhaps we should take this to the Swift Forums for discussion on possible ways forward. At this point, we're not going to be able to introduce new API into Swift 4.2, but maybe there are other directions we can take this.

@itaiferber
Copy link
Contributor

I started a thread on the Swift forums to discuss this in a more public setting. :)

@hartbit
Copy link
Contributor

hartbit commented May 4, 2018

Thx @itaiferber! Would it worth moving the discussion to Swift Evolution?

@itaiferber
Copy link
Contributor

I think it would be good to gather feedback first about potential directions before suggesting API to Swift Evolution, but eventually, yes

@mortenbekditlevsen
Copy link
Contributor

Was any conclusion on the issue reached through the discussion?

@itaiferber
Copy link
Contributor

There were three responses to the thread, all supportive of this approach; if we consider that to be enough signal to go ahead with this, we'll need to figure out how we want to review and schedule this API change. Given the amount of things going on at the moment, I'm not sure this will make the 4.2 release.

@parkera Any thoughts here?

@parkera
Copy link
Contributor

parkera commented Jun 15, 2018

Let's aim for 5.0. master is still the branch for that, but since we need time to do additional API review here, we will leave this open for a bit longer.

@mortenbekditlevsen
Copy link
Contributor

Hi there. I hope that it is ok that I bump this thread.
@parkera you mention that you hope to aim for 5.0. Now that we are nearing 5.0 do you think that there is any hope that this could make it?

@theblixguy
Copy link
Collaborator

Any updates on this? Could we merge this and cherry pick into 5.1 perhaps? Seems like there's a lot of support for this so it would be a shame if this gets abandoned.

@itaiferber
Copy link
Contributor

Any changes made here would need to go through API review, which won't make it into Swift 5.1, unfortunately. We will try to push on this internally, given bandwidth.

@theblixguy
Copy link
Collaborator

Okay! I hope it makes into Swift 5.2 then :)

@norio-nomura
Copy link
Contributor Author

norio-nomura commented Jul 7, 2019

FYI, I opened jpsim/Yams#200 that can become a PoC improving this PR.

@CodaFi
Copy link
Contributor

CodaFi commented Nov 21, 2019

It's been a while, so let's get this moving again. Because this change is API and ABI breaking (not that I anticipate there are many clients switching over this enum), it requires a formal proposal to Swift evolution. During that process, it's possible you may hear from the core team that this change is OK to take without a formal review. Luckily, you've already done the heavy lifting by providing an implementation.

I'm going to tag this as pending evolution discussion.

@CodaFi CodaFi added the swift evolution pending discussion Flag → feature: A feature that has a Swift evolution proposal currently in review label Nov 21, 2019
@itaiferber
Copy link
Contributor

@CodaFi Since JSONDecoder does not live in the standard library, this should not require full Swift evolution, but does require Foundation review. /CC @parkera to weigh in on changes here

@CodaFi CodaFi added potentially source breaking and removed swift evolution pending discussion Flag → feature: A feature that has a Swift evolution proposal currently in review labels Nov 22, 2019
@CodaFi
Copy link
Contributor

CodaFi commented Nov 22, 2019

I have checked, and I stand corrected. The overlays are Foundation’s and are not subject to evolution. Hopefully @parkera can get this reviewed in time.

@parkera
Copy link
Contributor

parkera commented Nov 22, 2019

cc @bendjones on this one

@shahmishal
Copy link
Member

Please update the base branch to main by Oct 5th otherwise the pull request will be closed automatically.

  • How to change the base branch: (Link)
  • More detail about the branch update: (Link)

@shahmishal shahmishal closed this Oct 5, 2020
@mortenbekditlevsen
Copy link
Contributor

Hi there,
Apologies for the bump, but unfortunately this PR was automatically closed - and I don't think it got the attention that it deserved.
The comments do mention internal reviews, so I am hoping that perhaps the issue is not yet off of the radar.
Otherwise: might I reopen a copy of the PR against main?

I am certain that the issue is very well understood, but I think that the current state causes a lot of confusion - even for experienced developers. The intention was to keep the key conversion in the encoder and decoder, but the current state is that this often 'leaks' into the CodingKey definition where you have to 'translate' something like imageURL to imageUrl for it to roundtrip to snake case correctly. I've seen quite a few bugs due to the misunderstanding that it would help to translate the key directly into snake case as:

  case imageURL = "image_url"

As mentioned elsewhere, the big issue is that the 'imageUrl' representation belongs neither to the source or the destination domains but must be added for the sake of the key coding strategy.

@parkera , @CodaFi , @bendjones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants