Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue#1813: user defined input adapters #2145

Merged
merged 18 commits into from
Jun 5, 2020
Merged

Fix Issue#1813: user defined input adapters #2145

merged 18 commits into from
Jun 5, 2020

Conversation

FrancoisChabot
Copy link
Contributor

@FrancoisChabot FrancoisChabot commented May 27, 2020

This is an attempt to address #1813

This change adds generalized Iterator support for inputs. In effect, this is just 2 changes:

a) loosen the restriction on iterators having to be of continuous storage
b) harmonize the parse() API with the from_*() API.

All parse() and from_*() functions now have 3 (or 4, see below) overloads:

  1. parse(InputType): This will go into the specialized input adapters for FILE, istream or null-terminated string. As a fallback, it will perform a std-using ADL-lookup on begin(input) and end(input) to invoke 2.

  2. parse(iterator, iterator): This will instantiate an input adapter for that specific iterator type. This will also handle all of the contiguous memory scenarios, (using const char* as the iterator type). The compiler has enough information to optimize this appropriately. If the value_type of the iterator is a multi-byte character type, wchar to char conversion will be performed.

  3. parse(span_input_adapter): This exists specifically to support the existing parse({ptr, len}) syntax, but is marked as deprecated, as it's redundant with 2. apart from minor syntactic sugar.

User defined input is now a matter of implementing an arbitrary iterator type. It can also be done by having begin(T) and end(T) return an existing iterator type (like const char*)

From a very pedantic POV, this might be "technically" API breaking, because a lot of the SFINAE stuff that was in place to ensure that data is continuous in memory are now gone. Code that would have failed to compile will now result in functional, but not as optimized as it could be program. i.e. If someone tries to parse a std::list<char>, it'll work, and this may or may not be what was desired.


Pull request checklist

Read the Contribution Guidelines for detailed information.

  • Changes are described in the pull request, or an existing issue is referenced.
  • The test suite compiles and runs without error.
  • Code coverage is 100%. Test cases can be added by editing the test suite.
  • The source code is amalgamated; that is, after making changes to the sources in the include/nlohmann directory, run make amalgamate to create the single-header file single_include/nlohmann/json.hpp. The whole process is described here.

@FrancoisChabot FrancoisChabot marked this pull request as draft May 27, 2020 16:58
@coveralls
Copy link

coveralls commented May 27, 2020

Coverage Status

Coverage remained the same at 100.0% when pulling 0da131d on FrancoisChabot:1813-user-input into 9ec0e4c on nlohmann:develop.

@FrancoisChabot FrancoisChabot marked this pull request as ready for review May 28, 2020 07:38
@FrancoisChabot
Copy link
Contributor Author

FrancoisChabot commented May 28, 2020

Reviewing this further, I realized that there IS actually a non-trivial API breakage in there.

the previous from_*(A1&&, A2&&) functions could accept a pointer and a length (even though not a single unit test made use of that), and the current overload actually breaks that.

I see three possible resolutions:

  1. Leave it as is, it was an effectively undocumented feature.
  2. Add a 4th overload to the parse() and from_*() functions to handle parse(ptr, len)
  3. Add the 4th overload to from_*() functions, and mark it as deprecated.

The easiest fix is 1., the cleanest is 3., as it maintains backwards compatibility, and the API inconsistency is counterbalanced by the deprecation. (And I would ideally take the opportunity to mark the third overload: parse({ptr, len}) for deprecation as well while I'm at it tbh).

If you'd like, I can hold my nose and go for 2., but I think this unnecessarily increases the API surface since you can do parse(ptr, ptr+len) or parse({ptr, len}) already.

The one thing I'm really against is bringing back the overload just for from_*() without deprecation. I really believe the APIs should be made consistent.

edit: I've amended the PR with option 3, as well as adding deprecation warnings to the third overload. Feel free to ask for them to be removed if you think that's unreasonable.

Copy link
Owner

@nlohmann nlohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@nlohmann nlohmann self-assigned this Jun 5, 2020
@nlohmann nlohmann added this to the Release 3.8.0 milestone Jun 5, 2020
@nlohmann nlohmann linked an issue Jun 5, 2020 that may be closed by this pull request
@nlohmann nlohmann merged commit 7444c7f into nlohmann:develop Jun 5, 2020
@nlohmann
Copy link
Owner

nlohmann commented Jun 5, 2020

Thanks so much!!!

@nlohmann
Copy link
Owner

nlohmann commented Jun 5, 2020

The code looks great - I may add some more examples and tests in the next days.

@nlohmann
Copy link
Owner

nlohmann commented Jun 5, 2020


🔖 Release item

This issue/PR will be part of the next release of the library. This template helps preparing the release notes.

Type

  • ✨ New Feature
  • 🐛 Bug Fix
  • ⚡️ Improvement
  • 🔨 Further Change
  • 🔥 Deprecated function

Description


@FrancoisChabot
Copy link
Contributor Author

That's awesome! I wasn't expecting this to actually make it in 3.8.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

input_adapter not user extensible
5 participants