Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling UTF encoded inputs #296

Merged
merged 6 commits into from
Mar 24, 2024

Conversation

fktn-k
Copy link
Owner

@fktn-k fktn-k commented Mar 23, 2024

This PR has improved the way of handling UTF encoded inputs by:

  • dropping information about the existence of a byte order mark (BOM) from inputs during the creation of an input adapter object since the information won't be needed by the lexer and the deserializer, and
  • preparing a normalized input buffer which is guranteed to be encoded in the UTF-8 format in the input_handler class before the actual start of the deserialization process.

Pull Request Checklist

Read the CONTRIBUTING.md file for detailed information.

  • Changes are described in the pull request or in a referenced issue.
  • The test suite compiles and runs without any error.
  • The code coverage on your branch is 100%.
  • The documentation is updated if you added/changed a feature.

Please don't

  • The C++11 support varies between different compilers and versions. Please note the list of supported compilers. Some compilers like GCC 4.7 (and earlier), Clang 3.3 (and earlier), or Microsoft Visual Studio 13.0 and earlier are known not to work due to missing or incomplete C++11 support. Please refrain from proposing changes that work around these compiler's limitations with #ifdefs or other means.
  • Please refrain from proposing changes that would break YAML specifications. If you propose a conformant extension of YAML to be supported by the library, please motivate this extension.
  • Please do not open pull requests that address multiple issues.

@fktn-k fktn-k added the improvement refactoring or optimization without public API changes label Mar 23, 2024
@fktn-k fktn-k added this to the Release v0.3.3 milestone Mar 23, 2024
@fktn-k fktn-k self-assigned this Mar 23, 2024
@coveralls
Copy link

coveralls commented Mar 23, 2024

Pull Request Test Coverage Report for Build 8407932666

Details

  • 184 of 184 (100.0%) changed or added relevant lines in 6 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 100.0%

Totals Coverage Status
Change from base Build 8407215565: 0.0%
Covered Lines: 2936
Relevant Lines: 2936

💛 - Coveralls

@fktn-k
Copy link
Owner Author

fktn-k commented Mar 23, 2024

Waiting for #293 to get merged to avoid conflicts on that branch due to the changes with lower priority.

@fktn-k fktn-k changed the title Feature/improve handling utf encodings Improve handling UTF encoded inputs Mar 23, 2024
@fktn-k fktn-k force-pushed the feature/improve_handling_utf_encodings branch from 87bc5d8 to 3245397 Compare March 24, 2024 08:10
@fktn-k fktn-k merged commit 2efa8c9 into develop Mar 24, 2024
144 checks passed
@fktn-k fktn-k deleted the feature/improve_handling_utf_encodings branch March 24, 2024 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement refactoring or optimization without public API changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants