Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toml fixes #3500

Merged
merged 22 commits into from Jul 7, 2021
Merged

Toml fixes #3500

merged 22 commits into from Jul 7, 2021

Conversation

bauhaus93
Copy link
Contributor

@bauhaus93 bauhaus93 commented Sep 23, 2020

Fixing TOML issues mentioned in #3490.

  • Make error messages less technical
  • Fix floats in exponential form (didn't allow leading zero after exponent, but is now allowed)
  • Allow empty keynames, which are valid but discouraged
  • Write info in README about which TOML standard is supported / what the current gaps are
  • Preserve whitespace in front of key names / around assignments
  • Proper string quoting on writing, don't assume just basic. First tries of that in TOML as default storage plugin #3478.

Additional fixes:

  • Fix multiline string reading. There was a problem, when a string contained non-terminating quotation characters. Also fixed function to not loop endlessly when encountering EOF during reading.
  • Check, if non-decimal conversion functions are handling overflows properly (They didn't, so added checks if a base 2/8/10/16 integer over/underflows).
  • More strict handling of truth values on writing.
  • Fix parser so that some invalid TOML files don't get parsed as valid. (Happens on invalid simple table/tablearray combinations or redefintions).
  • Add metakey preservation capabilties.

Metakey preservation problems:

  • Metakeys assigned to a table array root key. The table array root key (which holds the array and tomltype metakey of an table array) doesn't have any anchors in a TOML file.
  • Metakeys assigned to inline table elements (since no comments are allowed in direct inline table elements by the TOML standard). This is a general comment problem. Either we change an existing inlinetable to a simpletable or we may throw an error in that case.

Basics

These points need to be fulfilled for every PR:

  • Short descriptions of your changes are in the release notes
    (added as entry in doc/news/_preparation_next_release.md which
    contains _(my name)_)
    Please always add something to the release notes.
  • Details of what you changed are in commit messages
    (first line should have module: short statement syntax)
  • References to issues, e.g. close #X, are in the commit messages.
  • The buildservers are happy.
  • The PR is rebased with current master.

If you have any troubles fulfilling these criteria, please write
about the trouble as comment in the PR. We will help you.
But we cannot accept PRs that do not fulfill the basics.

Checklist

Check relevant points but please do not remove entries.
For docu fixes, spell checking, and similar none of these points below
need to be checked.

  • I added unit tests for my code
  • I fully described what my PR does in the documentation
    (not in the PR description)
  • I fixed all affected documentation
  • I added code comments, logging, and assertions as appropriate (see Coding Guidelines)
  • I updated all meta data (e.g. README.md of plugins and METADATA.ini)
  • I mentioned every code not directly written by me in THIRD-PARTY-LICENSES

Review

Reviewers will usually check the following:

Labels

If you are already Elektra developer:

  • Add the "work in progress" label if you do not want the PR to be reviewed yet.
  • Add the "ready to merge" label if the basics are fulfilled and you also
    say that everything is ready to be merged.

Copy link
Contributor

@markus2330 markus2330 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good, thank you for the improvements of the error messages. The unit tests are very nicely written, too.

@bauhaus93
Copy link
Contributor Author

@markus2330
I've added a for now very basic implementation of metakey preservation in this PR (Metakey names assumed to have no spaces - I have to find a proper way/format to differentiate between metakey name and value in the comment string; Also, special characters/escape sequences are not yet handled).

But at least the failing example from #3491 with the base64 metamode should work now as intended (without any need of knowing that metamode is on).

@markus2330
Copy link
Contributor

Thank you so much, great work 🎆

@kodebach kodebach mentioned this pull request Oct 3, 2020
10 tasks
@PhilippGackstatter PhilippGackstatter mentioned this pull request Oct 14, 2020
16 tasks
@mpranj
Copy link
Member

mpranj commented Nov 5, 2020

@bauhaus93 thank you for implementing so many improvements to the TOML plugin. This is even more important than #3098. 😉

@kodebach
Copy link
Member

@bauhaus93 @markus2330 Currently toml uses the tomltype metadata. I think it would be a good idea to generalize this into type/[plugin], where [plugin] is a storage plugin. Very few storage formats have a type system that exactly matches Elektra's, so storing the storage format's type separately makes sense. For example a JSON plugin would use one of these:

type/json = string
type/json = boolean
type/json = number
type/json = null

(array and object aren't needed, since the are inferred from KeySet structure and JSON doesn't have things like TOML's inline tables)
The corresponding type values would be:

type/json = string -> type = any
type/json = boolean -> type = boolean
type/json = number -> type = long_long [OR] type = double
type/json = null -> type = any

(string corresponds to any because in Elektra string means non-empty)

@markus2330
Copy link
Contributor

Please avoid making new proposals in PRs, this could be easily misunderstood. Status-quo is:

  • In the cases the types match (which is not soo rare) type should be used.
  • In cases where types cannot be mapped, internal/<plugin>/* (in this case internal/toml/type should be used), see:

libelektra/doc/METADATA.ini

Lines 1072 to 1074 in f370cc5

[internal/<plugin>/*]
status= implemented
description= Internal metadata to be ignored by other plugins.

and here also the meta-spec for type:

libelektra/doc/METADATA.ini

Lines 223 to 244 in f370cc5

[type]
type= enum
short
unsigned_short
long
unsigned_long
long_long
unsigned_long_long
float
double
long_double
char
boolean
octet
any
empty
string
status= implemented
usedby/plugin= range type
description= defines the type of the value, as specified in CORBA
Unlike check/type, type is also used for code generation and within the APIs
example= any

If unsupported types are encountered during kdbSet, please give the user an error.

@kodebach
Copy link
Member

Yes, internal/toml/type would also be a possibility. But even in this case, we should probably define in METADATA.ini that internal/<plugin>/type should be used exclusively for this purpose to avoid possible confusion.

@markus2330
Copy link
Contributor

Thank you, I clarified in d198a7e

src/plugins/toml/meta.c Outdated Show resolved Hide resolved
@kodebach
Copy link
Member

@markus2330 I think this PR is now in a working state. My suggestion would be to merge this and then I'll use new PR(s) to fix the remaining TOML issues (see #3910). If we do a release soon, we should wait until after the release with the merge, or remove the semi-functional comment-based metadata handling from the PR.

@markus2330
Copy link
Contributor

Yes, the idea is to release asap so that Harald can continue with his work, see #3908. I think its best if we remove the "semi-functional comment-based metadata handling from the PR" as the specs in TOML is not urgent and having a otherwise working TOML is important&urgent.

@kodebach kodebach marked this pull request as ready for review July 2, 2021 12:28
@kodebach kodebach requested a review from markus2330 July 2, 2021 12:29
@kodebach
Copy link
Member

kodebach commented Jul 2, 2021

I think this is now ready to be merged (assuming the build servers are happy).

I removed the metadata handling for now (new handling will be added in a new PR). I did not yet copy the string quoting stuff from #3478, because I found some issues with the existing code. I will add a new version in another PR.

Didn't allow leading zeros after e/E, but is now a valid.
Empty key names (written in empty quotes) now allowed for any kind of
keyname (in k/v assignments, simple tables, table arrays, inline
tables).
Tests contain one roundtrip file, which gets read and written and is
then checked if it's same under disregard of whitespace (and currently
also quotation characters).

Tests also contain some invalid TOML files, which should fail on read.

The parser considers some invalid TOML files as valid, mainly related to
invalid table/table array combinations/redefinitions.

Additionally, need to check the correct handling of multiline strings,
they somehow are not fine yet.
String quote counter would not be reset when reading a non-quote
character, which caused strings to be considered finished prematurely.
bauhaus93 and others added 14 commits July 7, 2021 09:50
Previously, did not check any range for integers on reading/writing.
Now added checks for that. On reading, will emit an error if a syntactically correct
integer exceeds the range of ulonglong/longlong. On writing, value will
not be seen as integer, but written as string instead.

Also fixed incorrect float checking in type.c (Out-of-range integers
would get interpreted as floats, because commas were optional).
On writing, a `boolean` typed key must now be either 0 or 1, otherwise
an error will be emitted.
Metakeys are now written as comments with a prefix on writing.
On reading, those special comments will get assigned as metakeys to a
key.

This is currently just a very basic implementation, which assumes that
the metakey name/values don't have any special characters / are binary.
Also, metakey names are currently expected to not contain any spaces.

Tests for this changes not yet added/adapted.
Fixed a general comment error, which occured within list elements.
Needed to add an additional newline before writing comments in certain
situations, otherwise a value preceding comment would get written as an
inline comment of the previous list value instead.
For now, no comments will be written in direct children of inline
tables.
Overall it works better, even if the implementation doesn't fully match
the spec.
@mpranj mpranj added this to the 0.9.7 milestone Jul 7, 2021
@mpranj mpranj merged commit 3c021db into ElektraInitiative:master Jul 7, 2021
@mpranj
Copy link
Member

mpranj commented Jul 7, 2021

Thank you so much @bauhaus93 and @kodebach! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants