Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share and discuss a good solution suggestion for TOML.stringify #854

Closed
LongTengDao opened this issue Oct 12, 2021 · 2 comments
Closed

Share and discuss a good solution suggestion for TOML.stringify #854

LongTengDao opened this issue Oct 12, 2021 · 2 comments

Comments

@LongTengDao
Copy link
Contributor

LongTengDao commented Oct 12, 2021

Due to the flexibility of TOML syntax, while it greatly meeting the needs of reading and writing directly, it also causes great difficulty for serialization solution. While update @ltd/j-toml these days, I had some idea:

Table

Considering how JS code is read and written, the default mode to treat unmarked table objects should be dotted key/value pairs (unless the table is at a layer where such operation is impossible, or the table is an empty table, in which cases it will be serialized in inline mode by default).

Use the Section helper function to mark a table as a block table, or use the inline helper function to mark the table as an inline table.

const TOML = require('@ltd/j-toml');

const { inline, Section } = TOML;

TOML.stringify({
    key: 'value',
    dotted: {
        key: 'value',
    },
    inlineTable: inline({ key: 'value' }),
    mix: {
        key: 'value',
        table: Section({
            key: 'value',
        }),
    },
    table: Section({
        key: 'value',
        table: Section({
            key: 'value',
        }),
    }),
});
key = 'value'
dotted.key = 'value'
inlineTable = { key = 'value' }
mix.key = 'value'

[mix.table]

key = 'value'

[table]

key = 'value'

[table.table]

key = 'value'

Array

A non-empty array, whose item is table that marked by Section, should be serialized as "array of tables".

Otherwise, arrays should be treated as static and multi-line by default. If single-line mode is expected, use the inline helper function to mark it.

const TOML = require('@ltd/j-toml');

const { inline, Section } = TOML;

TOML.stringify({
    staticArray: [
        'string',
        { },
    ],
    staticArray_singleline: inline([ 1.0, 2n ]),
    arrayOfTables: [
        Section({
        }),
    ],
});
staticArray = [
    'string',
    { },
]
staticArray_singleline = [ 1.0, 2 ]

[[arrayOfTables]]

Comment

Another pain is comment. Obviously we don't want a configuration file that contains comments lose all comment information after being modified by programs.

In JavaScript (I don't know whether it's possible for other languages), [commentFor(key)] as key in tables (this gives you a symbol as key, and the value should be the comment content string), so that the comment is after the value belong the key in the final serialization.

const TOML = require('@ltd/j-toml');

const { commentFor } = TOML;

TOML.stringify({
    [commentFor('key')]: ' this is a key/value pair',
    key: 'value',
    dotted: {
        [commentFor('key')]: ' this is a dotted key/value pair',
        key: 'value',
    },
    table: {
        [commentFor('header')]: ' this is a table header (but it cannot be a table in an array of tables)',
        header: Section({
        }),
    },
});
key = 'value' # this is a key/value pair
dotted.key = 'value' # this is a dotted key/value pair

[table.header] # this is a table header (but it cannot be a table in an array of tables)

TOML.parse -> TOML.stringify

Data from TOML.parse should retain the memory of the writing style above, which means there is no need to manually mark them again when reserialize the modified data.

Newline

options.newlineAround

  • type: 'document' | 'section' | 'header' | 'pairs' | 'pair'
  • default: 'header'

When serializing, where to insert empty lines to improve readability.

  1. 'document': only make sure the document begins and ends with a empty line for git diff (if the document is empty, only one empty line will be kept);
  2. 'section': further ensures that sections (block tables) are separated by an empty line;
  3. 'header': further ensure that each block table's header and its key/value pairs are separated by an empty line;
  4. 'pairs': further ensure that the own key/value pairs of each block table are separated by an empty lines (while the dotted keys are grouped together);
  5. 'pair': further ensure that all key/value pairs (including dotted keys) of each block table, are separated by an empty lines.

Of these, 'section' and 'header' are generally the best modes in practice, the former being more suitable for simple cases where sections don't contain each other, while the latter is friendly to both simple and nested (and therefore it should be the default mode).

Big Problems

There still left string, integer and float. Their writing choices, just as gymnastics scoring points, has no perfect solution to be specified (without solving problem by creating more); and they are atom values, no good way to preserve their preferences in the data producted by parse. Maybe I can record the choices in the parent keys like comments, but it's hard to discribe the choice (e. g. 0b0_00_0000).

I tried some solutions in my library, but not good enough. For example:

undersore = 1_000
zero = 10.00
base = 0o777
mark = +10e10
multilineString = """
1\n2
3"""
@pradyunsg
Copy link
Member

I'm not sure that this has any implications on the standards side, so I'm going to go ahead and close this.

Please feel welcome to file a discussion for discussing this. :)

@LongTengDao
Copy link
Contributor Author

Oh, thank you! I just forgot there is a "Discussions" in new github

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants