Convert object to data file #5

Vladimir37 · 2017-05-07T07:37:04Z

Jomini converts data files generated by the Clausewitz engine into an object, but can not converts JS-object into Clausewitz engine data file. Why not do the method for reverse conversion? This would facilitate the creation of various editors.

nickbabcock · 2017-05-07T17:16:55Z

Yes, I absolutely agree that reverse conversion (often called serialization) would be a huge boon. Unfortunately the conversion from the data file to an object is lossy, meaning that given certain objects, it's ambiguous what the correct serialization should be.

Given:

{
  "foo": [1, 2]
}

is the correct serialization:

foo=1
foo=2

or

foo = { 1 2 }

or even

foo = { 1.000 2.0 }

jomini currently doesn't have a strong enough vocabulary to roundtrip deserialize and then serialize without ambiguity.

Vladimir37 · 2017-05-08T08:54:52Z

It seems to me that this problem can be solved if Jomini will display the serialized data in a format like this:

{
  "type": <type>,
  "body": <body>
}

In this way,

  {
    "foo": {
      "type": "int_array",
      "body": [1, 2]
    }
  }

will be deserialized to

foo = { 1 2 }

If type field is float_array:

foo = { 1.0 2.0 }

If type field is chain:

foo=1
foo=2

Using the "type" field would help eliminate ambiguity, as it seems to me.

nickbabcock · 2017-05-08T11:47:20Z

You're absolutely right that there are ways to disambiguate the types (and your idea is a good one). The one downside is that instead of accessing like foo.bar, one would have to do foo.body.bar.body

Vladimir37 · 2017-05-10T07:46:17Z

This problem can be solved if Jomini will have two methods. For example:

jomini.parse - Currently existing method. The data is easy to read, but they can not be serialized.
jomini.deserialization - Derealization using body/type objects.

Thus, the data for easy viewing and data for use with subsequent serialization will be separated.

nickbabcock · 2017-05-10T13:35:52Z

Ideally there'd only be one method to ease differences in parsing. There may be a way to create a class where we keep all properties on the class + a jomini_type() method that is used in a save() method to disambiguate.

But this may be wishful thinking and creating two methods may be more practical in the short term.

C45tr0 · 2018-12-02T07:46:55Z

You could add all the information to disambiguate types into a meta object at the top level. This way the clean access is still given, but allows you to parse that or define it if needed.

nickbabcock · 2018-12-02T19:26:48Z

Right it should be possible to hide the disambiguation away from the user (but still keep it available for serialization) 🤔

Saying that, I don't have any plans for continued development as the current method of parsing (using jison) exhausts heap space, so a rewrite would be necessary to make viable to parse large files.

C45tr0 · 2018-12-02T20:54:57Z

Do you have any current thoughts for what you want to rewrite it in/to use?

nickbabcock · 2018-12-03T02:30:59Z

So this can still be written in js -- it'd just need to be some sort of hand written recursive descent parser (basically any JSON parser can be used for inspiration). I've written paradox parsers in C#, js, F#, and most recently (but not open sourced) rust. Each language has it's own tradeoffs, so I don't think there'd be one solution that could rule all.

soryy708 · 2019-12-29T20:32:12Z

Why a recursive descent parser?
I've written a parser in C++ that achieves this with regular expressions, which proves that the language is regular. What files did you look at when deciding it's a context free grammar?

soryy708 · 2019-12-30T10:02:54Z

I've begun a hand-rewrite of the parser, so that the output is optionally instrumented in a way that allows unambiguous serialization.
As a start, I've ported my C++ parser to JS.
https://github.com/soryy708/jomini/tree/parser
There's still some work to be done on the parser and tokenizer, so that the tests will pass.

nickbabcock · 2020-01-03T21:45:01Z

What files did you look at when deciding it's a context free grammar?

I'm not too privy to computer science terminology, but I believe it is not a regular language as the format allows arbitrary embedding of delimiters (objects can contain array of objects repeatedly). It's the same reason why JSON is not regular.

I've begun a hand-rewrite of the parser, so that the output is optionally instrumented in a way that allows unambiguous serialization.
As a start, I've ported my C++ parser to JS.

Excellent. I'm more than happy to see what you're thinking.

soryy708 · 2020-01-04T12:05:04Z

Apparently you've also made some hand-rolling progress a while ago:
https://github.com/nickbabcock/jomini/tree/handroll

soryy708 · 2020-01-09T16:02:56Z

Apparently this used to be powered by a hand-rolled parser before. parser.js (4c7ece2)
Why was it migrated to Jison?

soryy708 · 2020-01-09T16:51:10Z

Someone made a F# implementation here: https://github.com/tboby/cwtools/tree/master/CWToolsTests

soryy708 · 2020-01-09T16:57:57Z

Someone made a Python implementation here: https://github.com/Shadark/ClauseWizard/

nickbabcock · 2020-01-10T00:47:50Z

Apparently this used to be powered by a hand-rolled parser before. parser.js (4c7ece2)
Why was it migrated to Jison?

Haha, who knew!? Forgot that the commit is from 5 years ago. Looks like I may need to write more descriptive commit messages 😆

My assumption looking at those commits is that jison provided an easier API for development and users at that time. In hindsight, I wished I iterated on the handrolled version, as jison seems unmaintained and a bit baroque, but oh well 🤷‍♂

Someone made a F# implementation here: https://github.com/tboby/cwtools/tree/master/CWToolsTests

Someone made a Python implementation here: https://github.com/Shadark/ClauseWizard/

Yeah there are a lot of parsers out there. I've written my own fair share (C#, C# (2), F#, this one, and other closed sourced implementations). Writing parsers for games you love is a great excuse to program 😄

soryy708 · 2020-01-10T12:35:51Z

Are any of these parsers fit for the purpose of unambiguous conversion from JSON back to Clausewitz format? Maybe the cheapest solution is to make a binding between C# and JavaScript (with edge and/or node-gyp)

nickbabcock · 2020-10-04T13:44:34Z

The latest release uses a parser that is functionally lossless so it would be possible to write out a structure (but not from a JS object).

It would be something along the lines of:

const out = parser.parseText(data, {}, (q) => {
  // update an EU4 save so that the player is england
  q.at("/player", "ENG");
  return q.writeTo(/* a writable stream? */);
});

While this feature is now possible to be implemented in the latest release, I don't have a personal drive for implementing this feature, so as of now if this feature needs to be implemented, it should be done by the community. I'm happy to guide one through the process if they decide to take up this mantle, but until there is a volunteer, I'm going to close this issue.

soryy708 · 2020-10-04T19:14:02Z

@nickbabcock sounds good, and I have some interest in implementing that. I don't know how to interface with your webasm implementation though. Does it have documentation?

nickbabcock · 2020-10-04T21:00:15Z

Excellent, I'll reopen the issue for further discussion.

The underlying parser has documentation.

One can derive inspiration from the code bases that convert binary data to plain text:

The binary data has a slightly different format so it won't be one to one but both text and binary formats functionally behave the same.

CharacterOverflow · 2021-05-22T14:36:48Z

I too started to take a peek into this. I unfortunately don't have a ton of experience, especially with web assembly, and have been pretty lost in trying to make this change.

I noticed @nickbabcock that another library of yours implements this feature: https://github.com/nickbabcock/Pdoxcl2Sharp

I'm considering using C# just for this feature in a tool I'm creating, but figured I'd ask if there's any kind of update coming on this soon or if there's a way I can help.

nickbabcock · 2021-05-22T23:33:45Z

The issue with converting js objects is that some fields will need to be enriched so that they can be written out properly: For instance, we'd want an object like

{
  army: Inflate([{ name: Quoted("army1") }, { name: Quoted("army2") }]),
  type: "western",
  cores: [Quoted("ENG"), Quoted("FRA")]
}

in order to write out:

army={ name="army1" }
army={ name="army1" }
type=western
cores={ "ENG" "FRA" }

In order to facilitate ergonomics, currently the object returned from parsing is not enriched. I would need to see / investigate how one could provide these enriched types without sacrificing ergonomics or performance. Feel free to share ideas or suggestions.

nickbabcock · 2021-05-29T01:27:28Z

I created a PR to allow one to create PDS text documents: #59

Please let me know your feedback and if that PR would close this issue.

Clashsoft · 2023-05-26T11:32:09Z

I have some basic code for writing arbitrary objects, in case anyone finds it useful.
The constants at the start are somewhat game-specific, but can be adapted.
Here I have what works for Stellaris custom empire designs.

const FLAT_ARRAY_KEYS = [
  'ethic',
  'trait',
];
const UNQUOTED_KEYS = [
  'gender',
];

/**
 * @param writer {Writer}
 * @param key {string}
 * @param value {any}
 */
function writeKeyValue(writer, key, value) {
  if (/^[a-zA-Z_]+$/.test(key)) {
    writer.write_unquoted(key);
  } else {
    writer.write_quoted(key);
  }
  writer.write_operator('=');
  writeAny(writer, value, key);
}

/**
 * @param writer {Writer}
 * @param obj {object}
 */
function writeObject(writer, obj) {
  writer.write_object_start();
  writeEntries(writer, obj);
  writer.write_end();
}

/**
 * @param writer {Writer}
 * @param obj {object}
 */
function writeEntries(writer, obj) {
  for (const [key, value] of Object.entries(obj)) {
    if (FLAT_ARRAY_KEYS.includes(key) && Array.isArray(value)) {
      for (const item of value) {
        writeKeyValue(writer, key, item);
      }
    } else {
      writeKeyValue(writer, key, value);
    }
  }
}

/**
 * @param writer {Writer}
 * @param obj {Array}
 */
function writeArray(writer, obj) {
  writer.write_array_start();
  for (const item of obj) {
    writeAny(writer, item);
  }
  writer.write_end();
}

/**
 * @param writer {Writer}
 * @param obj {any}
 * @param key {string}
 */
function writeAny(writer, obj, key = undefined) {
  if (Array.isArray(obj)) {
    writeArray(writer, obj);
  } else switch (typeof obj) {
    case 'string':
      if (UNQUOTED_KEYS.includes(key)) {
        writer.write_unquoted(obj);
      } else {
        writer.write_quoted(obj);
      }
      break;
    case 'number':
      if (Number.isInteger(obj)) {
        writer.write_integer(obj);
      } else {
        writer.write_f64(obj);
      }
      break;
    case 'boolean':
      writer.write_bool(obj);
      break;
    case 'object':
      if (obj instanceof Date) {
        writer.write_date(obj);
      } else if (obj) {
        writeObject(writer, obj);
      }
      break;
  }
}

nickbabcock closed this as completed Oct 4, 2020

nickbabcock reopened this Oct 4, 2020

nickbabcock mentioned this issue Jun 5, 2023

Mention writing arbitrary objects solution in readme #102

Merged

nickbabcock closed this as completed in #102 Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert object to data file #5

Convert object to data file #5

Vladimir37 commented May 7, 2017

nickbabcock commented May 7, 2017

Vladimir37 commented May 8, 2017

nickbabcock commented May 8, 2017

Vladimir37 commented May 10, 2017

nickbabcock commented May 10, 2017

C45tr0 commented Dec 2, 2018

nickbabcock commented Dec 2, 2018

C45tr0 commented Dec 2, 2018

nickbabcock commented Dec 3, 2018

soryy708 commented Dec 29, 2019

soryy708 commented Dec 30, 2019

nickbabcock commented Jan 3, 2020 •

edited

Loading

soryy708 commented Jan 4, 2020

soryy708 commented Jan 9, 2020 •

edited

Loading

soryy708 commented Jan 9, 2020

soryy708 commented Jan 9, 2020

nickbabcock commented Jan 10, 2020

soryy708 commented Jan 10, 2020

nickbabcock commented Oct 4, 2020

soryy708 commented Oct 4, 2020

nickbabcock commented Oct 4, 2020

CharacterOverflow commented May 22, 2021

nickbabcock commented May 22, 2021

nickbabcock commented May 29, 2021

Clashsoft commented May 26, 2023

Convert object to data file #5

Convert object to data file #5

Comments

Vladimir37 commented May 7, 2017

nickbabcock commented May 7, 2017

Vladimir37 commented May 8, 2017

nickbabcock commented May 8, 2017

Vladimir37 commented May 10, 2017

nickbabcock commented May 10, 2017

C45tr0 commented Dec 2, 2018

nickbabcock commented Dec 2, 2018

C45tr0 commented Dec 2, 2018

nickbabcock commented Dec 3, 2018

soryy708 commented Dec 29, 2019

soryy708 commented Dec 30, 2019

nickbabcock commented Jan 3, 2020 • edited Loading

soryy708 commented Jan 4, 2020

soryy708 commented Jan 9, 2020 • edited Loading

soryy708 commented Jan 9, 2020

soryy708 commented Jan 9, 2020

nickbabcock commented Jan 10, 2020

soryy708 commented Jan 10, 2020

nickbabcock commented Oct 4, 2020

soryy708 commented Oct 4, 2020

nickbabcock commented Oct 4, 2020

CharacterOverflow commented May 22, 2021

nickbabcock commented May 22, 2021

nickbabcock commented May 29, 2021

Clashsoft commented May 26, 2023

nickbabcock commented Jan 3, 2020 •

edited

Loading

soryy708 commented Jan 9, 2020 •

edited

Loading