Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize xml helper code #839

Merged
merged 7 commits into from
Mar 22, 2021
Merged

Generalize xml helper code #839

merged 7 commits into from
Mar 22, 2021

Conversation

DanRStevens
Copy link
Collaborator

Related to Issue #797, and PR #838.

It's been quite a little while since I worked on this (about 6 months). At this point I don't quite remember why I stopped and left it. I think there was something I was still working on, though maybe I just got busy with other stuff. It's a conflicting approach to PR #838. At any rate, I thought I'd push what I had as a point of comparison:
https://github.com/lairworks/nas2d-core/compare/generalizeXmlHelperCode?expand=1

Maybe some ideas will be worth taking from there.

This allows parsing of alternate file formats, and returning of
alternate parsed data structures.
It seems old versions of Clang are not able to match the auto return
type parameter in this context. The stock install of Clang on Ubuntu
18.04 is one of those old versions of Clang.
@Brett208
Copy link
Contributor

What are your thoughts about using element values instead of attributes to store data in dictionary format? More like what I did with PlanetAttributes.xml instead of the current save format?

I am generally happy with the concept of storing data in dictionary form and planned to move more towards parsing in this fashion as discussed in the issues. I would just prefer it be in elements. Maybe best for me to switch to attributes though to be more consistent though...

@DanRStevens
Copy link
Collaborator Author

Curious question, under what conditions would you consider using attributes?

I would certainly use element values for recursive structures. Outside of that, I find the choice is a little more arbitrary.

@DanRStevens
Copy link
Collaborator Author

Found some interesting discussion points here:
https://stackoverflow.com/questions/1096797/should-i-use-elements-or-attributes-in-xml

@Brett208
Copy link
Contributor

@DanRStevens,

Sorry for the long time between replies. I read the stack overflow you pointed to, and that article is striking the heart of what I'm talking about, in addition to the post it refers to at (https://stackoverflow.com/questions/33746/xml-attribute-vs-xml-element).

It seems like there are 2 camps, OPHD sort of falls in the first camp that extensively uses attributes and I am used to the other camp. When writing XML, I avoid attributes almost exclusively unless it is metadata that does not belong in the actual structure of the parsed data.

Current OPHD Serialization:

<OutpostHD_SaveGame version="0.31">
    <mines>
        <mine x="14" y="85" depth="0" active="0" yield="0" flags="001111" />
    </mines>
</OutpostHD_SaveGame>

How I would have written the snippet:

<OutpostHD_SaveGame version="0.31">
    <mines>
        <mine>
            <x>14</x>
            <y>85</y>
            <depth>0</depth>
            <active>0</active>
            <yield>0</yield>
            <flags>001111</flags>
        </mine>
    </mines>
</OutpostHD_SaveGame>

How I would have written it is certainly more verbose. Probably a individual preference on which is easier to read.

I would just accept the OPHD style here of using attributes, but I think it breaks down for large text fields. For example:

<Planets>
	<Planet>
		<PlanetType>Mercury</PlanetType>
		<ImagePath>planets/planet_d.png</ImagePath>
		<Hostility>High</Hostility>
		<MaxDepth>1</MaxDepth>
		<MaxMines>10</MaxMines>
		<MapImagePath>maps/merc_01</MapImagePath>
		<TilesetPath>tsets/mercury.png</TilesetPath>
		<Name>Mercury Type</Name>
		<MeanSolarDistance>0.4</MeanSolarDistance>
		<Description>Small, hot and not very friendly. This planet resembles our very own Mercury. Close to its parent star, few potential mine locations, very shallow dig depth, no magnetic field and no atmosphere. &#xA; &#xA; The Payoff? The few mines you do get are higher yield and you get plenty of solar energy.</Description>
	</Planet>
</Planets>

Versus how I think OPHD currently would format it using attributes:

<Planets>
	<Planet PlanetType="Mercury" ImagePath="planets/planet_d.png" Hostility="High" MaxDepth="1" MaxMines="10" MapImagePath="maps/merc_01" TilesetPath="tsets/mercury.png Name="Mercury Type" MeanSolarDistance="0.4" Description="Small, hot and not very friendly. This planet resembles our very own Mercury. Close to its parent star, few potential mine locations, very shallow dig depth, no magnetic field and no atmosphere. &#xA; &#xA; The Payoff? The few mines you do get are higher yield and you get plenty of solar energy.">
</Planets>

(These fields probably should use the CDATA escapes. You may want to copy into notepadd++ or something similar to view. GitHub doesn't wrap the text in 'code' mode, and if I don't put it in code mode, it doesn't want to display it for probably good security reasons.)

In the PlanetAttributes case the significant amount of text involved in the attributes reduces readability whereas the attributes looked cleaner with more numerical fields in the mine representation in the saved game.

I would have probably just stuck with using attributes, but it seems incorrect in my mind to put large text blocks into attributes such as the PlanetDescription.

In the past, I have hand written my XML field parsing, which means that a solution could be tailored either direction as needed. One of the downsides of hand writing the parsing is it tends to couple tightly with the implementation (XML in this case). I understand you are advocating for a universal dictionary approach in the project which would remove the tight coupling. Unfortunately, the universal approach becomes more complex if you create edge conditions such as if the text being serialized is a paragraph, use a child element, else always use attributes.

Based on what I've read so far, I'm leaning towards rewriting the PlanetAttributes to use attributes for all stored data and more align with the current code base.

Admittedly, I have read through some of the source code, played with some examples, and made a cursory read of most the associated issues, and have not really 'deep dived' the issue.

Anyways, if this all sounds good, I would recommend merging this PR and I'll rewrite the PlanetAttributes to match this style. Then I can split PlanetAttributes to just parse a dictionary and push the parsing into NAS2D to remove all xml calls from PlanetAttributes.

Thanks for reading. -Brett

@DanRStevens
Copy link
Collaborator Author

I would agree that attributes are not ideal for large blocks of text, and the CDATA concern is valid.

In terms of using a consistent approach, that might only be relevant for the writing code. It would be possible to adapt the reading code so it could ingest data from either an attribute or a child element. In the case of planet data, it's static data packaged with the game, which will only ever be read.

Previously there was some dual format read code to handle a custom "options" section, which was formatted a little differently. When loading the config file, it would parse either format, but then only save in the one consistent format. It was eventually removed though, after a transition period.

I would be open to more dual format reading code. That would allow the planet data to be read as is, or perhaps with a mixed form where only the longer description field is an element.

@DanRStevens
Copy link
Collaborator Author

Ok, I've had a chance to re-read the code. I remember a couple of things I still wanted to work on:

  1. The last commit changed the template parameters to parseXmlFileData, which required manually specifying the return type at the point of call. I wasn't super happy about it. It was needed to work around some compile problems in older versions of Clang, such as the one shipped with Ubuntu 18.04. Perhaps there's a way to write the method so the return type can be auto inferred, and won't cause compile issues with Clang. I never got around to figuring that out.

  2. Using templates means the function definitions need to be in the header file, which required the XML include to also be in the header file. With older code it was possible to only include the XML header from the .cpp file for the parsing code. I wasn't super happy about moving the include to the header file, which has a cascade include effect on all including files. This pollutes the namespace of those translation units, and leads to (slightly) slower compiles.

As mildly frustrating as those flaws are, they're probably not worth holding back the code updates. Maybe they can be fixed in a future branch.

I'm going to go with Brett's recommendation to merge. Particularly since other fresh items have been brought up that may depends on these or similar changes.

@DanRStevens DanRStevens merged commit 721b39b into master Mar 22, 2021
@DanRStevens DanRStevens deleted the generalizeXmlHelperCode branch March 22, 2021 12:29
@DanRStevens DanRStevens mentioned this pull request Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants