New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from XML to S-Expressions file format #193

Merged
merged 13 commits into from Oct 30, 2017

Conversation

Projects
None yet
2 participants
@ubruhin
Member

ubruhin commented Oct 22, 2017

Example files:

Fixes #192

@ubruhin ubruhin added this to the 0.1 milestone Oct 22, 2017

@ubruhin ubruhin self-assigned this Oct 22, 2017

@dbrgn

This comment has been minimized.

Show comment
Hide comment
@dbrgn

dbrgn Oct 22, 2017

Contributor

Shouldn't dates be quoted? E.g. see here

Contributor

dbrgn commented Oct 22, 2017

Shouldn't dates be quoted? E.g. see here

@dbrgn

This comment has been minimized.

Show comment
Hide comment
@dbrgn

dbrgn Oct 22, 2017

Contributor

Some other remarks:

(name (locale "") "R-US")
(description (locale "") "Resistor American (IEEE 315)")
(keywords (locale "") "resistor,resistance")

I don't quite get that. Shouldn't that be (keywords "" "resistor,resistance")? Or if you want to make the locale optional, (keywords "resistor,resistance")?

I also wonder whether in some cases more newlines would be better:

(signal 6d776f4d-2a7c-4128-a98a-dbb1dd861411 (name "2") (role passive)
  (required true) (negated false) (clock false) (forced_net "")
)

Maybe this would be better like this:

(signal 6d776f4d-2a7c-4128-a98a-dbb1dd861411
  (name "2")
  (role passive)
  (required true)
  (negated false)
  (clock false)
  (forced_net "")
)

It would also help when doing diffs.

Regarding this one:

(disp none)

none is an enum variant, not null, right?

What about flags?

(signal 1c1c7abc-7b40-4f92-b533-f65604644db7 (name "1") (role passive)
  (required true) (negated false) (clock false) (forced_net "")
)

Maybe required could become a flag?

(signal 1c1c7abc-7b40-4f92-b533-f65604644db7 required (name "1") (role passive)
  (negated false) (clock false) (forced_net "")
)

On the other hand it might make the format less generic and harder to parse.

And a general comment, I'd use two spaces for indentation, one is a bit hard to see when dealing with larger structures.

Contributor

dbrgn commented Oct 22, 2017

Some other remarks:

(name (locale "") "R-US")
(description (locale "") "Resistor American (IEEE 315)")
(keywords (locale "") "resistor,resistance")

I don't quite get that. Shouldn't that be (keywords "" "resistor,resistance")? Or if you want to make the locale optional, (keywords "resistor,resistance")?

I also wonder whether in some cases more newlines would be better:

(signal 6d776f4d-2a7c-4128-a98a-dbb1dd861411 (name "2") (role passive)
  (required true) (negated false) (clock false) (forced_net "")
)

Maybe this would be better like this:

(signal 6d776f4d-2a7c-4128-a98a-dbb1dd861411
  (name "2")
  (role passive)
  (required true)
  (negated false)
  (clock false)
  (forced_net "")
)

It would also help when doing diffs.

Regarding this one:

(disp none)

none is an enum variant, not null, right?

What about flags?

(signal 1c1c7abc-7b40-4f92-b533-f65604644db7 (name "1") (role passive)
  (required true) (negated false) (clock false) (forced_net "")
)

Maybe required could become a flag?

(signal 1c1c7abc-7b40-4f92-b533-f65604644db7 required (name "1") (role passive)
  (negated false) (clock false) (forced_net "")
)

On the other hand it might make the format less generic and harder to parse.

And a general comment, I'd use two spaces for indentation, one is a bit hard to see when dealing with larger structures.

@ubruhin

This comment has been minimized.

Show comment
Hide comment
@ubruhin

ubruhin Oct 22, 2017

Member

Shouldn't dates be quoted?

I saw no reason to quote them because the format of dates is well known (contains only letters, digits, :, - and +) and thus is always a valid S-Expression token.

Member

ubruhin commented Oct 22, 2017

Shouldn't dates be quoted?

I saw no reason to quote them because the format of dates is well known (contains only letters, digits, :, - and +) and thus is always a valid S-Expression token.

@ubruhin

This comment has been minimized.

Show comment
Hide comment
@ubruhin

ubruhin Oct 22, 2017

Member

I don't quite get that. Shouldn't that be (keywords "" "resistor,resistance")? Or if you want to make the locale optional, (keywords "resistor,resistance")?

I expected that question 😃 I already spent some time to improve the format of localized strings (name, description, keywords), but unfortunately I didn't found a good solution.

Actually the main question is: How should the localizing of these attributes work? My current idea is that for each library element there are fields for "name", "description" and "keywords" without a specific locale. So if you create libraries only for yourself, you can for example fill these fields with German words. In addition, translated strings can optionally be added, e.g. for your libraries someone could add English translations.

For that system, the most reasonable file format would probably look like this:

(name "Hallo Welt")
(name (locale "en_US") "Hello World")

But this is really ugly to parse, because the text is not always at the same index of the node (index 0 without locale, index 1 with locale). Of course it would be possible to either take the count of child nodes, or the type of each child node into account, but it feels like a workaround to do that.

If the order of the child nodes is swapped, parsing gets easier, but IMO the file looks less intuitive:

(name "Hallo Welt")
(name "Hello World" (locale "en_US"))

Another solution is to always write down the locale, but without an extra child node, but this also looks strange:

(name "" "Hallo Welt")
(name "en_US" "Hello World")

Maybe it would make sense to make the locale mandatory, so every string has a locale defined?

(name (locale de_CH) "Hallo Welt")
(name (locale en_US) "Hello World")

But this would bring up a new question: Which locale should be used if the user's locale is not found? So if you have set your default locale to "cs_CZ", which of the two available locales "de_CH" and "en_US" should be displayed? When having default/fallback name/description/keywords, these will be used in such cases.

Do you have a good answer for these questions? :)

I also wonder whether in some cases more newlines would be better:
It would also help when doing diffs

It helps doing diffs on short files. But when adding more newlines, many files get much longer (e.g. >1000 lines), which is also not nice. I think creating nice looking files is a trade-off between length of lines and count of lines.

And a general comment, I'd use two spaces for indentation

Of course this would improve readability, but here I also see a trade-off between file size and readability. Optimal for file size would be zero indentation (and even no newlines), optimal for readability ~2-4 spaces. So for me, 1 space seems to be a good compromise (keep in mind that you will probably never write such files by hand, and even reading such files should not be daily business).

none is an enum variant, not null, right?

Right.

What about flags?

Looks interesting, need to think about it... ;)

EDIT: Thought about it ~10seconds 😄 and my answer is: It makes files less self-documented. If a flag is missing, it's not visible that there could be such a flag. A boolean value (e.g. (required false)) is much more expressive and also easier to parse.

Member

ubruhin commented Oct 22, 2017

I don't quite get that. Shouldn't that be (keywords "" "resistor,resistance")? Or if you want to make the locale optional, (keywords "resistor,resistance")?

I expected that question 😃 I already spent some time to improve the format of localized strings (name, description, keywords), but unfortunately I didn't found a good solution.

Actually the main question is: How should the localizing of these attributes work? My current idea is that for each library element there are fields for "name", "description" and "keywords" without a specific locale. So if you create libraries only for yourself, you can for example fill these fields with German words. In addition, translated strings can optionally be added, e.g. for your libraries someone could add English translations.

For that system, the most reasonable file format would probably look like this:

(name "Hallo Welt")
(name (locale "en_US") "Hello World")

But this is really ugly to parse, because the text is not always at the same index of the node (index 0 without locale, index 1 with locale). Of course it would be possible to either take the count of child nodes, or the type of each child node into account, but it feels like a workaround to do that.

If the order of the child nodes is swapped, parsing gets easier, but IMO the file looks less intuitive:

(name "Hallo Welt")
(name "Hello World" (locale "en_US"))

Another solution is to always write down the locale, but without an extra child node, but this also looks strange:

(name "" "Hallo Welt")
(name "en_US" "Hello World")

Maybe it would make sense to make the locale mandatory, so every string has a locale defined?

(name (locale de_CH) "Hallo Welt")
(name (locale en_US) "Hello World")

But this would bring up a new question: Which locale should be used if the user's locale is not found? So if you have set your default locale to "cs_CZ", which of the two available locales "de_CH" and "en_US" should be displayed? When having default/fallback name/description/keywords, these will be used in such cases.

Do you have a good answer for these questions? :)

I also wonder whether in some cases more newlines would be better:
It would also help when doing diffs

It helps doing diffs on short files. But when adding more newlines, many files get much longer (e.g. >1000 lines), which is also not nice. I think creating nice looking files is a trade-off between length of lines and count of lines.

And a general comment, I'd use two spaces for indentation

Of course this would improve readability, but here I also see a trade-off between file size and readability. Optimal for file size would be zero indentation (and even no newlines), optimal for readability ~2-4 spaces. So for me, 1 space seems to be a good compromise (keep in mind that you will probably never write such files by hand, and even reading such files should not be daily business).

none is an enum variant, not null, right?

Right.

What about flags?

Looks interesting, need to think about it... ;)

EDIT: Thought about it ~10seconds 😄 and my answer is: It makes files less self-documented. If a flag is missing, it's not visible that there could be such a flag. A boolean value (e.g. (required false)) is much more expressive and also easier to parse.

@ubruhin

This comment has been minimized.

Show comment
Hide comment
@ubruhin

ubruhin Oct 24, 2017

Member

Ok now I updated the format of localized strings by making the "locale" node optional:

(name "Hallo Welt")
(name (locale "en_US") "Hello World")

IMO the files look much better this way, but of course parsing feels a bit crappy now...

What do you think @dbrgn?

Btw, the links from the first post above always point to the latest file format, just check out these files again to see the difference.

Member

ubruhin commented Oct 24, 2017

Ok now I updated the format of localized strings by making the "locale" node optional:

(name "Hallo Welt")
(name (locale "en_US") "Hello World")

IMO the files look much better this way, but of course parsing feels a bit crappy now...

What do you think @dbrgn?

Btw, the links from the first post above always point to the latest file format, just check out these files again to see the difference.

@dbrgn

This comment has been minimized.

Show comment
Hide comment
@dbrgn

dbrgn Oct 25, 2017

Contributor

Don't you initially parse the S-Expression into a generic nested key-value tree? In that case parsing would not make a difference.

In general I like the approach with the optional locale node.

Contributor

dbrgn commented Oct 25, 2017

Don't you initially parse the S-Expression into a generic nested key-value tree? In that case parsing would not make a difference.

In general I like the approach with the optional locale node.

@ubruhin

This comment has been minimized.

Show comment
Hide comment
@ubruhin

ubruhin Oct 25, 2017

Member

Don't you initially parse the S-Expression into a generic nested key-value tree?

Yes, I do.

In that case parsing would not make a difference.

"Low-level" parsing does not make a difference, but serializing and deserializing of objects is a bit ugly. See 3717032, this is the change needed to make the "locale" node optional.

Member

ubruhin commented Oct 25, 2017

Don't you initially parse the S-Expression into a generic nested key-value tree?

Yes, I do.

In that case parsing would not make a difference.

"Low-level" parsing does not make a difference, but serializing and deserializing of objects is a bit ugly. See 3717032, this is the change needed to make the "locale" node optional.

@dbrgn

This comment has been minimized.

Show comment
Hide comment
@dbrgn

dbrgn Oct 25, 2017

Contributor

this is the change needed to make the "locale" node optional

That isn't too bad though :)

Contributor

dbrgn commented Oct 25, 2017

this is the change needed to make the "locale" node optional

That isn't too bad though :)

@ubruhin ubruhin changed the title from WIP: Switch from XML to S-Expressions file format to Switch from XML to S-Expressions file format Oct 27, 2017

@ubruhin

This comment has been minimized.

Show comment
Hide comment
@ubruhin

ubruhin Oct 27, 2017

Member

I think this should be ready now!

I have even added the new mime type application/x-librepcb-file and a language specification to enable syntax highlighting of *.lp files in GtkSourceView based editors.

@dbrgn @rnestler Someone interested in providing a syntax highlighting config for Vim? Or does Vim also support the GtkSourceView language specs?

Member

ubruhin commented Oct 27, 2017

I think this should be ready now!

I have even added the new mime type application/x-librepcb-file and a language specification to enable syntax highlighting of *.lp files in GtkSourceView based editors.

@dbrgn @rnestler Someone interested in providing a syntax highlighting config for Vim? Or does Vim also support the GtkSourceView language specs?

ubruhin added some commits Oct 28, 2017

Switch core classes from XML to S-Expressions
All serialization core classes are changed to work with S-Expressions
instead of XML. All other classes will follow soon.
Remove trailing zeros of numbers in all files
Affected classes:
- librepcb::Angle
- librepcb::Length
- librepcb::Ratio
Update submodule "demo-workspace"
Convert all XML files to S-Expressions
Add "gtksourceview-3.0" language spec for *.lpp files
This can be used to enable syntax highlighting for *.lp files in some
text editors (e.g. Gedit on Ubuntu). Just execute the shell script to
install the language specs.

@ubruhin ubruhin merged commit fb70549 into master Oct 30, 2017

4 checks passed

continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@ubruhin ubruhin deleted the switch_from_xml_to_sexpr branch Oct 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment