Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of a document type definition for library configurations #457

Conversation

elfring
Copy link
Contributor

@elfring elfring commented Oct 26, 2014

A document type definition can help to work with corresponding files because it describes the involved data structures to some degree. Such a DTD can be added so that further improvements will become easier for the safe data exchange around library configurations for static source code analysis.

A DTD format was standardised as a basis for XML documents. Some technical challenges and limitations were noticed so that another standard like W3C XML Schema evolved. Such a XML schema definition can also be added so that corresponding file handling might become more consistent.

  • Would you like to try this approach out?
  • How are the chances to integrate this update suggestion into your source code repository?

Markus Elfring added 2 commits October 26, 2014 19:20
A document type definition can help to work with corresponding files
because it describes the involved data structures to some degree.
https://en.wikipedia.org/wiki/Document_type_definition

Such a DTD was added so that further improvements will become easier for
the safe data exchange around library configurations for static source
code analysis.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
A document type definition can help to work with corresponding files
because it describes the involved data structures to some degree.
Such a format was standardised as a basis for XML documents. Some technical
challenges and limitations were noticed so that another standard like
W3C XML Schema evolved.
https://en.wikipedia.org/wiki/XML_Schema_%28W3C%29

Such a XML schema definition was added so that further improvements will
become easier for the safe data exchange around library configurations for
static source code analysis.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
@amaigil2
Copy link

I'm in favor of this.
In order to benefit from this in (daily) development one could choose a popular XML validation tool and add a suitable target to the Makefile.

@PKEuS
Copy link
Contributor

PKEuS commented Oct 28, 2014

Suggestions:

  • Use only one format, preferably xsd
  • For the dtd, the scripts need to be changed so that xmllint ignores it

@elfring
Copy link
Contributor Author

elfring commented Oct 28, 2014

I find that both file format descriptions are useful. I am curious if my suggestion fits also to your usage expectations with the CFG files.

@PKEuS
Copy link
Contributor

PKEuS commented Oct 28, 2014

Having two files means we have to maintain two files. As the information is redundant, one would be sufficient.

@elfring
Copy link
Contributor Author

elfring commented Oct 28, 2014

I hope also that a XML schema can be preferred over a document type definition.
Does anybody need a DTD for a DOCTYPE specification (instead of a XSD which will be referenced in XML element attributes) eventually?

@amaigil2
Copy link

XSD is fine for me. So we would like to have Makefile support based on xmllint for validation?

@danmar
Copy link
Owner

danmar commented Nov 1, 2014

xsd sounds good to me too.

@elfring
Copy link
Contributor Author

elfring commented Nov 1, 2014

Do the suggested data structure description files need any more fine-tuning?

@danmar
Copy link
Owner

danmar commented Nov 2, 2014

I fail to use the xsd file you created:

daniel@dator:~/cppcheck/cfg$ xmllint --schema library.xsd --noout std.cfg
library.xsd:5: element key: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}schema': The content is not valid. Expected is ((include | import | redefine | annotation)*, (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*).
library.xsd:9: element key: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}schema': The content is not valid. Expected is ((include | import | redefine | annotation)*, (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*).
library.xsd:13: element key: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}schema': The content is not valid. Expected is ((include | import | redefine | annotation)*, (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*).
library.xsd:17: element key: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}schema': The content is not valid. Expected is ((include | import | redefine | annotation)*, (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*).
library.xsd:91: element complexType: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}element': The attribute 'type' and the <complexType> child are mutually exclusive.
library.xsd:105: element complexType: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}element': The attribute 'type' and the <complexType> child are mutually exclusive.
WXS schema library.xsd failed to compile

is the xsd wrong, or do I use it wrongly somehow?

@elfring
Copy link
Contributor Author

elfring commented Nov 2, 2014

@danmar: You stumbled on an update candidate in my data description suggestion "library.xsd" which I accidentally left over.
Do you see any more places that will need further considerations?

@PKEuS
Copy link
Contributor

PKEuS commented Nov 2, 2014

Can you update your pull request, fixing that issue and removing the dtd? Then we can check again.

@elfring
Copy link
Contributor Author

elfring commented Nov 2, 2014

Should I collect any more clarifications and fixes before I would submit another pull request?

@PKEuS
Copy link
Contributor

PKEuS commented Nov 2, 2014

You don't have to create a new pull-request. Just do a force-push on the branch

@elfring
Copy link
Contributor Author

elfring commented Nov 4, 2014

Do I interpret the implementation of the member function "Library::load" in the right way that the XML element "exporter" should deal with another element and attribute with the name "prefix"?
When will this data structure be explained in the manual besides other update candidates?

@danmar
Copy link
Owner

danmar commented Nov 5, 2014

I don't remember well how the element "exporter" works. Its for functionality similar to what is available in Qt where functions etc can be called "behind your back" from the framework. If you want to know more about it you'll have to investigate it yourself.

@elfring
Copy link
Contributor Author

elfring commented Nov 5, 2014

@danmar: I am surprised by your feedback. Should the element "exporter" (or "exported"?) really belong to the XML structure "format 1" for library configurations then?

@danmar
Copy link
Owner

danmar commented Nov 6, 2014

yes we should keep it.

@elfring
Copy link
Contributor Author

elfring commented Nov 6, 2014

How do you think about to describe your addition of the markup element also in the manual?

@amai2012
Copy link
Collaborator

I suggest we move the discussion about documentation outside the XSD to another place.
Do you intend to update your branch here?

@elfring
Copy link
Contributor Author

elfring commented Nov 11, 2014

@amai2012:

  • How and where do you want to move this discussion?
  • I would dare to publish further patch suggestions if unclear implementation details can be resolved somehow.

@PKEuS
Copy link
Contributor

PKEuS commented Nov 11, 2014

I would recommend to not make an endless discussion out of this, instead you should try to finish this particular pull request. Applying this does not mean that further improvements can't be done later.

@elfring
Copy link
Contributor Author

elfring commented Nov 12, 2014

@PKEuS: I hope also that a constructive discussion will not become endless. But I imagine that the requested data structure clarifications will need more efforts.
How many design details can only be resolved with feedback from @danmar?

@PKEuS
Copy link
Contributor

PKEuS commented Jan 11, 2015

Feel free to reopen if there is any progress.

@PKEuS PKEuS closed this Jan 11, 2015
@elfring
Copy link
Contributor Author

elfring commented Jan 11, 2015

@PKEuS: I would like to reopen this issue. Unfortunately, the corresponding button is not activated for me here.

I improved my data format descriptions a bit a while ago already.

  • How many design details can also be eventually resolved without feedback from @danmar?
  • Would you like to continue a constructive clarification?

@PKEuS
Copy link
Contributor

PKEuS commented Jan 11, 2015

This pull request still contains the same dtd and xsd as it did when you opened it. So, there was no visible progress.

We decided that we would like to have one scheme (xsd), not two. So far, the pull-request contains both. Additionally, both schemes don't fit to the format, as they don't support all tags and attributes that might be used inside cfg files. Yes, there are tags that are not documented, but a scheme that does not work for the cfg files that are provided with cppcheck does not help.

@PKEuS PKEuS reopened this Jan 11, 2015
@elfring
Copy link
Contributor Author

elfring commented Jan 11, 2015

I imagine that a DTD file can also help to clarify open issues here.

@PKEuS: Would you dare to invest more efforts to make the preferred XSD file variant really usable?

@PKEuS
Copy link
Contributor

PKEuS commented Jan 27, 2015

noreturn is not the same as return type void. No return means that the function might behave like exit()

@elfring
Copy link
Contributor Author

elfring commented Jan 27, 2015

But they are both used sometimes if the function is not noreturn.

How should such an use case be safely expressed in a DTD and/or XSD?

I would be able to insert review comments.

I agree in principle that the other style of GitHub code commenting functionality can also be nice. I find that our clarification might work in the other way, too.
How do you think about to try additional telecommunication tools out?

@elfring
Copy link
Contributor Author

elfring commented Jan 27, 2015

No return means that the function might behave like exit()

Do such functions have got a non-void return type?

@PKEuS
Copy link
Contributor

PKEuS commented Jan 27, 2015

why not?

@elfring
Copy link
Contributor Author

elfring commented Jan 27, 2015

why not?

Does a return value become usable from a function implementation which does not return actually?

@PKEuS
Copy link
Contributor

PKEuS commented Jan 27, 2015

a function could return only conditionally and exit in other cases. assert() for example.

@elfring
Copy link
Contributor Author

elfring commented Jan 27, 2015

a function could return only conditionally and exit in other cases.

That is fine, of course. - Would you like to distinguish functions that will always not return from those implementations which will occasionally not return because of situations like error/exception handling?

@danmar
Copy link
Owner

danmar commented Jan 28, 2015

I don't think we should debate noreturn here. concentrate on writing a XSD that works for our CFG files thank you. If you try your code against our cfg files you will see if it works or not, and then we don't have to review and debate a non-working schema.

could you please use the white-list approach to start with.. make everything in our existing cfg files allowed. then if you want to black-list some configurations we can discuss that later. then the latest and greatest work-in-progress schema will be a working schema that we can try.

@danmar
Copy link
Owner

danmar commented Jan 28, 2015

I added a rng file with 7d0f5ad that can validate all our cfg files.

It's not super strict so feel free to investigate if you can make it stricter.

I don't know.. do we want a XSD also/instead?

@elfring
Copy link
Contributor Author

elfring commented Jan 28, 2015

I don't think we should debate noreturn here.

It seems that it is still needed to clarify opinions around the combination of the elements "noreturn" and "use-retval" so that it will be found out how they should really work together.

make everything in our existing cfg files allowed.

I am not going to follow this suggestion. I find that safe data format specifications deal with some limitations and restrictions to prevent data rubbish.

I added a rng file

Interesting addition… - Do you prefer this file format for data structure descriptions?
Which approach will become the leading one for your software?

@danmar
Copy link
Owner

danmar commented Jan 28, 2015

It seems that it is still needed to clarify opinions around the combination of the elements "noreturn" and "use-retval" so that it will be found out how they should really work together.

EDIT: I would like to keep the existing behaviour.

It is true that if use-retval is used then we can guess that the function is not noreturn so it's redundant to use noreturn. but I still like to have these as 2 separate options.

I am not going to follow this suggestion. I find that safe data format specifications deal with some limitations and restrictions to prevent data rubbish.

ok. do it your way but I don't like to debate when I don't get any results.

Interesting addition… - Do you prefer this file format for data structure descriptions?

I am not an xml expert. I only used rng because I know it.

as far as I have heard - but maybe I am wrong - the advantage with xsd is that tool support is better.

@PKEuS
Copy link
Contributor

PKEuS commented Jan 28, 2015

noreturn and use-retval do not interact.

@elfring
Copy link
Contributor Author

elfring commented Jan 28, 2015

I pointed out one example.

I find that the specification of the element "noreturn" is unnecessary (in an use case like the function "cproj") when the element "use-retval" is also specified.

I don't like to debate when I don't get any visible results.

Some open issues should be clarified before further contributions will become visible, shouldn't they?

@danmar
Copy link
Owner

danmar commented Jan 28, 2015

I find that the specification of the element "noreturn" is unnecessary (in an use case like the function "cproj") when the element "use-retval" is also specified.

I understand. I want it to work as it does now.

Some open issues should be clarified before further contributions will become visible, shouldn't they?

I believe we could debate for months and still see no results. I don't want to do that.

if you think the schema I checked in should be made stricter in some way.. feel free to discuss it. I am open to suggestions. I am sure there are much to do.

@danmar
Copy link
Owner

danmar commented Jan 28, 2015

I am sure there are much to do.

@PKEuS Feel free to take an extra look at the rules for container.

@elfring
Copy link
Contributor Author

elfring commented Jan 28, 2015

I believe we could debate for months

I guess that we are still going to discuss various issues in the following years. ;-)

and still see no results.

It might happen that results willl not appear with the speed you would prefer.

  • But I imagine that the reuse of additional telecommunication tools could increase the clarification speed.
  • Your introduction of an alternative data format description will also help to achieve a common understanding a bit easier if you are more familiar with this approach already.

if you think the schema I checked in should be made stricter in some way

I suggest further considerations for corresponding fine-tuning of your addition.
Would you like to convert this RELAX NG file to other data format descriptions automatically?

@elfring
Copy link
Contributor Author

elfring commented Jan 28, 2015

It is true that if use-retval is used then we can guess that the function is not noreturn so it's redundant to use noreturn.

Thanks for your acknowledgement.

but I still like to have these as 2 separate options.

How do you think about to present a warning to the data validation users if a questionable element combination was specified?
Would it be nicer if a data format specification could prevent such difficulties?

@danmar
Copy link
Owner

danmar commented Jan 29, 2015

How do you think about to present a warning to the data validation users if a questionable element combination was specified?

what do you mean with "questionable". If you mean "a function is noreturn and has the use-retval option", then that is a bit questionable. An information message when --check-library is used would be fine. A function that is not noreturn and has the use-retval option, as in the example, is not questionable and I don't think any message should be shown.

@elfring
Copy link
Contributor Author

elfring commented Jan 29, 2015

If you mean "a function is noreturn and has the use-retval option", …

Yes, exactly this special case.

  • How do you think about to avoid such constraints in your data format?
  • Would you like to introduce any settings which specify that a corresponding configuration value should get priority over an other to make this meta-data definition consistent?

@danmar
Copy link
Owner

danmar commented Jan 30, 2015

I want it to work as it does now.

Would you like to introduce any settings which specify that a corresponding configuration value should get priority over an other to make this meta-data definition consistent?

no. these are 2 separate settings I don't want to mix them anywhere.

@danmar
Copy link
Owner

danmar commented Feb 9, 2015

There is nothing to apply right now.. so I close this.

@danmar danmar closed this Feb 9, 2015
@elfring
Copy link
Contributor Author

elfring commented Apr 11, 2018

Do any contributors care to clarify this topic again so that library configurations can be safely completed by corresponding editors?

@danmar
Copy link
Owner

danmar commented Apr 11, 2018

we have cppcheck-cfg.rng in the repo. it is recommended to use that for validation. You can use the GUI to edit the cfg files.

@elfring
Copy link
Contributor Author

elfring commented Apr 11, 2018

Where is the specification “DATA-EXTNAME” explained so that data like the following can be safely validated?

…
  <function name="execv,execvp">
…

Should comments be explicitly assigned to such an XML element?

@amai2012
Copy link
Collaborator

Where is the specification “DATA-EXTNAME” explained so that data like the following can be safely validated?

https://github.com/danmar/cppcheck/blob/master/cfg/cppcheck-cfg.rng#L409 ?!

@amai2012
Copy link
Collaborator

@elfring https://sourceforge.net/p/cppcheck/discussion/general/ is probably the better place for discussions and questions. Here only few people are going to read this.

@elfring
Copy link
Contributor Author

elfring commented Apr 11, 2018

It would be nice if a data structure description can be achieved that is safer than the pattern “[a-zA-Z_][a-zA-Z_0-9:,]*”.
Would you like to express a list of valid function names (or aliases) in any other ways?

@danmar
Copy link
Owner

danmar commented Apr 12, 2018

The manual describes the library format in chapter 10. That can certainly be improved.

@elfring
Copy link
Contributor Author

elfring commented Apr 12, 2018

I am curious on how further improvements can be achieved for affected documentation source files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants