Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparation for v1.0 – Peer Review #33

Closed
Airblader opened this issue Mar 30, 2016 · 17 comments
Closed

Preparation for v1.0 – Peer Review #33

Airblader opened this issue Mar 30, 2016 · 17 comments
Milestone

Comments

@Airblader
Copy link
Owner

This issue serves as a sync point for a review by anyone who wants to review. Please add any comment here and/or create issues for it. Please also note when you are done with the review.

The idea of this is to have this new library reviewed before a first stable release to avoid breakage.

@Airblader Airblader added this to the v1.0 milestone Mar 30, 2016
@psychon
Copy link
Contributor

psychon commented Mar 30, 2016

Took a quick first look at everything except for all the C code (mostly the build system):

  • Don't ship test-driver, it can just be removed (it's installed by autoreconf) (Edit: "Ship" as in "have in git", make dist will of course still include it into the tarball)

  • configure.ac, is the following necessary (the result of these tests seems to be unused:)
    AC_TYPE_SSIZE_T

    AC_CHECK_HEADERS([endian.h sys/endian.h sys/byteorder.h libkern/OSByteOrder.h], [break])
    AC_CHECK_FUNCS([le32toh])

  • xcb_xrm_intro.in looks weird. Are all of these really authors? The @Date? The Intro?

  • Makefile.am:
    TEST_LIBS = $(shell pkg-config --libs x11 x11-xcb xcb xcb-aux)
    That's... unusual. I don't know if this is portable to non-gnu-make. Something like this should be done in configure.ac with the usual macros (and error messages if one of the libraries is missing)

  • libxcb_xrm_la_LIBADD = $(XCB_LIBS) -lm

    This "-lm" seems to be unneeded.
    Edit: Ok, it's needed for ceil. How about replacing ceil(reply->bytes_after/4.0) with (reply->bytes_after + 3) / 4 /* Add 3 for the needed rounding */

  • "make distcheck" errors out (Edit: Actually, make dist is already broken):

$ make distcheck
make  dist-bzip2 dist-gzip am__post_remove_distdir='@:'
make[1]: Entering directory '/tmp/xcb-util-xrm'
(GIT_DIR=./.git git log > ./.changelog.tmp && mv ./.changelog.tmp ./ChangeLog) || (rm -f ./.changelog.tmp; touch ./ChangeLog; echo 'git directory not found: installing possibly empty changelog.' >&2)
(cp -f /usr/share/util-macros/INSTALL ./.INSTALL.tmp && mv ./.INSTALL.tmp ./INSTALL) || (rm -f ./.INSTALL.tmp; touch ./INSTALL; echo 'util-macros "pkgdatadir" from xorg-macros.pc not found: installing possibly empty INSTALL.' >&2)  
make[1]: *** No rule to make target 'include/xrm.h', needed by 'distdir'.  Schluss.
  • Warnings during build:
    src/database.c: In Funktion »xcb_xrm_database_from_string«:
    src/database.c:141:17: Warnung: ISO-C90 verbietet gemischte Deklarationen und Code [-Wdeclaration-after-statement]
    char *filename = scalloc(1, j - i + 2);
    ^
  • Addendum: Tests use non-public API. The only test file, tests/test.c, includes headers which are not installed. Perhaps it would be nice to split this up into an "internal" test using non-public API and some "external" test which only uses public API?
  • Another Addendum: Accidentally(?) exported symbols. The following functions are exported, but not defined in xcb_xrm.h. I guess this means that they shouldn't actually be exported:
  • xcb_xrm_database_put
  • xcb_xrm_entry_compare
  • xcb_xrm_entry_escape_value
  • xcb_xrm_entry_free (Used by the tests)
  • xcb_xrm_entry_num_components
  • xcb_xrm_entry_parse (Used by the tests)
  • xcb_xrm_entry_to_string
  • xcb_xrm_match

I didn't yet look into fixing most of them. Feel free to tell me what I should propose patches for.

Edit: To fix make distcheck, apply the following patch:

diff --git a/Makefile.am b/Makefile.am
index cb0163a..bd9fe30 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -14,7 +14,9 @@ ChangeLog:

 dist-hook: ChangeLog INSTALL

-EXTRA_DIST = autogen.sh xcb-xrm.pc.in include/xrm.h
+EXTRA_DIST = autogen.sh xcb-xrm.pc.in include/xcb_xrm.h include/database.h
+EXTRA_DIST += include/entry.h include/externals.h include/match.h
+EXTRA_DIST += include/resource.h include/util.h

 lib_LTLIBRARIES = libxcb-xrm.la

You can now produce release tarballs with make dist.

@Airblader
Copy link
Owner Author

Thanks for all the findings :) Some comments:

That's... unusual. I don't know if this is portable to non-gnu-make. Something like this should be done in configure.ac with the usual macros (and error messages if one of the libraries is missing)

If you want to send a patch, I'll be happy to merge it. It's only necessary when running tests, though, not for building the library, so I think it's rather low priority.

Edit: Ok, it's needed for ceil. How about replacing ceil(reply->bytes_after/4.0) with (reply->bytes_after + 3) / 4 /* Add 3 for the needed rounding */

Is there something "bad" about -lm? In general I'd prefer to use standard tools instead of reinventing the wheel, but if this would cause problems on some platform that we otherwise wouldn't have…

Warnings during build

Weird. I always check for this. Must've forgotten recently… :)

Addendum: Tests use non-public API. The only test file, tests/test.c, includes headers which are not installed. Perhaps it would be nice to split this up into an "internal" test using non-public API and some "external" test which only uses public API?

I've tried splitting the test suite into multiple source files before (and one file for the utility functions) and failed miserably. If you can send a PR that at least does the infrastructure change for this, I'd love to split the different tests.

Another Addendum: Accidentally(?) exported symbols. The following functions are exported, but not defined in xcb_xrm.h. I guess this means that they shouldn't actually be exported:

Any application using the header file can't use those, though. They'd have to include the actual source file, no? Those aren't static because a) other files and b) the tests use them. I'm not sure how I'd fix this. I thought making them non-static but not defining them in the library's header file would be the right way.

Edit: To fix make distcheck, apply the following patch: […]

Awesome, thanks! I'll apply that patch. Of course feel free to send PRs if you want the commit to be in your name. :)

@psychon
Copy link
Contributor

psychon commented Mar 30, 2016

Is there something "bad" about -lm?

Nope & ok. I marked the checkbox.

Those aren't static because a) other files and b) the tests use them. I'm not sure how I'd fix this.

Ok. I checked which of those functions is actually used by the tests. That's just two of them. All the other ones you can just rename to something else (they are exported because of -export-symbols-regex '^xcb_xrm_' in Makefile.am).
What remains is xcb_xrm_entry_parse and xcb_xrm_entry_free. Perhaps the test can be rewritten to use xcb_xrm_database_put_resource_line instead? I don't know for sure, but a quick grep says that this might be a useful candidate. If not.... hmmm.

@Airblader
Copy link
Owner Author

Thanks for the information. I'll rename those to _xcb_xrm_… then. I'll also try to take a look at the tests. If push comes to shove, maybe we can just include those in the library header guarded by some #ifdef DEBUG or something?

@psychon
Copy link
Contributor

psychon commented Mar 30, 2016

That'd make them even more part of the API and ABI of the library. ;-)
I only looked at the exported symbols since accidentally exported symbols can be used and are technically still part of the ABI.

Edit: Personally, I'd rather leave this the way with the two remaining functions being exported, but not declared in a header (if no alternative shows up).

@Airblader
Copy link
Owner Author

Personally, I'd rather leave this the way with the two remaining functions being exported, but not declared in a header (if no alternative shows up).

OK. I'll see if I can easily rewrite the tests then and if not we'll leave it for now.

@psychon
Copy link
Contributor

psychon commented Mar 30, 2016

Some comments on the API:

  • externals.h is included in xcb_xrm.h, but that header is not (and I guess: should not) be installed. Replace with stdbool.h and xcb/xcb.h.
  • I miss a function like XGetDefault(). From looking at its source, this function:
    • tries to get a DB from the server
    • if that fails, it loads ~/.Xdefaults
    • For some reason that I don't understand, it also loads the file specified by $XENVIRONMENT (env var, defaults to ~/.Xdefaults- if unset) and merges the two databases (don't ask me which one wins on conflicts)
  • Typos: thie -> this
  • Should the docs state which files handles #include-directives? (all db-loading functions do, so I'm not sure)
  • Why does xcb_xrm_database_combine destroy the source DB? I think it makes more sense to keep it alive and not mess with at all. This also makes it easier to track the lifetime of DBs when looking at some piece of source code (every DB must eventually end up in xcb_xrm_database_free)
  • The API docs refer to "error codes", but never explains these. From looking at the implementation, I guess that all functions return 0 on success and -1 otherwise. Should this perhaps be converted to a boolean?
  • Why does xcb_xrm_resource_value copy the string that it returns? I'd return a const char* that is valid as long as the xcb_xrm_resource_t is alive. Callers who want can duplicate this value themselves. Otherwise, I'd guess that it's easy to have memory leaks due to this.
  • Simple patch:
@@ -202,7 +213,7 @@ void xcb_xrm_database_free(xcb_xrm_database_t *database);
  * components as the resource name.
  * @param resource A pointer to a xcb_xrm_resource_t* which will be modified to
  * contain the matched resource. Note that this resource must be free'd by the
- * caller.
+ * caller via a call to @ref xcb_xrm_resource_free ().
  * @return 0 on success, a negative error code otherwise.
  */
 int xcb_xrm_resource_get(xcb_xrm_database_t *database, const char *res_name, const char *res_class,

Other random comments will follow (likely as pull requests, if I finally find the time). My "notes" are at https://gist.github.com/psychon/73344f301e8b6947f0512374e978c771.

@Airblader
Copy link
Owner Author

Thanks @psychon, I appreciate all the feedback. Keep it coming. :) I've made some of the changes. Some comments / questions:

I miss a function like XGetDefault(). From looking at its source, this function:

The behavior is described here, too: https://tronche.com/gui/x/xlib/appendix/d/XGetDefault.html
I'm not against adding additional APIs, but that's backwards compatible, so we'd be good. I guess it doesn't hurt to take some "heavy lifting" away, but then we should also implement XrmSetDatabase and XrmGetDatabase. How merging works is pretty well-defined, actually. :)

I'll open a separate issue for this and check it off in this list for now.

Should the docs state which files handles #include-directives? (all db-loading functions do, so I'm not sure)

No, I don't think so. We say we handle spec-conform entries and the spec (https://tronche.com/gui/x/xlib/resource-manager/file-syntax.html) contains #include, so we handle it. I don't think repeating this on every method adds any more value than also stating that whitespace is ignored.

Why does xcb_xrm_database_combine destroy the source DB?

For one: because it's easier that way. :) However, the Xlib equivalent XrmCombineDatabase actually does the same. So when migrating from Xlib to xcb-util-xrm, you actually don't need to worry about adding additional frees here. I'm not 100% sure whether it's better to follow Xlib's API here or change it.

The API docs refer to "error codes", but never explains these. From looking at the implementation, I guess that all functions return 0 on success and -1 otherwise. Should this perhaps be converted to a boolean?

I left it to be integers on purpose to have freedom in the future to define error codes (which they currently aren't), so this is forwards thinking. As long as we don't define codes, no one can (or should) rely on them. If in the future we want to return specific codes and define them, we can change the actual return value in a compatible way. I don't know if it'll ever happen, but if it does, I don't want to break the API. Does that make sense?

And some comments to your notes gist:

  • char buffer[4096];
  • char buffer[4096]; /* Let's overflow that! */

I thought I had a comment on there. Anywho. Yes. This and the other place are… not ideal. We should fix this, I guess. I'll open a bug.

  •            // TODO XXX Filename globbing
    

Was this intentional?

  •    xcb_xrm_entry_free(entry);
    
  •    xcb_xrm_entry_free(entry); /\* XXX: This is a double-free, I think */
    

No, it isn't :) The function can in fact allocate the entry and return an error. Or at least used to, maybe it changed. Either way, the FREE macro will prevent double frees anyway (though of course that's not good programming to just ignore it).

  •                /\* This suggests that this should be split into two parsers,
    
  •                 \* one resource_only, one for the value
    
  •                 */
    

I thought about it, but ultimately, those things are both semantically and syntactically so closely related that they'd end up sharing most of the code anyway. Edit: I think I just now realized what you meant. Yeah, I guess that could be done.

             /* Never reached. */
  •            assert(0);
    

This would exit the application again due to a bug in the library, no?

I'll also incorporate some of the changes mentioned in that gist right now.

@Airblader
Copy link
Owner Author

@psychon One more thing, is it actually safe to change the string returned by strtok_r, in particular to put a \0 somewhere in there?

@psychon
Copy link
Contributor

psychon commented Mar 31, 2016

  • TODO XXX Filename globbaing

Was this intentional?

Yes, I don't see Xlib handling any kind of globbing. (Only checked the source code, didn't check all that carefully.

     xcb_xrm_entry_free(entry); /* XXX: This is a double-free, I think */

No, it isn't :) The function can in fact allocate the entry and return an error. Or at least used to, maybe it changed. Either way, the FREE macro will prevent double frees anyway (though of course that's not good programming to just ignore it).

Yes it is.

The first and second return in xcb_xrm_entry_parse() is before the entry argument was modified in any way. In this case you call xcb_xrm_entry_free with an uninitialized value (likely not NULL). Sounds like a good way for a crash.

All following returns are via goto done_error. In this case xcb_xrm_entry_free() is already called and *_entry is set to NULL.

Well, ok. It's not a double free. It's dead code that's not necessary and that can actually cause a crash if xcb_xrm_entry_parse returns due to an allocation failure. Still, this can be removed.

This suggests that this should be split into two parsers,

My thought here was to have some parser working similar to strtol. You have a function that parses one component(?) and somehow returns a pointer up to where it parsed things. If this points to :, another function is used for parsing the actual value. resource_only = true would be removed and instead there is a different function that only parses single components until "done".

I hope that the resulting code would be easier to understand. Right now I mostly just skipped the code in xcb_xrm_entry_parse, assuming that your unit tests would catch any problems.

This would exit the application again due to a bug in the library, no?

I actually don't have any problem with that. "allocation error" (and other runtime errors) is not a bug in the library and shouldn't exit anything. I can life better with "this should never happen"-kind of asserts. Ultimately, it's your call what to do here.

@psychon One more thing, is it actually safe to change the string returned by strtok_r, in particular to put a \0 somewhere in there?

That's a good question. Looking at the man page of strtok, it's pretty clear how this function should be implemented (you have a "current position" pointer and each call scans from there to the next delimiter, updating the current position). With this implementation, modifying the string is safe, because strtok_r doesn't have to look at it again (and I can't think of a reason why any implementation would behave otherwise). However, the man page does not say that it is safe to modify the string "mid-traversal".

@Airblader
Copy link
Owner Author

Yes, I don't see Xlib handling any kind of globbing. (Only checked the source code, didn't check all that carefully.

Maybe so, but I'd argue that this is a nice improvement over Xlib. Technically it leads to different behavior because files might be loaded that weren't loaded before, but… I mean… you know.

Yes it is.

OK, convinced. :) It used to be necessary then, because I remember adding it for a reason. Things have changed since, I guess.

I actually don't have any problem with that. "allocation error" (and other runtime errors) is not a bug in the library and shouldn't exit anything. I can life better with "this should never happen"-kind of asserts.

Yeah, I see your point. It's probably better to provoke seeing such a bug if we introduce it later on.

That's a good question. […]

I've changed it and at least the tests seem happy about it (including valgrind).

@stapelberg
Copy link
Contributor

Nice work overall!

Some thoughts:

  • Does it really make sense to provide xcb_xrm_convert_to_long, which, AFAICT, just wraps strtol? C programmers ought to be familiar with strtol anyway. I’m not entirely sure about xcb_xrm_resource_get_long yet, but I think xcb_xrm_convert_to_long shouldn’t be public.
  • In a similar vein, I think if we wanted to keep xcb_xrm_convert_to_bool (does Xlib provide such a function?), having it return false on errors isn’t the best choice, as you’re effectively masking 50% of possible values. I’d suggest making it return 0/-1 and accepting a *bool as output parameter.
  • You mentioned that valgrind is happy, but have you also used clang-analyze and AddressSanitizer?

@Airblader
Copy link
Owner Author

@stapelberg Thanks!

having it return false on errors isn’t the best choice, as you’re effectively masking 50% of possible values

Rewriting my answer this from scratch for the third time now :-) I think maybe the best way is to remove the converter functions again and change all *_resource_get_* functions to return int and using an output parameter.

That should cover all cases: I can ignore the return value and rely on the provided output argument if I don't want to provide my own fallback, but if I do care, I check the return value and ignore the output parameter.

Summarizing your thoughts it sounds like that's pretty much what you were saying, right?

You mentioned that valgrind is happy, but have you also used clang-analyze and AddressSanitizer?

No, so far I've only run valgrind --leak-check=full. It's a good point that we should do these as well. :)

@psychon
Copy link
Contributor

psychon commented Apr 1, 2016

Still no time to code / send PRs. However, I ran some code coverage analysis and besides error handling (which is mostly out-of-memory-handling and hard to test), xcb_xrm_database_from_file and the #include-directive are the biggest missing pieces (but xcb_xrm_database_from_resource_manager and xcb_xrm_database_from_default are untested as well (and are hard to test... use a shell script and Xvfb for these?)).

For testing file reading, adding AM_TESTS_ENVIRONMENT = HOME=$(srcdir)/tests/ XENVIRONMENT=$(srcdir)/tests/Xenv could be used... However, the only function using this variables also needs an X connection and is thus hard to test (*_from_default()).
Which makes me wonder...

  • Should *_new_from_file() handle relative paths as being relative to $HOME? If no, should #include be handled this way? I actually don't know what Xlib does.

My ToDo-Diff shrunk to https://gist.github.com/psychon/ce61abbab5039b8b99667b24f3fe93fa. Thanks for handling some of it already!

@Airblader
Copy link
Owner Author

Yeah, I haven't written tests for those yet because it's hard to do so. I do test them manually a lot, including valgrind runs, but that's not very satisfying. The hint with AM_TESTS_ENVIRONMENT is useful, I'll see what I can do with that, thanks!

Should *_new_from_file() handle relative paths as being relative to $HOME? If no, should #include be handled this way? I actually don't know what Xlib does.

This is what the specification says: "The file name is interpreted relative to the directory of the file in which the line occurs (for example, if the file name contains no directory or contains a relative directory specification)."

Thanks for handling some of it already!

No, thank you for all the feedback, support and reviewing. :) I've taken care of the j+i mistake and the assert now as well. The parser rewrite is something we could do, but it'd be completely internal, so I think there's no need to block a release with that.

Edit: Now also took care of the double free and provided a PR for not destroying the source database when merging.

@Airblader
Copy link
Owner Author

I have opened tickets for all remaining issues AFAICT, that should be easier to keep an overview. I've put the milestone on those that I think should be fixed for v1.0.

@Airblader
Copy link
Owner Author

Alright. Everything has been taken care of AFAICT. Time for v1.0!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants