Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve API to production quality #18

Closed
artob opened this issue Nov 30, 2015 · 16 comments
Closed

Improve API to production quality #18

artob opened this issue Nov 30, 2015 · 16 comments
Assignees
Milestone

Comments

@artob
Copy link
Contributor

artob commented Nov 30, 2015

After the already-merged enhancements in pull requests #16 and #17, these are the remaining immediate pain points I have encountered so far in attempting to use hdt-lib in a production application:

  1. The library does not throw standard exceptions, i.e., exceptions derived from the C++ standard library's std::exception base class. Instead, the library throws raw C strings directly (e.g., src/hdt/BasicHDT.cpp:106). This is not great practice, and these exceptions will go uncaught by many applications, or else will land in a top-level catch-all handler catch (...) at which point they can no longer be interpreted at all.
  2. The library in multiple locations eats up exceptions and prints them out to the standard output stream (e.g., src/hdt/BasicHDT.cpp:171), instead of letting the application handle them. Unilaterally printing out stuff to the application's standard output or standard error is a definite no-no for a production-quality library. Also, error conditions really ought to propagate up the stack to where they can be properly observed and logged.
  3. The library is overall missing careful const-qualification of methods and parameters, resulting in spurious temporaries and in the inability to hold const references to hdt-lib objects, complicating const-correct application code by necessitating const_cast<> casts that undermine type safety.
  4. Additionally, from a performance standpoint, the library currently overall performs quite a bit of spurious memory allocations (that will slow things down and fragment the heap), in particular by overuse of std:string copies where a const std::string& reference or a char* pointer would suffice.

I'd like to address each of these points in order to make the library suitable for use from production code, but wanted to express my intentions here first, as in particular the first item would necessitate changes to any existing application code that uses hdt-lib and assumes that it will throw raw C strings instead of standard exceptions.

Soliciting feedback from @mielvds and @RubenVerborgh in particular.

@mielvds
Copy link
Member

mielvds commented Nov 30, 2015

Again thanks for taking this forward! As I'm not sufficiently qualified in C++, I can't fully assess this. @MarioAriasGa @joachimvh any thoughts?

Dependent applications at the moment are at least the GUI application hdt-it (also in this repo) and the HDT-Node library. I believe we should design a versioning approach for the code releases, and the indexes they generate (#7), first.

@RubenVerborgh
Copy link
Member

I have encountered problems 1, 2, and 3 myself and I agree that a fix would indeed be good. The exception thing has bitten us several times. And the const thing troubled me in at least one location, where I was not allowed to read something from a const object. I think I indeed also saw a couple of instances of 4 as well.

So thanks in advance for looking at these :-)

artob added a commit to datagraph/hdt-cpp that referenced this issue Dec 1, 2015
@artob
Copy link
Contributor Author

artob commented Dec 1, 2015

Thanks @RubenVerborgh and @mielvds for your feedback.

I've begun tackling these issues, starting with const-correctness since that is somewhat lower hanging than the other actionables, in that while it may affect the library ABI, it does preserve source compatibility—meaning that existing applications only need recompile, at worst.

This work is happening on the const-correctness branch of our fork. I would expect to submit that branch as a pull request sometime next week once further along.

@RubenVerborgh
Copy link
Member

Thanks, @bendiken, much appreciated!

artob added a commit to datagraph/hdt-cpp that referenced this issue Dec 1, 2015
artob added a commit to datagraph/hdt-cpp that referenced this issue Dec 1, 2015
artob added a commit to datagraph/hdt-cpp that referenced this issue Dec 1, 2015
@mielvds
Copy link
Member

mielvds commented Dec 2, 2015

I created version 1.1.1 (https://github.com/rdfhdt/hdt-cpp/releases/tag/1.1.1) at the point where the code moved from google code to github (as named by @MarioAriasGa in d343fa8). The code on the website is 1.0.0. This gives us a point of reference for future releases and incompatibilities.

@RubenVerborgh
Copy link
Member

Thanks a lot, Miel! Will indeed be handy to reference. We can/should probably stay within 1.x as long as the outputted HDT files are compatible.

@MarioAriasGa
Copy link
Member

Thanks bendiken,

I agree with your improvements. I will keep an eye in case I can help.

I will make sure that the HDT-it! app also compiles with your changes.

Thanks again!

Mario.

@artob
Copy link
Contributor Author

artob commented Dec 2, 2015

@mielvds That's good thinking, and means we might perhaps consider these changes to constitute a 1.2.x API.

@MarioAriasGa Thank you for weighing in—I will hence proceed according to the outlined plan, and submit pull requests for each of these action items.

@mielvds
Copy link
Member

mielvds commented Dec 3, 2015

I would propose 1.1.2, since no breaking changes in the HDT file or index are introduced, right? See my proposal on versioning in #7

@artob
Copy link
Contributor Author

artob commented Jun 14, 2016

I've merged the initial const-correctness work in 42a5b07. This improves type safety as well as reduces memory allocations and spurious string copying. The semantic changes to the API are minor and unlikely to affect any users (such as HDT-it!).

@artob artob self-assigned this Jun 14, 2016
@webdata
Copy link
Contributor

webdata commented Jun 14, 2016

Hi, @bendiken I couldn't run the code with the new API because of the new exception in ControlInformation. I had to perform several changes to catch this exception throughout the code, already considered in the last commits 42caf47, 0253fe2

@artob
Copy link
Contributor Author

artob commented Jun 15, 2016

@webdata Right, that type of thing could be a consequence of const-correctness changes.

If the exception in case of nonexistent keys proves too onerous for client code (it certainly complicated the client code in 42caf47 and 0253fe2), we could make the method return a const reference to a static empty string in the nonexistent key case, which would preserve full source compatibility with existing code. It'd be easy enough to do.

What do you think? You did already make the client code changes, but Git also makes them easy to revert.

@webdata
Copy link
Contributor

webdata commented Jun 15, 2016

Yes, if you don't need to distinguish between an existing empty property and a non-existing property, that could be possible.

LaurensRietveld added a commit to Triply-Dev/hdt-cpp that referenced this issue Jul 6, 2016
…lauses, cerr/cout statements accordingly. Related to rdfhdt#18
@LaurensRietveld
Copy link
Member

I took a first stab at modifying exception handling. See this PR: #35

LaurensRietveld added a commit to Triply-Dev/hdt-cpp that referenced this issue Jul 6, 2016
…lauses, cerr/cout statements accordingly. Related to rdfhdt#18
@mielvds mielvds added this to the Late 2017 milestone Jan 12, 2017
@mielvds
Copy link
Member

mielvds commented Jan 12, 2017

Let's try to have these fixes by mid 2017

@RubenVerborgh
Copy link
Member

The majority of these issues seem to be addressed, so I will close this. Feel free to create new issues for subpoints where necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants