Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashing is unnecessarily limited to null-terminated strings #9

Open
Xeverous opened this issue Jul 22, 2019 · 7 comments
Open

Hashing is unnecessarily limited to null-terminated strings #9

Xeverous opened this issue Jul 22, 2019 · 7 comments

Comments

@Xeverous
Copy link

TBID::Set(const char*) uses this code (while loop assumes null-termination):

uint32_t TBGetHash(const char *str)
{
if (!str || !*str)
return 0;
// FNV hash
uint32_t hash = 2166136261U;
int i = 0;
while (str[i])
{
char c = str[i++];
hash = (16777619U * hash) ^ c;
}
return hash;
}

This is an unnecessary limitation because for example, it can not be used with std::string_view.

@Xeverous
Copy link
Author

Also, the linked hash implementation (and likely code in other files) uses intXX_t types without appropriate includes:

  • to use uint32_t, <stdint.h> (deprecated header) needs to be included
  • to use std::uint32_t, <cstdint> needs to be included

On a conforming compiler, the above file may not compile due to missing definitions.

@tesch1
Copy link
Owner

tesch1 commented Jul 22, 2019

Agreed. How about a change of interface adding an optional parameter int n = 0 where 0 indicates to terminate on null and non-zero indicates a length?

Are you interested in making the whole library compatible with a newer c++ standard? For now I was hoping to keep c++11 compatibility, but if someone wants to do the work, could maybe add an option for c++17.

There are probably a lot of things in TB that will annoy you if you hope to get all the benefits of string_view, it was written for pre-c++11.

@Xeverous
Copy link
Author

How about a change of interface adding an optional parameter int n = 0 where 0 indicates to terminate on null and non-zero indicates a length?

I thought about writing a separate overload (int n = std::strlen(str) is invalid because parameters are not allowed to use for default parameters).

int n = 0 is a bad idea because non-null terminated strings can be of length 0. And there should be no burden on the user to check if it's empty - empty ranges must be supported and are by whole STL.

From cppreference:

std::string_view::data()

Returns a pointer to the underlying character array. The pointer is such that the range [data(); data() + size()) is valid and the values in it correspond to the values of the view.

If size() is 0, the value of the pointer can not be dereferenced, whatever it is (end iterators can not be dereferenced).

So ... there seems to be no other way than providing 2 overloads.

Are you interested in making the whole library compatible with a newer c++ standard? For now I was hoping to keep c++11 compatibility, but if someone wants to do the work, could maybe add an option for c++17.

There are no problems with supporting C++17 if you already support C++11, unless you use stuff which was deprecated or removed in later standatds (eg std::auto_ptr or non-variadic function wrappers).

I don't see large problems in technical compatibility with C++11, more of a style problems:

  • "No dependency on stl, exceptions, RTTI" - this is not a problem, but might be a disadvantage in some scenarios (binary size, support for non-null-terminated strings, support for iterators and (empty) ranges)
  • the library uses CamelCase-style names, which are in conflict with http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#nl10-prefer-underscore_style-names (for which guidelines state somewhere else ???Must we suffer CaMelcAse???) (also, TBID is not a macro which guidelines would scream about)
    • this is obviously unfixable, unless you want to alias every possible name in the library (Qt did it partially because begin, end, const_iterator will simply not work with STL if written in a different style)
    • like the above; STL forces correct naming in some contexts - void TBProgressSpinner::Begin() will not break it but such name is at least misleading when considering that begin and end members are used to implement support for C++11 ranged loops
  • many functions from widget classes in the public interface of the library take pointers
    • if the pointer is owning, this is against modern C++ (no RAII)
    • if the pointer is expected to never be null, this is against modern C++ (reference should be used instead)
  • some things (eg C-enums and their type safety issues) can not be solved without breaking code of users

Solving all of these (and likely other problems, of which I am not aware) would requite to refactor most of the code (and perhaps some breaking changes). So to answer the question "Are you interested in making the whole library compatible with a newer c++ standard?" - I would say yes, but sadly I don't see any other way than writing a completely new library. There aren't many problems to make this library work with C++11, but there are some really big problems to make this library look like a modern C++ library.

@tesch1 tesch1 changed the title Hasing is unnecessarily limited to null-terminated strings Hashing is unnecessarily limited to null-terminated strings Jul 22, 2019
@tesch1
Copy link
Owner

tesch1 commented Jul 22, 2019

Why would you be dereferencing if the size was zero? I'm missing something; what would be the signatures of your two overloads?

Yes, well... let us know when you've finished writing a nice modern isocpp-compliant multi-platform open-source c++17 GUI library from scratch! :DD

It will be immensely popular, I will guarantee you that - riches and fame can be yours. TB/HB is not ideal, I agree, but it works ok, and it works roughly the same across iOS, Android, Linux, MacOS, Windows, and WebGL/Webassembly. And the benefit for developing coping skills is a solution that actually works today.

There are a lot of incremental improvements that can be made here, and I'm more than happy to accept well-formed backward-compatibility-maximizing test-able PRs to make those. It's a common dilemma - do I write everything from scratch while sitting at A, or do I incrementally improve the junker that already gets me from A to B? A bird in the hand, or two in the bush?

ps. you might want to read through the "Note" on your NL.10 link :o)

@Xeverous
Copy link
Author

Why would you be dereferencing if the size was zero? I'm missing something; what would be the signatures of your two overloads?

  • (const char* str) - null-terminated string
  • (const char* str, int n) - iterator + length

Proposed overload (const char* str, int n = 0) (where 0 means null-terminated string) would not work because someone can call this with 0-length string view and it would be errnoneously treated as null-terminated string (leading to dereference when size is 0).


Regarding changes, I agree that consistency is better than never-finished-potentially-ideal. There is nothing better you can do than keeping TB/TH and offering non-breaking improvements.

I was just searching for a GUI library with specific needs and someone linked me TB. Obviously I won't find an ideal library but I think it would be better for my project to make PRs (with implementations of missing features) to libraries which appeal to me (modern-C++ style) than struggling with wrappers around C or old-C++-style libraries.

I have found though that your library has 2 features which many libraries do not have:

  • text box being able to render text in arbitrary colors in arbitrary places
  • filterable combobox

Does TB support OS-related features like clipboard, open file/directory dialog or docking?

@tesch1
Copy link
Owner

tesch1 commented Jul 23, 2019

Oh, right, I didn't really think that through, though n = -1 would work, if you accept limiting strings to 2^31, which is more than reasonable given how TBIDs are used (I'm not a huge fan of these either, but again... it works).

I wouldn't call it "my" library, more like my branch... There is clipboard support. I've been using tinyfiledialogs for native dialogs, and simple home-cooked stuff under emscripten, works pretty well across OSs. No docking. Have you been able to build the demo app?

I'll be curious to know what you finally end up using, I went through the same search about five years ago and ended up here, but would be really curious to know if there's something better in the meantime.

If I were starting from scratch otoh... I'd use SVG (a reasonable subset) as a middle layer for rendering that gives you tons of platforms and tooling for the graphics (including web - natively) and people love skinning. Then build an event & message-passing system under it (conveniently some basics already defined by SVG), and then an android/iOS-like layout constraint-definition layer above it, and then a widget library on top of it all. Wouldn't be much bulkier than TB, (maybe even lighter) but would be far more portable and hackable and accessible(/comfortable for devs coming in from the web-world). Apps would end up similar to electron, but without the html layout & js baggage. (It sounds too perfect, right?)

@Xeverous
Copy link
Author

Have you been able to build the demo app?

Haven't tried. But regarding building, I checked your CMakeLists. It requires version 3.4+, but uses very 2.0-like writing (variables and global changes of internal variables, not targets and their properties). I recommend to read https://cliutils.gitlab.io/modern-cmake/. CMake 3.0 vs 2.0 is just like moder C++ vs old C++.

I'll be curious to know what you finally end up using, I went through the same search about five years ago and ended up here, but would be really curious to know if there's something better in the meantime.

I have seen many libraries with different approaches:

  • declarative
  • imperative
  • config-based (eg loading layout from XML / HTML)
  • immediate-mode

I'm very fine with first 2, little worried about 3rd (string-related errors and performance) and never tried the last one. I have very little knowledge on the backend stuff and how widgets are drawn, so I'm fine as long as it works and has some customizations.

Regarding my search, what has interested me:

  • nana - modern interface but very immature (could contribute though). Lacks a few widgets but they seem "PR-able".
  • Boost.UI (not an official part of boost) - the boost-like API is amazing but the implementation is basically a wxWidgets wrapper. For good, it's very mature. For bad, wxWidgets means it will never get skinning or advanced widgets. I would absolutely use this if I wanted a simple GUI, but unfortunately I need more complex widgets and some theming. Also there is no CMake support.
  • Elements - made by Joel de Guzman, the author of 3 boost libraries (phoenix, fusion, spirit). This is new, but I doubt that something from such author would go wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants