Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change API to use string+len pairs instead of null terminated strings #81

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Hejsil
Copy link
Contributor

@Hejsil Hejsil commented Mar 11, 2020

Implements this issue.

This pull request doesn't actually implement this. It only contains changes to the header files. This is my proposed new API for strings in the Nuklear library. I would love to hear some feedback on this.

Notes

  • Adding new struct called nk_slice (nk_str and nk_text was taken). This is the string+len pair.
    • Existing code can make an nk_slice from a const char * by calling nk_slice("str").
  • The format string versions of nk_label cannot really be changed without throwing away the NK_PRINTF_VARARG_FUNC attribute. I've therefore chosen to leave these functions as is.
  • There is no longer both a nk_*_label and nk_*_text variant of varies functions. nk_*_label is the naming convention that was kept, but these functions now take an nk_slice.

Benifits

  • This pattern allows for more optimal code. This allows for any buffer to be passed to these functions, as all we require is a pointer and a length.
    • This allows users to avoid allocations in certain cases.
  • Other languages have chosen not to have null terminated string by default, and these languages can now have more efficient bindings.
  • We remove a decent amount of API surface area as we already had a decent amount of function which take a string + len.

Cons

  • This is a major breaking change.
  • Using Nuklear from C is a little more verbose.

src/nuklear_internal.h Outdated Show resolved Hide resolved
@wheybags
Copy link
Contributor

Here's an idea for a potential ergonomic improvement for c++ users (which I imagine is most people?)
When compiling as c++, add a constructor from std::string_view on the nk_slice struct, so then the change is pretty much transparent.

Eg:

struct nk_slice 
{
    const char *ptr;
    nk_size len;
#ifdef __cplusplus
    nk_slice(std::string_view view) : ptr(view.data()), len(view.length) {}
#endif
};

If you're not comfortable using c++17, it could be an std::string& either.

@dumblob
Copy link
Member

dumblob commented Mar 12, 2020

@wheybags if the proposal is so much dependent on C++ version, shouldn't we rather document it instead of providing it by default in the code? Or maybe implement just a tiny subset which really overlaps between all C++ versions?

@wheybags
Copy link
Contributor

That is why I suggested using an std::string reference instead, that would work on all versions.
Alternatively, you could even detect versions:

struct nk_slice 
{
    const char *ptr;
    nk_size len;
#ifdef __cplusplus
#  if __cplusplus >= 201703L // if we have c++17, use string view
    nk_slice(std::string_view view) : ptr(view.data()), len(view.length()) {}
#  else // otherwise fall back to a string ref
    nk_slice(std::string& str) : ptr(str.c_str()), len(str.length()) {}
#  endif
#endif
};

@dumblob
Copy link
Member

dumblob commented Mar 12, 2020

@wheybags hm, I've heard std:: is not recommended e.g. for game development due to its performance impact - do you think it's worth introducing such dependency? Or would we need another #define to disable this dependency?

@wheybags
Copy link
Contributor

I work on a commercial game, and we use the STL heavily.
That said, a lot of studios prefer not to. Even among those which don't, I would be surprised to find them not using std::string at all. If someone doesn't want to use it, they can just not use it, by explicitly initialising the nk_slice using some alternative method. Even if they aren't using the STL, they will still have the headers available, so it should compile just fine.

@Hejsil
Copy link
Contributor Author

Hejsil commented Mar 12, 2020

Here is my personal take. Nuklear is a C library, not a C++ library. I does the bare minimum to ensure that the library can compile under C++ but nothing more and I think it should stay this way.

std is just a library like any others. What makes std::string_view special from llvm::StringRef, absl::string_view, boost::string_view and so on. Currently Nuklear has no dependencies (even libc can be avoided as far as I know).

There are also a none trivial problem introduced by adding a constructor. An initializer list can no longer be used to initialize this struct:

struct nk_slice {
    const char *ptr;
    int len;

    nk_slice(const char *ptr) : ptr(ptr), len(0) {}
};

int main() {
    struct nk_slice s = { "t", 1 };
}
test.cpp:10:34: error: could not convert ‘{"t", 1}’ from ‘<brace-enclosed initializer list>’ to ‘nk_slice’
   10 |     struct nk_slice s = { "t", 1 };
      |                                  ^
      |                                  |
      |                                  <brace-enclosed initializer list>

So now, our valid C code does not compile under a C++ compiler.

@wheybags
Copy link
Contributor

wheybags commented Mar 12, 2020

Well, maybe things like this should live in a separate header. In general, it would be nice to see more ergonomic c++ usage. A supplementary nuklear.hpp could do that.

@Hejsil
Copy link
Contributor Author

Hejsil commented Mar 12, 2020

Also, c++ have it easy compared to other languages when it comes to simple usage. 😉

#include <cstddef>

struct nk_slice {
    const char *ptr;
    size_t len;
};

struct nk_slice operator "" _nk(const char *ptr, size_t len) {
    return {ptr, len};
}

int main() {
    struct nk_slice s = "test"_nk;
}

@Hejsil
Copy link
Contributor Author

Hejsil commented Mar 12, 2020

In general, it would be nice to see more ergonomic c++ usage. A supplementary nuklear.hpp could do that.

Well, this sounds like bindings to me, which people already maintain in various languages

@Hejsil
Copy link
Contributor Author

Hejsil commented Mar 12, 2020

Anyways, I don't think this is to relevant for this specific PR and is something that could be added later anyways. This change would already make the library a little more ergonomic for c++ usage.

@rswinkle
Copy link
Contributor

rswinkle commented Mar 28, 2020

I'm not sure I'm understanding your last two Notes bullet points. In the changes I didn't see any nk_*_label functions changed/removed?

Also it looks like all the functions that changed were internal functions, not something I'd call directly from my application GUI code. I hope that's true because I think it'd be sad if C programmers using a C library couldn't do what comes easily and naturally in C, ie pass string literals directly as const char*. Having to do something like this:

struct nk_slice s = { "My Label",  8 };
nk_label(ctx, s, NK_TEXT_LEFT);

would be bad enough. That would be the most efficient way but having to manually count would be annoying and introduce many off-by-1 errors (probably quickly obvious and fixable but still). Compound literals (if you were willing to use them despite the C/C++ issues) would cut a line but leave the counting problem.

On the other hand, encouraging C programmers to have dozens or hundreds of little heap allocations for constant (and even update per frame type) strings hurts my soul and is longer.

sds s = sdsnew("My Label");
struct nk_slice label = { s, sdslen(s) }
nk_label(ctx, label, NK_TEXT_LEFT);

So, if at all possible, I say keep the interface that let us pass string literals and let the language bindings wrap inner functions that take an nk_slice or equivalent. That's just my 2 cents for what it's worth.

@Hejsil
Copy link
Contributor Author

Hejsil commented Mar 28, 2020

I'm not sure I'm understanding your last two Notes bullet points. In the changes I didn't see any nk_*_label functions changed/removed?

Also it looks like all the functions that changed were internal functions, not something I'd call directly from my application GUI code.

This PR removes the nk_*_text variants and makes the nk_*_label variants of functions take the nk_slice. These changes are not just internal, but changes to the API, as you can see from this diff:

-NK_API void nk_text(struct nk_context*, const char*, int, nk_flags);
-NK_API void nk_text_colored(struct nk_context*, const char*, int, nk_flags, struct nk_color);
-NK_API void nk_text_wrap(struct nk_context*, const char*, int);
-NK_API void nk_text_wrap_colored(struct nk_context*, const char*, int, struct nk_color);
-NK_API void nk_label(struct nk_context*, const char*, nk_flags align);
-NK_API void nk_label_colored(struct nk_context*, const char*, nk_flags align, struct nk_color);
-NK_API void nk_label_wrap(struct nk_context*, const char*);
-NK_API void nk_label_colored_wrap(struct nk_context*, const char*, struct nk_color);
+NK_API void nk_label(struct nk_context*, struct nk_slice, nk_flags align);
+NK_API void nk_label_colored(struct nk_context*, struct nk_slice, nk_flags align, struct nk_color);
+NK_API void nk_label_wrap(struct nk_context*, struct nk_slice);
+NK_API void nk_label_colored_wrap(struct nk_context*, struct nk_slice, struct nk_color);

I hope that's true because I think it'd be sad if C programmers using a C library couldn't do what comes easily and naturally in C, ie pass string literals directly as const char*. Having to do something like this:

struct nk_slice s = { "My Label",  8 };
nk_label(ctx, s, NK_TEXT_LEFT);

would be bad enough. That would be the most efficient way but having to manually count would be annoying and introduce many off-by-1 errors (probably quickly obvious and fixable but still). Compound literals (if you were willing to use them despite the C/C++ issues) would cut a line but leave the counting problem.

On the other hand, encouraging C programmers to have dozens or hundreds of little heap allocations for constant (and even update per frame type) strings hurts my soul and is longer.

sds s = sdsnew("My Label");
struct nk_slice label = { s, sdslen(s) }
nk_label(ctx, label, NK_TEXT_LEFT);

So, if at all possible, I say keep the interface that let us pass string literals and let the language bindings wrap inner functions that take an nk_slice or equivalent. That's just my 2 cents for what it's worth.

Alright, this hits a lot of points. If you are a user of the API, most of the times your code will be updated like this:

-nk_label(ctx, "test", NK_TEXT_LEFT);
+nk_label(ctx, nk_slice("test"), NK_TEXT_LEFT);

This is not really that bad for a C programmer. It looks pretty clean. Now, you mention a point about efficiency:

struct nk_slice s = { "My Label",  8 };
nk_label(ctx, s, NK_TEXT_LEFT);

That would be the most efficient way but having to manually count would be annoying and introduce many off-by-1 errors (probably quickly obvious and fixable but still). Compound literals (if you were willing to use them despite the C/C++ issues) would cut a line but leave the counting problem.

Yes, this would be the most efficient way. However let me just point out that the old API called strlen internally. This means, that nuklear already did the work of calling strlen, so having to call nk_slice is not any less efficient than the old API. The key here is, that with the new API the user can be more efficient. As you yourself show, you can now avoid the strlen entirely if you know the length ahead of time.

Now, I don't recommend people counting the length of the string and manually inserting it in their code. The best thing you can do in C is to have a macro for this (there are some footguns with this approach, but it is efficient).

#define NK_SLICE(str) (struct nk_slice){str, sizeof(str)-1}
nk_label(ctx, NK_SLICE(s), NK_TEXT_LEFT);

On the other hand, encouraging C programmers to have dozens or hundreds of little heap allocations for constant (and even update per frame type) strings hurts my soul and is longer.

sds s = sdsnew("My Label");
struct nk_slice label = { s, sdslen(s) }
nk_label(ctx, label, NK_TEXT_LEFT);

And I agree. We should avoid allocations. But with the new API you can more easily do this. Imagine something like this:

char *user_input = ...;

// We can now slice the string, without writing '\0' to it or duplicating it.
struct nk_slice sliced = { user_input + 3, 3 };
nk_label(ctx, sliced, NK_TEXT_LEFT);

There was no way to do this for some functions with the old API (yes, nk_label had a nk_text variant, but not all functions had this).

So, if at all possible, I say keep the interface that let us pass string literals and let the language bindings wrap inner functions that take an nk_slice or equivalent. That's just my 2 cents for what it's worth.

This change is is trying to streamline the API, making it more efficient and reduce the surface area. I can flip this question around. Imagine Nuklear was written like this in the first place, and someone wanted to add wrappers to Nuklear that allowed users to pass null terminated strings. I would then ask, should we also have wrappers for wchar_t, u32 (utf32 chars) or maybe newline terminated strings? Yes, C programmers have used null terminated strings for ages, but they have a lot of problems. Wouldn't it be better to expose a more efficient, more powerful and less error prone API and let the user wrap this API to their needs?

@rswinkle
Copy link
Contributor

This PR removes the nk_*_text variants and makes the nk_*_label variants of functions take the nk_slice. These changes are not just internal, but changes to the API, as you can see from this diff:

-NK_API void nk_text(struct nk_context*, const char*, int, nk_flags);
-NK_API void nk_text_colored(struct nk_context*, const char*, int, nk_flags, struct nk_color);
-NK_API void nk_text_wrap(struct nk_context*, const char*, int);
-NK_API void nk_text_wrap_colored(struct nk_context*, const char*, int, struct nk_color);
-NK_API void nk_label(struct nk_context*, const char*, nk_flags align);
-NK_API void nk_label_colored(struct nk_context*, const char*, nk_flags align, struct nk_color);
-NK_API void nk_label_wrap(struct nk_context*, const char*);
-NK_API void nk_label_colored_wrap(struct nk_context*, const char*, struct nk_color);
+NK_API void nk_label(struct nk_context*, struct nk_slice, nk_flags align);
+NK_API void nk_label_colored(struct nk_context*, struct nk_slice, nk_flags align, struct nk_color);
+NK_API void nk_label_wrap(struct nk_context*, struct nk_slice);
+NK_API void nk_label_colored_wrap(struct nk_context*, struct nk_slice, struct nk_color);

Ah, I didn't expand the diff on nuklear.h because I was thinking it was the generated nuklear.h so the changes would be the same as the other files. Forgot that there was a src/nuklear.h as well. My bad.

Alright, this hits a lot of points. If you are a user of the API, most of the times your code will be updated like this:

-nk_label(ctx, "test", NK_TEXT_LEFT);
+nk_label(ctx, nk_slice("test"), NK_TEXT_LEFT);

This is not really that bad for a C programmer. It looks pretty clean. Now, you mention a point about efficiency:
Ok that's not too bad; should have looked at that diff and I'd have realized what you were going for. While it works here because nuklear doesn't typedef its structs (one of the few things I disagree with Linus Torvalds about, that and 8 space tabs yikes), in general it's weird to see what looks like a constructor in C.

Yes, this would be the most efficient way. However let me just point out that the old API called strlen internally. This means, that nuklear already did the work of calling strlen, so having to call nk_slice is not any less efficient than the old API. The key here is, that with the new API the user can be more efficient. As you yourself show, you can now avoid the strlen entirely if you know the length ahead of time.

Yeah, I was aware of that. Again this all stems from my not seeing the majority of the changes. Though I don't consider strlen a significant overhead in any case, especially for strings that are usually dozens or at most 100's of bytes. I was comparing to the performance of the added heap operations suggested below which are much slower.

Now, I don't recommend people counting the length of the string and manually inserting it in their code. The best thing you can do in C is to have a macro for this (there are some footguns with this approach, but it is efficient).

#define NK_SLICE(str) (struct nk_slice){str, sizeof(str)-1}
nk_label(ctx, NK_SLICE(s), NK_TEXT_LEFT);

This is probably the best alternative or at least tied with calling nk_slice(), but as you said, there are cases where str isn't an array so sizeof won't work and we have the compound literal I discussed in another paragraph.

And I agree. We should avoid allocations. But with the new API you can more easily do this. Imagine something like this:

char *user_input = ...;

// We can now slice the string, without writing '\0' to it or duplicating it.
struct nk_slice sliced = { user_input + 3, 3 };
nk_label(ctx, sliced, NK_TEXT_LEFT);

There was no way to do this for some functions with the old API (yes, nk_label had a nk_text variant, but not all functions had this).

I feel like this is a rare use case but I can appreciate shrinking the API while gaining functionality

This change is is trying to streamline the API, making it more efficient and reduce the surface area. I can flip this question around. Imagine Nuklear was written like this in the first place, and someone wanted to add wrappers to Nuklear that allowed users to pass null terminated strings. I would then ask, should we also have wrappers for wchar_t, u32 (utf32 chars) or maybe newline terminated strings? Yes, C programmers have used null terminated strings for ages, but they have a lot of problems. Wouldn't it be better to expose a more efficient, more powerful and less error prone API and let the user wrap this API to their needs?

I get your point, though isn't it UTF-8 by default already and it works as long as the nuklear backend (and font) you're using supports it? I feel that's pretty universal, probably the most popular, and when people are using other rarer character encodings, they're usually having to convert one way or another between interfaces/libraries.

Lastly, I will admit that while I'm kind of unique in my programming experience and preferences, I've never really had problems with null-terminated strings. Like everyone else, I've written my own wrapper/dynamic string type when I needed more power, but in and of themselves, I don't see plain old C-strings as particularly error prone.

@dumblob
Copy link
Member

dumblob commented Mar 30, 2020

Didn't have time to take a look at this rather bigger PR, but I want to say thanks for this very good discussion. Feel free to discuss it further as this will be quite a big thing for Nuklear in general.

Judging based on the discussion so far, I really like the implementation approach.

Uneducated suggestion: some of the points raised by @rswinkle could be explained in comments in a more direct way (basically formulated as "answers" to @rswinkle's questions) instead of just in this discussion.

@Hejsil if you feel we should merge the PR any time soon (days/weeks), please raise your hand as I have currently a lot of other high-priority work to do and would need to reschedule things. And of course, feel free to "recruit" new reviewers to speed things up 😉.

@Hejsil
Copy link
Contributor Author

Hejsil commented Apr 1, 2020

Uneducated suggestion: some of the points raised by @rswinkle could be explained in comments in a more direct way (basically formulated as "answers" to @rswinkle's questions) instead of just in this discussion.

So having comments explaining why we have nk_slice, what it is used for and how to use it? We could document the nk_slice struct itself. This is probably the first place people would look when seeing a signature containing it:

/* `nk_slice` is the data structure Nuklear uses to pass around strings (in favor of `const char*`).
 * This struct carries around both the pointer to bytes and the number of bytes pointed to. Using
 * `nk_slice` over a null terminated `const char*` allows Nuklear to avoid calling `strlen`, makes
 * the Nuklear API safer, and allows bindings from other languages to be more efficient when
 * these languages don't use null terminated strings. `nk_slice` also allows the user to slice up
 * a string into multiple parts without allocation, allowing C users to be more efficient as well.
 * To get an `nk_slice` from a null terminated string, use the `nk_slicez` function.
 */
struct nk_slice { const char *ptr; nk_size len; };

if you feel we should merge the PR any time soon (days/weeks), please raise your hand as I have currently a lot of other high-priority work to do and would need to reschedule things. And of course, feel free to "recruit" new reviewers to speed things up wink.

This will probably take a while (weeks maybe idk), a there is a lot that needs to be changed. It's mostly just a straight forward refactor, so it's just about taking time to getting it done.

@Hejsil
Copy link
Contributor Author

Hejsil commented Apr 12, 2020

Just put in the last work required to get the entire library compiling. Haven't tested anything yet, but I feel pretty confident about this PR now that I have the entire picture of how the library looks now.

Next step is to get all demoes and examples compiling!

@Hejsil Hejsil force-pushed the null-termination-be-gone branch 2 times, most recently from e0d96bf to 12d3a57 Compare April 13, 2020 15:24
@Hejsil
Copy link
Contributor Author

Hejsil commented Apr 27, 2020

I now have all examples and demos that I can compile on my linux machine working. The primary code in nuklear.h is fully ready for review if someone feel brave enough to jump in. I do plan on looking over all my changes one more time.

TODO:

  • Get non linux backends working (anyone know how I can easily try to compile the glad backend)
  • Run and test all demos
  • Finish demo/node_editor.c,style.c,calculator.c

@Hejsil
Copy link
Contributor Author

Hejsil commented Apr 27, 2020

Once I feel I'm done, I'll rebase on master, bump the major version number and squash most commits together

@dumblob
Copy link
Member

dumblob commented Apr 27, 2020

Get non linux backends working (anyone know how I can easily try to compile the glad backend)

If it's just about compilation (not "visual" quality&functionality), then GitHub CI (or other CI of your choice - GitHub CI is though extremely fast - we're talking seconds instead of minutes as with e.g. Travis CI) might help. I didn't have time for that, but if you feel it's a good thing, then we might adopt something similar likehttps://github.com/vlang/v/blob/master/.github/workflows/ci.yml .

@Hejsil
Copy link
Contributor Author

Hejsil commented Apr 27, 2020

@dumblob Aah, ofc. Idk why I didn't think of using CI for this. I've had good experience with Github CI, so I'll try that out tomorrow.

@Hejsil Hejsil changed the title WIP: Change API to use string+len pairs instead of null terminated strings Change API to use string+len pairs instead of null terminated strings May 13, 2020
@Hejsil
Copy link
Contributor Author

Hejsil commented May 13, 2020

Done! All demoes have been compiled and tested. Commits have been squashed. Branch has been rebased on master. I'll take one more look over the changes to make sure I did everything right. After that I'll bump the version and this should be ready for merge.

@Hejsil Hejsil force-pushed the null-termination-be-gone branch 2 times, most recently from 0e4b77b to ae431f9 Compare May 18, 2020 18:19
@Hejsil
Copy link
Contributor Author

Hejsil commented May 18, 2020

Done!

@lundril
Copy link
Contributor

lundril commented May 25, 2020

Just my 2 cents: This really is a big API change.
I still like it: There are so many potential problems with the NUL terminated C strings, that using something else really is nice.

@Hejsil Hejsil force-pushed the null-termination-be-gone branch 4 times, most recently from 51df201 to adbe553 Compare June 24, 2021 18:20
@DBJDBJ
Copy link

DBJDBJ commented Sep 19, 2021

something for a char * be gone branch. Perhaps a bit dramatic, but we do use this for several years now.

#define STRNG(S_) struct { char data[S_]; }
#define STRNG_SZE(S_) (sizeof S_.data / sizeof S_.data[0])

int main(void) {
  STRNG(0xF) s1 = {"ABC"} ;
  assert( 0xF == STRNG_SZE(s1));
}

What is also dramatic is programs resilience increase.

Too late here ... 😉

@RobLoach
Copy link
Contributor

I'm not sure about a flip to slices rather than null-terminating strings. The standard in the C-land seems to be null-terminating strings, and it does make the usage a lot more verbose. Alternatively, could we wrap the implementation so that users don't have to worry about using slices everywhere, and it's just an internal thing?

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 22, 2021

I'm not sure about a flip to slices rather than null-terminating strings. The standard in the C-land seems to be null-terminating strings, and it does make the usage a lot more verbose. Alternatively, could we wrap the implementation so that users don't have to worry about using slices everywhere, and it's just an internal thing?

This was discussed (2 year ago lol). We do wonna expose this new api to the user though, so this cannot just be an internal thing. In theory, we could have matching functions that takes null terminated strings as well, but this would basically double the API surface

@RobLoach
Copy link
Contributor

Hmm, would that end up having slice-enabled functions use _slice() suffixes or something? It does seem like a lot of overhead and cognoscente load for the user to consider.

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 22, 2021

There are a few functions that we cannot just wrap to provide the same old api (nk_combo_callback comes to minds). For the rest, I guess we could have the old API as wrappers over this new one (geeh lots of work 😅)

@dumblob Thoughts.

@dumblob
Copy link
Member

dumblob commented Dec 22, 2021

My goal here is to make Nuklear safer and easier to use from other-than-C ecosystems.

There are many situations in Nuklear (disregarding strings) where there is a need for (many) different functions with similar functionality (be it due to accepted types - e.g. color picker recently; or due to simply slightly different behavior; or in less cases but still just to offer lower amount of arguments by making some of them a default). And now this fundamental thing with strings.

I don't think there is a good non-bloat solution in C for any of these. The only thing C offers is #ifdefing (conditional compilation).

So I'm actually not that much afraid of this PR. Even if we were "forced" to provide wrappers for nearly every single function, I'd agree with this PR as its merits seem to outweight this seeming cognitive burden.

Note that in case we would provide wrappers for functions accepting null-terminated strings for all such functions, the doc can reflect this visually and can hide by default the irrelevant kind of API if you're interested only in using Nuklear from C. And vice versa - hide null-string API if you're interested only in size-string API.

@RobLoach does that make more sense or do you still think it's not a good idea?

@RobLoach
Copy link
Contributor

RobLoach commented Dec 23, 2021

I understand having a method to handle C strings. Having a safe and performant implementation is important. I post this to give my honest feedback, and not to chew out anyone personally or insult any ideas. This is coming from a space of wanting the best for the library.

Concerns

This is a list of the concerns I have around this approach.

1. Slice Adoption and Integrations

The vast majority of C libraries out there already use const char*, so implementing our own system on top of this will have big impact on libraries and bindings using Nuklear. They'll need to wrap their own nk_slice implementations, both to nk_slice, and from nk_slice.

glfw, SDL, heck even OpenGL, don't take this approach. Majority of C libraries out there use null-terminating const char*. Going against the grain will hinder Nuklear adoption.

2. Cognitive Load

The programming experience has some additional cognitive load, as mentioned above.

nk_label(ctx, "Hello World!", NK_TEXT_LEFT);
nk_label(ctx, nk_slicez("Hello World!"), NK_TEXT_LEFT);

The user has to remember to pass in a slice rather than just passing in a string.

3. Code Bloat

The code does result in a bit of bloat with all the nk_slicesz() calls. You can see this in the property calls in the demo.

bg.r = nk_propertyf(ctx, nk_slicez("#R:"), 0, bg.r, 1.0f, 0.01f,0.005f);
bg.g = nk_propertyf(ctx, nk_slicez("#G:"), 0, bg.g, 1.0f, 0.01f,0.005f);
bg.b = nk_propertyf(ctx, nk_slicez("#B:"), 0, bg.b, 1.0f, 0.01f,0.005f);
bg.a = nk_propertyf(ctx, nk_slicez("#A:"), 0, bg.a, 1.0f, 0.01f,0.005f);

Even when retrieving a string from a Nuklear function, you need to remember to dereference it from its slice.

4. Performance?

If performance and efficiency is one of the benefits out of this, have there been some analysis into the strlen() call saves out of this? I'd wager that the performance gain is extremely minor. When I consider that in comparison to the impact it would have on Nuklear's adoption as a whole, I'd recommend against it.

5. Upgrade Process

The upgrade process from Nuklear 4.* to 5.* will be such a pain to go through, as you've likely experienced already when making the demo updates. Wrapping all the strings in nk_slicez() will result in people opting out of upgrading at all.

Overall

If we're considering something like this, I think we should really look into what problem we're solving. null-terminated const char* is C's standard, everywhere. Hejsll mentioned safety with Zig, dumblob mentioned sds strings, while I understand those benefits, I think it would be worth seeing where we actually run into problems with our string implementation, and consider the root of the problem, rather than implementing a large change such as this.

Lots of people are use to const char*. I don't feel like we have to babysit people who don't understand how to use them effectively. If we do take the approach, however, I think we could wrap the API to avoid making so many backward incompatible changes.

@dumblob
Copy link
Member

dumblob commented Dec 23, 2021

Thanks for the wrap up. Those concerns are real and I agree with them.

  1. The problem (the definition of you're rightly asking for) I'd like to solve is to finally have first class support for modern languages (Go, V, Zig, Rust, D, JS, ... pretty much all modern languages). I want to put stop to both current issues Nuklear has in these languages:
    1. either the nk_slicez() problem you thoroughly describe for C (for auto-generated 1:1 bindings) - using Nuklear feels alien
    2. or through an inefficient wrapper - which has 3 disadvantages:
      1. is not 1:1 to Nuklear (i.e. cognitive load which is non-transferable to other languages nor the original Nuklear on top!)
      2. has to be done manually over and over for each language and this can't be fully automated
      3. is not efficient on certain places (this gets important for real-world bigger apps because these work more with text than with "buttons" and the text tends to get much longer)
  2. There is an easy solution covering all the problems you outlined to an acceptable degree - to wrap all these functions and provide both APIs alongside of each other. And that's totally fine IMHO. I'd go for this option.

What do you think?

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 23, 2021

Seems like having both API's where possible is the way to go then. I don't think we will be able to do this in all cases. Slices are not a superset of null terminated strings after all. Functions that return slices or which takes callbacks which takes slices will be breaking, but these are few and far between so hopefully, which the rest of the library being the same, the upgrade pains should be less painful.

This API retains backwards compatibility in a lot of cases for Nuklears
high level API, but the low level functions still only have a slice
options. This is because, at a low level, Nuklear no longer uses null
terminated strings at all.
@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 23, 2021

Alright, I just pushed a commit that changes the src/nuklear.h header with a new proposed API. Here, many of the functions in the current API remain the same, but a new *_s function has been added that takes a slice instead of null terminated string. Functions like nk_strtod have been left only taking slices, as these are more low level functions used by Nuklear internally, and internally Nuklear does not use null terminated strings.

This is still a breaking change. I've removed all *_text functions in favor of the *_s convention for consistency, and functions that already took const char *ptr, int len as two parameters have been changed to take the nk_slice.

@@ -145,7 +145,7 @@ int main(void)
nk_input_end(ctx);

/* GUI */
if (nk_begin(ctx, "Demo", nk_rect(50, 50, 200, 200),
if (nk_begin(ctx, nk_slicez("Demo"), nk_rect(50, 50, 200, 200),
Copy link
Member

@dumblob dumblob Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't both demos and examples written in C then use the non-slice API (to not scare people off)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes yes. I only update the header so we can discuss the interface before i go on a thousand line journey.

@@ -2666,7 +2689,8 @@ NK_API void nk_group_set_scroll(struct nk_context*, const char *id, nk_uint x_of
///
/// Returns `true(1)` if visible and fillable with widgets or `false(0)` otherwise
*/
#define nk_tree_push(ctx, type, title, state) nk_tree_push_hashed(ctx, type, title, state, NK_FILE_LINE,nk_strlen(NK_FILE_LINE),__LINE__)
#define nk_tree_push(ctx, type, title, state) nk_tree_push_hashed(ctx, type, title, state, nk_slicez(NK_FILE_LINE),__LINE__)
Copy link
Member

@dumblob dumblob Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just quickly tried to find all uses of this function and couldn't find anything convincing so I'm not sure it's necessary to break backwards compatibility here.

This seems to hold also for some other functions - e.g. nk_tree_image_push_id() .

What was the motivation to change the API in such functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed functions that took ptr+len already to take slices instead, for consistency. I can revert that as well, if we wonna keep complete backwards compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, as much backwards compatibility as we can get that is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed functions that took ptr+len already to take slices instead, for consistency. I can revert that as well, if we wonna keep complete backwards compatibility.

Actually looking at it after the year and a half since you created this PR , with all the new "knowledge" (especially the trend to "duplicate" the string APIs) I think we could really try the "the more backwards compatible the better" approach. So I think we should keep str+len in existing functions 😉.

But feel free to wait for others (@RobLoach etc.) to express themself before diving into the thousand line journey 😉.

Copy link
Contributor Author

@Hejsil Hejsil Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could really try the "the more backwards compatible the better" approach. So I think we should keep str+len in existing functions wink.

I see. Maybe we should just drop the nk_slice and just take ptr+len as individual parameters. This would be consistent with what Nuklear already does and is what most C libraries in general does. I think we can also come up with better names for these functions than just an *_s prefix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just drop the nk_slice and just take ptr+len as individual parameters.

Yep, why not. If it serves the intended purpose, then it's good 😉!

@DBJDBJ
Copy link

DBJDBJ commented Dec 27, 2021

Great lib and great thread. Just a bit of "ultimate truth" if we may :)

I think we are very fortunate C was made as it was. But. Null terminated "strings" are most definitely the biggest mistake Denis has ever made.. And hardly anyone disputes that by 2021.

@rswinkle
Copy link
Contributor

I'll just chime in again with my 2 cents.

My initial reaction a couple years ago was similar to @RobLoach but I was convinced pretty quickly. My concern was more for verbosity and to a lesser extent performance, though the idea of a C library not defaulting to plain char* was a bit strange.

I don't think wrapping a string literal in nk_slicez() is that big a deal in verbosity, cognitive load, or the upgrade process. And that's assuming that people actually bother to update the version of Nuklear in their projects which isn't a given. Also as @Hejsil pointed out in my initial discussion back in 2020, strlen was always being called anyway, now it's just called in nk_slicez().

If this change required more than simple inline nk_slicez() calls to update existing programs, or caused a measurable performance drop, I'd still be against it but it doesn't.

So the only remaining reason IMO to go for the dual API approach, which causes far more internal library bloat, is to not break backward compatibility. I'm not sure it's worth the trade off. Nuklear isn't the linux kernel or even something like Qt or GTK where a single breaking API change can cause havoc. It's also rarely (ever?) used as a dynamic library that can be updated without also updating the program that uses it.

That being said, it does seem the only way to keep everyone happy and Nuklear is already a fairly large library at ~30k lines so a few more won't make much of a difference.

@dumblob
Copy link
Member

dumblob commented Dec 28, 2021

So the only remaining reason IMO to go for the dual API approach, which causes far more internal library bloat, is to not break backward compatibility. I'm not sure it's worth the trade off. Nuklear isn't the linux kernel or even something like Qt or GTK where a single breaking API change can cause havoc. It's also rarely (ever?) used as a dynamic library that can be updated without also updating the program that uses it.

That being said, it does seem the only way to keep everyone happy and Nuklear is already a fairly large library at ~30k lines so a few more won't make much of a difference.

Yep, we had to admit Nuklear is not a minuscule lib since we introduced TTF support years ago. I for myself see the "dual API" approach as temporary. It's a bit of PITA for us (devs & maintainers) only now IMHO.

Because it seems a good way to find out how Nuklear users will react and in a year or two we can bump major version due to a major overhaul & cleanup. Which depends on whether there'll be enough devs (currently there are nearly none considering their day jobs etc.).

Any commercial subject wanting to pour some money or work force into Nuklear? Then please start new thread in the issue tracker to discuss the synergies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants