Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Add utf8ndup #41

Closed
warmwaffles opened this issue Nov 21, 2017 · 7 comments · Fixed by #42
Closed

[feature] Add utf8ndup #41

warmwaffles opened this issue Nov 21, 2017 · 7 comments · Fixed by #42

Comments

@warmwaffles
Copy link
Contributor

This is my current implementation. I am using it to replace all of my strndups

#include <utf8.h>

void*
utf8ndup(const void* src, size_t n)
{
    const char* s = (const char*)src;
    char* c       = 0;

    // figure out how many bytes (including the terminator) we need to copy first
    size_t bytes = utf8size(src);

    c = (char*)malloc(n);

    if (0 == c) {
        // out of memory so we bail
        return 0;
    }

    bytes = 0;
    size_t i = 0;

    // copy src byte-by-byte into our new utf8 string
    while ('\0' != s[bytes] && i < n) {
        c[bytes] = s[bytes];
        bytes++;
        i++;
    }

    // append null terminating byte
    c[bytes] = '\0';
    return c;
}

I don't know if this is desirable. I am almost just half tempted to calloc an memcpy the results.

@f2404
Copy link
Contributor

f2404 commented Nov 21, 2017

What's the point in

size_t bytes = utf8size(src);
bytes = 0;

?

@warmwaffles
Copy link
Contributor Author

I think originally I intended to check to see if the new string will be smaller than the requested size.

But this is literally the utf8dup code with a tacked on size_t n

@f2404
Copy link
Contributor

f2404 commented Nov 21, 2017

Also, you don't need two iterators (bytes and i). One would be enough.

@warmwaffles
Copy link
Contributor Author

warmwaffles commented Nov 21, 2017

void*
utf8ndup(const void* src, size_t n)
{
    const char* s = (const char*)src;
    char* c       = 0;

    // figure out how many bytes (including the terminator) we need to copy first
    size_t bytes = utf8size(src);

    if (n < bytes) {
        c = (char*)malloc(n + 1);
    } else {
        c = (char*)malloc(bytes);
        n = bytes;
    }

    if (!c) {
        // out of memory so we bail
        return 0;
    }

    bytes = 0;

    // copy src byte-by-byte into our new utf8 string
    while ('\0' != s[bytes] && bytes < n) {
        c[bytes] = s[bytes];
        bytes++;
    }

    // append null terminating byte
    c[bytes] = '\0';
    return c;
}

@warmwaffles warmwaffles changed the title Add utf8ndup [feature] Add utf8ndup Nov 21, 2017
@warmwaffles
Copy link
Contributor Author

Anyways, this could probably be better and probably share the code used in utf8dup if the string is shorter than the requested n

@sheredom
Copy link
Owner

Thanks for looking at this!

Two options:

  • Are you willing to do a PR to add this?
  • Otherwise, would you rather I incorporated this?

I'm happy to do the work, but some people would rather there name was on the commit if they did the work!

@warmwaffles
Copy link
Contributor Author

@sheredom I would be more than happy to submit a PR for this. Just wanted to test the waters here first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants