Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strings: add find_between_pair #13468

Merged
merged 6 commits into from
Feb 15, 2022

Conversation

Larpon
Copy link
Contributor

@Larpon Larpon commented Feb 14, 2022

This PR adds a function I've personally missed a lot while working with strings in V. It introduces a way to extract content between two marks in a string in a "balanced" way. It makes it possible to extract "block" or "scope" content in a way that is currently not possible with either of the find_between, all_after*, all_before* methods defined on the string type.

My personal use case has been mostly to extract blocks of enum or struct contents from C (and V) code.
E.g. getting the contents between the opening { and closing } of the SDL_RWops struct in the following was not easy before without extensive parsing (or position/format hacks):

typedef struct SDL_RWops
{
...
    int (SDLCALL * close) (struct SDL_RWops * context);
    Uint32 type;
    union
    {
#if defined(__ANDROID__)
        struct
        {
            void *fileNameRef;
            void *inputStreamRef;
            ...
            int fd;
        } androidio;
#elif defined(__WIN32__)
        struct
        {
            SDL_bool append;
            void *h;
            struct
            {
                ...
            } buffer;
        } windowsio;
#endif
        struct
        {
            void *data1;
            void *data2;
        } unknown;
    } hidden;
} SDL_RWops;

It's also good for things like (nested) block comments:

/*
 /* hello */
*/

... and for grabbing various parts of V function signatures like:
fn fn_with_anon_fn(i int, callback fn (num int, s string)) <- arguments can be grabbed quickly (including the callback signature) with:
arg_str := strings.find_between_pair(signature,'(',')') // i int, callback fn (num int, s string)

I decided to use a sumtype as mark types as both the performance and use-cases differ for the three types I've had use for:
(10000 iterations)
image

The benchmark I've used is included here: bmark_find_between_pair.zip

I guess it can be discussed what results should be if the input/marks is "unbalanced"/missing marks/uneven/the same:
I.e what should the result be in the following input cases?:
strings.find_between_pair('( blank ( or full ( or half of it','(','') == ??
strings.find_between_pair('* this part * or this *','*','*') == ??
These can be decided later on - right now these will result in blank or "undefined" (untested) return output.

@Larpon
Copy link
Contributor Author

Larpon commented Feb 14, 2022

Vinix build error is unrelated I think

@Larpon
Copy link
Contributor Author

Larpon commented Feb 14, 2022

Ditto dynamic library example 🤷

vlib/strings/strings.v Outdated Show resolved Hide resolved
@spytheman
Copy link
Member

Ditto dynamic library example shrug

Yes, it was a bug in the v.markused module (the functions needed for a#[] were removed with -skip-unused).
Fixed in V f8bf3db .

@spytheman spytheman closed this Feb 14, 2022
@spytheman spytheman reopened this Feb 14, 2022
@Larpon Larpon force-pushed the strings/add-find_between_pair branch from f0d9bcb to 2310039 Compare February 15, 2022 10:48
@Larpon
Copy link
Contributor Author

Larpon commented Feb 15, 2022

Ready when CI is

@spytheman spytheman merged commit 80444c8 into vlang:master Feb 15, 2022
@Larpon Larpon deleted the strings/add-find_between_pair branch February 15, 2022 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants