strings: add find_between_pair #13468

Larpon · 2022-02-14T14:46:17Z

This PR adds a function I've personally missed a lot while working with strings in V. It introduces a way to extract content between two marks in a string in a "balanced" way. It makes it possible to extract "block" or "scope" content in a way that is currently not possible with either of the find_between, all_after*, all_before* methods defined on the string type.

My personal use case has been mostly to extract blocks of enum or struct contents from C (and V) code.
E.g. getting the contents between the opening { and closing } of the SDL_RWops struct in the following was not easy before without extensive parsing (or position/format hacks):

typedef struct SDL_RWops
{
...
    int (SDLCALL * close) (struct SDL_RWops * context);
    Uint32 type;
    union
    {
#if defined(__ANDROID__)
        struct
        {
            void *fileNameRef;
            void *inputStreamRef;
            ...
            int fd;
        } androidio;
#elif defined(__WIN32__)
        struct
        {
            SDL_bool append;
            void *h;
            struct
            {
                ...
            } buffer;
        } windowsio;
#endif
        struct
        {
            void *data1;
            void *data2;
        } unknown;
    } hidden;
} SDL_RWops;

It's also good for things like (nested) block comments:

/*
 /* hello */
*/

... and for grabbing various parts of V function signatures like:
fn fn_with_anon_fn(i int, callback fn (num int, s string)) <- arguments can be grabbed quickly (including the callback signature) with:
arg_str := strings.find_between_pair(signature,'(',')') // i int, callback fn (num int, s string)

I decided to use a sumtype as mark types as both the performance and use-cases differ for the three types I've had use for:
(10000 iterations)

The benchmark I've used is included here: bmark_find_between_pair.zip

I guess it can be discussed what results should be if the input/marks is "unbalanced"/missing marks/uneven/the same:
I.e what should the result be in the following input cases?:
strings.find_between_pair('( blank ( or full ( or half of it','(','') == ??
strings.find_between_pair('* this part * or this *','*','*') == ??
These can be decided later on - right now these will result in blank or "undefined" (untested) return output.

Larpon · 2022-02-14T15:11:27Z

Vinix build error is unrelated I think

Larpon · 2022-02-14T15:30:02Z

Ditto dynamic library example 🤷

vlib/strings/strings.v

spytheman · 2022-02-14T17:07:59Z

Ditto dynamic library example shrug

Yes, it was a bug in the v.markused module (the functions needed for a#[] were removed with -skip-unused).
Fixed in V f8bf3db .

Larpon · 2022-02-15T11:03:49Z

Ready when CI is

spytheman reviewed Feb 14, 2022

View reviewed changes

vlib/strings/strings.v Outdated Show resolved Hide resolved

spytheman closed this Feb 14, 2022

spytheman reopened this Feb 14, 2022

Larpon added 6 commits February 15, 2022 11:48

strings: add find_between_pair

6a186a2

strings: fix comment examples

61efcec

strings: fix find_between_pair test

8eeb0e4

strings: fix find_between_pair test

a0eaf0c

strings: use explicit functions for each type, fix tests

ae47202

strings: clean up doc strings

2310039

Larpon force-pushed the strings/add-find_between_pair branch from f0d9bcb to 2310039 Compare February 15, 2022 10:48

spytheman merged commit 80444c8 into vlang:master Feb 15, 2022

Larpon deleted the strings/add-find_between_pair branch February 15, 2022 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

strings: add find_between_pair #13468

strings: add find_between_pair #13468

Larpon commented Feb 14, 2022 •

edited

Larpon commented Feb 14, 2022

Larpon commented Feb 14, 2022

spytheman commented Feb 14, 2022

Larpon commented Feb 15, 2022

strings: add find_between_pair #13468

strings: add find_between_pair #13468

Conversation

Larpon commented Feb 14, 2022 • edited

Larpon commented Feb 14, 2022

Larpon commented Feb 14, 2022

spytheman commented Feb 14, 2022

Larpon commented Feb 15, 2022

Larpon commented Feb 14, 2022 •

edited