Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add _SAVE & _SAVE_AS for unnamed/named captures #4

Open
4 of 5 tasks
xparq opened this issue Sep 4, 2023 · 1 comment
Open
4 of 5 tasks

Add _SAVE & _SAVE_AS for unnamed/named captures #4

xparq opened this issue Sep 4, 2023 · 1 comment

Comments

@xparq
Copy link
Owner

xparq commented Sep 4, 2023

(Requires #2)

Again, just like that of proper regexes... (Nameless ones just collecting into a vector, named ones into a map. UPDATE: Well, not that simple, though, see task 2...)

This could be a fundamental building block of subsequent parsing features.

Note: it should be used "wisely", i.e. to not nest capture ops carelessly. I mean it's fine, but probably makes little sense. E.g. I could detect it and issue a warning I guess...

  • _SAVE
  • _SAVE, but actually putting the result into some "reliably indexable" slot! :) Just putting it into some map using this as the key loses the "proper monotonic ordering for pseudo-indexing by enumeration" property!
  • convenience getter (e.g. operator[](size_t index)) for unnamed results (via Parser)
  • also for named ones then...
  • _SAVE_AS "name"

Rather than fiddling with auto-assigning contiguous indexes to multiple matching unnamed _SAVE rules, and having distinct (disjunct) capture-results collections for the two modes, just using the stringified this pointer for the unnamed ones seems better, with a thin iterator wrapper for enumerating them. They can even go into the very same results map, perhaps!

  • OK, so not in the same map...
    • First of all, users may want to query the count of named/unnamed captures separately.
    • Second, the numeric indexing of unnamed results is a tad more nuanced than just the simple ["name"] getters, and it can at least be a little simpler with numeric indexes and keeping the map ordered, and not having to slalom around the explicit name keys either.
    • The issue of name clashes between auto-indexes and "given names" disappears with 0 effort this way.
@xparq xparq changed the title Add a _CAPTURE operator (both named and nameless) Add _SAVE & _SAVE_AS for unnamed/named captures Sep 4, 2023
@xparq
Copy link
Owner Author

xparq commented Sep 4, 2023

Just FTR:

CASE("CAPTURE: named") {
    Parser p(_{
	    _{_OPT, "_WHITESPACES"},
	    _{_SAVE_AS, "id", "_ID"},
	    _{_OPT, "_WHITESPACES"},
	    "=",
	    _{_OPT, "_WHITESPACES"},
	    _{_SAVE_AS, "val", "_DIGITS"}
    }); p.syntax.DUMP();

    CHECK(p.parse("  capture_this = 1  "));
	    CHECK(p.unnamed_captures.empty());
	    CHECK(p.named_captures.size() == 2);
	    CHECK(p["id"] == "capture_this");
	    CHECK(p["val"] == "1");
    CHECK(p.parse("  or_this = 22 // Wow, fake comments! ;)"));
	    CHECK(p.unnamed_captures.empty());
	    CHECK(p.named_captures.size() == 2);
	    CHECK(p["id"] == "or_this");
	    CHECK(p["val"] == "22");

    CHECK(!p.parse("not this = one"));
}

CASE("CAPTURE: nested") {
    RULE code = _{_MANY, _{_OR, "_ID", "=", "_DIGITS", ";", "_WHITESPACES"} };
    RULE block_in = _{"<", _{_SAVE_AS, "inner", code}, ">"};
    RULE block_out = _{"<", _{_SAVE_AS, "outer",
			    _{ _{_OPT, code}, _{_OPT, block_in}, _{_OPT, code} },
		    }, ">"};

    Parser p(block_out); p.syntax.DUMP();

    CHECK(p.parse("<outer <inner> block>"));
    CHECK(p["inner"] == "inner");
    CHECK(p["outer"] == "outer <inner> block");

    CHECK(p.parse("<outer < x = 1; y = 2; > block>"));
    CHECK(p["inner"] == " x = 1; y = 2; "); //! Mind the spaces...
    CHECK(p["inner"] !=  "x = 1; y = 2;" ); //! Mind the spaces...
    CHECK(p["outer"] == "outer < x = 1; y = 2; > block");
}

xparq added a commit that referenced this issue Sep 5, 2023
Test:
	build-msvc test/OP_SAVE.cpp & OP_SAVE
	build-gcc test/OP_SAVE.cpp & a.exe
@xparq xparq mentioned this issue Sep 7, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant