Simple tokenizer question #58

dhowe · 2018-10-04T05:27:55Z

I'm trying to match c-style identifiers that start with a '#' character, but to discard the hash character and keep the rest. In my attempts below the '#' is always included in the token:

static TextParser<Unit> HashIdent { get; } = Character.EqualTo('#').IgnoreThen(Identifier.CStyle).Value(Unit.Value);

I know I'm missing something simple...

nblumhardt · 2018-10-04T06:10:49Z

Hi!

The tokenizer will consider the whole span matched by the parser to be the token, regardless of the value that's returned, but, you can use .Apply(HashIdent) later on in the parsing stage to get the value, i.e. Token.EqualTo(Tokens.Identifier).Apply(HashIdent).

For this to work you need to drop off the Value(Unit.Value) piece:

static TextParser<Unit> HashIdent { get; } = Character.EqualTo('#').IgnoreThen(Identifier.CStyle)

HTH!

dhowe · 2018-10-04T06:26:19Z

Thanks, thats very helpful (I was just wondering what the apply was doing in your example code)!

When I do that, I get a compile error... do I need to change the generic type to TextSpan?

dhowe · 2018-10-04T06:33:41Z

Also, how do I handle the opposite case, where I want to ignore the last character?

        static TextParser<Unit> Actor { get; } =
            from name in Character.LetterOrDigit.Many()
            from last in Character.EqualTo(':')
            select Unit.Value;

AndrewSav · 2018-12-18T01:28:48Z

Yes, dropping Unit means that you are returning a different value now. If value is not needed for performance reasons it's better to use unit, but in your case change it to TextSpan as you suggested.

As for your second question something like this would probably work:

static TextParser<string> Actor { get; } =
			from name in Character.LetterOrDigit.Many()
			from last in Character.EqualTo(':')
			select new string(name);

nblumhardt · 2019-05-31T10:36:10Z

I guess this water is long under the bridge - hope you found a good solution, @dhowe - closing as stale, but keep us posted :-)

nblumhardt added the question label Oct 4, 2018

nblumhardt closed this as completed May 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple tokenizer question #58

Simple tokenizer question #58

dhowe commented Oct 4, 2018 •

edited

Loading

nblumhardt commented Oct 4, 2018

dhowe commented Oct 4, 2018 •

edited

Loading

dhowe commented Oct 4, 2018

AndrewSav commented Dec 18, 2018

nblumhardt commented May 31, 2019

Simple tokenizer question #58

Simple tokenizer question #58

Comments

dhowe commented Oct 4, 2018 • edited Loading

nblumhardt commented Oct 4, 2018

dhowe commented Oct 4, 2018 • edited Loading

dhowe commented Oct 4, 2018

AndrewSav commented Dec 18, 2018

nblumhardt commented May 31, 2019

dhowe commented Oct 4, 2018 •

edited

Loading

dhowe commented Oct 4, 2018 •

edited

Loading