Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pattern matching library #192

Merged
merged 12 commits into from Mar 5, 2018
Merged

Add pattern matching library #192

merged 12 commits into from Mar 5, 2018

Conversation

hellerve
Copy link
Member

@hellerve hellerve commented Feb 27, 2018

This PR introduces an initial API for regexes in Carp. It is an adaptation of the Lua regex system (I should probably think about proper attribution before we merge this; thoughts?).

Tests are included.

The following functions were added:

  • find : (Fn [&String &String] Int): finds the index of a regex inside another string
  • match : (Fn [&String &String] (Array String)): returns the match groups in a match
  • global-match : (Fn [&String &String] (Array (Array String))): returns the match groups of all matches
  • substitute : (Fn [&String &String &String Int] String): substitutes a pattern n times. Passing -1 will result in substitution of every occurrence of the pattern.
  • matches? : (Fn [&String &String] Bool): will check whether there are any valid matches of the regex in the string
  • match-str : (Fn [&String &String] String): will return the matched string

All in a few hours worth of work! 🎉 (this probably means that it’s buggy)

Cheers

@RyanSquared
Copy link

please note that if you're porting mostly from the Lua version, while this one is extremely lighter than what is a fully working regex implementation, this is a pattern matching system, not regex.

if you'd instead like something more powerful, I'd suggest porting over LPeg - a commonly used library in Lua that has more advanced features and is, IMO, much more powerful and better than regex (while also having a re mode that lets you use regex-like syntax).

@hellerve
Copy link
Member Author

hellerve commented Mar 2, 2018

@RyanSquared thanks for your input! Ideally I would like to have a PCRE-compatible library in the long run, but I believe that this is quite good for now. It is small and maintainable, and this is more important than completeness at the moment, since we’re a very small team who maintain and develop Carp nights and weekends.

So, while I appreciate that Lua’s pattern system is more simplistic than what you would like to have in the long run, this is by design.

@RyanSquared
Copy link

So, while I appreciate that Lua’s pattern system is more simplistic than what you would like to have in the long run, this is by design.

And I'm perfectly fine with that, but what you're saying is that this is a regex library, and the Lua team has made very strong efforts in the past to point out that this is not regex. It's pattern matching. It's faster. It's more useful. But most importantly, even if it looks like regex, it's not.

@hellerve
Copy link
Member Author

hellerve commented Mar 2, 2018

You’re right, I should probably rename them to patterns at the moment. Thanks again for your input!

hellerve added a commit to hellerve/Carp that referenced this pull request Mar 2, 2018
@eriksvedang
Copy link
Collaborator

eriksvedang commented Mar 3, 2018

Very exciting! Here's some feedback:

  • Should there be delete & copy functions for Pattern? Perhaps even init and str?
  • Naming of the internal functions in carp_pattern.h might clash with user defined functions (i.e. "match"). Either we need to put them into a .c file and make them static, or (easier) prefix them with Pattern_, they still don't need to be registered in Carp.
  • I noticed at least one use of malloc, make sure all calls use CARP_MALLOC / CARP_STRDUP / CARP_FREE (unless there's a very good reason not to, of course?)
  • How should the function isManaged in Lookup.hs handle the PatternTy?
  • Also, canBeUsedAsMemberType should probably be modified. Another solution might be to not introduce PatternTy but use (StructTy "Pattern" []) everywhere instead.

@hellerve hellerve force-pushed the regex branch 2 times, most recently from 69a5eb1 to dac3bad Compare March 3, 2018 18:19
@hellerve
Copy link
Member Author

hellerve commented Mar 3, 2018

Thanks for the review! I think all thos points should now be fixed :)

@eriksvedang
Copy link
Collaborator

Epic!

@eriksvedang eriksvedang merged commit ae8b764 into carp-lang:master Mar 5, 2018
@hellerve hellerve deleted the regex branch March 5, 2018 09:12
@hellerve hellerve changed the title Add regex library Add pattern matching library Mar 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants