Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Regular expressions #256

Closed
gfwilliams opened this issue Mar 18, 2014 · 15 comments
Closed

Add Regular expressions #256

gfwilliams opened this issue Mar 18, 2014 · 15 comments

Comments

@gfwilliams
Copy link
Member

@gfwilliams gfwilliams commented Mar 18, 2014

This looks promising: https://github.com/cesanta/slre

Is GPLv2 compatible with the MPLv2 of Espruino though?

Boasts ~5kB compiled, which is probably as small as we could hope for.

It could do with some modification to use JsvStringIterator if it's going to be any use for long strings (which let's face it, it'll need to be good for);

@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Mar 18, 2014

http://sourceforge.net/projects/tiny-rex looks good too. As a bonus it doesn't use functions called foo baz and bar

@gadicc
Copy link

@gadicc gadicc commented Apr 7, 2014

Haha, it's awesome, every time I think of something that would be great to have, but probably not worth opening an issue for (yet), you've already opened it yourself.

@EthraZa
Copy link

@EthraZa EthraZa commented Mar 4, 2017

There was no advance on that matter?
I tryed to load json2html on espruino and got sad finding out how much regex it uses. After that I took a look at ewsjs and got sad as well.
So I noticed that it's kind of hard to think about any general purpose webservice or templating without regex.
Is regex a possibility within espruino or I'm better finding another way to do that stuff?

@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Mar 6, 2017

It's possible, yes. It may happen - but not soon. I just haven't had time to work on it.

For now the most annoying thing is sorting the lexer out (the grammar for it is nasty since '/' could be a divide, or the start of a regex)

Espruino does have ES6 templated literals though, which are pretty much perfect for templating in a web service.

Honestly, if you're trying to pull some framework in to Espruino you'll be disappointed anyway - it'll use up all your available memory before you've even had a chance to write any code!

@dave-irvine
Copy link

@dave-irvine dave-irvine commented Jun 9, 2017

@gfwilliams Any movement on this? Finding myself needing RegEx the more things I do!

@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Jun 9, 2017

Hi - not yet - the first step would be to make the Lexer distinguish regular expressions from other JS. I'd had a go at that, but it's not quite as easy as you'd hope because the slash is also used for divide. It's hard to find info on it but it looks like I can decide based on the previous token.

What sort of things were you trying to do that needed regex? I think realistically even when it goes in, it might not be a full regex parser.

@dave-irvine
Copy link

@dave-irvine dave-irvine commented Jun 9, 2017

You could add basic support via the new RegExp() constructor rather than parsing / if that is where you are stuck for now.

Come to think of it, how are you parsing // comments?

I think http://duktape.org/ has a built in parser so maybe you could check their source for ideas.

Mostly I'd like to be able to do token replacement in a string (ideally doing this on a Stream as it is being piped from SD card to WiFi)

gfwilliams added a commit that referenced this issue Jun 9, 2017
@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Jun 9, 2017

Ok, I just added Regex parsing - it's nasty, but that's as good as it'll be by the look of it. No actual Regex implementation yet though, but at least (finally) it'll warn people when regex is used.

Mostly I'd like to be able to do token replacement in a string (ideally doing this on a Stream as it is being piped from SD card to WiFi)

So basically that's just a repeated String.replace? You could still do it with indexOf and substr if needed.

Even using RegEx isn't going to work as nicely as you expect though. For example assume (for the sake of not having a huge post) that pipe is sending chunks of 8 bytes.

// Sending
HelloThereThisisMyFile$myvar$WithStuffAfter
// Gets split to
HelloThe reThisis MyFile$m yvar$Wit hStuffAf ter

If you just regex on each block you're going to miss the $mayvar$ bit because it's split over 2 blocks. You're doing to have to keep the previous block in memory all the time, and make sure you sent it all when the pipe is closed.

Once you're doing all that, just using indexOf instead of a regex probably isn't all that painful.

@MrTimcakes
Copy link

@MrTimcakes MrTimcakes commented Jun 19, 2017

I was about to make a forum post about this but I thought I should just check GitHub first, glad I did.

My usecase is removing multiple spaces from a string, e.g.

" 0  0      0 29105844  93308 2019184    0    0     0     0  241  435  0  0 100  0  0".split(/ +/)
@MrTimcakes
Copy link

@MrTimcakes MrTimcakes commented Jun 19, 2017

This is what I'm using in place for right now

var e = " 0  0      0 29105844  93308 2019184    0    0     0     0  241  435  0  0 100  0  0"
while(e.indexOf('  ')!=-1)e=e.replace('  ',' ');
var values = e.trim().split(" ");
@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Jun 19, 2017

Thanks - it's actually kind of annoying that there's no 'global replace' function. I seems that for most things, even just a barebones regexp function that allowed stuff like / /g would make life a lot easier.

@gfwilliams gfwilliams closed this in 552f0e1 Oct 6, 2017
@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Oct 6, 2017

Leaving open, as we still have at minimum String.split to implement with RegEx.

@gfwilliams gfwilliams reopened this Oct 6, 2017
@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Dec 14, 2017

String.split is done now

@gfwilliams gfwilliams closed this Dec 14, 2017
@icanhazpython
Copy link

@icanhazpython icanhazpython commented Dec 30, 2019

I may have found an issue with the regex implementation: String.split(/[_-]/) yields an exception: "Unfinished character set in RegEx", whereas String.split(/[-_]/) works fine. This is in v2.04.

@gfwilliams
Copy link
Member Author

@gfwilliams gfwilliams commented Jan 6, 2020

Ahh, thanks! I just filed #1736

Looks like it's expecting something like [A-Z]. It feels like /[_-]/ is bad form because if you added one extra character it could totally change how this works. For example "a-b-c".split(/[_-]/) vs "a-b-c".split(/[_-z]/) - but even so it's supported elsewhere so we should do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.