Skip to content
This repository was archived by the owner on Dec 15, 2022. It is now read-only.

Conversation

mkhl
Copy link
Contributor

@mkhl mkhl commented Mar 8, 2016

The regexp engine and the shell grammar don't agree on what comprises a “word boundary”. This leads to problems where a command or path ending in a shell keyword would be interpreted as that keyword.

This change replaces \b word boundary matchers with lookahead/-behind matchers on whitespace, line breaks, and command separators (; and &).

It also eliminates ad-hoc custom word boundaries, where subsets of [-=/] were pre- or appended to some word boundaries, which fixed some similar problems.

This is basically a cherry-pick of textmate/shellscript.tmbundle@cb6e72e, modified for atom.

@winstliu
Copy link
Contributor

winstliu commented Mar 8, 2016

Can you please add some specs to this to ensure that nothing regressed or will regress?

The regexp engine and the shell grammar don't agree on what comprises
a “word boundary”. This leads to problems where a command or path
ending in a shell keyword would be interpreted as that keyword.

This change replaces `\b` word boundary matchers with lookahead/-behind
matchers on whitespace, line breaks, and command separators (";" and "&").

It also eliminates ad-hoc custom word boundaries, where subsets of [-=/]
were pre- or appended to some word boundaries,
which fixed some similar problems.
@mkhl
Copy link
Contributor Author

mkhl commented Mar 15, 2016

I've added a bunch of specs for cases that weren't handled before.

for…in loops and case blocks seem covered well enough already (except at least one spec is wrong, but that's the subject of #36), which leaves while and until loops, and function declarations.

Do you want me to add specs for their "happy paths" as well?


expect(tokens[0]).toEqual value: 'iffy', scopes: ['source.shell']
for string of strings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long delay...but shouldn't this be for string in strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, yes, I think it should be. Not sure what I got mixed up there 😅

@winstliu
Copy link
Contributor

Do you want me to add specs for their "happy paths" as well?

Can you elaborate on this?

@mkhl
Copy link
Contributor Author

mkhl commented Sep 13, 2016

Do you want me to add specs for their "happy paths" as well?

Can you elaborate on this?

I was talking about while and until loops, and function declarations. Those don’t have proper tests yet, and I only added tests that ensure that many strange characters are not considered to be word boundaries by the shell.

What’s missing are tests that tokenize a whole block, like the one for for…in loops.

(Tests for for…in loops without the in and for C-style for loops are also missing.)

@winstliu
Copy link
Contributor

Ok, since this is only testing for word boundaries, I think it's fine that we don't add new tests for the other loops.

@winstliu winstliu closed this in #65 Jan 9, 2017
winstliu pushed a commit that referenced this pull request Aug 22, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants