New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Add MultiStringParser
to match a collection of strings
#3510
Changes from 5 commits
13a3832
f129f07
733c3f2
db03a79
88404f5
0bb6468
96f63a4
ed04460
414c713
e9c66ea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ | |
""" | ||
|
||
import regex | ||
from typing import Type, Optional, List, Tuple, Union | ||
from typing import Collection, Type, Optional, List, Tuple, Union | ||
|
||
from sqlfluff.core.parser.context import ParseContext | ||
from sqlfluff.core.parser.matchable import Matchable | ||
|
@@ -100,6 +100,47 @@ def match( | |
return MatchResult.from_unmatched(segments) | ||
|
||
|
||
class MultiStringParser(StringParser): | ||
"""An object which matches and returns raw segments on a collection of strings.""" | ||
|
||
def __init__( | ||
self, | ||
templates: Collection[str], | ||
raw_class: Type[RawSegment], | ||
name: Optional[str] = None, | ||
type: Optional[str] = None, | ||
optional: bool = False, | ||
**segment_kwargs, | ||
): | ||
self.templates = {template.upper() for template in templates} | ||
super().__init__( | ||
template="", | ||
raw_class=raw_class, | ||
name=name, | ||
type=type, | ||
optional=optional, | ||
**segment_kwargs, | ||
) | ||
# Delete attribute which is replaced by `self.templates` for this `Parser`` | ||
del self.template | ||
|
||
def simple(self, parse_context: "ParseContext") -> Optional[List[str]]: | ||
"""Return simple options for this matcher. | ||
|
||
Because string matchers are not case sensitive we can | ||
just return the templates here. | ||
""" | ||
return list(self.templates) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this function called frequently? I notice that we're creating and returning a new list each time, which could be slow. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The compromise here was whether to cache the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've implemented a |
||
|
||
def _is_first_match(self, segment: BaseSegment): | ||
"""Does the segment provided match according to the current rules.""" | ||
# Is the target a match and IS IT CODE. | ||
# The latter stops us accidentally matching comments. | ||
if segment.is_code and segment.raw.upper() in self.templates: | ||
judahrand marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return True | ||
return False | ||
|
||
|
||
class NamedParser(StringParser): | ||
"""An object which matches and returns raw segments based on names.""" | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
"""The Test file for Parsers (Matchable Classes).""" | ||
|
||
from sqlfluff.core.parser import ( | ||
KeywordSegment, | ||
MultiStringParser, | ||
) | ||
from sqlfluff.core.parser.context import RootParseContext | ||
|
||
|
||
def test__parser__multistringparser__match(generate_test_segments): | ||
"""Test the MultiStringParser matchable.""" | ||
parser = MultiStringParser(["foo", "bar"], KeywordSegment) | ||
with RootParseContext(dialect=None) as ctx: | ||
# Check directly | ||
seg_list = generate_test_segments(["foo", "fo"]) | ||
# Matches when it should | ||
assert parser.match(seg_list[:1], parse_context=ctx).matched_segments == ( | ||
KeywordSegment("foo", seg_list[0].pos_marker), | ||
) | ||
# Doesn't match when it shouldn't | ||
assert parser.match(seg_list[1:], parse_context=ctx).matched_segments == tuple() | ||
|
||
|
||
def test__parser__multistringparser__simple(): | ||
"""Test the MultiStringParser matchable.""" | ||
parser = MultiStringParser(["foo", "bar"], KeywordSegment) | ||
with RootParseContext(dialect=None) as ctx: | ||
assert parser.simple(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some benefit to inheriting from
StringParser
? I notice that:template
)Basically, I'm wondering if this class doesn't need to inherit from
StringParser
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other methods which are used from
StringParser
, for examplematch
. Though, you are right that the inheritance is odd. I reckon there should be aBaseParser
perhaps which inherits fromMatchable
and implements the reusable bits ofStringParser
given that it looks to me like all other Parsers inherit fromStringParser
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a go at this and keen for feedback.