Skip to content

Support Snowflake MATCH_RECOGNIZE syntax#1222

Merged
alamb merged 4 commits intoapache:mainfrom
jmhain:match_recognize
Apr 22, 2024
Merged

Support Snowflake MATCH_RECOGNIZE syntax#1222
alamb merged 4 commits intoapache:mainfrom
jmhain:match_recognize

Conversation

@jmhain
Copy link
Copy Markdown
Contributor

@jmhain jmhain commented Apr 15, 2024

No description provided.

@coveralls
Copy link
Copy Markdown

coveralls commented Apr 15, 2024

Pull Request Test Coverage Report for Build 8773731659

Details

  • 354 of 391 (90.54%) changed or added relevant lines in 7 files are covered.
  • 651 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+1.1%) to 89.202%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/dialect/generic.rs 1 2 50.0%
src/dialect/mod.rs 1 2 50.0%
src/dialect/snowflake.rs 1 2 50.0%
src/ast/query.rs 64 74 86.49%
tests/sqlparser_common.rs 136 146 93.15%
src/parser/mod.rs 143 157 91.08%
Files with Coverage Reduction New Missed Lines %
src/parser/mod.rs 147 92.66%
tests/sqlparser_common.rs 504 89.59%
Totals Coverage Status
Change from base Build 8772950293: 1.1%
Covered Lines: 23163
Relevant Lines: 25967

💛 - Coveralls

@jmhain jmhain force-pushed the match_recognize branch 3 times, most recently from 2e139b6 to ab21e20 Compare April 15, 2024 15:10
Copy link
Copy Markdown
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jmhain! I think this looks good, left a couple minor comments/questions

Comment thread src/ast/query.rs Outdated
Comment thread src/parser/mod.rs
}
}

if self.dialect.supports_match_recognize()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this use the dialect_of check instead or is the trait method required for this feature?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to use the trait method here because it can be reused in the test with all_dialects_where(|d| d.supports_match_recognize(). This way if it's added for another dialect in the future (e.g. Trino), those tests will run without having to remember to add the new dialect to them.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, not sure if we have a guideline for this but feels to me like the dialect trait could get bloated quickly otherwise. Not sure though, deferring to @alamb if you have thoughts (and the PR looks good to me for a review too!)

Copy link
Copy Markdown
Contributor Author

@jmhain jmhain Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good point. We could do something like this:

struct FeatureSupport {
    match_recognize: bool,
    // other features
}

trait Dialect {
    fn features() -> &'static FeatureSupport;

    // methods requiring custom code
}

If FeatureSupport gets too big, we can make it hierarchical.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great story for this question -- it seems to me that the crate is currently inconsistent where sometimes it uses a trait method and sometimes uses the hard coded IsDialect ..

I think the FeatureSupport idea is neat, and if we didn't already have 2 patterns it would probably be the way to go. However, given we already have 2 methods, I think adding a third would be less than ideal

Ideally we would have a single way and I think the best thing for users of the crate is methods on Dialect (mostly so they can define their own custom dialects, which I have seen some of them do). However, I fully acknowledge that the trait will get very large for sure.

However, a wholesale refactoring of the crate is probably not feasible at this time.

If we can get some consensus between the three of us maybe I can add something to the documentation on the preferred method (dialectof or trait method)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FeatureSupport proposal seems like it balances the tradeoffs closely but yeah, agree that its likely not ideal to introduce another mechanism for this. At least the Dialect trait methods route sounds very reasonable to me as a preferred method in that case since as mentioned it helps users with custom dialect setup and it makes it easier to add test coverage for dialects.

Comment thread tests/sqlparser_common.rs
@alamb alamb changed the title Snowflake: add support for MATCH_RECOGNIZE Support Snowflake MATCH_RECOGNIZE syntax Apr 21, 2024
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jmhain and @iffyio

Comment thread src/parser/mod.rs
}
}

if self.dialect.supports_match_recognize()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great story for this question -- it seems to me that the crate is currently inconsistent where sometimes it uses a trait method and sometimes uses the hard coded IsDialect ..

I think the FeatureSupport idea is neat, and if we didn't already have 2 patterns it would probably be the way to go. However, given we already have 2 methods, I think adding a third would be less than ideal

Ideally we would have a single way and I think the best thing for users of the crate is methods on Dialect (mostly so they can define their own custom dialects, which I have seen some of them do). However, I fully acknowledge that the trait will get very large for sure.

However, a wholesale refactoring of the crate is probably not feasible at this time.

If we can get some consensus between the three of us maybe I can add something to the documentation on the preferred method (dialectof or trait method)

Comment thread tests/sqlparser_common.rs
),
);

// the grand finale (example taken from snowflake docs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 21, 2024

@jmhain -- this PR looks like it has some conflicts -- would it be possible to resolve them?

@jmhain
Copy link
Copy Markdown
Contributor Author

jmhain commented Apr 21, 2024

@alamb Yep! Done

@alamb alamb merged commit 39980e8 into apache:main Apr 22, 2024
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 22, 2024

Thanks @jmhain

@jmhain jmhain deleted the match_recognize branch April 22, 2024 20:29
JichaoS pushed a commit to luabase/sqlparser-rs that referenced this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants