Skip to content

Update query parser to infer strings as ascii#277

Merged
eddelbuettel merged 7 commits intomasterfrom
aw/tweak-query-parser-ascii-detection
Jul 30, 2021
Merged

Update query parser to infer strings as ascii#277
eddelbuettel merged 7 commits intomasterfrom
aw/tweak-query-parser-ascii-detection

Conversation

@aaronwolen
Copy link
Copy Markdown
Member

Drops the anchors from .isAscii's regex pattern to look for alpha characters anywhere in the value. Also adds a new test section where we can start checking for more type inference edge cases.

Comment thread R/QueryCondition.R
parse_query_condition <- function(expr, debug=FALSE) {
.isComparisonOperator <- function(x) as.character(x) %in% c(">", ">=", "<", "<=", "==", "!=")
.isBooleanOperator <- function(x) as.character(x) %in% c("&&", "||", "!")
.isAscii <- function(x) grepl("^[a-zA-Z_]*$", x)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep the front anchor? Column names cannot start with digits...

Maybe "^[a-zA-Z_0-9]*" ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is checking column values, right? So we don't want to miss something like "1a".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gosh, you're right. I was so focussed on schema names.

In which case it is 'much worse' :) I.e. possibly payload is much more. Should we use [[:alpha:]] or a totally generic reg exp (meta) pattern.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I like [[:alpha:]]. It should improve detection for non-a-z characters too! I'll update.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per ?regexp maybe [[:alnum:]] ?

Copy link
Copy Markdown
Contributor

@eddelbuettel eddelbuettel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the inlined comment -- I think we may at least want to keep the front-anchor.

@aaronwolen aaronwolen changed the title Update query parser to infer strings containing digits as ascii Update query parser to infer strings as ascii Jul 29, 2021
@aaronwolen aaronwolen requested a review from eddelbuettel July 29, 2021 20:14
Copy link
Copy Markdown
Contributor

@eddelbuettel eddelbuettel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update based on what I riffed on over slack to you.

This had removed .isAscii which I just restored, if that doesn't pose a problem for you 🤠 I would be happy and delighted to merge

@eddelbuettel eddelbuettel merged commit daf0378 into master Jul 30, 2021
@eddelbuettel eddelbuettel deleted the aw/tweak-query-parser-ascii-detection branch August 3, 2021 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants