Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[postgres] Add support for custom binary operators #548

Merged
merged 5 commits into from
Aug 5, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/ast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ pub use self::ddl::{
AlterColumnOperation, AlterTableOperation, ColumnDef, ColumnOption, ColumnOptionDef,
ReferentialAction, TableConstraint,
};
pub use self::operator::{BinaryOperator, UnaryOperator};
pub use self::operator::{BinaryOperator, PGCustomOperator, UnaryOperator};
pub use self::query::{
Cte, Fetch, Join, JoinConstraint, JoinOperator, LateralView, LockType, Offset, OffsetRows,
OrderByExpr, Query, Select, SelectInto, SelectItem, SetExpr, SetOperator, TableAlias,
Expand Down
84 changes: 52 additions & 32 deletions src/ast/operator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@

use core::fmt;

#[cfg(not(feature = "std"))]
use alloc::string::String;
#[cfg(feature = "serde")]
use serde::{Deserialize, Serialize};

Expand Down Expand Up @@ -86,41 +88,59 @@ pub enum BinaryOperator {
PGRegexIMatch,
PGRegexNotMatch,
PGRegexNotIMatch,
PGCustomBinaryOperator(PGCustomOperator),
}

/// PostgreSQL-specific custom operator.
///
/// See [CREATE OPERATOR](https://www.postgresql.org/docs/current/sql-createoperator.html) for more information.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub struct PGCustomOperator {
pub schema: Option<String>,
pub name: String,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think about using the pre-existing Identifier here (that supports various qualified names)

like:

Suggested change
pub struct PGCustomOperator {
pub schema: Option<String>,
pub name: String,
pub struct PGCustomOperator {
pub ident: ObjectName

Thank you could use parse_object_name instead of custom parsing logic.

sqlparser-rs/src/parser.rs

Lines 3114 to 3125 in 076b587

/// Parse a possibly qualified, possibly quoted identifier, e.g.
/// `foo` or `myschema."table"
pub fn parse_object_name(&mut self) -> Result<ObjectName, ParserError> {
let mut idents = vec![];
loop {
idents.push(self.parse_identifier()?);
if !self.consume_token(&Token::Period) {
break;
}
}
Ok(ObjectName(idents))
}

This would likely require less code as well as handle parsing more cases (like OPERATOR("~") -- aka quoted strings)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing this, but looks like Identifier doesn't allow for Tokens that are un-quoted operators, for e.g: ~ is not a valid identifier. I am not sure if extending the parse_identifier to allow for un-quoted operators is the right call here.

IMO we could start with the existing mechanism of schema qualifier operator name (though I would've liked to use the parse_object_name logic) and can extend it as needed later. Let me know.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for trying -- Something about hard coding "schema" seems not right to me here -- at the very least because there can also be a database in identifiers https://www.postgresql.org/docs/current/ddl-schemas.html -- e.g. database.schema.table, so maybe it should be a Vec<String>

I think given the special set of allowed characters described in https://www.postgresql.org/docs/current/sql-createoperator.html in The operator name is a sequence of up to NAMEDATALEN-1 (63 by default) characters from the following list: to properly parse this limitation some extra code would be needed

Let me give it a shot

}

impl fmt::Display for BinaryOperator {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.write_str(match self {
BinaryOperator::Plus => "+",
BinaryOperator::Minus => "-",
BinaryOperator::Multiply => "*",
BinaryOperator::Divide => "/",
BinaryOperator::Modulo => "%",
BinaryOperator::StringConcat => "||",
BinaryOperator::Gt => ">",
BinaryOperator::Lt => "<",
BinaryOperator::GtEq => ">=",
BinaryOperator::LtEq => "<=",
BinaryOperator::Spaceship => "<=>",
BinaryOperator::Eq => "=",
BinaryOperator::NotEq => "<>",
BinaryOperator::And => "AND",
BinaryOperator::Or => "OR",
BinaryOperator::Xor => "XOR",
BinaryOperator::Like => "LIKE",
BinaryOperator::NotLike => "NOT LIKE",
BinaryOperator::ILike => "ILIKE",
BinaryOperator::NotILike => "NOT ILIKE",
BinaryOperator::BitwiseOr => "|",
BinaryOperator::BitwiseAnd => "&",
BinaryOperator::BitwiseXor => "^",
BinaryOperator::PGBitwiseXor => "#",
BinaryOperator::PGBitwiseShiftLeft => "<<",
BinaryOperator::PGBitwiseShiftRight => ">>",
BinaryOperator::PGRegexMatch => "~",
BinaryOperator::PGRegexIMatch => "~*",
BinaryOperator::PGRegexNotMatch => "!~",
BinaryOperator::PGRegexNotIMatch => "!~*",
})
match self {
BinaryOperator::Plus => f.write_str("+"),
BinaryOperator::Minus => f.write_str("-"),
BinaryOperator::Multiply => f.write_str("*"),
BinaryOperator::Divide => f.write_str("/"),
BinaryOperator::Modulo => f.write_str("%"),
BinaryOperator::StringConcat => f.write_str("||"),
BinaryOperator::Gt => f.write_str(">"),
BinaryOperator::Lt => f.write_str("<"),
BinaryOperator::GtEq => f.write_str(">="),
BinaryOperator::LtEq => f.write_str("<="),
BinaryOperator::Spaceship => f.write_str("<=>"),
BinaryOperator::Eq => f.write_str("="),
BinaryOperator::NotEq => f.write_str("<>"),
BinaryOperator::And => f.write_str("AND"),
BinaryOperator::Or => f.write_str("OR"),
BinaryOperator::Xor => f.write_str("XOR"),
BinaryOperator::Like => f.write_str("LIKE"),
BinaryOperator::NotLike => f.write_str("NOT LIKE"),
BinaryOperator::ILike => f.write_str("ILIKE"),
BinaryOperator::NotILike => f.write_str("NOT ILIKE"),
BinaryOperator::BitwiseOr => f.write_str("|"),
BinaryOperator::BitwiseAnd => f.write_str("&"),
BinaryOperator::BitwiseXor => f.write_str("^"),
BinaryOperator::PGBitwiseXor => f.write_str("#"),
BinaryOperator::PGBitwiseShiftLeft => f.write_str("<<"),
BinaryOperator::PGBitwiseShiftRight => f.write_str(">>"),
BinaryOperator::PGRegexMatch => f.write_str("~"),
BinaryOperator::PGRegexIMatch => f.write_str("~*"),
BinaryOperator::PGRegexNotMatch => f.write_str("!~"),
BinaryOperator::PGRegexNotIMatch => f.write_str("!~*"),
BinaryOperator::PGCustomBinaryOperator(ref custom_operator) => {
write!(f, "OPERATOR(")?;
if let Some(ref schema) = custom_operator.schema {
write!(f, "{}.", schema)?;
}
write!(f, "{})", custom_operator.name)
}
}
}
}
1 change: 1 addition & 0 deletions src/keywords.rs
Original file line number Diff line number Diff line change
Expand Up @@ -358,6 +358,7 @@ define_keywords!(
ON,
ONLY,
OPEN,
OPERATOR,
OPTION,
OR,
ORC,
Expand Down
27 changes: 27 additions & 0 deletions src/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1159,6 +1159,32 @@ impl<'a> Parser<'a> {
}
}
Keyword::XOR => Some(BinaryOperator::Xor),
Keyword::OPERATOR if dialect_of!(self is PostgreSqlDialect) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given @ovr 's comment here, perhaps we should also add support when parsing in GenericDialect? apache/datafusion#3037 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I kept the name as PGCustomOperator but happy to change it as well.

self.expect_token(&Token::LParen)?;
let token_1 = self.peek_nth_token(1);

let custom_operator = match token_1 {
Token::Period => {
let schema = self.parse_identifier()?;
self.expect_token(&Token::Period)?;
let operator = self.next_token();
PGCustomOperator {
schema: Some(schema.value),
name: operator.to_string(),
}
}
_ => {
let operator = self.next_token();
PGCustomOperator {
schema: None,
name: operator.to_string(),
}
}
};

self.expect_token(&Token::RParen)?;
Some(BinaryOperator::PGCustomBinaryOperator(custom_operator))
}
_ => None,
},
_ => None,
Expand Down Expand Up @@ -1423,6 +1449,7 @@ impl<'a> Parser<'a> {
Token::Word(w) if w.keyword == Keyword::BETWEEN => Ok(Self::BETWEEN_PREC),
Token::Word(w) if w.keyword == Keyword::LIKE => Ok(Self::BETWEEN_PREC),
Token::Word(w) if w.keyword == Keyword::ILIKE => Ok(Self::BETWEEN_PREC),
Token::Word(w) if w.keyword == Keyword::OPERATOR => Ok(Self::BETWEEN_PREC),
Token::Eq
| Token::Lt
| Token::LtEq
Expand Down
39 changes: 39 additions & 0 deletions tests/sqlparser_postgres.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1564,3 +1564,42 @@ fn parse_fetch() {
pg_and_generic()
.verified_stmt("FETCH BACKWARD ALL IN \"SQL_CUR0x7fa44801bc00\" INTO \"new_table\"");
}

#[test]
fn parse_custom_operator() {
// operator with a schema
let sql = r#"SELECT * FROM events WHERE relname OPERATOR(pg_catalog.~) '^(table)$'"#;
let select = pg().verified_only_select(sql);
assert_eq!(
select.selection,
Some(Expr::BinaryOp {
left: Box::new(Expr::Identifier(Ident {
value: "relname".into(),
quote_style: None,
})),
op: BinaryOperator::PGCustomBinaryOperator(PGCustomOperator {
schema: Some("pg_catalog".into()),
name: "~".into(),
}),
right: Box::new(Expr::Value(Value::SingleQuotedString("^(table)$".into())))
})
);

// custom operator without a schema
let sql = r#"SELECT * FROM events WHERE relname OPERATOR(~) '^(table)$'"#;
let select = pg().verified_only_select(sql);
assert_eq!(
select.selection,
Some(Expr::BinaryOp {
left: Box::new(Expr::Identifier(Ident {
value: "relname".into(),
quote_style: None,
})),
op: BinaryOperator::PGCustomBinaryOperator(PGCustomOperator {
schema: None,
name: "~".into(),
}),
right: Box::new(Expr::Value(Value::SingleQuotedString("^(table)$".into())))
})
);
}