Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/dialect/postgresql.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ impl Dialect for PostgreSqlDialect {
// See https://www.postgresql.org/docs/11/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
// We don't yet support identifiers beginning with "letters with
// diacritical marks and non-Latin letters"
('a'..='z').contains(&ch) || ('A'..='Z').contains(&ch) || ch == '_'
('a'..='z').contains(&ch) || ('A'..='Z').contains(&ch) || ch == '_' || ch == '#'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am torn on this one - I worry about causing trouble for someone who is relaying on # not being a valid pg identifiers. I don't really have a great understanding of what people are doing / using sqlparser for and I would say it is not particularly consistent about disallowing constructs that are not valid other dialects of SQL 🤔

What do you think about adding a RedshiftSqlDialect to avoid unintended consquences? I think it could be pretty straightforward, something like:

impl Dialect for RedshiftSqlDialect {
    fn is_identifier_start(&self, ch: char) -> bool {
      PostgreSqlDialect::is_identifier_start(ch) || ch == '#'
    }
}

}

fn is_identifier_part(&self, ch: char) -> bool {
Expand All @@ -29,5 +29,6 @@ impl Dialect for PostgreSqlDialect {
|| ('0'..='9').contains(&ch)
|| ch == '$'
|| ch == '_'
|| ch == '#'
}
}
39 changes: 25 additions & 14 deletions src/tokenizer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,25 @@ impl<'a> Tokenizer<'a> {
Ok(tokens)
}

fn consume_sharp(
&self,
chars: &mut Peekable<Chars<'_>>,
) -> Result<Option<Token>, TokenizerError> {
match chars.peek() {
Some('>') => {
chars.next();
match chars.peek() {
Some('>') => {
chars.next();
Ok(Some(Token::HashLongArrow))
}
_ => Ok(Some(Token::HashArrow)),
}
}
_ => Ok(Some(Token::Sharp)),
}
}

/// Get the next token or return None
fn next_token(&self, chars: &mut Peekable<Chars<'_>>) -> Result<Option<Token>, TokenizerError> {
//println!("next_token: {:?}", chars.peek());
Expand Down Expand Up @@ -422,7 +441,11 @@ impl<'a> Tokenizer<'a> {
s += s2.as_str();
return Ok(Some(Token::Number(s, false)));
}
Ok(Some(Token::make_word(&s, None)))
if s == "#" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb What do you think about this change?
I'm worried about clash with lines 610:
I didn't found a better way to solve it.

'#' if dialect_of!(self is SnowflakeDialect) => { chars.next(); // consume the '#', starting a snowflake single-line comment let comment = self.tokenize_single_line_comment(chars); Ok(Some(Token::Whitespace(Whitespace::SingleLineComment { prefix: "#".to_owned(), comment, }))) }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should check if the dialect is RedshiftSqlDialect?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change doesn't feel right to me (because it effectively skips #s so wouldn't it mean that things like #SELECT will parse as SELECT (not just identifiers)?

self.consume_sharp(chars)
} else {
Ok(Some(Token::make_word(&s, None)))
}
}
// string
'\'' => {
Expand Down Expand Up @@ -624,19 +647,7 @@ impl<'a> Tokenizer<'a> {
}
'#' => {
chars.next();
match chars.peek() {
Some('>') => {
chars.next();
match chars.peek() {
Some('>') => {
chars.next();
Ok(Some(Token::HashLongArrow))
}
_ => Ok(Some(Token::HashArrow)),
}
}
_ => Ok(Some(Token::Sharp)),
}
self.consume_sharp(chars)
}
'@' => self.consume_and_return(chars, Token::AtSign),
'?' => self.consume_and_return(chars, Token::Placeholder(String::from("?"))),
Expand Down
10 changes: 10 additions & 0 deletions tests/sqlparser_postgres.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1373,3 +1373,13 @@ fn pg_and_generic() -> TestedDialects {
dialects: vec![Box::new(PostgreSqlDialect {}), Box::new(GenericDialect {})],
}
}

#[test]
fn test_sharp() {
let sql = "SELECT #_of_values";
let select = pg().verified_only_select(sql);
assert_eq!(
SelectItem::UnnamedExpr(Expr::Identifier(Ident::new("#_of_values"))),
select.projection[0]
);
}