-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Humbly Checking for Status Updates #10
Comments
Thanks for your kind words ❤️ This project eats at me. I really really want to implement it. But time and again, the different approaches I take, it’s clear I need to build a new PostgreSQL parser to solve it. None of the existing ones will do, they’re either not sufficiently accurate, or they are based on Postgres’ own parser which has a wide range of issues that mean it’s not really suitable for this task (unintuitively). It’s very frustrating. I pick at it in my hobby hours from time to time. I’m pulling on a few threads in private. I’ve got some interesting ideas that might turn out great or might not turn into anything (except maybe some pull requests to Postgres’ docs...) Mostly I need to invest some time in learning parsing/lexing properly rather than the dabbling I’ve done in the past. I’ve been considering crowd-funding (e.g. kickstarter) some work on it, but I’m not 100% sure I can deliver, and I’m not sure there’s sufficient demand that the fundraise would be successful either, so right now it remains a hobby project that gently eats at my soul. FWIW the best formatter I’m aware of is pgFormatter. |
@benjie it pains me to think of this unfinished task eating away at you. I'm watching this project because I'm keen to see a good solution. But please don't lose sleep on it until people (like myself!) are paying you to do it. :) |
Would any of the newer parsing libraries like |
Does it do deparsing too? If not, someone has to write converting the AST to SQL for every single possible AST representation, which is a lot of work as I'm sure you can imagine. Also if it's not PostgreSQL specific then it's unlikely it would have full coverage of the PostgreSQL features that many of us use; but I'd certainly like to see the results of running it against some decent schemas; a good starting point might be Graphile Starter's schema: https://github.com/graphile/starter/blob/main/data/schema.sql |
Deparsing (I guess aka printing? or maybe I'm misunderstanding...) is not one of its goals, so that would need to be built (unless I'm misunderstanding I guess pretty-printing usually would make up the majority of the code of a Prettier plugin). But maybe @mtriff or @modeldba (maintainers at https://github.com/modeldba/sql-surveyor) could shed more light here on how this could be done and how difficult it would be. Here's the output of that Graphile schema when parsed by Excerpt: ParsedSql {
parsedQueries: {
'163': ParsedQuery {
outputColumns: [],
referencedColumns: [],
referencedTables: {},
tokens: {
'163': Token {
value: 'SET',
location: TokenLocation {
lineStart: 8,
lineEnd: 8,
startIndex: 163,
stopIndex: 165
}
},
'167': Token {
value: 'statement_timeout',
location: TokenLocation {
lineStart: 8,
lineEnd: 8,
startIndex: 167,
stopIndex: 183
}
},
'185': Token {
value: '=',
location: TokenLocation {
lineStart: 8,
lineEnd: 8,
startIndex: 185,
stopIndex: 185
}
},
'187': Token {
value: '0',
location: TokenLocation {
lineStart: 8,
lineEnd: 8,
startIndex: 187,
stopIndex: 187
}
},
'188': Token {
value: ';',
location: TokenLocation {
lineStart: 8,
lineEnd: 8,
startIndex: 188,
stopIndex: 188
}
}
},
query: 'SET statement_timeout = 0',
queryType: 'DDL',
queryLocation: TokenLocation {
lineStart: 8,
lineEnd: 8,
startIndex: 163,
stopIndex: 187
},
queryErrors: [],
subqueries: {},
commonTableExpressions: {}
},
'190': ParsedQuery {
outputColumns: [],
referencedColumns: [],
referencedTables: {},
tokens: {
'190': Token {
value: 'SET',
location: TokenLocation {
lineStart: 9,
lineEnd: 9,
startIndex: 190,
stopIndex: 192
}
},
'194': Token {
value: 'lock_timeout',
location: TokenLocation {
lineStart: 9,
lineEnd: 9,
startIndex: 194,
stopIndex: 205
}
},
'207': Token {
value: '=',
location: TokenLocation {
lineStart: 9,
lineEnd: 9,
startIndex: 207,
stopIndex: 207
}
},
'209': Token {
value: '0',
location: TokenLocation {
lineStart: 9,
lineEnd: 9,
startIndex: 209,
stopIndex: 209
}
},
'210': Token {
value: ';',
location: TokenLocation {
lineStart: 9,
lineEnd: 9,
startIndex: 210,
stopIndex: 210
}
}
},
query: 'SET lock_timeout = 0',
queryType: 'DDL',
queryLocation: TokenLocation {
lineStart: 9,
lineEnd: 9,
startIndex: 190,
stopIndex: 209
},
queryErrors: [],
subqueries: {},
commonTableExpressions: {}
},
// ...
}
} |
Or if by deparsing you mean getting the original SQL representation back as part of the parsed data structure, I guess that is happening (both in the |
Jumping into this conversation a bit blind, so I'm not totally familiar with this project and its goals. Under the hood, I'm assuming that the deparsing requirement is in support of this from the README:
Deparsing like this isn't really a goal of |
Thanks @karlhorky and @mtriff. Looking at It also seems like we'd need to use the token stream as the digests aren't generally sufficient for our needs, for example in the below parsed
Alas I don't think |
The comments are available in the AST, but they're loaded onto a separate channel and If you go this route, you'll probably want to implement a PL/pgSQL parser directly with |
Awesome, thanks @mtriff. At the moment this project is deep on the back burner as I don't have sufficient quantities of time available to work on it (when I tried before I figured it'd take all my OSS hours for a few months to get a decent solution out, I am a beginner at parsers/lexers after all); but I'm keeping my eyes open for better solutions and your pointers help 👍 |
Just stumbled on https://github.com/mtxr/vscode-sqltools/tree/b8938cafa59dd378d22a7ef24ecb710de51b400e/packages/formatter/src/languages which could be interesting. |
@benjie seems the issue you raised has been resolved? Are there more major blockers? |
Time ;) See the open PR |
Wrote about some alternatives over here too, in case anyone wants to try some things in the meantime: |
First off, thanks for all the incredible things you do for the open source commuinity @benjie. You're definitely an inspiration on multiple levels. Hopefully this issue doesn't come as a nuisance so please feel free to close if this just isn't on your radar right now.
I went down a crazy rabbit hole today trying to find some kind of tool I could use inside of our repo to apply consistent formatting to Postgres SQL migrations and a bunch of more complex Postgres Functions that we manage. I'm spoiled into assuming that in the Node ecosystem these types of tools are readily accessible but there doesn't seem to be many good options out there. This project definitely seems the most promising both because of it's integration with Prettier and the architectural approach you're taking.
Was just wondering if you plan to revisit and continue work on this or if it's on the backburner because of your other more high profile projects (totally understandable)? Anyways, thanks again and I just wanted to give a huge thumbs up that if this ever got released it would be something that would definitely help out both the org I work for and me personally!
The text was updated successfully, but these errors were encountered: