PostgreSQL parser hook, Extensibility of the PostgreSQL wire protocol 还远吗? - 用户可以自己开发插件, mysql, oracle, sql server, sybase wire protocol可以兼容, 语法也可以兼容
digoal
2021-02-24
PostgreSQL , hook , 兼容性 , 协议 , parser
On December 1st, 2020, Amazon AWS announced Babelfish. Babelfish “adds an endpoint to PostgreSQL that understands the SQL Server wire protocol Tabular Data Stream (TDS), as well as commonly used T-SQL commands used by SQL Server. Support for T-SQL includes elements such as the SQL dialect, cursors, catalog views, data types, triggers, stored procedures, and functions”. Wow. SQL Server wire and application compatibility for PostgreSQL!
Agreed on adding substantial hooks if they're not likely to be used. While
I haven't yet seen AWS' implementation or concrete proposal, given the
people involved, I assume it's fairly similar to how I implemented it.
Assuming that's correct and it doesn't require substantial redevelopment,
I'd certainly open-source my MySQL-compatible protocol and parser
implementation. From my perspective, it would be awesome if these could be
done as extensions.
While I'm not planning to open source it as of yet, for my
Oracle-compatible stuff, I don't think I'd be able to do anything other
than the protocol as an extension given the core-related changes similar to
what EDB has to do. I don't think there's any easy way to get around that.
But, for the protocol and any type of simple translation to Postgres'
dialect, I think that could easily be hook-based.
--
Jonah H. Harris
听起来很好, 但是社区好像还是很强硬的反对, 例如tom lane, 为什么要支持parser hook, 为什么要支持wire protocol extension
Yeah, and as I pointed out somewhere upthread, trying to replace the
whole parser will just end in a maintenance nightmare. The constructs
that the parser has to emit are complex, Postgres-specific, and
constantly evolving. We are NOT going to promise any sort of cross
version compatibility for parse trees.
regards, tom lane
Absolutely agreed. We cannot promise that the parsetree generated in one version will work with the planner, optimizer and executor of the next. These types of projects will need to pay close attention and more importantly, develop their own regression test suites that detect when something has changed in core. That said, discussion about the parser hook should happen in the other thread.
I don't even expect that we can guarantee that the functions I am trying to allow to be redirected for the wire protocol will be stable forever. libpq V4 may need to change some of the call signatures, which has happened before. For example, the function to send the command completion message to the frontend (tcop/dest.c EndCommand()) changed from 12 to 13.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
也有提出建议中间层来实现, 例如heikki
I can see value in supporting different protocols. I don't like the
approach discussed in this thread, however.
For example, there has been discussion elsewhere about integrating
connection pooling into the server itself. For that, you want to have a
custom process that listens for incoming connections, and launches
backends independently of the incoming connections. These hooks would
not help with that.
Similarly, if you want to integrate a web server into the database
server, you probably also want some kind of connection pooling. A
one-to-one relationship between HTTP connections and backend processes
doesn't seem nice.
With the hooks that exist today, would it possible to write a background
worker that listens on a port, instead of postmaster? Can you launch
backends from a background worker? And communicate the backend processes
using a shared memory message queue (see pqmq.c).
I would recommend this approach: write a separate program that sits
between the client and PostgreSQL, speaking custom protocol to the
client, and libpq to the backend. And then move that program into a
background worker process.
> In a recent case, letting the browser talk directly to the database
> allowed me to get rid of a ~100k-sloc .net backend and all the
> complexity and infrastructure that goes with
> coding/testing/deploying/maintaining it, while keeping all the
> positives: per-query compression/data conversion, querying multiple
> databases over a single connection, session cookies, etc. Deployment
> is trivial compared to what was before. Latency is down 2x-4x across
> the board.
Querying multiple databases over a single connection is not possible
with the approach taken here. Not sure about the others things you listed.
- Heikki
https://postgresql.fund/blog/babelfish-the-elephant-in-the-room/
https://www.mail-archive.com/pgsql-hackers@lists.postgresql.org/msg81639.html
https://www.mail-archive.com/pgsql-hackers@lists.postgresql.org/msg81649.html
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.