Skip to content

Latest commit

 

History

History
151 lines (87 loc) · 6.4 KB

20210224_05.md

File metadata and controls

151 lines (87 loc) · 6.4 KB

PostgreSQL parser hook, Extensibility of the PostgreSQL wire protocol 还远吗? - 用户可以自己开发插件, mysql, oracle, sql server, sybase wire protocol可以兼容, 语法也可以兼容

作者

digoal

日期

2021-02-24

标签

PostgreSQL , hook , 兼容性 , 协议 , parser


背景

On December 1st, 2020, Amazon AWS announced Babelfish. Babelfish “adds an endpoint to PostgreSQL that understands the SQL Server wire protocol Tabular Data Stream (TDS), as well as commonly used T-SQL commands used by SQL Server. Support for T-SQL includes elements such as the SQL dialect, cursors, catalog views, data types, triggers, stored procedures, and functions”. Wow. SQL Server wire and application compatibility for PostgreSQL!

Agreed on adding substantial hooks if they're not likely to be used. While
I haven't yet seen AWS' implementation or concrete proposal, given the
people involved, I assume it's fairly similar to how I implemented it.
Assuming that's correct and it doesn't require substantial redevelopment,
I'd certainly open-source my MySQL-compatible protocol and parser
implementation. From my perspective, it would be awesome if these could be
done as extensions.

While I'm not planning to open source it as of yet, for my
Oracle-compatible stuff, I don't think I'd be able to do anything other
than the protocol as an extension given the core-related changes similar to
what EDB has to do. I don't think there's any easy way to get around that.
But, for the protocol and any type of simple translation to Postgres'
dialect, I think that could easily be hook-based.

--
Jonah H. Harris

听起来很好, 但是社区好像还是很强硬的反对, 例如tom lane, 为什么要支持parser hook, 为什么要支持wire protocol extension

Yeah, and as I pointed out somewhere upthread, trying to replace the  
whole parser will just end in a maintenance nightmare.  The constructs  
that the parser has to emit are complex, Postgres-specific, and  
constantly evolving.  We are NOT going to promise any sort of cross  
version compatibility for parse trees.  
  
  
  
			regards, tom lane  
		  
Absolutely agreed. We cannot promise that the parsetree generated in one version will work with the planner, optimizer and executor of the next. These types of projects will need to pay close attention and more importantly, develop their own regression test suites that detect when something has changed in core. That said, discussion about the parser hook should happen in the other thread.

I don't even expect that we can guarantee that the functions I am trying to allow to be redirected for the wire protocol will be stable forever. libpq V4 may need to change some of the call signatures, which has happened before. For example, the function to send the command completion message to the frontend (tcop/dest.c EndCommand()) changed from 12 to 13.


Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services

也有提出建议中间层来实现, 例如heikki

I can see value in supporting different protocols. I don't like the   
approach discussed in this thread, however.  
  
  
  
For example, there has been discussion elsewhere about integrating   
connection pooling into the server itself. For that, you want to have a   
custom process that listens for incoming connections, and launches   
backends independently of the incoming connections. These hooks would   
not help with that.  
  
  
  
Similarly, if you want to integrate a web server into the database   
server, you probably also want some kind of connection pooling. A   
one-to-one relationship between HTTP connections and backend processes   
doesn't seem nice.  
  
  
  
With the hooks that exist today, would it possible to write a background   
worker that listens on a port, instead of postmaster? Can you launch   
backends from a background worker? And communicate the backend processes   
using a shared memory message queue (see pqmq.c).  
  
  
  
I would recommend this approach: write a separate program that sits   
between the client and PostgreSQL, speaking custom protocol to the   
client, and libpq to the backend. And then move that program into a   
background worker process.  
  
  
  
> In a recent case, letting the browser talk directly to the database  
> allowed me to get rid of a ~100k-sloc .net backend and all the  
> complexity and infrastructure that goes with  
> coding/testing/deploying/maintaining it, while keeping all the  
> positives: per-query compression/data conversion, querying multiple  
> databases over a single connection, session cookies, etc. Deployment  
> is trivial compared to what was before. Latency is down 2x-4x across  
> the board.  
  
  
  
Querying multiple databases over a single connection is not possible   
with the approach taken here. Not sure about the others things you listed.  
  
  
  
- Heikki  

参考

https://postgresql.fund/blog/babelfish-the-elephant-in-the-room/

https://www.postgresql.org/message-id/flat/CAGBW59d5SjLyJLt-jwNv%2BoP6esbD8SCB%3D%3D%3D11WVe5%3DdOHLQ5wQ%40mail.gmail.com

https://www.mail-archive.com/pgsql-hackers@lists.postgresql.org/msg81639.html

https://www.mail-archive.com/pgsql-hackers@lists.postgresql.org/msg81649.html

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

digoal's wechat