Skip to content

Latest commit

 

History

History
103 lines (67 loc) · 3.59 KB

20240503_05.md

File metadata and controls

103 lines (67 loc) · 3.59 KB

DuckDB semi join , anti join 语法糖

作者

digoal

日期

2024-05-03

标签

PostgreSQL , PolarDB , DuckDB , join , semi join = in / exists , anti join = not in / not exists


背景

DuckDB的join支持非常丰富, 除了常用的inner join , outer join , 笛卡尔积等, 还支持 semi join , anti join , asof join , lateral join , positional join等.

参考文档:
https://duckdb.org/docs/sql/query_syntax/from

其中 asof join 通常用于解决业务无法满足等值精确JOIN匹配的数据关联问题、或者2张表的数据存在gap的问题, 例如IoT场景或日志场景.

《DuckDB ASOF JOIN 用法介绍》

《DuckDB 0.8.0 发布, 支持pivot语法, ASOF JOIN, 并行导入导出性能提升, 递归通配符解析文件, arrow 连接器等》

lateral join 则方便在join subquery里面取前面JOIN表的内容进行匹配或过滤的需求.

《PostgreSQL SELECT 的高级用法(CTE, LATERAL, ORDINALITY, WINDOW, SKIP LOCKED, DISTINCT, GROUPING SETS, ...) - 珍藏级》

《PostgreSQL 9.3 add LATERAL support - LATERAL的语法和用法介绍》

本文主要分享semi join 和 anti join的用法

1、SEMI JOIN

-- return a list of cars that have a valid region.  
SELECT cars.name, cars.manufacturer  
FROM cars  
SEMI JOIN region  
       ON cars.region = region.id;  

相当于

SELECT cars.name, cars.manufacturer  
FROM cars  
where cars.region in (select region.id from region)  

SELECT cars.name, cars.manufacturer  
FROM cars  
where exists (select 1 from region where cars.region = region.id)  

2、ANTI JOIN

-- return a list of cars with no recorded safety data.  
SELECT cars.name, cars.manufacturer  
FROM cars  
ANTI JOIN safety_data  
       ON cars.safety_report_id = safety_data.report_id;  

相当于

SELECT cars.name, cars.manufacturer  
FROM cars  
where cars.safety_report_id not in (select safety_data.report_id from safety_data)  

SELECT cars.name, cars.manufacturer  
FROM cars   
where not exists (select 1 from safety_data where cars.safety_report_id = safety_data.report_id)  

digoal's wechat