After hash join, the size of the generated table is three times that of vanilla Spark

### Backend

VL (Velox)

### Bug description

[TEST SQL:
origin spark,
drop table tmpxx purge;
create table tmpxx using orc as select * from store_sales s1 left outer join store_returns s2  on sr_item_sk = ss_item_sk where s1.ss_sold_date_sk>2452630 and s2.sr_returned_date_sk>2452790;

gluten,
drop table tmpyy purge;
create table tmpyy using orc as select * from store_sales s1 left outer join store_returns s2  on sr_item_sk = ss_item_sk where s1.ss_sold_date_sk>2452630 and s2.sr_returned_date_sk>2452790;

the result:
91.3 G  273.8 G  hdfs://tpc/warehouse/tablespace/managed/hive/tpcds_10000_orc.db/tmpxx
575.2 G  1.7 T  hdfs://tpc/warehouse/tablespace/managed/hive/tpcds_10000_orc.db/tmpyy

Tests show that whether native write is used or not does not affect the result，without hash join occurring and with direct insertion, the data volume shows little difference.Expected behavior] and [actual behavior].

### Gluten version

Gluten-1.2

### Spark version

Spark-3.4.x

### Spark configurations

_No response_

### System information

_No response_

### Relevant logs

```bash

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After hash join, the size of the generated table is three times that of vanilla Spark #10693

Backend

Bug description

Gluten version

Spark version

Spark configurations

System information

Relevant logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

After hash join, the size of the generated table is three times that of vanilla Spark #10693

Description

Backend

Bug description

Gluten version

Spark version

Spark configurations

System information

Relevant logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions