Full join on dataframe with only index yields dropped rows

**Describe the bug**
If I do a full join between a dataframe with content and one only consisting of an index column that index column only bits get dropped.

**To Reproduce**
See commented out `empty` additional column. When that is included then we see results in the final dataframe.
```python
ctx = dfn.SessionContext()
    key_frame = ctx.from_pydict(
        {
            "log_time": [1, 3, 5, 7, 9, 11, 13, 15, 17, 19],
            "key_frame": [True, True, True, True, True, True, True, True, True, True]
        }
    )
    query_times = ctx.from_pydict(
        {
            "log_time": [2, 4, 6, 8, 10],
            #"empty": [0, 0, 0, 0, 0]
        }
    )

    print(key_frame)
    print(query_times)
    merged = query_times.join(key_frame, left_on="log_time", right_on="log_time", how="full")
    print(merged)
```

```python
DataFrame()
+----------+-----------+
| log_time | key_frame |
+----------+-----------+
| 1        | true      |
| 3        | true      |
| 5        | true      |
| 7        | true      |
| 9        | true      |
| 11       | true      |
| 13       | true      |
| 15       | true      |
| 17       | true      |
| 19       | true      |
+----------+-----------+
DataFrame()
+----------+
| log_time |
+----------+
| 2        |
| 4        |
| 6        |
| 8        |
| 10       |
+----------+
DataFrame()
+----------+----------+-----------+
| log_time | log_time | key_frame |
+----------+----------+-----------+
|          | 1        | true      |
|          | 3        | true      |
|          | 5        | true      |
|          | 7        | true      |
|          | 9        | true      |
|          | 11       | true      |
|          | 13       | true      |
|          | 15       | true      |
|          | 17       | true      |
|          | 19       | true      |
+----------+----------+-----------+
```

**Expected behavior**
When doing a full join I get back all rows. Effectively merging the dataframes.

Here is a somewhat equivalent in pandas
```python
key_frame_df = pd.DataFrame({
        "log_time": [1, 3, 5, 7, 9, 11, 13, 15, 17, 19],
        "key_frame": [True, True, True, True, True, True, True, True, True, True]
    })
    
    query_times_df = pd.DataFrame({
        "log_time": [2, 4, 6, 8, 10],
        # "empty": [0, 0, 0, 0, 0]  # commented out like in original
    })
    
    
    # Perform full outer join (equivalent to DataFusion's "full" join)
    merged_df = pd.merge(query_times_df, key_frame_df, on="log_time", how="outer")
    print("\nMerged DataFrame (full outer join):")
    print(merged_df)
```

```console
Merged DataFrame (full outer join):
    log_time key_frame
0          1      True
1          2       NaN
2          3      True
3          4       NaN
4          5      True
5          6       NaN
6          7      True
7          8       NaN
8          9      True
9         10       NaN
10        11      True
11        13      True
12        15      True
13        17      True
14        19      True
```

Actually the behavior in pyarrow is maybe a more direct comparison

```python
key_table = pa.table(key_frame)
  query_table = pa.table(query_times)
  merged_table = query_table.join(key_table, keys="log_time", join_type="full outer")
  print(ctx.from_arrow(merged_table))
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Full join on dataframe with only index yields dropped rows #1305

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Full join on dataframe with only index yields dropped rows #1305

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions