Iceberg BatchScan & SparkDistributedDataScan to support `limit` pushdown

### Feature Request / Improvement

Request to add `limit` pushdown to improve the performance of reading a big table by skipping full batch scan, where the batch scan is implemented [here](https://github.com/apache/iceberg/blob/apache-iceberg-1.9.1/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java#L755-L762)

**How is this observed?**
When `select * from table_name limit 1`, the spark will actually scan all the data from the table; the bigger the table, the longer it takes. 

For example, 
```
(1) BatchScan glue_catalog.lakehouse_bronze.table_name
Output [51]: [ISTEST#69, LEADUUID#70, UPDATEDAT#71, ...etc]
glue_catalog.lakehouse_bronze.table_name (branch=null) [filters=, groupedBy=] <-- don't have limit pushdown
```
Hence, the input size is big
<img width="687" alt="Image" src="https://github.com/user-attachments/assets/864d9349-6280-439f-8689-4a66541a6e4c" />

### Query engine

Spark

### Willingness to contribute

- [ ] I can contribute this improvement/feature independently
- [x] I would be willing to contribute this improvement/feature with guidance from the Iceberg community
- [ ] I cannot contribute this improvement/feature at this time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Iceberg BatchScan & SparkDistributedDataScan to support `limit` pushdown #13383

Feature Request / Improvement

Query engine

Willingness to contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Iceberg BatchScan & SparkDistributedDataScan to support limit pushdown #13383

Description

Feature Request / Improvement

Query engine

Willingness to contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Iceberg BatchScan & SparkDistributedDataScan to support `limit` pushdown #13383