# Partitioning

Iceberg (and other lakehouses) don't provide indexes that you may be used to from a more traditional datawarehouse, but they do provide a concept of partitioning, which serves a similar purpose. 

Partitioning refers to structuring the way the files are saved to disk in order to co-locate ranges of values. This makes it more likely that the query engine only has to read a few files to get all the requested data instead of all of them.

If you haven't noticed the theme yet, it's all about eliminating as much disk I/O as possible. The less files we have to scan, the more performant our query is!

Iceberg implements what they call *Hidden Partitioning*, and let's digress a little bit to the past to understand what that means.

Hive implemented *Explicit partitioning*, where the user needs to be aware of the partitioning and explicitly use when reading and writing.

```{figure} images/hive_partitioning.png
:alt: Hive-style partitioning
:align: center
:figwidth: image

Hive-style partitioning
```

The main issue with Hive-style partitioning is that it is explicit.

If I want to only see the 1st day of each month in this example data, I might write the following SQL:

```sql
SELECT * FROM table WHERE day=1
```

This query would not use the index, as 