You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The main reason we need to do this is because the columns are not always going to be on the disk (right now the new scan node fails in this case). It's also a performance enhancement to skip loading of these columns as well. The solution will, I suspect, also lay the groundwork for adding support for the augmented columns as well (filename, batch index, file index)
Component(s)
C++
The text was updated successfully, but these errors were encountered:
…stead of fragment (#15129)
If a fragment has a guarantee like `x == 5` then we don't need to load the column `x` from disk and can instead just use the scalar `5`. This is not just a performance improvement. In many cases, users will create partitioned datasets without actually storing the partition value as a separate column (e.g. the file `my_dataset/x=5/foo.parquet` will not have a column named `x`)
* Closes: #15059
Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
Describe the enhancement requested
The main reason we need to do this is because the columns are not always going to be on the disk (right now the new scan node fails in this case). It's also a performance enhancement to skip loading of these columns as well. The solution will, I suspect, also lay the groundwork for adding support for the augmented columns as well (filename, batch index, file index)
Component(s)
C++
The text was updated successfully, but these errors were encountered: