-
Notifications
You must be signed in to change notification settings - Fork 306
Permalink
Choose a base ref
{{ refName }}
default
Choose a head ref
{{ refName }}
default
Comparing changes
Choose two branches to see what’s changed or to start a new pull request.
If you need to, you can also or
learn more about diff comparisons.
Open a pull request
Create a new pull request by comparing changes across two branches. If you need to, you can also .
Learn more about diff comparisons here.
base repository: apache/iceberg-python
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Could not load branches
Nothing to show
Loading
Could not load tags
Nothing to show
{{ refName }}
default
Loading
...
head repository: apache/iceberg-python
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: fd-infer-types
Could not load branches
Nothing to show
Loading
Could not load tags
Nothing to show
{{ refName }}
default
Loading
- 9 commits
- 6 files changed
- 1 contributor
Commits on Feb 16, 2025
-
Arrow: Infer the types when reading
When reading a Parquet file using PyArrow, there is some metadata stored in the Parquet file to either make it a large type (eg `large_string`, or a normal type (`string`). The difference is that the large types use a 64 bit offset to encode their arrays. This is not always needed, and we can could first check all the in the types of which it is stored, and let PyArrow decide here: https://github.com/apache/iceberg-python/blob/300b8405a0fe7d0111321e5644d704026af9266b/pyiceberg/io/pyarrow.py#L1579 In PyArrow today we just bump everything to a large type, which might lead to additional memory consumption because it allocates a int64 array to allocate the offsets, instead of an int32. I thought we would be good to go for this now with the new lower bound of PyArrow to 17. But, it looks like we still have to wait for Arrow 18 to fix the issue with the `date` types: apache/arrow#43183 Fixes: #1049
Configuration menu - View commit details
-
Copy full SHA for fa9b3ca - Browse repository at this point
Copy the full SHA fa9b3caView commit details
Commits on Feb 18, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 0384b4e - Browse repository at this point
Copy the full SHA 0384b4eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6dd9308 - Browse repository at this point
Copy the full SHA 6dd9308View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2817c61 - Browse repository at this point
Copy the full SHA 2817c61View commit details
Commits on Mar 4, 2025
-
Configuration menu - View commit details
-
Copy full SHA for d6fbca9 - Browse repository at this point
Copy the full SHA d6fbca9View commit details -
Configuration menu - View commit details
-
Copy full SHA for fff7414 - Browse repository at this point
Copy the full SHA fff7414View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d19987 - Browse repository at this point
Copy the full SHA 0d19987View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7382112 - Browse repository at this point
Copy the full SHA 7382112View commit details
Commits on Mar 26, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 6526cc2 - Browse repository at this point
Copy the full SHA 6526cc2View commit details
Loading
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff main...fd-infer-types