Reimplement parquet (de)serialization

**Feature request**

### `read_parquet`

- Automatically cast struct-list columns to nested. Introduce `reject_nesting: bool | list[str] = False` which would help to exclude columns from being casted. Provide a nice error message if `struct-list` is not "nested", something like "ooh-ooh, please use `npd.read_parquet(reject_nesting=["failed_column"])` instead".
- Allow `engine="pyarrow"` only
- Allow `dtypes_backend="pyarrow"` only
- Pack partially loaded struct-list columns to nested, e.g. loaded with `columns=["lc.t", "lc.flux"]`.

For the last one, there is an important edge case (existing in Rubin DP1), `columns=["flux", "lc.flux"]`, which fails with current stable pandas. I think we should use pyarrow directly:

```python
fname = ...
table = pa.parquet.read_pandas(fname, columns=[...], ...)
schema = pa.parquetParquetSchema(fname)
# Figure out how to pack sub-columns back with schema and table
table = ...
nested_columns = [...]
nf = NestedFrame(table.to_pandas(types_mapper=lambda ty: NestedDtype(ty) if ty in nested_columns else pd.ArrowDtype(ty)))
```

### `to_parquet`

- `use_nested_dtype: bool = False` would cast `NestedDtype` to the corresponding arrow pandas type before saving.

**Before submitting**
Please check the following:

- [x] I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
- [x] I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
- [x] If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Reimplement parquet (de)serialization #232

`read_parquet`

`to_parquet`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Reimplement parquet (de)serialization #232

Description

read_parquet

to_parquet

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`read_parquet`

`to_parquet`