Skip to content

Base implementation of Parquet file writing#2583

Merged
VeckoTheGecko merged 35 commits intoParcels-code:mainfrom
VeckoTheGecko:push-zlxyoyvlpoqm
Apr 24, 2026
Merged

Base implementation of Parquet file writing#2583
VeckoTheGecko merged 35 commits intoParcels-code:mainfrom
VeckoTheGecko:push-zlxyoyvlpoqm

Conversation

@VeckoTheGecko
Copy link
Copy Markdown
Contributor

@VeckoTheGecko VeckoTheGecko commented Apr 20, 2026

Description

This PR introduces Parquet file writing to Parcels.

I still need to work on:

  • How to work with cftime output in the Parquet (how does this work with our internal model of time in Parcels? How should it work?)
  • Reviewing the test_particlefile.py file - are there tests that are no longer needed? What would be the best testing approach here?
    • Will leave for a future PR
  • Update documentation
    • Will leave for a future PR

Posting as draft for initial feedback

Checklist

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested any AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.
    • Describe how you used it (e.g., by pasting your prompt): Just to help with learning the PyArrow API (i.e., used it to create an example script - which I then used as an entry to exploring the docs on pyarrow)

@VeckoTheGecko VeckoTheGecko changed the title Add parquet file writing Base implementation of Parquet file writing Apr 20, 2026
@VeckoTheGecko
Copy link
Copy Markdown
Contributor Author

@erikvansebille let's table some of these questions for our meeting tomorrow (mainly around datetime serialization in the Parquet file)

@VeckoTheGecko VeckoTheGecko marked this pull request as ready for review April 23, 2026 12:48
@VeckoTheGecko
Copy link
Copy Markdown
Contributor Author

@erikvansebille I moved read_particlefile to the global scope (i.e.,parcels.read_particlefile()) as well as added a docstring.

@VeckoTheGecko
Copy link
Copy Markdown
Contributor Author

(we'll likely get doc failures on this PR - which is to be expected since they haven't been updated)

Copy link
Copy Markdown
Member

@erikvansebille erikvansebille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work. And as I'm going through updating all the documentation, it seems to work smoothly too!

A few small comments below

Comment thread src/parcels/_core/particlefile.py
Comment thread src/parcels/_core/particlefile.py
Comment thread src/parcels/_core/particlefile.py
Comment thread src/parcels/_core/particlefile.py Outdated
Comment thread src/parcels/_core/particlefile.py Outdated
Comment thread src/parcels/_core/particlefile.py Outdated
Comment thread src/parcels/_core/particlefile.py Outdated
Comment thread tests-v3/test_advection.py
Comment thread tests/test_particlefile.py Outdated
Comment thread tests/test_particlefile.py Outdated
@@ -187,7 +154,7 @@ def test_variable_written_once():
@pytest.mark.skip(reason="Pending ParticleFile refactor; see issue #2386")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test work again? If not out of the box, perhaps we should discuss what we actually expect from output of a looped pset.execute...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was introduced in 4b28ddb . I'm ok cutting tests that are no longer valuable (/tested specific implementation bugs from the old codebase). I don't really understand this test - do you think its still valuable?

perhaps we should discuss what we actually expect from output of a looped pset.execute...

Agreed. Should we open a tracking issue? Whatever we decide on that front we can add separate tests

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is a valuable test (as it tests what happens in the particle file when pset.execute() is run in a for-loop). But perhaps better for a later PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And perhaps rename from test_pset_repeated_release_delayed_adding_deleting - since I find that quite confusing

Comment thread src/parcels/_core/particle.py
Comment thread src/parcels/_core/particlefile.py
@VeckoTheGecko VeckoTheGecko merged commit 439ea29 into Parcels-code:main Apr 24, 2026
14 of 15 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Parcels development Apr 24, 2026
@VeckoTheGecko VeckoTheGecko deleted the push-zlxyoyvlpoqm branch April 24, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Consistent time handling

2 participants