[python] Implement tag CRUD on FileSystemCatalog#7751
Merged
JingsongLi merged 2 commits intoapache:masterfrom May 1, 2026
Merged
[python] Implement tag CRUD on FileSystemCatalog#7751JingsongLi merged 2 commits intoapache:masterfrom
JingsongLi merged 2 commits intoapache:masterfrom
Conversation
PR apache#7746 added abstract tag stubs to ``Catalog`` and a complete ``RESTCatalog`` implementation, but left ``FileSystemCatalog`` inheriting NotImplementedError. The merge note said implementing filesystem tag CRUD required porting a Python TagManager. As it turns out, ``pypaimon/tag/tag_manager.py::TagManager`` and ``FileStoreTable.create_tag/delete_tag/list_tags/tag_manager()`` already exist on master, so this PR is a thin catalog-layer wrapper that delegates to those helpers and translates ``ValueError`` / return-value-False into the typed catalog exceptions used by the rest of the API. Implementations: - create_tag: delegates to ``table.create_tag``; ``ValueError`` carrying "already exists" is translated to ``TagAlreadyExistException`` (``ignore_if_exists`` is honored inside ``table.create_tag`` itself, so any conflict that bubbles up means the caller asked us to fail). - delete_tag: delegates to ``table.delete_tag(tag_name) -> bool``; ``False`` (tag missing) is translated to ``TagNotExistException``. - get_tag: delegates to ``table.tag_manager().get(tag_name)``; ``None`` is translated to ``TagNotExistException``. The result is wrapped in ``GetTagResponse`` with the tag's snapshot via ``tag.trim_to_snapshot()``. - list_tags_paged: client-side prefix filter + lexicographic pagination (Python TagManager has no built-in paged listing). ``time_retained`` is intentionally rejected with NotImplementedError: the Python ``Tag`` dataclass currently inherits only ``Snapshot`` fields and has no place to store ``tag_create_time`` / ``tag_time_retained``. Silently dropping the option would lead callers to believe their TTL took effect; raising surfaces the gap explicitly. Extending ``Tag`` + ``TagManager.create_tag`` to carry these fields is tracked as a follow-up (and would need a coordinated schema change in the on-disk tag JSON format). For the same reason, ``GetTagResponse.tag_create_time`` and ``tag_time_retained`` are returned as ``None`` from this catalog — neither value is tracked filesystem-side today. Tests: - pypaimon/tests/filesystem_catalog_tag_test.py (12 cases) mirrors the REST tag CRUD test matrix on the local filesystem path: create+get, snapshot_id selection, already-exists raises / ignore, table-not- exists, get/delete-not-exists, list_tags_paged with paging + prefix + empty table, plus ``time_retained`` raises NotImplementedError. - pypaimon/tests/rest/rest_tag_test.py: removed the now-stale ``FilesystemCatalogTagInheritsNotImplementedTest`` class — its assertions ("FileSystemCatalog raises NotImplementedError on tag methods") no longer hold; the new behavior is covered by the new filesystem test file. Cleaned up unused imports left behind.
Self-review caught a residual `unittest.main()` block whose import was removed alongside the `FilesystemCatalogTagInheritsNotImplementedTest` class in the previous commit. Running the file directly with `python pypaimon/tests/rest/rest_tag_test.py` raised `NameError`. pytest didn't catch this (it doesn't go through `__main__`), but reviewers commonly run test files directly. Re-add the import.
Contributor
|
+1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
PR #7746 added abstract tag stubs to
Catalogand a completeRESTCatalogimplementation, but leftFileSystemCataloginheritingNotImplementedError. The merge note said implementing filesystem tag CRUD required porting a PythonTagManager. As it turns out,pypaimon/tag/tag_manager.py::TagManagerandFileStoreTable.create_tag/delete_tag/list_tags/tag_manager()already exist on master, so this PR is a thin catalog-layer wrapper that delegates to those helpers and translatesValueError/ return-value-Falseinto the typed catalog exceptions used by the rest of the API.Linked issue
Follow-up to #7746 — closes the FileSystemCatalog gap noted in that PR's "Out of scope" section.
Tests
pypaimon/tests/filesystem_catalog_tag_test.py(12 new cases) — mirrorsRESTCatalogTagCRUDTeston the local filesystem path:test_create_and_get_tag,test_create_tag_with_snapshot_idtest_create_tag_already_exists_raises,test_create_tag_already_exists_ignoretest_create_tag_with_time_retained_raises_not_implemented(filesystem-only)test_create_tag_table_not_existstest_get_tag_not_existstest_delete_tag_happy,test_delete_tag_not_existstest_list_tags_paged_basic(paging viamax_results+next_page_token)test_list_tags_paged_with_prefix,test_list_tags_paged_empty_tablepypaimon/tests/rest/rest_tag_test.py— removed the now-staleFilesystemCatalogTagInheritsNotImplementedTestclass (its assertions no longer hold; new behavior is covered by the new file). Cleaned up unused imports left behind.pypaimon/tests/api/test_tag_dto_serde.py(8 cases) and the REST tag test matrix (11 cases) all still pass.flake8 --config=dev/cfg.ini.API and format
No new public Catalog method or wire format. This PR only overrides the four existing abstract stubs in
FileSystemCatalog:create_tagtable.create_tag(tag_name, snapshot_id, ignore_if_exists).ValueError("...already exists.")→TagAlreadyExistException(only whennot ignore_if_exists).delete_tagtable.delete_tag(tag_name) -> bool.False→TagNotExistException.get_tagtable.tag_manager().get(tag_name) -> Optional[Tag].None→TagNotExistException. Result wrapped inGetTagResponsewithsnapshot=tag.trim_to_snapshot().list_tags_pagedtable.list_tags() -> List[str]. Client-side lexicographic sort + prefix filter +page_tokencontinuation cursor (no Java equivalent of paged listing exists inTagManageritself).Documentation
n/a — this is a parity fix for an existing public API.
Out of scope (separate PRs)
time_retainedreal support — Python'sTagdataclass currently inherits onlySnapshotfields and has no place to storetag_create_time/tag_time_retained. This PR rejectstime_retained != NonewithNotImplementedErrorrather than silently dropping it (which would mislead callers about TTL effects). ExtendingTag+TagManager.create_tagto carry these fields requires a coordinated change to the on-disk tag JSON format and is tracked as a follow-up.GetTagResponse.tag_create_time— for the same reason, this catalog returnsNonefortag_create_timeandtag_time_retained. Cross-FileIOmtime extraction is not reliable (differentFileIOimplementations expose different metadata).rename_branchstub andBranchAlreadyExistExceptionhandling introduced by [python] Add branch CRUD REST implementation to RESTCatalog #7747 (still under review). The sister PR for FileSystemCatalog branch CRUD will follow once [python] Add branch CRUD REST implementation to RESTCatalog #7747 lands.Verification
Generative AI disclosure
Implementation, tests, and PR description were drafted with the assistance of Claude (Anthropic). Claude was supplied with the existing pypaimon helpers (
TagManager,FileStoreTable.create_tag/delete_tag/list_tags,RESTCatalog.create_tag's exception mapping pattern from #7746) and the REST tag test matrix; the design plan and code were reviewed and integrated by the author.