Skip to content

Conversation

@Sbargaoui
Copy link
Contributor

Fix AsyncFileSystem._rm_file() to support sync-style implementations

Summary

Fixes architectural inconsistency between sync and async file deletion APIs that prevents rm_file() from working on async filesystems that implement _rm() instead of _rm_file().

Problem

The async and sync APIs have inverted delegation patterns:

Sync (AbstractFileSystem):

rm_file(path) → _rm(path)  # subclasses implement _rm()

Async (AsyncFileSystem):

_rm(path) → _rm_file(path)  # but _rm_file() raises NotImplementedError

This breaks external async filesystems (like adlfs.AzureBlobFileSystem where this issue was encoutered) that follow the sync pattern and implement _rm() but not _rm_file():

  • fs.rm() works (calls the overridden _rm())
  • fs.rm_file() fails with NotImplementedError (calls _rm_file() which isn't implemented)

Root Cause

# fsspec/asyn.py
async def _rm_file(self, path, **kwargs):
    raise NotImplementedError  # Wrong extension point

async def _rm(self, path, recursive=False, batch_size=None, **kwargs):
    path = await self._expand_path(path, recursive=recursive)
    return await _run_coros_in_chunks(
        [self._rm_file(p, **kwargs) for p in reversed(path)],  # Calls unimplemented _rm_file
        batch_size=batch_size,
        nofiles=True,
    )

Proposal

Make _rm_file() detect if subclass implements _rm() and delegate to it:

# fsspec/asyn.py
async def _rm_file(self, path, **kwargs):
    if type(self)._rm is not AsyncFileSystem._rm:
        return await self._rm(path, recursive=False, batch_size=1, **kwargs)
    raise NotImplementedError

Changes

Modified Files

  1. fsspec/asyn.py

    • Added delegation logic to _rm_file()
  2. fsspec/tests/test_async.py

    • Added test_rm_file_with_rm_implementation(): Tests sync-style pattern (implements _rm())
    • Added test_rm_file_with_rm_file_implementation(): Tests async-style pattern (implements _rm_file())
    • Added test_rm_file_without_implementation(): Tests error handling

Open to all feedback and happy to iterate !

@martindurant
Copy link
Member

I see what you mean. Before checking what you have done, I will comment that the real difference between the two _rm definitions, is that the one in AsyncFileSystem is a normal method, but the one in AsyncFileSystem is a coroutine; so the test could be inspect.iscoroutinefunction(self._rm). That might be preferable for cases where a filesystem depends not on AsyncFileSystem but some descendent of it.

@Sbargaoui
Copy link
Contributor Author

Good point! I can update the check to use inspect.iscoroutinefunction(self._rm) combined with the identity check.

That way it'll cover subclasses with custom async def _rm() are detected (even through inheritance) and make sure that base implementation doesn't recurse into itself.

@Sbargaoui Sbargaoui force-pushed the fix/asyncfs-to-support-sync-style branch from a28b212 to 281d1aa Compare November 21, 2025 15:07
@martindurant martindurant merged commit 97c0a0f into fsspec:master Nov 25, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants