Skip to content

Converting from kerchunk to CFA: fix or document behaviour #945

@sadielbartholomew

Description

@sadielbartholomew

Regarding our read/write formats support, we yesterday realised the case of going from (newly-supported in v3.20!) kerchunk to CFA, i.e. reading kerchunk and attempting to write it out as CFA, may not work on paper - and indeed when tetsing this I get a TypeError: expected string or bytes-like object, got 'FSMap' error, for example by adding a few lines to a relevant test to request a CFA-format write of read-in kerchunk, namely:

diff --git a/cf/test/test_kerchunk.py b/cf/test/test_kerchunk.py
index b37d09718d..98a9b89968 100644
--- a/cf/test/test_kerchunk.py
+++ b/cf/test/test_kerchunk.py
@@ -75,7 +75,10 @@ class read_writeTest(unittest.TestCase):
 
         fs = fsspec.filesystem("reference", fo=d)
         kerchunk = fs.get_mapper()
-        self.assertEqual(len(cf.read(kerchunk)), 1)
+        k = cf.read(kerchunk)
+        self.assertEqual(len(k), 1)
+        print("Attempting kerchunk -> CFA")
+        cf.write(k, "kerchunk_to_cfa.cfa", cfa="field")
 
     def test_read_bytes(self):
         """Test cf.read with a Kerchunk raw bytes representation."""

I see:

======================================================================
ERROR: test_read_dict (__main__.read_writeTest.test_read_dict)
Test cf.read with an Kerchunk dictionary.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/slb93/git-repos/cf-python/cf/test/test_kerchunk.py", line 81, in test_read_dict
    cf.write(k, "kerchunk_to_cfa.cfa", cfa="field")
  File "/home/slb93/git-repos/cfdm/cfdm/read_write/write.py", line 936, in __new__
    netcdf.write(
  File "/home/slb93/git-repos/cfdm/cfdm/decorators.py", line 171, in verbose_override_wrapper
    return method_with_verbose_kwarg(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slb93/git-repos/cfdm/cfdm/read_write/netcdf/netcdfwrite.py", line 5952, in write
    self._file_io_iteration(
  File "/home/slb93/git-repos/cfdm/cfdm/read_write/netcdf/netcdfwrite.py", line 6230, in _file_io_iteration
    self._write_field_or_domain(f)
  File "/home/slb93/git-repos/cfdm/cfdm/read_write/netcdf/netcdfwrite.py", line 4746, in _write_field_or_domain
    self._write_netcdf_variable(
  File "/home/slb93/git-repos/cfdm/cfdm/read_write/netcdf/netcdfwrite.py", line 3310, in _write_netcdf_variable
    cfa = self._cfa_fragment_array_variables(data, cfvar)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slb93/git-repos/cfdm/cfdm/read_write/netcdf/netcdfwrite.py", line 7014, in _cfa_fragment_array_variables
    uri = urisplit(dataset_name)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slb93/miniconda3/envs/cf-env-312-numpy2/lib/python3.12/site-packages/uritools/__init__.py", line 545, in urisplit
    return result(*result.RE.match(uristring).groups())
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'FSMap'

so there's at least an immediate logical issue to work through relating to type expectations for this case, but there we should work out whether the case can be supported - if so how much further work/logic is needed - and if not feasible to support soon we should document that this case doesn't (yet) work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    aggregationRerlating to metadata-based field and domain aggregationkerchunkRelating to Kerchunk datasets

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions