Skip to content

[ntuple] Schema evolution of record fields #19254

Open
@hahnjo

Description

@hahnjo

Description

We currently have no RRecordField::BeforeConnectPageSource that either checks or correctly schema evolves record fields.

Reproducer

#include <ROOT/RNTupleModel.hxx>
#include <ROOT/RNTupleReader.hxx>
#include <ROOT/RNTupleWriter.hxx>

#include <cstdint>
#include <string>
#include <tuple>

void ntuple_tuple() {
  {
    auto model = ROOT::RNTupleModel::Create();
    using TupType = std::tuple<std::int32_t, std::string>;
    auto tup = model->MakeField<TupType>("tup");
    auto writer = ROOT::RNTupleWriter::Recreate(std::move(model), "ntpl", "ntuple_tuple.root");

    *tup = {42, "abc"};
    writer->Fill();
  }

  { // 1
    auto model = ROOT::RNTupleModel::Create();
    using TupType = std::tuple<std::int32_t, std::string, float>;
    auto tup = model->MakeField<TupType>("tup");
    try {
      auto reader = ROOT::RNTupleReader::Open(std::move(model), "ntpl", "ntuple_tuple.root");
    } catch (const ROOT::RException &e) {
      std::cout << e.GetError().GetReport() << "\n";
    }
  }

  { // 2
    auto model = ROOT::RNTupleModel::Create();
    using TupType = std::tuple<std::string>;
    auto tup = model->MakeField<TupType>("tup");
    try {
    auto reader = ROOT::RNTupleReader::Open(std::move(model), "ntpl", "ntuple_tuple.root");
      reader->LoadEntry(0);
      std::cout << "0: string = " << std::get<0>(*tup) << "\n";
    } catch (const ROOT::RException &e) {
      std::cout << e.GetError().GetReport() << "\n";
    }
  }
}

This macro writes a std::tuple<std::int32_t, std::string> and then demonstrates two cases:

  1. When trying to read back as a std::tuple<std::int32_t, std::string, float>, the additional float leaf field complains:
No on-disk field information for `tup._2`
At:
  const ROOT::RFieldBase::ColumnRepresentation_t &ROOT::RFieldBase::EnsureCompatibleColumnTypes(const ROOT::RNTupleDescriptor &, std::uint16_t) const
  1. With an imposed field type of std::variant<std::string>, the user could expect that the string is read back. Instead it (correctly) throws:
On-disk column types {`SplitInt32`} for field `tup._0` cannot be matched to its in-memory type `std::string` (representation index: 0)
At:
  const ROOT::RFieldBase::ColumnRepresentation_t &ROOT::RFieldBase::EnsureCompatibleColumnTypes(const ROOT::RNTupleDescriptor &, std::uint16_t) const

Additional context

We discussed a conservative approach to check exact identity of the type name.

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions