Skip to content

[ntuple] Schema evolution of std::variant #19253

Open
@hahnjo

Description

@hahnjo

Description

We currently have no RVariantField::BeforeConnectPageSource that either checks or correctly schema evolves a std::variant field.

Reproducer

#include <ROOT/RNTupleModel.hxx>
#include <ROOT/RNTupleReader.hxx>
#include <ROOT/RNTupleWriter.hxx>

#include <cstdint>
#include <string>
#include <variant>

void ntuple_variant() {
  {
    auto model = ROOT::RNTupleModel::Create();
    using VarType = std::variant<std::int32_t, std::string>;
    auto var = model->MakeField<VarType>("var");
    auto writer = ROOT::RNTupleWriter::Recreate(std::move(model), "ntpl", "ntuple_variant.root");

    *var = 42;
    writer->Fill();
    *var = "abc";
    writer->Fill();
  }

  { // 1
    auto model = ROOT::RNTupleModel::Create();
    using VarType = std::variant<std::int32_t, std::string, float>;
    auto var = model->MakeField<VarType>("var");
    try {
      auto reader = ROOT::RNTupleReader::Open(std::move(model), "ntpl", "ntuple_variant.root");
    } catch (const ROOT::RException &e) {
      std::cout << e.GetError().GetReport() << "\n";
    }
  }

  { // 2
    auto model = ROOT::RNTupleModel::Create();
    using VarType = std::variant<std::int32_t>;
    auto var = model->MakeField<VarType>("var");
    auto reader = ROOT::RNTupleReader::Open(std::move(model), "ntpl", "ntuple_variant.root");
    reader->LoadEntry(0);
    std::cout << "0: index = " << var->index() << "\n";
    reader->LoadEntry(1);
    std::cout << "1: index = " << var->index() << "\n";
  }
}

This macro writes a std::variant<std::int32_t, std::string> and then demonstrates two problems:

  1. When trying to read back as a std::variant<std::int32_t, std::string, float>, the additional float leaf field complains. I think it would be good to automatically support this case.
No on-disk field information for `var._2`
At:
  const ROOT::RFieldBase::ColumnRepresentation_t &ROOT::RFieldBase::EnsureCompatibleColumnTypes(const ROOT::RNTupleDescriptor &, std::uint16_t) const
  1. The second case is more severe: Here the user tries to read back into a std::variant<std::int32_t> and this crashes once the on-disk switch tag indicates a value in the second item field:
    if (R__likely(tag > 0)) {
    void *varPtr = reinterpret_cast<unsigned char *>(to) + fVariantOffset;
    CallConstructValueOn(*fSubfields[tag - 1], varPtr);
    CallReadOn(*fSubfields[tag - 1], variantIndex, varPtr);
    }

Additional context

We discussed a conservative approach to check exact identity of the type name.

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions