Skip to content

Export framework#220

Merged
Krzmbrzl merged 165 commits intoValeevGroup:masterfrom
Krzmbrzl:export-framework
Aug 18, 2025
Merged

Export framework#220
Krzmbrzl merged 165 commits intoValeevGroup:masterfrom
Krzmbrzl:export-framework

Conversation

@Krzmbrzl
Copy link
Collaborator

@Krzmbrzl Krzmbrzl commented Jul 4, 2024

This implements (more or less) general code generator capabilities into SeQuant. The idea is to abstract the bulk of the work required during code generation away such that new languages can be added as easily as possible.


TODO

  • Verify index order of intermediates is still uniquely determined (sorted)
  • Make test cases work
  • Get rid of insert_blank_lines function of Generator as it breaks the model of only generating semantics
  • Allow specifying name of a computed result
  • Support multiple expressions in a single export
  • Support ResultExpr objects
  • Migrate ITF exporter to new framework
  • Test migrated ITF exporter

@Krzmbrzl Krzmbrzl force-pushed the export-framework branch 2 times, most recently from 1313e9b to bf68d73 Compare July 10, 2024 13:36
The entire logic that was present to determine whether a given tensor is
created or loaded and whether when it will be set to zero and when not
is not also applicable to variables.
This is in order to prevent a bug in which result variables would be
loaded but not set to zero before their intended value is added to it.
This way, when inheriting from a given generator, it is still possible
to pass a custom context class to the top-level Generator<> interface,
which is required in order to make things play out nicely.
Support tensor contraction frameworks are
- TensorOperations.jl
- TensorKit.jl
- ITensor.jl
@Krzmbrzl
Copy link
Collaborator Author

Krzmbrzl commented Jul 9, 2025

It would be helpful to eventually have a basic tutorial on how to use and develop the export framework. This isn’t urgent, but something to consider for the future.

Absolutely! I plan to write them once it is clear that the workflow has stabilized. That will happen once I have written a couple more backends to ensure that the concepts are transferable.

It would be great to adopt and enforce a consistent style across the project. There is already some code in master that doesn't follow this style.

I agree. Given that this doesn't only affect this PR, I would suggest to do code style unification in a separate PR once this one is merged.

@Krzmbrzl Krzmbrzl requested review from ajay-mk and bimalgaudel July 9, 2025 10:35
@Krzmbrzl
Copy link
Collaborator Author

Krzmbrzl commented Jul 9, 2025

@CodiumAI-Agent /improve

@QodoAI-Agent
Copy link

QodoAI-Agent commented Jul 9, 2025

PR Code Suggestions ✨

Latest suggestions up to d1321aa

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix dangling parent pointer

Returning the address of node.parent() yields a pointer to a temporary if node is an
rvalue, causing dangling pointers during traversal. Preserve the node's value
category by accepting an lvalue reference and avoid forwarding in this helper.

SeQuant/core/binary_node.hpp [119-123]

 template <typename Node>
-std::remove_reference_t<Node>* get_parent_ptr(Node&& node) {
+std::remove_reference_t<Node>* get_parent_ptr(Node& node) {
   return node.root() ? nullptr : &node.parent();
 }
Suggestion importance[1-10]: 8

__

Why: Correctly constraining get_parent_ptr to take an lvalue (Node&) prevents returning pointers to subobjects of temporaries, which would dangle during traversal; this is a correctness fix with meaningful impact.

Medium
Prevent dangling traversal pointers

The traversal accepts Node&& and stores pointers to node; when called with
temporaries, current_ptr and previous_ptr dangle. Restrict to Node& and ensure
previous_ptr is non-const to match comparisons with
&current.left()/right()/parent().

SeQuant/core/binary_node.hpp [138-209]

-template <TreeTraversal order, typename Node, typename Visitor,
-          typename NodeType>
-void visit(Node&& node, Visitor&& f, NodeType) {
+template <TreeTraversal order, typename Node, typename Visitor, typename NodeType>
+void visit(Node& node, Visitor&& f, NodeType) {
   static_assert(std::is_same_v<NodeType, VisitLeaf> ||
                     std::is_same_v<NodeType, VisitInternal> ||
                     std::is_same_v<NodeType, VisitAll>,
                 "Not sure which nodes to visit");
 
   std::remove_reference_t<Node>* current_ptr = &node;
-  const std::remove_cvref_t<Node>* previous_ptr = &node;
+  std::remove_reference_t<Node>* previous_ptr = &node;
   ...
   if (!continue_with_subtree) {
-    // Overwrite to make next target the parent (if any)
-    current_ptr = get_parent_ptr(current);
+    current_ptr = get_parent_ptr(*current_ptr);
   }
 }
Suggestion importance[1-10]: 8

__

Why: The traversal stores pointers into node; restricting to Node& and aligning pointer types eliminates potential dangling when called with temporaries and fixes a subtle lifetime bug in the iterative traversal.

Medium
Ensure visitors get stable lvalues

Forwarding node preserves rvalue-ness and may invoke visitors with a dangling
reference when the caller passes a temporary. Constrain node to be an lvalue
reference and remove forwarding to ensure stable references during traversal.

SeQuant/core/binary_node.hpp [93-117]

 template <typename Visitor, typename Node>
-bool invoke_tree_visitor(Visitor&& f, Node&& node, TreeTraversal context) {
-  if constexpr (std::is_invocable_v<Visitor, Node, TreeTraversal>) {
-    using result_type = std::invoke_result_t<Visitor, Node, TreeTraversal>;
+bool invoke_tree_visitor(Visitor&& f, Node& node, TreeTraversal context) {
+  if constexpr (std::is_invocable_v<Visitor, Node&, TreeTraversal>) {
+    using result_type = std::invoke_result_t<Visitor, Node&, TreeTraversal>;
     if constexpr (std::is_same_v<result_type, void>) {
-      std::forward<Visitor>(f)(std::forward<Node>(node), context);
+      std::forward<Visitor>(f)(node, context);
       return true;
     } else {
-      return static_cast<bool>(
-          std::forward<Visitor>(f)(std::forward<Node>(node), context));
+      return static_cast<bool>(std::forward<Visitor>(f)(node, context));
     }
   } else {
-    static_assert(std::is_invocable_v<Visitor, Node>,
-                  "Visitor must be a callable that takes a FullBinaryNode<T> "
-                  "and optionally a TreeTraversal argument");
-    using result_type = std::invoke_result_t<Visitor, Node>;
+    static_assert(std::is_invocable_v<Visitor, Node&>,
+                  "Visitor must be callable with FullBinaryNode<T>& and optionally TreeTraversal");
+    using result_type = std::invoke_result_t<Visitor, Node&>;
     if constexpr (std::is_same_v<result_type, void>) {
-      std::forward<Visitor>(f)(std::forward<Node>(node));
+      std::forward<Visitor>(f)(node);
       return true;
     } else {
-      return static_cast<bool>(
-          std::forward<Visitor>(f)(std::forward<Node>(node)));
+      return static_cast<bool>(std::forward<Visitor>(f)(node));
     }
   }
 };
Suggestion importance[1-10]: 7

__

Why: Changing invoke_tree_visitor to accept Node& avoids forwarding rvalues into visitors that may store references; it's a reasonable safety improvement, though existing usage likely passes lvalues, making impact moderate.

Medium
Harden filesystem iteration

std::filesystem::directory_iterator can throw on permission issues and during
iteration. Wrap the iteration in a try/catch and skip problematic entries to avoid
test crashes on environments with unexpected files/permissions. Also, prefer pushing
current.path() to avoid storing a possibly non-stable iterator entry object.

tests/unit/test_export.cpp [73-78]

-for (std::filesystem::path current :
-     std::filesystem::directory_iterator(base_dir)) {
-  if (current.extension() == ".export_test") {
-    files.push_back(std::move(current));
+try {
+  for (const auto& entry : std::filesystem::directory_iterator(base_dir)) {
+    const auto& path = entry.path();
+    if (path.extension() == ".export_test") {
+      files.push_back(path);
+    }
   }
+} catch (const std::filesystem::filesystem_error& e) {
+  throw std::runtime_error(std::string("Failed to enumerate export tests: ") + e.what());
 }
Suggestion importance[1-10]: 6

__

Why: Wrapping filesystem iteration guards tests against environment-specific filesystem errors and using directory_entry.path() is safer; this is a reasonable robustness improvement with accurate mapping to the new hunk.

Low
Make index label math safe

Using plain char for letter arithmetic is locale/implementation dependent and can
overflow on non-ASCII. Use unsigned char or better int for arithmetic, and validate
that computed code points stay within 'a'..'z' or 'A'..'Z' ranges before casting
back, falling back to numeric suffix when out-of-range.

SeQuant/core/export/itf.cpp [22-31]

-char limit = [&]() -> char {
+int limit = [&]() -> int {
   auto it = m_index_label_limits.find(space);
   if (it == m_index_label_limits.end()) {
-    // Note that in between capital and lowercase letters there are symbols
-    // [,\,],^,_ and `. Hence, this default is not quite safe but should work
-    // in most cases anyway
-    return 'z';
+    return static_cast<int>('z');
   }
+  return static_cast<int>(it->second);
+}();
+if (base_key.empty() || base_key.size() > 1) {
+  return base_key + std::to_string(ordinal);
+}
+int base = static_cast<unsigned char>(base_key[0]);
+int target = base + static_cast<int>(ordinal);
+if (target > limit || !std::isalpha(static_cast<unsigned char>(base))) {
+  return base_key + std::to_string(ordinal);
+}
+return std::string(1, static_cast<char>(target));
 
-  return it->second;
-}();
-
Suggestion importance[1-10]: 5

__

Why: Using int for arithmetic and validating ranges reduces potential UB/overflow with char math; it's a minor but correct safety enhancement aligned with the shown code.

Low
General
Ensure strict ordering checks

is_ordered and space comparisons must be consistent and independent of operand order
to avoid misclassification. Use a single comparison routine (e.g.,
needs_swap/is_ordered) that is antisymmetric and handle equal spaces explicitly;
also guard against undefined behavior if is_ordered assumes a strict weak ordering.

SeQuant/core/export/itf.cpp [203-221]

 bool ItfContext::is_exceptional_J(std::span<Index> bra,
                                   std::span<Index> ket) const {
   assert(bra.size() == 2);
   assert(ket.size() == 2);
+  const auto& b0 = bra[0].space();
+  const auto& b1 = bra[1].space();
+  const auto& k0 = ket[0].space();
+  const auto& k1 = ket[1].space();
 
-  // integrals with 3 external (virtual) indices ought to be converted to
-  // J-integrals
-  // Here, we generalize this to all integrals for which the bra indices and
-  // the first ket index are of the same space, the ket indices are of different
-  // spaces and the additional ket space compares less than the space of the
-  // bras.
-  bool bras_are_same = bra[0].space() == bra[1].space();
-  bool kets_are_different = ket[0].space() != ket[1].space();
-  bool kets_are_ordered = is_ordered(ket[0].space(), ket[1].space());
-  bool first_ket_same_as_bra = ket[0].space() == bra[0].space();
+  const bool bras_are_same = (b0 == b1);
+  const bool kets_are_different = (k0 != k1);
+  // Ensure strict ordering predicate; if equal, not ordered.
+  const bool kets_are_ordered = (k0 != k1) && is_ordered(k0, k1);
+  const bool first_ket_same_as_bra = (k0 == b0);
 
-  return bras_are_same && kets_are_different && kets_are_ordered &&
-         first_ket_same_as_bra;
+  return bras_are_same && kets_are_different && kets_are_ordered && first_ket_same_as_bra;
 }
Suggestion importance[1-10]: 4

__

Why: The change clarifies strict ordering and equality handling; however, the current code already checks these conditions, so impact is modest though correct.

Low

Previous suggestions

Suggestions up to commit 42478f9
CategorySuggestion                                                                                                                                    Impact
Possible issue
Validate test file section structure

Guard against malformed files by validating balanced section separators and ensuring
current_format is set before using it. If EOF is reached while inside_meta remains
true or set_format is false after encountering separators, fail explicitly to avoid
undefined behavior.

tests/unit/test_export.cpp [286-326]

 for (std::string line; std::getline(in, line);) {
   if (line.starts_with("=====")) {
     finished_expr = true;
     set_format = false;
     inside_meta = !inside_meta;
-  } else if (!finished_expr) {
+    continue;
+  }
+  if (!finished_expr) {
     expression_spec += line;
     expression_spec += "\n";
-  } else if (inside_meta) {
+    continue;
+  }
+  if (inside_meta) {
     if (line.starts_with("#")) {
-      // Comment
       continue;
     }
     if (!set_format) {
       current_format = boost::trim_copy(line);
+      if (current_format.empty()) {
+        FAIL("Missing format after separator");
+      }
       if (known_formats.find(current_format) == known_formats.end()) {
         FAIL("Unknown format '" + current_format + "'");
       }
       set_format = true;
     } else if (current_format == generator.get_format_name()) {
-      // Context definition
       add_to_context(context, line);
     }
-  } else if (current_format == generator.get_format_name()) {
+    continue;
+  }
+  if (current_format.empty()) {
+    FAIL("Format not specified before expected output");
+  }
+  if (current_format == generator.get_format_name()) {
     if (expected_output.has_value()) {
       expected_output.value() += "\n";
       expected_output.value() += line;
     } else {
       expected_output = line;
     }
   }
 }
+if (inside_meta) {
+  FAIL("Unbalanced section separators in export test file");
+}
+if (finished_expr && !set_format && current_format.empty()) {
+  FAIL("No format specified in meta section");
+}
Suggestion importance[1-10]: 7

__

Why: Adds defensive checks for malformed test files and ensures format selection before use; aligns with the surrounding parsing logic. Impact is moderate as it improves robustness without altering core behavior.

Medium
Fix underflow on empty queue

The loop uses m_queue.size() - 1 in the condition, which underflows when m_queue is
empty, causing out-of-bounds access. Guard early for sizes < 2 before the loop. This
prevents UB on empty or single-element queues.

SeQuant/core/export/generation_optimizer.hpp [518-567]

 void process_operation_queue() {
-  // If two subsequent elements cancel each other, they can be erased without
-  // having to worry about potentially violating the stack-like ordering of
-  // memory operations (it will be preserved).
-  for (std::size_t i = 0; i < m_queue.size() - 1; ++i) {
+  if (m_queue.size() < 2) {
+    // Nothing to cancel/pair; just move to cache
+    m_cache.insert(m_cache.end(),
+                   std::make_move_iterator(m_queue.begin()),
+                   std::make_move_iterator(m_queue.end()));
+    m_queue.clear();
+    return;
+  }
+
+  for (std::size_t i = 0; i + 1 < m_queue.size(); ++i) {
     if (m_queue[i]->cancels(m_queue.at(i + 1))) {
       m_queue.erase(m_queue.begin() + i + 1);
       m_queue.erase(m_queue.begin() + i);
-      // Re-visit the previous value of i in the next iteration
-      // (keep in mind that i gets incremented at the end of the loop)
-      static_assert(static_cast<decltype(i)>(-1) + 1 == 0);
-      i = std::max(i - 2, static_cast<decltype(i)>(-1));
+      i = (i >= 2) ? i - 2 : static_cast<std::size_t>(-1);
     }
   }
-  ...
+
+  // ... rest unchanged ...
+  m_cache.insert(m_cache.end(), std::make_move_iterator(m_queue.begin()),
+                 std::make_move_iterator(m_queue.end()));
+  m_queue.clear();
 }
Suggestion importance[1-10]: 7

__

Why: The loop condition uses size()-1 which can underflow when the queue has fewer than two elements; adding an early guard (or using i+1 < size) prevents potential UB. This is a correct and practical safety improvement.

Medium
Preserve umbrella header API

Ensure this umbrella header preserves the previous public API surface. If external
code included SeQuant/core/expr.hpp expecting downstream headers (like
algorithm/operator headers) to be available, re-export them here to prevent build
breaks. Consider keeping include order stable to avoid ODR or ADL changes.

SeQuant/core/expr.hpp [8-20]

 #include <SeQuant/core/expressions/abstract_tensor.hpp>
 #include <SeQuant/core/expressions/constant.hpp>
 #include <SeQuant/core/expressions/expr.hpp>
-#include <SeQuant/core/expressions/expr_algorithms.hpp>
-#include <SeQuant/core/expressions/expr_operators.hpp>
 #include <SeQuant/core/expressions/expr_ptr.hpp>
 #include <SeQuant/core/expressions/labeled.hpp>
 #include <SeQuant/core/expressions/product.hpp>
 #include <SeQuant/core/expressions/result_expr.hpp>
 #include <SeQuant/core/expressions/sum.hpp>
 #include <SeQuant/core/expressions/tensor.hpp>
 #include <SeQuant/core/expressions/traits.hpp>
 #include <SeQuant/core/expressions/variable.hpp>
+// Re-export commonly used algorithms/operators for backward compatibility
+#include <SeQuant/core/expressions/expr_algorithms.hpp>
+#include <SeQuant/core/expressions/expr_operators.hpp>
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly targets the new umbrella includes and raises a valid backward-compatibility concern about re-exporting algorithms/operators, which were previously transitively available. It's advisory (ensure/consider) rather than a concrete required fix, but relevant to avoid build breaks.

Low
Safeguard index label generation

Prevent undefined behavior from character arithmetic overflow and non-alpha ranges.
Clamp the computed target to limit and fall back to alphanumeric suffix when
exceeding range to avoid producing invalid labels.

SeQuant/core/export/itf.cpp [22-46]

 char limit = [&]() -> char {
   auto it = m_index_label_limits.find(space);
   if (it == m_index_label_limits.end()) {
-    // Note that in between capital and lowercase letters there are symbols
-    // [,\,],^,_ and `. Hence, this default is not quite safe but should work
-    // in most cases anyway
     return 'z';
   }
-
   return it->second;
 }();
 
-if (base_key.empty() || base_key.size() > 1 ||
-    base_key[0] + ordinal > limit) {
+if (base_key.empty() || base_key.size() > 1) {
   return base_key + std::to_string(ordinal);
 }
 
-// Merge base key and ordinal into a single letter. That is
-// i1 -> i
-// i2 -> j
-// i3 -> k
-// etc.
-char target = base_key[0] + ordinal;
+int base = static_cast<unsigned char>(base_key[0]);
+int tgt = base + static_cast<int>(ordinal);
+if (tgt > static_cast<int>(static_cast<unsigned char>(limit))) {
+  return base_key + std::to_string(ordinal);
+}
 
+char target = static_cast<char>(tgt);
+if (!std::isalpha(static_cast<unsigned char>(target))) {
+  return base_key + std::to_string(ordinal);
+}
 return std::string(1, target);
Suggestion importance[1-10]: 6

__

Why: Prevents potential overflow and non-alpha outputs when computing index labels; reasonable safety improvement though original code already guards via limit check.

Low
Stabilize traversal under mutation

The loop uses current.leaf() multiple times after invoking the visitor, which may
mutate the node and invalidate traversal logic. Cache leaf state at entry and avoid
re-checks after visitor calls, or gate rechecks strictly when the visitor returned
true. This prevents skipping children or mis-ordering when visitors modify
structure.

SeQuant/core/binary_node.hpp [138-209]

-template <TreeTraversal order, typename Node, typename Visitor,
-          typename NodeType>
+template <TreeTraversal order, typename Node, typename Visitor, typename NodeType>
 void visit(Node&& node, Visitor&& f, NodeType) {
   static_assert(std::is_same_v<NodeType, VisitLeaf> ||
                     std::is_same_v<NodeType, VisitInternal> ||
                     std::is_same_v<NodeType, VisitAll>,
                 "Not sure which nodes to visit");
 
-  std::remove_reference_t<Node>* current_ptr = &node;
-  const std::remove_cvref_t<Node>* previous_ptr = &node;
+  using NodeRef = std::remove_reference_t<Node>;
+  NodeRef* current_ptr = &node;
+  const NodeRef* previous_ptr = &node;
 
-  // Implementation note: we use a loop rather than recursive function calls as
-  // the latter implicitly imposes a maximum depth of a tree we can visit
-  // without triggering a stack overflow that will crash the program.
   while (current_ptr) {
-    const Node& current = *current_ptr;
+    NodeRef& current = *current_ptr;
+    const bool was_leaf = current.leaf();
 
     bool continue_with_subtree = true;
 
-    // Note: Invoking the visitor function may change the node object!
-    // Hence, the need to repeatedly check for whether current might be a leaf
-
-    if (current.leaf()) {
-      // Arrived at a tree's leaf
+    if (was_leaf) {
       if constexpr (!std::is_same_v<NodeType, VisitInternal>) {
-        continue_with_subtree =
-            invoke_tree_visitor(f, current, TreeTraversal::Any);
+        continue_with_subtree = invoke_tree_visitor(f, current, TreeTraversal::Any);
       }
-
-      // Move back up to parent
       current_ptr = get_parent_ptr(current);
     } else if (previous_ptr == &current.left()) {
-      if constexpr ((order & TreeTraversal::InOrder) ==
-                        TreeTraversal::InOrder &&
+      if constexpr ((order & TreeTraversal::InOrder) == TreeTraversal::InOrder &&
                     !std::is_same_v<NodeType, VisitLeaf>) {
-        continue_with_subtree =
-            invoke_tree_visitor(f, current, TreeTraversal::InOrder);
+        continue_with_subtree = invoke_tree_visitor(f, current, TreeTraversal::InOrder);
       }
-
-      // Finished visiting left, now visit right
-      current_ptr = current.leaf() ? get_parent_ptr(current) : &current.right();
+      current_ptr = &current.right();
     } else if (previous_ptr == &current.right()) {
-      if constexpr ((order & TreeTraversal::PostOrder) ==
-                        TreeTraversal::PostOrder &&
+      if constexpr ((order & TreeTraversal::PostOrder) == TreeTraversal::PostOrder &&
                     !std::is_same_v<NodeType, VisitLeaf>) {
-        continue_with_subtree =
-            invoke_tree_visitor(f, current, TreeTraversal::PostOrder);
+        continue_with_subtree = invoke_tree_visitor(f, current, TreeTraversal::PostOrder);
       }
-
-      // Finished visiting right, now move back up to parent (if any)
       current_ptr = get_parent_ptr(current);
     } else {
       assert(current.root() || previous_ptr == &current.parent());
-      if constexpr ((order & TreeTraversal::PreOrder) ==
-                        TreeTraversal::PreOrder &&
+      if constexpr ((order & TreeTraversal::PreOrder) == TreeTraversal::PreOrder &&
                     !std::is_same_v<NodeType, VisitLeaf>) {
-        continue_with_subtree =
-            invoke_tree_visitor(f, current, TreeTraversal::PreOrder);
+        continue_with_subtree = invoke_tree_visitor(f, current, TreeTraversal::PreOrder);
       }
-
-      // Coming from parent (or started at root), start by visiting left
-      current_ptr = current.leaf() ? get_parent_ptr(current) : &current.left();
+      current_ptr = &current.left();
     }
 
     previous_ptr = &current;
-
     if (!continue_with_subtree) {
-      // Overwrite to make next target the parent (if any)
       current_ptr = get_parent_ptr(current);
     }
   }
 }
Suggestion importance[1-10]: 4

__

Why: Caching the leaf state could avoid inconsistencies if visitors mutate nodes, but the current code already rechecks leaf intentionally as documented. The change is a minor robustness tweak with questionable necessity.

Low
Remove duplicate mutable visit_leaf

*The non-const visit_leaf overload has the same signature as the const one, differing
only in constness. This can cause ambiguity and unexpected binding when passing
temporary or non-const nodes. Remove the duplicate or make the non-const variant
actually operate on non-const nodes by forwarding a non-const this to allow
mutation if intended.

SeQuant/core/binary_node.hpp [525-535]

 template <typename F>
 void visit_leaf(F&& visitor,
                 TreeTraversal order = TreeTraversal::PreOrder) const {
   TRAVERSAL_TO_TEMPLATE_ARG(order, sequant::visit,
                             (*this, std::forward<F>(visitor), VisitLeaf{}));
 }
 
-template <typename F>
-void visit_leaf(F&& visitor, TreeTraversal order = TreeTraversal::PreOrder) {
-  TRAVERSAL_TO_TEMPLATE_ARG(order, sequant::visit,
-                            (*this, std::forward<F>(visitor), VisitLeaf{}));
-}
-
Suggestion importance[1-10]: 2

__

Why: Having both const and non-const overloads is intentional to support mutable traversals; removing the non-const overload would reduce functionality. The suggestion overlooks this design and offers no concrete correctness benefit.

Low
General
Use consistent ordering predicate

Make the ordering predicate robust by using a strict-weak-order consistent
comparator for spaces; if is_ordered is context-dependent, prefer
needs_swap/comparator used elsewhere to avoid contradictory decisions between this
check and the swap logic.

SeQuant/core/export/itf.cpp [201-219]

 bool ItfContext::is_exceptional_J(std::span<Index> bra,
                                   std::span<Index> ket) const {
   assert(bra.size() == 2);
   assert(ket.size() == 2);
 
-  // integrals with 3 external (virtual) indices ought to be converted to
-  // J-integrals
-  // Here, we generalize this to all integrals for which the bra indices and
-  // the first ket index are of the same space, the ket indices are of different
-  // spaces and the additional ket space compares less than the space of the
-  // bras.
   bool bras_are_same = bra[0].space() == bra[1].space();
   bool kets_are_different = ket[0].space() != ket[1].space();
-  bool kets_are_ordered = is_ordered(ket[0].space(), ket[1].space());
+  bool kets_are_ordered = is_ordered(ket[0].space(), ket[1].space()); // fallback
+  // Prefer consistent comparator with rewrite:
+  if (!kets_are_ordered) {
+    kets_are_ordered = !needs_swap(ket[0].space(), ket[1].space());
+  }
   bool first_ket_same_as_bra = ket[0].space() == bra[0].space();
 
   return bras_are_same && kets_are_different && kets_are_ordered &&
          first_ket_same_as_bra;
 }
Suggestion importance[1-10]: 5

__

Why: Tries to harmonize ordering logic between is_exceptional_J and rewrite by leveraging needs_swap; correctness seems plausible but context dependency of is_ordered vs needs_swap isn't fully evident from the diff.

Low
Suggestions up to commit f54ba2b
CategorySuggestion                                                                                                                                    Impact
Possible issue
Preserve auxiliary indices

The reconstruction of the tensor in the two-electron hack drops any auxiliary
indices, which can lead to lost dimensions. Capture and reapply the original
auxiliary indices when building the new Tensor object.

SeQuant/core/export/itf.cpp [188-190]

-tensor = Tensor(std::move(label), sequant::bra(std::move(bra)),
-                sequant::ket(std::move(ket)), aux(), tensor.symmetry(),
-                tensor.braket_symmetry(), tensor.particle_symmetry());
+auto original_aux = tensor.aux();
+tensor = Tensor(std::move(label),
+                sequant::bra(std::move(bra)),
+                sequant::ket(std::move(ket)),
+                sequant::aux(std::move(original_aux)),
+                tensor.symmetry(),
+                tensor.braket_symmetry(),
+                tensor.particle_symmetry());
Suggestion importance[1-10]: 9

__

Why: The new Tensor construction drops the original auxiliary indices, which loses key dimensions so preserving them fixes a critical bug.

High
Replace assert with runtime check

Using assert to check ordinal >= m_idx_offset can be disabled in release builds,
leading to an underflow. Replace the assert with an explicit runtime check and
exception.

SeQuant/core/export/itf.cpp [16-17]

 std::string ItfContext::index_name(const IndexSpace &space,
                                    std::size_t ordinal) const {
-  assert(ordinal >= m_idx_offset);
+  if (ordinal < m_idx_offset) {
+    throw std::out_of_range("Index ordinal is below the configured offset");
+  }
   ordinal -= m_idx_offset;
Suggestion importance[1-10]: 7

__

Why: Using an assert can be disabled in release builds, risking underflow, so throwing an exception ensures correct runtime validation.

Medium
Mask invalid bits in negation

Mask out bits beyond the defined flags when negating
ComputeSelection, ensuring the
result stays within the valid bitmask.

SeQuant/core/export/compute_selection.cpp [31-36]

 ComputeSelection operator~(ComputeSelection selection) {
   using underlying_type = std::underlying_type_t<ComputeSelection>;
+  constexpr underlying_type mask =
+    static_cast<underlying_type>(ComputeSelection::Left) |
+    static_cast<underlying_type>(ComputeSelection::Right);
   return static_cast<ComputeSelection>(
-      ~static_cast<underlying_type>(selection));
+      (~static_cast<underlying_type>(selection)) & mask);
 }
Suggestion importance[1-10]: 6

__

Why: Applying a mask after ~static_cast ensures that the negated ComputeSelection does not introduce undefined bits beyond Left and Right, preventing misuse of invalid flags.

Low
Close complex literal parenthesis

Complex constants open a parenthesis but never close it, producing malformed output.
Append a closing parenthesis when an imaginary part is printed.

SeQuant/core/export/itf.hpp [185-191]

 std::string represent(const Constant &constant,
                       const Context &ctx) const override {
   std::stringstream sstream;
   if (constant.value().imag() != 0) {
     sstream << "(" << constant.value().real();
     if (constant.value().imag() < 0) {
       sstream << " - j" << (-1 * constant.value().imag());
     } else {
       sstream << " + j" << constant.value().imag();
     }
+    sstream << ")";
   } else {
     sstream << constant.value().real();
   }
   return sstream.str();
 }
Suggestion importance[1-10]: 6

__

Why: The represent function opens a parenthesis for complex constants but never closes it, producing malformed output; adding a closing ) corrects this.

Low
General
Use underlying bits in enum assignment

Use explicit underlying‐type bitwise operations when defining combined
TreeTraversal
values, as the user‐defined operator| is not available
inside the enum definition.

SeQuant/core/binary_node.hpp [30-41]

 enum class TreeTraversal {
   None = 0,
   PreOrder = 0b001,
   PostOrder = 0b010,
   InOrder = 0b100,
-  PreAndPostOrder = PreOrder | PostOrder,
-  PreAndInOrder = PreOrder | InOrder,
-  PostAndInOrder = PostOrder | InOrder,
-  Any = PreOrder | PostOrder | InOrder,
+  PreAndPostOrder = static_cast<TreeTraversal>(
+    static_cast<std::underlying_type_t<TreeTraversal>>(PreOrder) |
+    static_cast<std::underlying_type_t<TreeTraversal>>(PostOrder)),
+  PreAndInOrder = static_cast<TreeTraversal>(
+    static_cast<std::underlying_type_t<TreeTraversal>>(PreOrder) |
+    static_cast<std::underlying_type_t<TreeTraversal>>(InOrder)),
+  PostAndInOrder = static_cast<TreeTraversal>(
+    static_cast<std::underlying_type_t<TreeTraversal>>(PostOrder) |
+    static_cast<std::underlying_type_t<TreeTraversal>>(InOrder)),
+  Any = static_cast<TreeTraversal>(
+    static_cast<std::underlying_type_t<TreeTraversal>>(PreOrder) |
+    static_cast<std::underlying_type_t<TreeTraversal>>(PostOrder) |
+    static_cast<std::underlying_type_t<TreeTraversal>>(InOrder)),
 };
Suggestion importance[1-10]: 8

__

Why: The combined flags like PreAndPostOrder = PreOrder | PostOrder rely on operator| which isn’t available inside the enum, so explicit casts to the underlying type are needed to avoid compilation errors.

Medium
Use perfect forwarding in invocation

Perfectly forward the callable and node parameters into invocations of
f to preserve
their value category and avoid unnecessary copies.

SeQuant/core/binary_node.hpp [93-117]

 template <typename Visitor, typename Node>
 bool invoke_tree_visitor(Visitor&& f, Node&& node, TreeTraversal context) {
-  if constexpr (std::is_invocable_v<decltype(f), decltype(node),
-                                    decltype(context)>) {
-    using result_type =
-        std::invoke_result_t<decltype(f), decltype(node), decltype(context)>;
-    if constexpr (std::is_same_v<result_type, void>) {
-      f(node, context);
+  if constexpr (std::is_invocable_v<Visitor, Node, TreeTraversal>) {
+    using result_type = std::invoke_result_t<Visitor, Node, TreeTraversal>;
+    if constexpr (std::is_void_v<result_type>) {
+      std::forward<Visitor>(f)(std::forward<Node>(node), context);
       return true;
     } else {
-      return static_cast<bool>(f(node, context));
+      return static_cast<bool>(
+        std::forward<Visitor>(f)(std::forward<Node>(node), context));
     }
   } else {
-    static_assert(std::is_invocable_v<decltype(f), decltype(node)>,
-                  "Visitor must be a callable that takes a FullBinaryNode<T> "
-                  "and optionally a TreeTraversal argument");
-    using result_type = std::invoke_result_t<decltype(f), decltype(node)>;
-    if constexpr (std::is_same_v<result_type, void>) {
-      f(node);
+    static_assert(std::is_invocable_v<Visitor, Node>,
+                  "Visitor must be invocable with node and optional context");
+    using result_type = std::invoke_result_t<Visitor, Node>;
+    if constexpr (std::is_void_v<result_type>) {
+      std::forward<Visitor>(f)(std::forward<Node>(node));
       return true;
     } else {
-      return f(node);
+      return std::forward<Visitor>(f)(std::forward<Node>(node));
     }
   }
-};
+}
Suggestion importance[1-10]: 5

__

Why: Forwarding f and node with std::forward preserves their value categories and avoids unnecessary copies or binding to lvalue references, improving performance and correctness.

Low
Add default case in switch

Add a default case to the switch in TRAVERSAL_TO_TEMPLATE_ARG to
assert on
unsupported TreeTraversal values and prevent silent no‐ops.

SeQuant/core/binary_node.hpp [55-80]

 #define TRAVERSAL_TO_TEMPLATE_ARG(order, functionName, functionArgs) \
   switch (order) {                                                   \
     case TreeTraversal::None:                                        \
       break;                                                         \
     case TreeTraversal::PreOrder:                                    \
       functionName<TreeTraversal::PreOrder> functionArgs;            \
       break;                                                         \
     case TreeTraversal::PostOrder:                                   \
       functionName<TreeTraversal::PostOrder> functionArgs;           \
       break;                                                         \
     case TreeTraversal::InOrder:                                     \
       functionName<TreeTraversal::InOrder> functionArgs;             \
       break;                                                         \
     case TreeTraversal::PreAndPostOrder:                             \
       functionName<TreeTraversal::PreAndPostOrder> functionArgs;     \
       break;                                                         \
     case TreeTraversal::PreAndInOrder:                               \
       functionName<TreeTraversal::PreAndInOrder> functionArgs;       \
       break;                                                         \
     case TreeTraversal::PostAndInOrder:                              \
       functionName<TreeTraversal::PostAndInOrder> functionArgs;      \
       break;                                                         \
     case TreeTraversal::Any:                                         \
       functionName<TreeTraversal::Any> functionArgs;                 \
       break;                                                         \
+    default:                                                         \
+      assert(false && "Unsupported TreeTraversal value");            \
+      break;                                                         \
   }
Suggestion importance[1-10]: 5

__

Why: Adding a default with an assert in the switch macro ensures that unsupported TreeTraversal values are caught early rather than silently falling through, enhancing reliability.

Low
Guard name() with assert

Add an assertion to ensure name() is only called when the group is named. This
catches unintended usage early.

SeQuant/core/export/expression_group.hpp [51]

-[[nodiscard]] const std::string &name() const { return m_name.value(); }
+[[nodiscard]] const std::string &name() const { assert(m_name.has_value()); return m_name.value(); }
Suggestion importance[1-10]: 4

__

Why: Adding an assert(m_name.has_value()) helps catch misuse of name() early without affecting functionality.

Low

Krzmbrzl added 3 commits July 9, 2025 13:13
At this point of the code path, they are guaranteed to be empty so
functionally everything stays the same. However, it makes folks reading
the code less itchy about whether or not some indices might be forgotten
here.
Copy link
Member

@bimalgaudel bimalgaudel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve these changes with one remark that I have a doubt if it is going to be scalable to have an all-encompassing export framework aimed to generalize wildly different flavors of numerical tensor frameworks: e.g. TiledArray, Julia, Python, ITF etc. My doubt is in having a fixed set of data structures and extending their methods to support an ever-growing number of export scenarios: e.g. to support non-binary tensor contraction expressions.

@evaleev @Krzmbrzl

@evaleev evaleev added this to the 2.1 milestone Aug 8, 2025
@Krzmbrzl Krzmbrzl merged commit 2d9eb74 into ValeevGroup:master Aug 18, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants