Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deletion_fn() is called too late, resulting in SEGFAULT #80

Closed
dnezam opened this issue Nov 22, 2022 · 4 comments
Closed

deletion_fn() is called too late, resulting in SEGFAULT #80

dnezam opened this issue Nov 22, 2022 · 4 comments

Comments

@dnezam
Copy link
Contributor

dnezam commented Nov 22, 2022

Hello,

after compiling my code with fsanitizer=address to debug a SEGFAULT, I found out that the code (which mainly consists of library calls to parlay and gbbs) tries to access a freed location. While investigating, it seems that the pointer is freed during the move assignment:

  symmetric_graph& operator=(symmetric_graph&& other) noexcept {
    n = other.n;
    m = other.m;
    v_data = other.v_data;
    e0 = other.e0;
    vertex_weights = other.vertex_weights;
    deletion_fn();  <--
    deletion_fn = std::move(other.deletion_fn);
    other.v_data = nullptr;
    other.e0 = nullptr;
    other.vertex_weights = nullptr;
    other.deletion_fn = []() {};
    return *this;
  }

To my understanding, deletion_fn() (marked with <--) frees the pointer of other, and not this. At the end of the move, the original data of this is leaked + the current data of this is invalid, which probably leads to the SEGFAULT in my code. A possible fix would be to move the call to the beginning, i.e.:

  symmetric_graph& operator=(symmetric_graph&& other) noexcept {
    deletion_fn();  <--
    n = other.n;
    m = other.m;
    v_data = other.v_data;
    e0 = other.e0;
    vertex_weights = other.vertex_weights;
    deletion_fn = std::move(other.deletion_fn);
    other.v_data = nullptr;
    other.e0 = nullptr;
    other.vertex_weights = nullptr;
    other.deletion_fn = []() {};
    return *this;
  }

With this, my code doesn't SEGFAULT.

In particular, the graph that is being moved in and out is generated by symmetric_graph<symmetric_vertex, Wgh> sym_graph_from_edges in graph.h:

static inline symmetric_graph<symmetric_vertex, Wgh> sym_graph_from_edges(
  ...
  return symmetric_graph<symmetric_vertex, Wgh>(v_data, n, m,
                                                [=]() {
                                                  gbbs::free_array(v_data, n);
                                                  gbbs::free_array(edges, m);
                                                },
                                                (edge_type*)edges);
}

Best,
Daniel

@tomtseng
Copy link
Member

Moving deletion_fn() to the beginning of the function looks fine (and probably better) to me.

I think I'm confused about why this issue occurs when using the deletion_fn from sym_graph_from_edges() though.

static inline symmetric_graph<symmetric_vertex, Wgh> sym_graph_from_edges(
  ...
  return symmetric_graph<symmetric_vertex, Wgh>(v_data, n, m,
                                                [=]() {
                                                  gbbs::free_array(v_data, n);
                                                  gbbs::free_array(edges, m);
                                                },
                                                (edge_type*)edges);
}

Shouldn't the deletion function there be capturing v_data and edges as defined in sym_graph_from_edges? So even if this.v_data and this.e0 get reassigned (like in the move assignment operator), the deletion function should still be freeing the original values of v_data and e0

@dnezam
Copy link
Contributor Author

dnezam commented Nov 23, 2022

You make an excellent point - it really shouldn't matter. The problem happens in the following code (please ignore that the implementation of the algorithm isn't entirely correct yet):

#pragma once

#include "gbbs/gbbs.h"
#include "gbbs/contract.h"
#include "benchmarks/LowDiameterDecomposition/MPX13/LowDiameterDecomposition.h"
#include "benchmarks/Spanner/MPXV15/Spanner.h"
#include <iostream>

// LSPES - Low-Stretch Parallel Exponential Shift
namespace gbbs {
    namespace LowAvgStretch {
        using edge = gbbs::spanner::edge;

        template <
                template <class W> class vertex, class W,
                // This means W(weight) is not defined, thus unweighted graph
                typename std::enable_if<std::is_same<W, gbbs::empty>::value, int>::type = 0>
        inline parlay::sequence<edge> LSPES(symmetric_graph<vertex, W>& G) {
            using cluster_and_parent = gbbs::spanner::cluster_and_parent;

            auto edge_list = parlay::sequence<edge>();

            while (G.n > 0) {
                // Determine clusters + implicit trees within clusters (taken from Spanner.h)
                auto cluster_and_parents = gbbs::spanner::LDD_parents(G, 0.2, true);

                // Determine edges within clusters (taken from Spanner.h)
                auto tree_edges_with_loops = parlay::delayed_seq<edge>(G.n, [&](size_t i) {
                    return std::make_pair(i, cluster_and_parents[i].parent);
                });
                auto tree_edges = parlay::filter(tree_edges_with_loops, [&](const edge& e) {
                    return e.first != e.second;
                });
                edge_list.append(parlay::make_slice(tree_edges));

                // Contract the graph
                auto clusters = parlay::map(cluster_and_parents, [](const cluster_and_parent& cp) {
                    return cp.cluster;
                });
                size_t num_clusters = contract::RelabelIds(clusters);
                
                auto c_out = contract::contract(G, clusters, num_clusters);

                G = std::move(std::get<0>(c_out));
            }
            return edge_list;
        }
    }
}

The SEGFAULT happens in auto c_out = contract::contract(G, clusters, num_clusters); (namely fetch_intercluster_small) on the second iteration. When I compile with fsanitize=address, the problematic access happens auto cluster_and_parents = gbbs::spanner::LDD_parents(G, 0.2, true); (heap-use-after-free).

Graph G is generated using gbbs::gbbs_io::read_unweighted_symmetric_graph_custom(input);

Definition of move/copy constructor/assignment:

// Move constructor
  symmetric_graph(symmetric_graph&& other) noexcept {
    n = other.n;
    m = other.m;
    v_data = other.v_data;
    e0 = other.e0;
    vertex_weights = other.vertex_weights;
    deletion_fn = std::move(other.deletion_fn);
    other.v_data = nullptr;
    other.e0 = nullptr;
    other.vertex_weights = nullptr;
    other.deletion_fn = []() {};
  }

  // Move assignment
  symmetric_graph& operator=(symmetric_graph&& other) noexcept {
    n = other.n;
    m = other.m;
    v_data = other.v_data;
    e0 = other.e0;
    vertex_weights = other.vertex_weights;
    deletion_fn();
    deletion_fn = std::move(other.deletion_fn);
    other.v_data = nullptr;
    other.e0 = nullptr;
    other.vertex_weights = nullptr;
    other.deletion_fn = []() {};
    return *this;
  }

  // Copy constructor
  symmetric_graph(const symmetric_graph& other) {
    debug(std::cout << "Copying symmetric graph." << std::endl;);
    n = other.n;
    m = other.m;
    v_data = gbbs::new_array_no_init<vertex_data>(n);
    e0 = gbbs::new_array_no_init<edge_type>(m);
    parallel_for(0, n, [&](size_t i) { v_data[i] = other.v_data[i]; });
    parallel_for(0, m, [&](size_t i) { e0[i] = other.e0[i]; });
    deletion_fn = [=]() {
      gbbs::free_array(v_data, n);
      gbbs::free_array(e0, m);
      if (vertex_weights != nullptr) {
        gbbs::free_array(vertex_weights, n);
      }
    };
    vertex_weights = nullptr;
    if (other.vertex_weights != nullptr) {
      vertex_weights = gbbs::new_array_no_init<vertex_weight_type>(n);
      parallel_for(
          0, n, [&](size_t i) { vertex_weights[i] = other.vertex_weights[i]; });
    }
  }

  ~symmetric_graph() { deletion_fn(); }

If I pull the deletion_fn() to the beginning, I don't get a SEGFAULT. I have also tried to copy the graph before I pass it to a function, but in that case I got a double free error.

@tomtseng
Copy link
Member

Hmm I'm not sure what's going on either. I tried running your code snippet on my machine with fsanitize=address and didn't see an error. If we think deletion_fn() is the issue, maybe you could put print statements inside the deletion_fn to check what it's deleting and print out the old and new addresses of v_data and e0 in the move assignment operator to see if they match up correctly with what deletion_fn is deleting.

If you can't figure out what's happening or if you just don't want to spend more time debugging this issue, you can open a PR moving deletion_fn() to the beginning and I'll approve it

@dnezam
Copy link
Contributor Author

dnezam commented Jan 18, 2023

Sorry for the late response! I couldn't really figure out what the issue was, so I opened a PR as you said. As mentioned in the PR, I haven't tested the fix with anything besides my custom implementation of a low-stretch spanning tree algorithm, but it shouldn't break any other implementations/algorithms/benchmarks. (But then again, as we discussed above, this shouldn't change anything in the first place, yet it fixes my issue.)

tomtseng pushed a commit that referenced this issue Jan 18, 2023
PR by: dnezam

Fixes issue described in #80 by calling `deletion_fn()` at the very beginning.

Note that I only tested this fix in the context of my own algorithm implementation. I did not test this for any other benchmark/algorithm provided by gbbs.
@dnezam dnezam closed this as completed Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants