Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow energy calculation #1025

Closed
randomir opened this issue Oct 12, 2021 · 3 comments · Fixed by #1043
Closed

Slow energy calculation #1025

randomir opened this issue Oct 12, 2021 · 3 comments · Fixed by #1043
Labels

Comments

@randomir
Copy link
Member

randomir commented Oct 12, 2021

In dimod 0.10.7:

In [1]: import numpy 
   ...: import dimod 
   ...:  
   ...: bqm = dimod.generators.anti_crossing_clique(150) 
   ...:  
   ...: arr = numpy.zeros((50000, len(bqm.variables)), dtype=numpy.int8) 
   ...: var = list(bqm.variables) 
   ...:  
   ...: samples_like = dimod.as_samples((arr, var)) 
   ...:  
   ...: %timeit energies = bqm.energies(samples_like) 
   ...:  
9.46 s ± 106 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In dimod 0.9.16:

In [1]: import numpy 
   ...: import dimod 
   ...:  
   ...: bqm = dimod.generators.anti_crossing_clique(150) 
   ...:  
   ...: arr = numpy.zeros((50000, len(bqm.variables)), dtype=numpy.int8) 
   ...: var = list(bqm.variables) 
   ...:  
   ...: samples_like = dimod.as_samples((arr, var)) 
   ...:  
   ...: %timeit energies = bqm.energies(samples_like) 
   ...:  
1.15 s ± 17.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

For perspective how slow this is, in dimod 0.9/0.10, using AdjVectorBQM:

In [1]: import numpy 
   ...: import dimod 
   ...:  
   ...: bqm = dimod.AdjVectorBQM(dimod.generators.anti_crossing_clique(150))
   ...:  
   ...: arr = numpy.zeros((50000, len(bqm.variables)), dtype=numpy.int8) 
   ...: var = list(bqm.variables) 
   ...:  
   ...: samples_like = dimod.as_samples((arr, var)) 
   ...:  
   ...: %timeit energies = bqm.energies(samples_like) 
   ...:  
368 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
@arcondello
Copy link
Member

Ok - I did some digging. The performance regression comes from the custom iterators that were implemented in #818 and #827. It's not too surprising that they are slower than std::vector<std::pair<int, float>>::iterator but the performance hit is pretty rough. Measured on my system with g++

#include <cassert>
#include <ctime>
#include <iostream>

#include "dimod/adjvectorbqm.h"
#include "dimod/quadratic_model.h"

int num_trials = 10;
int num_variables = 150;

int main() {
    float Q[num_variables * num_variables];
    std::fill_n(Q, num_variables * num_variables, 1);

    auto bqm = dimod::BinaryQuadraticModel<double>(Q, num_variables,
                                                   dimod::Vartype::BINARY);

    auto adj = dimod::AdjVectorBQM<int, double>(Q, num_variables);

    clock_t begin_time = clock();
    for (auto ti = 0; ti < num_trials; ++ti) {
        for (auto vi = 0; vi < num_variables; ++vi) {
            auto span = bqm.neighborhood(vi);
            while (span.first != span.second) {
                ++span.first;
            }
        }
    }
    std::cout << "bqm: ";
    std::cout << float(clock() - begin_time) / CLOCKS_PER_SEC << "\n";

    begin_time = clock();
    for (auto ti = 0; ti < num_trials; ++ti) {
        for (auto vi = 0; vi < num_variables; ++vi) {
            auto span = adj.neighborhood(vi);
            while (span.first != span.second) {
                ++span.first;
            }
        }
    }
    std::cout << "adj: ";
    std::cout << float(clock() - begin_time) / CLOCKS_PER_SEC << "\n";
}

Gives

$ g++ scratch.cpp -o scratch -I dimod/include/ && ./scratch
bqm: 0.00958
adj: 0.001355

I think there is some low-hanging performance fruit to pick

@randomir
Copy link
Member Author

I guess this partially explains dwavesystems/dwave-cloud-client#487 as well.

@arcondello
Copy link
Member

Indeed, though in that case the python object creation has more of a (constant) effect. E.g.

import time

import dimod
import numpy as np

num_trials = 5

bqm = dimod.generators.anti_crossing_clique(150)

adj = dimod.AdjVectorBQM(bqm)

t = time.perf_counter()
for _ in range(num_trials):
    for _ in adj.iter_quadratic():
        pass
print('adjvector', time.perf_counter() - t)

t = time.perf_counter()
for _ in range(num_trials):
    for _ in bqm.iter_quadratic():
        pass
print('new bqm  ', time.perf_counter() - t)

the difference is

adjvector 0.0030581199999915043
new bqm   0.003743052999993779

not insignificant, but not quite as bad as in energy calculation which is done entirely at the cython level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants