Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird results on A100 when using faiss-gpu==1.6.3. All distances are 2.0. #2076

Closed
zhangxinyu-xyz opened this issue Oct 9, 2021 · 3 comments

Comments

@zhangxinyu-xyz
Copy link

Summary

I got weird results, i.e., all distances are 2.0, when using faiss.bruteForceKnn function on A100. The faiss-gpu version is 1.6.3.

When using P40 or K40, results are normal.

The differences of A100 and P40 are:
pytorch version: A100-1.9.0-cuda11.1 P40-1.4.0-cuda10.0
system cuda version: A100-cuda11.2 P40-cuda10.1
ubuntu version: A100-18.04.5 P40-16.04.6

Platform

Ubuntu

#Faiss version: pip install faiss-gpu==1.6.3

#Running on:
GPU

#Interface:
Python

@mdouze
Copy link
Contributor

mdouze commented Oct 10, 2021

could you post some reproduction code?

@zhangxinyu-xyz
Copy link
Author

zhangxinyu-xyz commented Oct 11, 2021

could you post some reproduction code?

res = faiss.StandardGpuResources()
res.setDefaultNullStreamAllDevices()
_, initial_rank = search_raw_array_pytorch(res, target_features, target_features, 30)
initial_rank = initial_rank.cpu().numpy()

def search_raw_array_pytorch(res, xb, xq, k, D=None, I=None, metric=faiss.METRIC_L2):
    assert xb.device == xq.device
    nq, d = xq.size()
    if xq.is_contiguous():
        xq_row_major = True
    elif xq.t().is_contiguous():
        xq = xq.t()    # I initially wrote xq:t(), Lua is still haunting me :-)
        xq_row_major = False
    else:
        raise TypeError('matrix should be row or column-major')

    xq_ptr = swig_ptr_from_FloatTensor(xq)

    nb, d2 = xb.size()
    assert d2 == d
    if xb.is_contiguous():
        xb_row_major = True
    elif xb.t().is_contiguous():
        xb = xb.t()
        xb_row_major = False
    else:
        raise TypeError('matrix should be row or column-major')
    xb_ptr = swig_ptr_from_FloatTensor(xb)

    if D is None:
        D = torch.empty(nq, k, device=xb.device, dtype=torch.float32)
    else:
        assert D.shape == (nq, k)
        assert D.device == xb.device

    if I is None:
        I = torch.empty(nq, k, device=xb.device, dtype=torch.int64)
    else:
        assert I.shape == (nq, k)
        assert I.device == xb.device

    D_ptr = swig_ptr_from_FloatTensor(D)
    I_ptr = swig_ptr_from_LongTensor(I)

    faiss.bruteForceKnn(res, metric,
                xb_ptr, xb_row_major, nb,
                xq_ptr, xq_row_major, nq,
                d, k, D_ptr, I_ptr)

    return D, I

Thanks for your reply!

@mdouze mdouze added the GPU label Oct 19, 2021
@mdouze
Copy link
Contributor

mdouze commented Jan 19, 2022

code is not self-contained.

@mdouze mdouze closed this as completed Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants