Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with new distance method #15

Closed
vsmaier opened this issue Feb 24, 2016 · 7 comments
Closed

Problems with new distance method #15

vsmaier opened this issue Feb 24, 2016 · 7 comments
Assignees
Labels

Comments

@vsmaier
Copy link

vsmaier commented Feb 24, 2016

When using 4 locations for both parameters I would again expect a symmetrical matrix as a result.

print(gpulocations.4)
Source: gpuR Matrix [4 x 3]

      [,1]      [,2]     [,3]

[1,] -305.0518 -3396.980 2010.261
[2,] -239.3692 -3421.902 1976.606
[3,] 742.4894 -2864.292 2630.251
[4,] 248.7933 -3382.905 2041.504

and calling

print( distance(gpulocations.4,gpulocations.4))
Source: gpuR Matrix [4 x 4]

       [,1]       [,2]      [,3] [,4]

[1,] 0.00000 77.89696 1328.7159 0
[2,] 77.89696 0.00000 1304.6937 0
[3,] 1328.71592 1304.69372 NaN 0
[4,] 554.90411 493.99902 926.9942 0

There is some oddity with the last column.
NaN at [3,3] is also a little bit off. The results in the diagonal should have been 0.

Cross checking with

print( distance(gpulocations.4,gpulocations.4, method="sqEuclidean"))
Source: gpuR Matrix [4 x 4]

        [,1]        [,2]          [,3] [,4]

[1,] 0.000 6067.936 1.765486e+06 0
[2,] 6067.936 0.000 1.702226e+06 0
[3,] 1765486.005 1702225.705 -3.725290e-09 0
[4,] 307918.566 244035.032 8.593182e+05 0

Again shows 0s at the last column. [3,3] is now a small negative number. All other matrix elements seem the proper positive squares.

I can some additional distance computations with sets of 3 and 4 locations. It showed a similar pattern of issues. Therd also seems to be a difference in the allocated size of the result distance between 3 and 4 locations shows different result size from 4 and 3. There seems some implicit ordering assumption.

print( distance(gpulocations.4,gpulocations.3))
Source: gpuR Matrix [4 x 4]

      [,1]     [,2]      [,3] [,4]

[1,] 430.9922 820.3234 1384.9828 0
[2,] 479.5628 749.3972 1365.7792 0
[3,] 1133.9736 978.7315 104.8199 0
[4,] 694.6057 307.7565 1010.0145 0

print( distance(gpulocations.3,gpulocations.4))
Source: gpuR Matrix [3 x 3]

      [,1]      [,2]      [,3]

[1,] 430.9922 479.5628 1133.9736
[2,] 820.3234 749.3972 978.7315
[3,] 1384.9828 1365.7792 104.8199

print( distance(gpulocations.3,gpulocations.3))
Source: gpuR Matrix [3 x 3]

     [,1]         [,2]     [,3]

[1,] 0.000 1.000599e+03 1161.673
[2,] 1000.599 6.103516e-05 1076.708
[3,] 1161.673 1.076708e+03 NaN

print( distance(gpulocations.3,gpulocations.3, method="sqEuclidean"))
Source: gpuR Matrix [3 x 3]

    [,1]         [,2]          [,3]

[1,] 0 1.001199e+06 1.349484e+06
[2,] 1001199 3.725290e-09 1.159300e+06
[3,] 1349484 1.159300e+06 -3.725290e-09

Hope this helps.

@cdeterman
Copy link
Owner

@vsmaier thanks for reporting this, not sure how that slipped through. The fixes have been pushed. Note that I have added a warning when using distance and the matrices are identical. In this case it would be better to use dist as it will use less device memory.

Also, FYI, the vclMatrix class is also present. Operations will be much quicker on this class as the object stays on the device.

@vsmaier
Copy link
Author

vsmaier commented Feb 25, 2016

Hi Charles,

I am now getting the warning when using identical sets of vectors (I have
used those for quick debugging because of the easily predicable results and
the possibility to compare to dist() ) :

> print( distance(gpu4tuple,gpu4tuple, method="sqEuclidean"))Source: gpuR
Matrix [4 x 4] [,1] [,2] [,3] [,4][1,] 0
4696270.5 1231907 6106199.9[2,] 4696271 0.0 1285522 135733.7[3,]
1231907 1285522.2 0 2194613.1[4,] 6106200 135733.7 2194613
0.0Warning message:In distance(gpu4tuple, gpu4tuple, method =
"sqEuclidean") : x is the same as y, did you mean to use 'dist' instead?

So I think I am on the newest version. Maybe there is a way to include
version information or the commit #? I'll check with a college who has done
similar for mercurial. Maybe it applies to git as well.

The issue with the [3,3] elements in the results is also gone it seems. So
that aspect is fixed.

The sizing of the result however is still a little off.

The size of the result seems to be determined by the first parameter of
distance.

When using matrices representing 3 and 4 points the results is a 3x3
matrix, not 3x4.

> print( distance(gpu3tuple,gpu4tuple, method="sqEuclidean"))Source: gpuR
Matrix [3 x 3] [,1] [,2] [,3][1,] 4194724 382302.3
1513996[2,] 4800105 593112.3 2029144[3,] 6435393 682864.8 2939459

When using matrices representing 4 and 3 points the results is a 4x4
matrix, not 4x3.

*> print( distance(gpu4tuple,gpu3tuple, method="sqEuclidean"))Source: gpuR
Matrix [4 x 4] [,1] [,2] [,3] [,4][1,] 4194724.2
4800104.6 6435392.8 0[2,] 382302.3 593112.3 682864.8 0[3,]
1513996.0 2029143.9 2939459.2 0[4,] 408982.0 485557.4 337527.8 0> *

I have not yet locked at the vclMatrix - that may have to way a little bit.

I am also now running into frequent R crashes. While it may be my platform

  • I am using a Mac Pro - I suspect that is has to do with memory
    allocation/deallocation related to the sizing of the results matrix and or
    temporary internal storage.

I'm unfortunately have been mostly a Windows guy in the past. So I'm still
having issues with the development environment. Once I catch up I hope to
be able to locate the appropriate code sections.

On Wed, Feb 24, 2016 at 4:10 PM, Charles Determan notifications@github.com
wrote:

@vsmaier https://github.com/vsmaier thanks for reporting this, not sure
how that slipped through. The fixes have been pushed. Note that I have
added a warning when using distance and the matrices are identical. In
this case it would be better to use dist as it will use less device
memory.

Also, FYI, the vclMatrix class is also present. Operations will be much
quicker on this class as the object stays on the device.


Reply to this email directly or view it on GitHub
#15 (comment).

@cdeterman
Copy link
Owner

@vsmaier Bah, that's what I get for recycling some existing code. I believe I now have it fully functional. I have added several more unit tests to be certain of different scenarios. I will leave this issue open until I get confirmation from you that the functions are behaving appropriately.

Regarding versions, sorry if it wasn't clear. I keep my develop branch distinct from the master in case something breaks when adding new features and to maintain some stability in master. I should make this clear in the README for interested users. I plan to merge back to master once I solve the 'switching GPU problem' is issue #9. As such, when requesting new features, they will always be initially pushed to the develop branch which is simply installed with:

devtools::install_github('cdeterman/gpuR', ref = 'develop')

Regarding the R crashes, is this happening within a single R session or after it restarts? Note that an R session restarts after a package is re-installed if it is loaded. If the former is true, then there is a bigger problem. If the latter, then it isn't a major problem. When an R session is restarted, either by package installation or by opening and closing Rstudio for example, the pointers of the gpuMatrix/vclMatrix objects will no longer be valid. The objects will have been erased. This is something I need to make clear in the documentation and vignettes as well. I currently don't have a way to save these objects between sessions. I would like to have a means of validating the pointers to return an error instead of crashing but that is another project.

@vsmaier
Copy link
Author

vsmaier commented Feb 26, 2016

I did some tests and the results do match what I was expecting.

I have not encountered the same type of R crashes/terminating unexpectedly that I got with prior versions.

I did encounter another issue. I am currently able to compute 600,000 x 10 distances. Using the vclMatrix I am getting absolutely stunning performance. Longer term I am looking to compute 10,000,000 to 20,000 distances. When breaking the problem into small subsets I found that after repeated calls to allocate sample sets like this

gpublocks <- gpuR::vclMatrix( as.matrix( mblocks[sample(1:nrow(blocks), 600000,replace=FALSE),] ))

and using them in computations, the machine seems to be slowing down to near unresponsiveness. I would describe this to be similar to swapping out memory. Yet from what I can tell there is still ample memory and little cpu load.

From this I realized that I do not know how to free the memory space associated with a vclMatrix. I think my repeated calls are effectively causing a memory leak on the GPU.

@cdeterman
Copy link
Owner

@vsmaier interesting, can you provide your mblocks and blocks objects so I can try and replicate this problem?

@vsmaier
Copy link
Author

vsmaier commented Feb 26, 2016

blocks is available at https://dl.dropboxusercontent.com/u/1419660/BLOCKS.csv.zip
mblocks is the same data set but with the longrec column dropped. Realize I got a little bit careless there.
You are looking at 2010 US Census block group coordinates transformed into Cartesian space.

@cdeterman
Copy link
Owner

@vsmaier given that there is no longer an issue directly related to dist or distance I am closing this issue and opening a separate one to address you recent concern #16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants