Skip to content
This repository has been archived by the owner on Aug 11, 2020. It is now read-only.

'sum_rows' error, when the dimension of tensor is (4,1). #8

Closed
reyoung opened this issue May 28, 2014 · 5 comments
Closed

'sum_rows' error, when the dimension of tensor is (4,1). #8

reyoung opened this issue May 28, 2014 · 5 comments

Comments

@reyoung
Copy link
Contributor

reyoung commented May 28, 2014

This error will occur when assignment operator execution, and the error line seems in this line. The cuda get no error, and posix error string is 'File exists'.

I test this simple program in cuda 5.5 and cuda 6, and they are both error.

inline void onExitPrintError(){
    cudaError_t err = cudaGetLastError();
    if(err != cudaSuccess)
    {
        // print the CUDA error message and exit
        printf("CUDA error: %s\n", cudaGetErrorString(err));
    }
    printf("Posix errno %s\n",strerror(errno));

}

int main(){
    InitTensorEngine(1);
    atexit(onExitPrintError);
    TensorContainer<cpu, 2> a;
    a.Resize(Shape2(4,1));

    a[0][0] = 0.0f;
    a[1][0] = 1.0f;
    a[2][0] = 1.0f;
    a[3][0] = 0.0f;

    TensorContainer<gpu, 2> gpu_a;
    gpu_a.Resize(Shape2(4,1));
    Copy(gpu_a,a);


    TensorContainer<gpu, 1> b;
    b.Resize(Shape1(1));

    b = sum_rows(gpu_a);

    TensorContainer<cpu, 1> c;
    c.Resize(b.shape);
    Copy(c,b);
    for(int i=0;i<c.shape[0];++i){
        cout<< c[i]<<endl;
    }

    ShutdownTensorEngine();
    return 0;
}
@tqchen
Copy link
Member

tqchen commented May 28, 2014

Thanks for reporting the issue. At current point, I am not sure what happens. Does the program successfully reach the end?

@reyoung
Copy link
Contributor Author

reyoung commented May 28, 2014

No, there is no output of tensor c. The sum_rows line make program exit suddenly. However, if there is no assignment of this line, the 'b = ' term, the program will not exit.

@tqchen
Copy link
Member

tqchen commented May 28, 2014

hmm, then this could be some bug we overlooked, thanks!

@antinucleon
Copy link
Contributor

I believe it is NVIDIA's black magic and we are innocent.
Try this code:

#include <cstdio>
#include <errno.h>
#include "cuda.h"


int main() {
    float * fp = NULL;
    printf("Before: Posix errno %s\n",strerror(errno));
    cudaMalloc((void**)&fp, sizeof(16));
    printf("After: Posix errno %s\n",strerror(errno));
    cudaFree(fp);

}

@SkidanovAlex
Copy link

I just got the same issue, when sum_rows would just suddenly exit. So just in case someone else encounters it, in my case it turned out that __CUDA_ARCH__ was not defined (mshadow actually gives a warning about it). Setting __CUDA_ARCH__ to 100 solved the problem.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants