Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a vector constructor which allows a customized part. #12

Closed
wants to merge 1 commit into from
Closed

Add a vector constructor which allows a customized part. #12

wants to merge 1 commit into from

Conversation

byzhang
Copy link
Contributor

@byzhang byzhang commented Aug 10, 2012

Summary:
This constructor is important to correctly partition the variants of sparse matrix.
For example, considering the following matrix A(n*m):
0 0 0
0 1 2
0 0 0
0 0 0
3 0 4
0 0 0
As there are many empty rows, an efficient storage way for computing is:
row = [1 4] // row numbers
idx = [0 2 4] // pointers to col and val
col = [1 2 0 2] // col numbers
val = [1 2 3 4] // values

Suppose there are two dense vectors y and x, where y = A * x.
Assuming n >> m, then to efficiently use multiple devices for y=Ax,
we need to partition y and A across multiple devices, and copy x to multiple devices.
However, we need to build A's partition scheme to align with y's.
E.g. if y is evenly partitioned in 3 devices, then,
row = [1 0](with part = {0, 1, 1, 2})
idx = [0 2 0 2](with part = {0, 2, 2, 4})
col = [1 2 0 2](with part = {0, 2, 2, 4})
val = [1 2 3 4](with part = {0, 2, 2, 4})

Denis, if my understanding is right, another potential approach is to implement a new sub class of sparse_matrix in spmat.hpp which contains row, idx, col, and val. However, the mul() doesn't support local copy of x. And also when we want to use custom kernels, things become more complex. So it seems this variant of sparse matrix doesn't fit well in current class SpMat. But correct me if I'm wrong.

Summary:
This constructor is important to correctly partition the variants of sparse matrix.
For example, considering the following matrix A(n*m):
0 0 0
0 1 2
0 0 0
0 0 0
0 0 0
3 0 4
As there are many empty rows, an efficient storage way for computing is:
nr  = 2         // number of nonempty rows (nr << n)
row = [1 5]     // row numbers
col = [1 2 0 2] // col numbers
val = [1 2 3 4] // values
idx = [0 2 4]   // pointers to col and val

Suppose there are two dense vectors y and x, where y = A * x.
Assuming n >> m, then to efficiently use multiple devices for y=Ax,
we need to partition y and A across multiple devices, and copy x to multiple devices.
However, we need to build A's partition scheme to align with y's.
@ddemidov
Copy link
Owner

Custom partitioning of a single vector is not possible in VexCL. All vectors are consistently partitioned in order to allow multi-device processing (each device processes corresponding parts of all vectors participating in an expression).

You can define your own partitioning scheme, but that would be applied to all vex::vectors. I think custom kernel is well-suited for you situation. What I would do is:

  1. Allocate vector y(ctx.queue(), n).
  2. Allocate vectors x on each device (std::vectorvex::vector)
  3. Partition matrix according to y's partitioning. d-th partition of the matrix gets nonzero rows between y.part_start(d) and y.part_start(d+1).
  4. Launch kernel on each device d, submitting the following parameters:
    1. mtx[d] components (nr, idx, row, col, val)
    2. d-th partition of y (y(d))
    3. copy of x on d-th device: xd

@byzhang
Copy link
Contributor Author

byzhang commented Aug 10, 2012

Thank you for the detail suggestion!
Thanks,
-B

On Fri, Aug 10, 2012 at 12:46 AM, Denis Demidov notifications@github.comwrote:

Custom partitioning of a single vector is not possible in VexCL. All
vectors are consistently partitioned in order to allow multi-device
processing (each device processes corresponding parts of all vectors
participating in an expression).

You can define your own partitioning scheme, but that would be applied to
all vex::vectors. I think custom kernel is well-suited for you situation.
What I would do is:

  1. Allocate vector y(ctx.queue(), n).

  2. Allocate vectors x on each device (std::vectorvex::vector)

  3. Partition matrix according to y's partitioning. d-th partition of
    the matrix gets nonzero rows between y.part_start(d) and y.part_start(d+1).

  4. Launch kernel on each device d, submitting the following
    parameters:

    1. mtx[d] components (nr, idx, row, col, val)
    2. d-th partition of y (y(d))
    3. copy of x on d-th device: xd


    Reply to this email directly or view it on GitHubhttps://github.com/Add a vector constructor which allows a customized part. #12#issuecomment-7636238.

@ddemidov ddemidov closed this Aug 10, 2012
@ddemidov
Copy link
Owner

Forgot to mention: mtx[d] components are independent vex::vectors allocated on d-th device similarly to x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants