-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corrected field example (#3369) #3375
base: master
Are you sure you want to change the base?
Conversation
src/api/c/vector_field.cpp
Outdated
// initialize the full out_ptns & out_dirs arrays, because join of 2D points | ||
// will only partially fill those arrays | ||
Array<T> out_pnts = createValueArray<T>(odims, 0.); | ||
Array<T> out_dirs = createValueArray<T>(odims, 0.); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since dimensions are similar, I don't see the need for createValueArray
, it adds an unnecessary mem set call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same thought when looking into the problem, although will debugging I discovered the unexpected usage.
We have 4 entry scenario's (and I use the quantities of field example)
- transpose=true
a) pnts.dims()=[1156,2,1,1], dirs.dims()=[1156,2,1,1] --> odims.T()=[3,1156,1,1] --> field example
b) pnts.dims()=[1156,3,1,1], dirs.dims()=[1156,3,1,1] --> odims.T()=[3,1156,1,1] - transpose=false
a) pnts.dims()=[2,1156,1,1], dirs.dims()=[2,1156,1,1] --> odims=[3,1156,1,1]
b) pnts.dims()=[3,1156,1,1], dirs.dims()=[3,1156,1,1] --> odims=[3,1156,1,1]
PS: For multiple array points, the output array will by [3,sum of all points of all input arrays, 1,1]
For 2D inputs + transponse, the join will copy X & Y and leaving the Z uninitialized.
input array storage (X11,X12,...,X1n, Y11,Y12,...,Y1n) & (X21,X22,...,X2, Y21,Y22,...,Y2n) & ...
--> odims (X11,X12,...X1n,X21,X22,..X2n,... , Y11,Y12,..Y1n,Y21,Y22,..Y2n,..., ?,?,?,?...) !!Z-axis uninitialized!!
--> odims.T() (X11,Y11,?, X12,Y12,?, ...)
For 3D inputs the full result array will be written.
As alternative to pre-initializing the full output array, we can also add the constant array of 0. I did not do so, because I wanted to remain as close to possible to the original code.
PS: The original calculation of odims took the number of arrays iso the total number of points. This could generate a memory overflow on the GPU buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your previous statement, did make me doubt so I returned to a very old version of arrayfire 3.7.2, where I saw that the conversion to 3D wasn't done neither although It displayed a better result.
Finally, only the dimensions were wrong, resulting in a possible buffer overrun on the GPU.
So I willy push a new version, with lesser changes although with better results.
Array<T> out_pnts = createEmptyArray<T>(odims); | ||
Array<T> out_dirs = createEmptyArray<T>(odims); | ||
detail::join(out_pnts, 1, pnts); | ||
detail::join(out_dirs, 1, dirs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uninitialized buffer used to collect partial join (2D -> 3D)
It has been a while I looked at this section. Can you please elaborate since you have been looking at this recently. The dimensions of the arrays are identical here and join writes to output arrays, overwriting uninitialized values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late response, I missed this question.
Depending on the request for transpose or not, the dimensions are different.
With transpose == true --> [N,1,1] + [N,1,1] + [N,1,1] --> [N,3,1]
With transpose == false --> [1,N,1] + [1,N,1] + [1,N,1] --> [3,N,1]
Remark: the order of the elements in the input data buffer is the same for both options. The dimensions are however different.
In the old implementation of the join methods, the dimensions were not verified with the risk of "out of buffer" writes, with unpredictable results as a consequence. In many cases the reserve of an oversized buffer is used (unnoticed) or the next array in memory is overwritten (unpredictable effect).
Corrected buffer overflow in vector_field Join now accepts any buffer as out array, as long as it is large enough.
Uninitialized buffer used to collect partial join (2D -> 3D)
Join now accepts any buffer as out array, as long as it is large enough.
Description
Fixes: #3369
Changes to Users
Non
Checklist