Skip to content

Conversation

@oitel
Copy link
Contributor

@oitel oitel commented Feb 25, 2025

No description provided.

@oitel oitel requested a review from Grantim February 25, 2025 11:23
Comment on lines 121 to 134
size_t CudaAccessor::fromGridMemory( const Mesh& mesh, const Vector3i& )
{
return fastWindingNumberMeshMemory( mesh ) + size_t( dims.x ) * dims.y * dims.z * sizeof( float );
return fastWindingNumberMeshMemory( mesh );
}

size_t CudaAccessor::fromVectorMemory( const Mesh& mesh, size_t inputSize )
size_t CudaAccessor::fromVectorMemory( const Mesh& mesh, size_t )
{
return fastWindingNumberMeshMemory( mesh ) + inputSize * ( sizeof( float ) + sizeof( Vector3f ) );
return fastWindingNumberMeshMemory( mesh );
}

size_t CudaAccessor::selfIntersectionsMemory( const Mesh& mesh )
{
return fastWindingNumberMeshMemory( mesh ) + mesh.topology.faceSize() * sizeof( float );
return fastWindingNumberMeshMemory( mesh );
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add here some minimum fixed amount?

Comment on lines 14 to 15
/// chunk index
size_t index;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to store index in chunk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is useful for logging and progress callbacks.


const auto size = totalSize - overlap; // otherwise the last chunk's size may be smaller or equal to the overlap i.e. fully in the previous chunk
const auto step = chunkSize - overlap;
return ( size / step ) + !!( size % step ); // integer variant of `std::ceil( a / b )`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return (size + step - 1) / step;

Comment on lines 39 to 42
const Dipole* __restrict__ dipoles;
const Node3* __restrict__ nodes;
const float3* __restrict__ meshPoints;
const FaceToThreeVerts* __restrict__ faces;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we initialize it with nulltprs?


const auto q = ( meshPoints[face.verts[0]] + meshPoints[face.verts[1]] + meshPoints[face.verts[2]] ) / 3.0f;
processPoint( q, resVec[index], dipoles, nodes, meshPoints, faces, beta, index );
processPoint( q, resVec[index], dipoles, nodes, meshPoints, faces, beta );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we pass faceIndex instead of index here (please compare to cpu version)

res.resize( size );
CUDA_LOGE_RETURN_UNEXPECTED( data_->cudaPoints.fromVector( points ) );
// TODO: allow user to set the upper limit
const auto maxBufferBytes = getCudaAvailableMemory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use ~70-80% of max available memory just in case

Comment on lines +93 to +94
DynamicArray<float3> cudaPoints;
CUDA_LOGE_RETURN_UNEXPECTED( cudaPoints.resize( bufferSize ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case we use 3 floats, while bufferSize was counted for one float

@oitel oitel merged commit 937c0df into master Feb 25, 2025
32 checks passed
@oitel oitel deleted the feature/cuda_buffer_slice branch February 25, 2025 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants