Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DynamicView with host size grow #1206

Closed
bathmatt opened this issue Oct 31, 2017 · 3 comments
Closed

DynamicView with host size grow #1206

bathmatt opened this issue Oct 31, 2017 · 3 comments
Assignees
Labels
Feature Request Create new capability; will potentially require voting
Milestone

Comments

@bathmatt
Copy link

@hcedwar and I talked yesterday about this. He agreed that Kokkos will add a dynamic view like the one that is there but growable on the host. This won't use the memory pool and should be able to deal with fragmentation. It will also not require that the scalar type be a 2^n size. So, I can have

class my_class { int a, b,c;}
DynamicView<my_class*> my_data( initial_size, chunk_size=1024);

and I can
my_data.resize(new_size);

@ibaned ibaned added the Feature Request Create new capability; will potentially require voting label Nov 1, 2017
@ibaned ibaned added this to the 2017 December milestone Nov 1, 2017
@hcedwar hcedwar modified the milestones: 2017 December, 2018 February Nov 29, 2017
@hcedwar
Copy link
Contributor

hcedwar commented Dec 18, 2017

@crtrott @ndellingwood @bathmatt
This is a must-do for Matt by the February 2018 milestone. I suggest Nathan.

@dsunder dsunder added the Blocks Promotion Overview issue for release-blocking bugs label Jan 24, 2018
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Jan 26, 2018
Initial commit to modify DynamicView such that:
1. remove need for memory pool by...
2. resize only functions from the host outside of parallel_* regions
     reason - remove need for memory pool

Due to removing the memory pool, the ptr-array-to-chunks, i.e. m_chunks,
is now allocated in CudaUVM space when CudaSpace is enabled; chunks are
still allocated in CudaSpace.

Current impl API asks for (label, max_extent, min_chunk_size)
Max extent is asked for to avoid resizing the m_chunks array, which
could lead to possible issues if multiple DynamicViews are copies of
each other, sharing the allocation.

Compiled with Serial backend so far...

TODO
Cleanup - comments, if-guarded previous code
Unit tests need to be added
Testing
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Jan 26, 2018
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Jan 30, 2018
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Jan 30, 2018
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Jan 30, 2018
Ccommit to modify DynamicView such that:
1. remove need for memory pool by...
2. resize only functions from the host outside of parallel_* regions
     reason - remove need for memory pool

Due to removing the memory pool, the ptr-array-to-chunks, i.e. m_chunks,
is now allocated in CudaUVM space when CudaSpace is enabled; chunks are
still allocated in CudaSpace.

Current impl API asks for (label, min_chunk_size, max_extent)
Max extent is asked for to avoid resizing the m_chunks array, which
could lead to possible issues if multiple DynamicViews are copies of
each other, sharing the allocation, and one of them requires resize of
the m_chunks array (to pointers of chunks)
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Jan 30, 2018
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Jan 30, 2018
@ndellingwood
Copy link
Contributor

@bathmatt I started working on this feature request and have an initial WIP PR #1378. I'll summarize some of the changes (API and behavior), could you give some feedback on the changes and whether this direction meets your use cases or needs changes?

  1. resize can now only be called on the host outside of parallel_* calls.
  2. resize_serial member function is used for resizing; the name was not changed nor was the API
  3. Use of memory pool was removed, resize_parallel member function removed
  4. DynamicView API (current PR) would require the 3 following arguments: (1) label, (2) chunk_size, (3) extent_upperbound, so

DynamicView< Scalar*, ExecSpace > dv( "dv", chunk_size, extent_upperbound) ;

I can add a fourth argument initial_size as well, and if I make it optional I can default it to 0 but it would need to be the final argument to the constructor.

The extent of the DynamicView (provided via resize) does not need to be 2^n, though internally the chunk_size will rounded up to a nearest power-of-two for faster bit operations during indexing and resizing.

The extent_upperbound is an upperbound on number of entries the DynamicView will receive. This was added to avoid having the inner array of pointers to chunks needing to be resized, which could be problematic in the case where there are multiple copies of a DynamicView and resize is called (where the array of pointers to must be resized) because only the DynamicView on which the resize is called will have the new array. Working around this to allow resize of the array of pointers would likely come at a performance hit to enforce some type of synchronization of the array of pointers between the views.

ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Feb 1, 2018
size() and extent() return the extent requested by the user via the
resize_serial() method.
The actual extent available ( number of chunks * chunk size ) is
returned by the allocation_extent() method.
This commit contributes toward issue kokkos#1206
ndellingwood added a commit to ndellingwood/kokkos that referenced this issue Feb 1, 2018
…e-1206

* 'issue-1206' of github.com:ndellingwood/kokkos:
  Issue kokkos#1206 - fix order of args to DynamicView in test_sort
  Issue kokkos#1206: Fix DynamicView API in test_sort in algorithms
  DynamicView: Address issue kokkos#1206
  Attempt to get rid of warning
  Fix issue in deep_copy changes
  Fix an issue with the benchmark suite after changes in macros
  Fix warning with CUDA for OpenMP nthreads unused variable
  Fix issue kokkos#1269
  Fix deep_copy between empty views issue kokkos#1369
  Adding OpenMP InterOp test issue kokkos#1305
  Fix CUDA interoperability and add unit test
  Fix issue kokkos#1363 : Deepcopy between rank-1 views with LayoutLeft/Right
  Adding ChunkSize constructor overload to RangePolicy.
  Error out when -arch not detected
crtrott added a commit to ndellingwood/kokkos that referenced this issue Feb 5, 2018
crtrott added a commit that referenced this issue Feb 5, 2018
@crtrott crtrott added InDevelop and removed Blocks Promotion Overview issue for release-blocking bugs labels Feb 5, 2018
@crtrott
Copy link
Member

crtrott commented Feb 5, 2018

Just pushed this into develop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request Create new capability; will potentially require voting
Projects
None yet
Development

No branches or pull requests

6 participants