Skip to content

Conversation

@ProGTX
Copy link

@ProGTX ProGTX commented Dec 12, 2018

  • Use handler::copy instead of copying in kernel
  • Initialize output data to zero
  • Only create single temporary buffer object
  • Use on-chip memory for temporary buffer objects
  • Made compute_index a single function
  • Added existing sample from master
    • As tiled-convolution-standard
  • input_data_info
  • Changed OpenCL parameters
  • Addressed many PR comments
  • constexpr matrix_size_t
  • Typo fixes and formatting
  • Validate returns bool
  • No need to specifiy global buffer target
  • CMake cleanup
  • Display profiling
  • Synced with master

DuncanMcBain and others added 13 commits October 30, 2018 17:02
This option has existed for some time, but was never explicitly written
in the main CMake file. While adding this, some old unused lines were
removed from the CMake (the directory has long not been set by
FindComputeCpp.cmake, so the lines were totally useless).
The current behaviour of `get_node` in the virtual pointer mapper will
use `lower_bound` to get the buffer which is not less than the input
pointer. If the pointer mapper has only allocated one buffer, and we are
looking up a pointer offset into that buffer then the call to
`lower_bound` will return an iterator pointing past the end of the map.
This means we cannot dereference this iterator, as it does not point
into the map, but we still want to decrement this iterator to get the
buffer that the virtual pointer is referencing.

By adding a check for whether the iterator is the end we can short
circuit the iterator dereference, and so avoid this problem without
otherwise changing the behaviour of the pointer mapper.
…_vptr

This commit removes the unnecessary usage of the buffer allocator in …
The type buffer_t is defined in the class PointerMapper and redefined
to a different type in get_buffer which results in shadowing warnings.
SYCLmalloc doesn't have this issue but the alias should probably be
changed for consistency.
Property lists are useful, and we still want to pass them to buffers
through the SYCL malloc interface. In addition, a related test was
failing because of an interface change, and has been removed.
Contains improvements designed to improve the reliablity of
FindComputeCpp.cmake, brought on by the MSVC policy of supporting
only two Clang versions at any given time.

* Added way to check MSVC STL/Clang compatibility
* Fix and simplify force-include for MSVC
* Ensure device compiler arguments are always stored as CMake list
Small sample showing how to use built-in kernels in ComputeCpp.
Fixes issue codeplaysoftware#126 but allows for optional ComputeCpp use.
Specifically this shows a device selector that fails to find a device.
Similar effect could be achieved with the fallback queue.
This should fix codeplaysoftware#105.
Moving these compiler checks to a separate module keeps the main
FindComputeCpp.cmake clean and makes the checks "opt-in", if you
know that you won't need the check for your project.
* Use `handler::copy` instead of copying in kernel
* Initialize output data to zero
* Only create single temporary buffer object
* Use on-chip memory for temporary buffer objects
* Made `compute_index` a single function
* Added existing sample from master
  * As `tiled-convolution-standard`
* `input_data_info`
* Changed OpenCL parameters
* Addressed many PR comments
* `constexpr matrix_size_t`
* Typo fixes and formatting
* Validate returns `bool`
* No need to specifiy global buffer target
* CMake cleanup
* Display profiling
@ProGTX ProGTX changed the base branch from master to Tiled-CONV December 12, 2018 16:25
@mehdi-goli mehdi-goli merged commit b0a8426 into mehdi-goli:Tiled-CONV Dec 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants