-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consistent size and index types #191
Comments
FWIW, the SYCL spec landed on using The fact that the gpu blas APIs typically use int indexes makes things fun here. Bottom line, we don't want silent failures if overflow happens, even in release builds. |
I guess I'll agree that while there was some intention behind the different size for multi-d index vs 1-d index, it's really not possible to not do this in a way that doesn't cause potential subtle overflow bugs. So probably what we should do is to use Maybe actually |
yes this is subtle. As a start, I am using size_t for index calculations inside kernels and for SYCL and whenever doing interations over container.size() or calc_size(shape). This involves no change to the external interface. |
There are potentially 3 types at play here:
I'm inclined to make gt::index_type a standard that should be used e.g. for |
Having both size_type and index_type is a bit confusing here. One is 1d, the other is N-d, but of course 1d is just a special case of N-d, so it's wierd. |
Right, I think I just wrote something along those same lines in another thread. |
I think an interesting experiment here, would be to change all indexes to size_type and change default size_type to uint32_t, and see if (a) tests, benchmarks etc pass, (b) gene passes tests, and (c) performance of benchmarks and/or GENE improves significantly. If the answer is yet, then I think it's worth doing. Other than some reverse iteration direction loops in bandsolver which can be rewritten, I don't see negative indexes being used anywhere. It would also be interesting to see if using uint32_t consistently works around the ROCm compiler bug with -O2. |
Currently gtensor uses
int
basedshape_type
, while low level indexing likegtensor_storage
index operator aregt::size_type
, which is an alias forstd::size_t
(typically unsigned long). Furthermore, thecalc_size(shape)
helper returnssize_type
, but there are places in the code which index over a total size with int or cast it to int.To make matters more interesting, the
dim3
type in CUDA and HIP isuint32_t
(actually unsigned in for CUDA but that is generally uint32_t). It seems like we should at least be consistent with dim3 types?The text was updated successfully, but these errors were encountered: