Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD compatibility issue and workaround (MISALIGNED_SUB_BUFFER_OFFSET) #10

Closed
sjstreicher opened this issue Dec 13, 2017 · 0 comments
Closed

Comments

@sjstreicher
Copy link

sjstreicher commented Dec 13, 2017

First, thanks for a great package!

Although the latest release does not officially support AMD OpenCL devices, I really wanted to get this working as I have numerous AMD GPUs available. As far as I could tell, the only real compatibility issue with AMD cards have to do with the fact that the offsets of sub buffers have to be multiples of 256. I spent some time researching the proper way of adjusting the offsets without success, but suspect this to be rather trivial to someone skilled in OpenCL.

In the meantime I found a simple workaround that might be of benefit to others trying to run analyses on AMD cards:

The first erring call in my experience is usually this line:

d_var2 = d_pointset.get_sub_region(
self.sizeof_float * signallength * var1dim,
self.sizeof_float * signallength * var2dim,
cl.mem_flags.READ_ONLY)

As the self.sizeof_float is 4 bytes, this will succeed as long as signallength is a multiple of 64.
In turn, signallength's size is determined by (number of samples - max lag sources) * number of replications. If number of samples >> 64 it is not difficult to make small adjustments to the number of samples in order to accommodate this limitation without any impact, however inelegant it might seem.

For example, the following demo runs successfully on my AMD RX580:

# Import classes
from idtxl.multivariate_te import MultivariateTE
from idtxl.data import Data

# a) Generate test data
data = Data()
data.generate_mute_data(n_samples=2053, n_replications=5)

# b) Initialise analysis object and define settings
network_analysis = MultivariateTE()
settings = {'cmi_estimator': 'OpenCLKraskovCMI',
            'max_lag_sources': 5,
            'min_lag_sources': 1,
            'debug': False}

# (n_samples - max_lag_sources) * n_replications must be a multiple of 256/4
# for compatibility with AMD cards

# c) Run analysis
results = network_analysis.analyse_network(settings=settings, data=data)
pwollstadt added a commit that referenced this issue May 21, 2018
Add Pedro's version of the data padding for AMD cards, Change unit
tests accordingly.

Fixes #10.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant