-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
With below code, the output on Intel Arc B60 is
memalloc_time=18.989502819720656, calculation_time=25.28791649499908
On B70
memalloc_time=18.39343382231891, calculation_time=40.41595908906311
The performance on B70 should be higher than B60, could you help to tell how to pinpoint the problem?
Do we have some tools to analyze?
Thanks.
import dpnp as np
import time
start_time = time.perf_counter()
a = np.array([0.1, 0.01, 0.001, 0.0001] * 4096*4096, dtype=np.float32)
end_time = time.perf_counter()
memalloc_time = end_time - start_time
i = 0
start_time = time.perf_counter()
while i < 10000:
b = a.cumsum()
i += 1
end_time = time.perf_counter()
calculation_time = end_time - start_time
print(
f"memalloc_time={memalloc_time}, calculation_time={calculation_time}"
)Tried some other functions like cumprod()/linalg.pinv()/linalg.solve() all have similar symptom.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels