Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
-
Updated
May 17, 2024 - Python
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Implementation of a Transformer, but completely in Triton
Fast deterministic all-Python Lennard-Jones particle simulator that utilizes Numba for GPU-accelerated computation.
Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.
Boilerplate for GPU-Accelerated TensorFlow and PyTorch code on M1 Macbook
pyCUDA implementation of forward propagation for Convolutional Neural Networks
🌟 Vertex Centric approach for building GNN/TGNNs
Fundamentals of heterogeneous parallel programming with CUDA C/C++ at the beginner level.
bilibili视频【CUDA 12.1 并行编程入门(Python语言版)】配套代码
vgg16 inference implementation using tensorflow, numpy and pycuda
A helper package to easily time Numba CUDA GPU events ⌛
GPU programming using CUDA & Python
A Bifrost plug-in for the Tensor-Core Correlator.
CUDA accelerated raytracer using PyCUDA in Python
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing
simple ray tracer implemented in Python, capable of rendering 3D scenes with basic shapes, materials, and lighting.
A Taichi component for automatically compiling and launching compute graph.
Introduction to PyCuda GPU programming.
Scripts to manage rocprof tracing of multi-process, multi-node program runs.
Object Tracking of grayscale objects using CUDA
Add a description, image, and links to the gpu-programming topic page so that developers can more easily learn about it.
To associate your repository with the gpu-programming topic, visit your repo's landing page and select "manage topics."