Production-ready multilingual customer support system using LLaMA-3, RLHF, DeepSpeed, and AWS Trainium.
-
Updated
May 3, 2025 - Python
Production-ready multilingual customer support system using LLaMA-3, RLHF, DeepSpeed, and AWS Trainium.
A high-performance Python SDK for running Lattice QCD simulations on AWS Trainium and Inferentia instances.
Tensor contractions for AWS Trainium via NKI (cuTENSOR-equivalent) — einsum with contraction planning, CP/PARAFAC and Tucker decompositions, density-fitted post-Hartree-Fock patterns.
Sparse matrix operations for AWS Trainium via NKI (cuSPARSE-equivalent) — CSR/COO formats, SpMV and SpMM via gather-matmul-scatter, Schwarz integral screening for quantum chemistry.
FFT and complex-valued tensor operations for AWS Trainium via NKI (cuFFT-equivalent) — Cooley-Tukey, Bluestein, STFT, ComplexTensor and complex NN layers.
BLAS Levels 1–3 for AWS Trainium via NKI (cuBLAS-equivalent) — GEMM with stationary-tile reuse, batched GEMM, TRSM, validated DF-MP2 for quantum chemistry.
Linear solvers and eigendecomposition for AWS Trainium via NKI (cuSOLVER-equivalent) — Jacobi eigh, Cholesky/LU/QR factorizations, CG/GMRES iterative solvers, Newton-Schulz inverse square root.
Scientific computing library suite for AWS Trainium via NKI — the cuFFT/cuBLAS/cuRAND/cuSOLVER/cuSPARSE/cuTENSOR equivalents for Neuron. Python-first, PyTorch fallback everywhere, Apache-2.0.
Add a description, image, and links to the aws-trainium topic page so that developers can more easily learn about it.
To associate your repository with the aws-trainium topic, visit your repo's landing page and select "manage topics."