Skip to content

Actions: fattorib/ZeRO-transformer

Actions

Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
153 workflow runs
153 workflow runs

Filter by Event

Loading

Filter by Status

Loading

Filter by Branch

Loading

Filter by Actor

Loading
drop jaxformer shard
Tests #1023: Commit bc53cbe pushed by fattorib
May 24, 2023 10:35 4m 12s pjit-tensor-parallel
May 24, 2023 10:35 4m 12s
add profiler
Tests #1022: Commit 8a16a4b pushed by fattorib
May 24, 2023 09:54 4m 16s pjit-tensor-parallel
May 24, 2023 09:54 4m 16s
correct v2-8 max flops
Tests #1021: Commit c36421d pushed by fattorib
May 24, 2023 09:46 3m 55s pjit-tensor-parallel
May 24, 2023 09:46 3m 55s
remove comments
Tests #1020: Commit 22cda84 pushed by fattorib
May 24, 2023 09:38 4m 8s pjit-tensor-parallel
May 24, 2023 09:38 4m 8s
reference xmap implementation for comparison
Tests #1019: Commit 0f7a783 pushed by fattorib
May 24, 2023 09:33 4m 6s pjit-tensor-parallel
May 24, 2023 09:33 4m 6s
add mask to layer init
Tests #1018: Commit ae4f210 pushed by fattorib
May 23, 2023 17:55 3m 40s pjit-tensor-parallel
May 23, 2023 17:55 3m 40s
reintroduce mp grad shard constraint
Tests #1017: Commit ce19c79 pushed by fattorib
May 23, 2023 17:53 3m 36s pjit-tensor-parallel
May 23, 2023 17:53 3m 36s
dp repeat axis is a variable
Tests #1016: Commit 69221ca pushed by fattorib
May 23, 2023 17:30 3m 34s pjit-tensor-parallel
May 23, 2023 17:30 3m 34s
dp model shard
Tests #1015: Commit 79e3cec pushed by fattorib
May 23, 2023 17:26 3m 45s pjit-tensor-parallel
May 23, 2023 17:26 3m 45s
reset partitions
Tests #1014: Commit d39cf20 pushed by fattorib
May 23, 2023 17:19 3m 40s pjit-tensor-parallel
May 23, 2023 17:19 3m 40s
drop print + profiler
Tests #1013: Commit a72773a pushed by fattorib
May 23, 2023 17:00 3m 40s pjit-tensor-parallel
May 23, 2023 17:00 3m 40s
replace indexing with naive reshape + scan
Tests #1011: Commit 1aba43e pushed by fattorib
May 23, 2023 12:14 3m 33s pjit-tensor-parallel
May 23, 2023 12:14 3m 33s
drop gradient replication - no changes
Tests #1010: Commit 3151a32 pushed by fattorib
May 23, 2023 12:09 3m 45s pjit-tensor-parallel
May 23, 2023 12:09 3m 45s
add pspec to grad initialization before scan
Tests #1009: Commit 0fa1dac pushed by fattorib
May 23, 2023 11:59 3m 38s pjit-tensor-parallel
May 23, 2023 11:59 3m 38s
add back base model for benchmarking
Tests #1007: Commit ff04a5c pushed by fattorib
May 23, 2023 11:15 4m 15s pjit-tensor-parallel
May 23, 2023 11:15 4m 15s
track shapes for mesh
Tests #1006: Commit 633db9f pushed by fattorib
May 23, 2023 11:14 3m 31s pjit-tensor-parallel
May 23, 2023 11:14 3m 31s
unpin old flax/wandb/optax versions
Tests #1005: Commit 898c7ce pushed by fattorib
May 23, 2023 11:12 3m 56s pjit-tensor-parallel
May 23, 2023 11:12 3m 56s
pin higher jax version
Tests #1004: Commit 8842b21 pushed by fattorib
May 23, 2023 11:06 4m 13s pjit-tensor-parallel
May 23, 2023 11:06 4m 13s
commit TP branch
Tests #1003: Commit 18a2607 pushed by fattorib
May 23, 2023 11:06 4m 14s pjit-tensor-parallel
May 23, 2023 11:06 4m 14s
make test model smaller
Tests #1002: Commit efd9ff9 pushed by fattorib
May 22, 2023 12:16 4m 38s main
May 22, 2023 12:16 4m 38s
unfreeze einops version
Tests #1001: Commit 4de1494 pushed by fattorib
May 15, 2023 15:03 3m 44s main
May 15, 2023 15:03 3m 44s
use einops in jax impl + fix confusing axis annotations
Tests #1000: Commit 2fee16d pushed by fattorib
May 15, 2023 15:03 4m 24s main
May 15, 2023 15:03 4m 24s
switch to pytorch scaled_dot_product_attn, confirmed correctness
Tests #999: Commit be88aae pushed by fattorib
May 14, 2023 10:18 4m 39s main
May 14, 2023 10:18 4m 39s