Skip to content
@softmax1

softmax1

Popular repositories Loading

  1. Flash-Attention-Softmax-N Flash-Attention-Softmax-N Public

    CUDA and Triton implementations of Flash Attention with SoftmaxN.

    Python 66 5

  2. quietGPT quietGPT Public

    A scaled down empirical study of "Attention is Off by One" on nanoGPT

    Python 3

  3. EsperBERTo EsperBERTo Public

    A test of the Attention Is Off By One hypothesis

    Python

  4. nanoGPT_softmax1 nanoGPT_softmax1 Public

    An experiment using nanoGPT vs nanoGPT (softmax1) to see how it affects perplexity score

    Python

  5. nanoGPT_softmax1_reddit nanoGPT_softmax1_reddit Public

    Forked from karpathy/nanoGPT

    The simplest, fastest repository for training/finetuning medium-sized GPTs.

    Python

  6. MosaicBERT-Softmax1 MosaicBERT-Softmax1 Public

    Python

Repositories

Showing 7 of 7 repositories
  • Flash-Attention-Softmax-N Public

    CUDA and Triton implementations of Flash Attention with SoftmaxN.

    softmax1/Flash-Attention-Softmax-N’s past year of commit activity
    Python 66 GPL-3.0 5 1 1 Updated May 26, 2024
  • llama2.c-tinystories Public Forked from karpathy/llama2.c

    Inference Llama 2 in one file of pure C

    softmax1/llama2.c-tinystories’s past year of commit activity
    Jupyter Notebook 0 MIT 2,009 0 0 Updated Dec 20, 2023
  • softmax1/MosaicBERT-Softmax1’s past year of commit activity
    Python 0 GPL-3.0 0 0 0 Updated Sep 23, 2023
  • EsperBERTo Public

    A test of the Attention Is Off By One hypothesis

    softmax1/EsperBERTo’s past year of commit activity
    Python 0 0 0 0 Updated Sep 16, 2023
  • nanoGPT_softmax1 Public

    An experiment using nanoGPT vs nanoGPT (softmax1) to see how it affects perplexity score

    softmax1/nanoGPT_softmax1’s past year of commit activity
    Python 0 0 1 0 Updated Aug 19, 2023
  • nanoGPT_softmax1_reddit Public Forked from karpathy/nanoGPT

    The simplest, fastest repository for training/finetuning medium-sized GPTs.

    softmax1/nanoGPT_softmax1_reddit’s past year of commit activity
    Python 0 MIT 5,324 0 0 Updated Aug 19, 2023
  • quietGPT Public

    A scaled down empirical study of "Attention is Off by One" on nanoGPT

    softmax1/quietGPT’s past year of commit activity
    Python 3 0 0 0 Updated Aug 9, 2023

Top languages

Loading…

Most used topics

Loading…