Skip to content

An experimental architecture using Mixture of Attentions with sandwiched Maracron Feedforward's and other modules

License

Notifications You must be signed in to change notification settings

kyegomez/GiediPrime

Repository files navigation

Multi-Modality

GeidiPrime

This is an extremely experimental Transformer architecture with Macaron like FFNs with local attention. Perhap's we can add the visual expert from Zeta and make it multi-modal!

Install

Usage

import torch
from geidi_prime.model import GeidiPrimeTransformer

model = GeidiPrimeTransformer(
    dim=4096,
    depth=6,
    heads=8,
    num_tokens=20000,
)

x = torch.randint(0, 20000, (1, 4096))

out = model(x)
print(out.shape)

License

MIT

About

An experimental architecture using Mixture of Attentions with sandwiched Maracron Feedforward's and other modules

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

 

Packages

No packages published