AST

Implementation of AST from the paper: "AST: Audio Spectrogram Transformer' in PyTorch and Zeta. In this implementation we basically take an 2d input tensor representing audio -> then patchify it -> linear proj -> then position embeddings -> then attention and feedforward in a loop for layers. Please Join Agora and tag me if this could be improved in any capacity.

Install

pip3 install ast-torch

Usage

import torch
from ast_torch.model import ASTransformer

# Create dummy data
x = torch.randn(2, 16)

# Initialize model
model = ASTransformer(
    dim=4, seqlen=16, dim_head=4, heads=4, depth=2, patch_size=4
)

# Run model and print output shape
print(model(x).shape)

Citation

@misc{gong2021ast,
    title={AST: Audio Spectrogram Transformer}, 
    author={Yuan Gong and Yu-An Chung and James Glass},
    year={2021},
    eprint={2104.01778},
    archivePrefix={arXiv},
    primaryClass={cs.SD}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
ast_torch		ast_torch
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
agorabanner.png		agorabanner.png
example.py		example.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AST

Install

Usage

Citation

License

About

Releases

Sponsor this project

Packages

Contributors 2

Languages

License

kyegomez/AST

Folders and files

Latest commit

History

Repository files navigation

AST

Install

Usage

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 2

Languages

Packages