Is your feature request related to a problem? Please describe.
For example I cannot get HF Bert working. I don't know when I can use your project
import bminf
import torch
encoded_input_cpu = tokenizer(text, return_tensors='pt').to('cpu')
model = BertModel.from_pretrained("bert-base-uncased").to('cpu')
# apply wrapper
with torch.cuda.device(0):
model = bminf.wrapper(model.to('cpu'))
with print_time_delta('generate'):
output = model(**encoded_input_cpu)
Describe the solution you'd like
Can you provide full examples with some known models from the HF in a Collab Notebook?
Describe alternatives you've considered