You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if we are just using a single pdf, the sample is a list and inference is expecting a tensor of an image ,so the below code will not work so we should make it into sample[0].where sample[0] is the tensor which is stored in the 0th index of the list
model_output = model.inference(image_tensors=sample)
this a function where i passed a single pdf file. and made predictions for each page
def predict():
model=NougatModel.from_pretrained("C:/Users/sshamsu/Documents/New folder/nougat weights").to(torch.bfloat16)#getting nougat pretrained model
if torch.cuda.is_available():
model.to("cuda")
dataset=LazyDataset("C:/Users/sshamsu/Downloads/research paper for Nought.pdf", #it should be the file path of the pdf
partial(model.encoder.prepare_input,random_padding=False),
)#object of the class LazyDataset
dataloader = torch.utils.data.DataLoader(
dataset,
batch_size=1,
shuffle=False,
collate_fn=LazyDataset.ignore_none_collate,
)
prediction=[]
for page_num,page_as_tensor in tqdm(enumerate(dataloader)):
model_output = model.inference(image_tensors=page_as_tensor[0])
output = markdown_compatible(model_output["predictions"][0])
prediction.append(output)
final_mmd="".join(prediction).strip()
return final_mmd
The text was updated successfully, but these errors were encountered:
for page_num,page_as_tensor in tqdm(enumerate(dataloader)):
model_output = model.inference(image_tensors=page_as_tensor[0])
If i don't mention the index 0 in page_as_tensor ,an error pops because page_as_tensor is a list.May be because i am doing it for just one paper .but in the predict.py and app.py files ,they didn't mention the index.So is it issue too when using multiple pdfs?
if we are just using a single pdf, the sample is a list and inference is expecting a tensor of an image ,so the below code will not work so we should make it into sample[0].where sample[0] is the tensor which is stored in the 0th index of the list
model_output = model.inference(image_tensors=sample)
this a function where i passed a single pdf file. and made predictions for each page
def predict():
model=NougatModel.from_pretrained("C:/Users/sshamsu/Documents/New folder/nougat weights").to(torch.bfloat16)#getting nougat pretrained model
if torch.cuda.is_available():
model.to("cuda")
The text was updated successfully, but these errors were encountered: