Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model.fit results in nan #1804

Open
Abigail-gs opened this issue Jan 2, 2023 · 1 comment
Open

model.fit results in nan #1804

Abigail-gs opened this issue Jan 2, 2023 · 1 comment

Comments

@Abigail-gs
Copy link

Hi,

I want to fine-tune SBERT with pre-trained weights of 'bert-base-uncased'.
I follow this tutorial: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/nli/training_nli_v2.py
using MultipleNegativesRankingLoss loss function.

When I do model.fit , the results are 'nan' everywhere.

here is my code:
`root_model = AutoModel.from_pretrained('bert-base-uncased')
output_dir = "/root/Automated_Assessment_(ETS)/Model/DRAFT/DRAFT_Bert_base_uncased"
BERT_model = root_model.save_pretrained(output_dir)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') #('onlplab/alephbert-base')
tokenizer.save_pretrained(output_dir)

learning_rate, batch_size, epochs = 2e-5, 8, 1

train_dataloader = datasets.NoDuplicatesDataLoader(train_data, batch_size=batch_size)
word_embedding_model = models.Transformer(output_dir, max_seq_length=512)
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), pooling_mode='mean')
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

train_loss = losses.MultipleNegativesRankingLoss(model)
val_evaluator = EmbeddingSimilarityEvaluator.from_input_examples(val_data, batch_size=batch_size)

warmup_steps = math.ceil(len(train_dataloader) * epochs * 0.1) #10% of train data for warm-up
logging.info("Warmup-steps: {}".format(warmup_steps))

output_file = 'output/sentence_similarity'+MODEL_NAME.replace("/", "-")+'-'+datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
sb_output_path = os.path.join(ref_saved_models_path, output_file)

model.fit(train_objectives=[(train_dataloader, train_loss)],
evaluator=val_evaluator,
epochs=epochs,
evaluation_steps=int(len(train_dataloader)*0.1),
warmup_steps=warmup_steps,
output_path=sb_output_path,
use_amp=False #Set to True, if your GPU supports FP16 operations
)
`

here is a screenshot of the log:
Capture

I don't understand what am I doing wrong? Could you please help me?

@linkedlist771
Copy link

Same Issue too, try to add some noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants