Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BertScore giving different results each time #165

Open
p-H-7 opened this issue Jul 7, 2023 · 0 comments
Open

BertScore giving different results each time #165

p-H-7 opened this issue Jul 7, 2023 · 0 comments

Comments

@p-H-7
Copy link

p-H-7 commented Jul 7, 2023

Hello @Tiiiger

Initially, i wrote my code such that it calls the score function multiple times, when applied to a dataframe, and it took a lot of time to compile. (Shown below)

def calculate_bertscore(row):
    source_text = row['Verbatim Translated']
    generated_summary = row['summary']

    summary_list = [generated_summary]
    source_list = [source_text]

    bertscore = score(summary_list, source_list, lang="en", model_type="bert-base-uncased", num_layers=4, device="cuda:0")

    f1_score = bertscore[0].item()

    return f1_score

Upon realizing my error by reading your reply, i modified my code to directly pass the columns as lists (as shown below).

summary_list = df_test_2['Verbatim Translated'].tolist()
source_list = df_test_2['summary'].tolist()

P, R, F1 = score(summary_list, source_list, lang="en", model_type="bert-base-uncased", num_layers=4, device="cuda:0")

df_test_2['F1-score'] = F1

df_test_2

The running time improved significantly but the results are comparatively bad. LIke the F1 scores for all the rows have dropped by approximately 0.1. Now, which of them would be the correct results, and I would like to know why there is such a drop in scores?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant