Create holdout set #145

lfoppiano · 2022-10-24T05:15:28Z

This PR will select some paper to have an holdout set.
At the moment, as the data set is small, we will use all the documents for create the final models, however we will keep a fixed holdout set to have a more strict and precise evaluation. Except for Units where the evaluation set was borrowed by a different source.

The holdout set was created using an automatic script and re-balanced based on the distribution of entities between training and holdout set.

The python script to reproduce the holdout dataset are contained under scripts.

The statistics about the training/holdout set can be found in:

coveralls · 2022-10-24T05:22:57Z

Coverage remained the same at 27.67% when pulling 06c7e11 on feature/holdout-set into 0957bc6 on master.

lfoppiano · 2022-11-07T01:42:56Z

I think this is ready to merge

lfoppiano added 4 commits October 20, 2022 18:38

add randomized stratified holdout set

a36a8d0

add automatic holdout set and statistics

94e92c4

rebalance holdout set to have a better distribution of labels

fc5955a

add python script to generate automatically the holdout set

f682de6

lfoppiano linked an issue Oct 24, 2022 that may be closed by this pull request

Create a holdout dataset #143

Closed

lfoppiano added the enhancement label Oct 24, 2022

lfoppiano added 9 commits October 24, 2022 14:30

add generated files for delft

036d14b

add generated files for delft for other models

88f3840

revise units and values

7c9cb7e

update holdout set

a20385f

add holdout evaluation raw

a4e279f

cleanup

3648bff

update results with the holdout dataset

f7a4420

add evaluation raw data for DL models

79a3b49

update results including only available models

2273010

lfoppiano added 4 commits November 7, 2022 10:55

revise documentation

b9724bc

update documentation version

e2077a8

add more clarification on holdout dataset

7ea49dd

add link to descritpion

06c7e11

lfoppiano merged commit 8da45fe into master Nov 9, 2022

lfoppiano deleted the feature/holdout-set branch November 9, 2022 02:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create holdout set #145

Create holdout set #145

lfoppiano commented Oct 24, 2022 •

edited

Loading

coveralls commented Oct 24, 2022 •

edited

Loading

lfoppiano commented Nov 7, 2022

Create holdout set #145

Create holdout set #145

Conversation

lfoppiano commented Oct 24, 2022 • edited Loading

coveralls commented Oct 24, 2022 • edited Loading

lfoppiano commented Nov 7, 2022

lfoppiano commented Oct 24, 2022 •

edited

Loading

coveralls commented Oct 24, 2022 •

edited

Loading