Con silver #1348

AngledLuffa · 2024-02-22T01:57:13Z

Add some mechanisms for building and manipulating a silver dataset for the constituency parser. Filtering the trees by number of matching parsers seems to help make a better silver dataset, whereas filtering by variance does not. Will continue experimenting

… new silver dataset producing script which uses two ensembles at once to directly find the matching trees along with counting the number of parsers which agree on the best tree Add a script which extracts the trees we want of a certain match level Sample command line for the wiki tokenization script

… is to make it so that we can skip 10 models at a time and use a series of these to measure the variance in a silver dataset's scores

…ituency model

(this will be sent back via the proto in the next version of CoreNLP after 4.5.6)

…generate was already expanded

…es over a sequence of trained models. Works either by taking the least or the most variance. Hopefully this script will work to filter a large silver dataset to a more manageable silver dataset

AngledLuffa added 7 commits February 21, 2024 17:55

Remove unnecessary repetition of existing save_each argument

7a76bf6

Add flags to control how often the save_each models get saved. Intent…

1df35e0

… is to make it so that we can skip 10 models at a time and use a series of these to measure the variance in a silver dataset's scores

Add an option to turn off saving the optimizer when saving each const…

628b902

…ituency model

Update constituency evaluation to accommodate the per-tree f1

c2a07f6

(this will be sent back via the proto in the next version of CoreNLP after 4.5.6)

Only do args['num_generate'] once... possibly fixing a bug where num_…

4d3180f

…generate was already expanded

Add a script which filters silver trees by the variance of their scor…

d4ba9ce

…es over a sequence of trained models. Works either by taking the least or the most variance. Hopefully this script will work to filter a large silver dataset to a more manageable silver dataset

AngledLuffa merged commit fe45f11 into dev Feb 24, 2024
1 check passed

AngledLuffa deleted the con_silver branch February 24, 2024 07:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Con silver #1348

Con silver #1348

AngledLuffa commented Feb 22, 2024

Con silver #1348

Con silver #1348

Conversation

AngledLuffa commented Feb 22, 2024