Training with LDC2020T02 and "amrlib" project #9

bjascob · 2020-09-05T01:55:45Z

Just FYI in case anyone's interested...

I trained the "no recategorization" branch of the model with LDC2020T02 (amr 3.0) and got a 76.7 smatch score. I didn't spend much time trying to optimize hyper-parameters and I'm using the amr 2.0 utils directory so possibly there's additional optimizations to be had, or maybe the amr 3 corpus is just more complex with the new "multi-sentence" annotations, etc..

I'm also using this model in amrlib with all of the sheng-z/stog code removed. In my version of the model code there's no pre/post processing at all. In addition, I've also switched to spaCy for annotations. I'm getting about the same 77 smatch under these conditions.

amrlib is intended as a user library for parsing and generation. I've simplified some of the parsing routines for the end-user and updated code to the latest version of penman, pytorch, sped up smatch scoring, etc.. Feel free to pull portions of revised code if you have any interest. I'd be happy to see a little more optimization of the model in that setting, though I'm not planning on focusing it myself.

The library also includes a Huggingface T5 model re-trained for graph-to-sentence generation that gets a 43 BLEU on LDC2020T02. It's a lot easier coding wise than jcyk/gtos and amazingly effective.

jcyk added the solved but likely useful for others label Jan 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training with LDC2020T02 and "amrlib" project #9

Training with LDC2020T02 and "amrlib" project #9

bjascob commented Sep 5, 2020 •

edited

Loading

Training with LDC2020T02 and "amrlib" project #9

Training with LDC2020T02 and "amrlib" project #9

Comments

bjascob commented Sep 5, 2020 • edited Loading

bjascob commented Sep 5, 2020 •

edited

Loading