Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of theorem predictor #10

Open
ICanFlyGFC opened this issue Sep 19, 2022 · 9 comments
Open

Poor performance of theorem predictor #10

ICanFlyGFC opened this issue Sep 19, 2022 · 9 comments

Comments

@ICanFlyGFC
Copy link

Hello, Pan. Thank you for your open source.

I download checkpoint model from https://acl2021-intergps.s3.us-west-1.amazonaws.com/tp_model_best.pt
But the evaluation results are empty. How can I get it back to normal? Thanks.

image

@lupantech
Copy link
Owner

Hi, Thank you for your interest in our work!

This evaluation result is not normal. Would you mind sharing the script you were running and the log it printed? It could help me narrow down the reasons.

Thanks!

Best,
Pan

@ICanFlyGFC
Copy link
Author

ICanFlyGFC commented Sep 20, 2022

Thank you for your reply!

The script is same as yours. I only change the file name of output.

#!/usr/bin/env python

coding: utf-8

import json
import ast
from tqdm import tqdm

import torch
from transformers import BartForConditionalGeneration, BartTokenizerFast

def evaluate(diagram_logic_file, text_logic_file, tokenizer_name, model_name, check_point, seq_num):

test_lst = range(2401, 3002)

## read logic form files
with open(diagram_logic_file) as f:
    diagram_logic_forms = json.load(f)
with open(text_logic_file) as f:
    text_logic_forms = json.load(f)

combined_logic_forms = {}
for pid in test_lst:
    combined_logic_forms[pid] = diagram_logic_forms[str(pid)]['diagram_logic_forms'] + \
                                text_logic_forms[str(pid)]['text_logic_forms']

## build tokenizer and model
tokenizer = BartTokenizerFast.from_pretrained(tokenizer_name) # 'facebook/bart-base'
model = BartForConditionalGeneration.from_pretrained(model_name).to(device) # 'facebook/bart-base'
model.load_state_dict(torch.load(check_point))

final = dict()
for pid in tqdm(test_lst):
    input = str(combined_logic_forms[pid])
    tmp = tokenizer.encode(input)
    if len(tmp) > 1024:
        tmp = tmp[:1024]
    input = torch.LongTensor(tmp).unsqueeze(0).to(device)

    output = model.generate(input, bos_token_id=0, eos_token_id=2,
                         max_length=20, num_beams=10, num_return_sequences=seq_num)
    # print(out.size())

    ## refine output sequence
    seq = []
    for j in range(seq_num):
        res = tokenizer.decode(output[j].tolist())
        res = res.replace("</s>", "").replace("<s>", "").replace("<pad>", "")
        # print(res)
        try:
            res = ast.literal_eval(res) # string class to list class
        except Exception as e:
            res = []
        seq.append(res)

    final[str(pid)] = {"id": str(pid), "num_seqs": seq_num, "seq": seq}

return final

if name == 'main':

diagram_logic_file = '../data/geometry3k/logic_forms/diagram_logic_forms_annot.json'
text_logic_file = '../data/geometry3k/logic_forms/text_logic_forms_annot_dissolved.json'

check_point = 'models/tp_model_best.pt'
output_file = 'results/test/pred_seqs_test_debugging.json'

tokenizer_name = 'facebook/bart-base'
model_name = 'facebook/bart-base'

SEQ_NUM = 5

device = torch.device('cuda:0')

result = evaluate(diagram_logic_file, text_logic_file, tokenizer_name, model_name, check_point, SEQ_NUM)

with open(output_file, 'w') as f:
    json.dump(result, f)

The log:

D:\Anaconda\envs\intergps\python.exe D:/WorkSpace/InterGPS-main/theorem_predict/eval_transformer.py
0%| | 0/601 [00:00<?, ?it/s]D:\Anaconda\envs\intergps\lib\site-packages\transformers\generation_utils.py:1839: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
next_indices = next_tokens // vocab_size
22%|██▏ | 135/601 [00:23<01:25, 5.43it/s]Token indices sequence length is longer than the specified maximum sequence length for this model (1569 > 1024). Running this sequence through the model will result in indexing errors
100%|██████████| 601/601 [01:42<00:00, 5.88it/s]

Process finished with exit code 0

Thanks!

@lupantech
Copy link
Owner

Hi,

Below is my script:

cd symbolic_solver
python test.py --label final --strategy final

And the running log is here: https://github.com/lupantech/InterGPS/blob/main/symbolic_solver/logs/final/log-1612098244-predict_low-first_1.log.

The executed result is here: https://github.com/lupantech/InterGPS/blob/main/symbolic_solver/pred_results/final/logic_1612098244-predict_low-first_1.json.

@ICanFlyGFC
Copy link
Author

Thank you Pan!

I can run your script to get the corresponding results. But I am focus on theorem predictor.
I wonder how to generate ../theorem_predict/results/pred_seq_result_bart_epoch19_seq5.json.
I also found that many geometry problems can be solved by rules based on formal language without theorems.
Can I understand that theorem prediction is not so important in this paper?

Thanks!

Best,
Fucheng

@lupantech
Copy link
Owner

Hi Fucheng,

For the theorem predictor, you can follow the instructions at https://github.com/lupantech/InterGPS#theorem-predictor.

For the second question, yes. As we discussed in the paper, one of the main functions of the theorem predictor is to improve the search efficiency and thus improve the final accuracy, which is verified in Table 7 and Figure 5.

Best,
Pan

@ICanFlyGFC
Copy link
Author

Thanks, Pan!

I follow the instruction at https://github.com/lupantech/InterGPS#theorem-predictor.
I download the pre-trained model at step 4. But the evaluation results are empty in step 5.

Thanks!

Best,
Fucheng

@lupantech
Copy link
Owner

Hi Fucheng,

I see. Would you mind if I checked your issue a few days later? I am working on some emergent deadlines and I need more time to figure your problem out. For now, I think it is not a big problem to ignore the theorem predictor if you just want to reproduce our results.

I appreciate your understanding!

Best,
Pan

@ICanFlyGFC
Copy link
Author

Thanks, Pan.

Sure.
Thank you for your work and look forward to your new achievements. Your paper and code have inspired me a lot.

Best,
Fucheng

@lupantech
Copy link
Owner

Hi Fucheng,

Thanks! I am happy to help with your project as well!

Yours sincerely,
Pan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants