Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extractive Prediction Instead of Abstractive Prediction #21

Closed
kandarpkakkad opened this issue Jun 12, 2020 · 38 comments
Closed

Extractive Prediction Instead of Abstractive Prediction #21

kandarpkakkad opened this issue Jun 12, 2020 · 38 comments

Comments

@kandarpkakkad
Copy link

Hi!
I have tried to run the pre-trained model to test it on my dataset which consists of paragraphs as inputs and one line sentence as targets. The problem was when I saw the prediction it was extracted from the input instead of generating one as expected.

kandarp@kandarp:~/Downloads/pegasus$ python3 pegasus/bin/evaluate.py --params=new_params --param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6 --model_dir=ckpt/pegasus_ckpt

The new_params is the new .tfrecords dataset for testing.

In the output, I am getting the following output:

I0612 16:45:32.333093 140653653737856 text_eval.py:126] INPUTS: [0]:
Live in the country and last three years longer than my city friends? Good news indeed, more backing for a lifestyle choice made half a lifetime ago when it seemed a good idea to exchange an Edinburgh terrace for a farm cottage. I knew it was a good idea because I had been there before. Born and reared on a farm I had been seduced for a few years by the idea of being a big shot who lived and worked in a city rather than only going for the day to wave at the buses. True, I was familiar with some of the minor disadvantages of country living such as an iffy private water supply sometimes infiltrated by a range of flora and fauna (including, on one memorable occasion, a dead lamb), the absence of central heating in farmhouses and cottages, and a single track farm road easily blocked by snow, broken-down machinery or escaped livestock. But there were many advantages as I told Liz back in the mid-Seventies. Town born and bred, eight months pregnant and exchanging a warm, substantial Corstorphine terrace for a windswept farm cottage on a much lower income, persuading her that country had it over town might have been difficult.
I0612 16:45:32.334013 140653653737856 text_eval.py:126] TARGETS: Although there are many advantages of country living, it is still difficult to persuade a town- born and bred person to live in the country due to disadvantages and inconvenience of country living life.
I0612 16:45:32.335105 140653653737856 text_eval.py:126] PREDICTIONS: Good news indeed, more backing for a lifestyle choice made half a lifetime ago when it seemed a good idea to exchange an Edinburgh terrace for a farm cottage.

The prediction is the 2nd line of the Input.

Is there a mistake by me or is it the problem of the model?

@spookypineapple
Copy link

spookypineapple commented Jun 12, 2020

@kandarpkakkad would you be able to share how you added your own dataset? Ive tried to do the same (following the official tfds tutorial) but wasnt able to succesfully do so.
Much Appreciated

@kandarpkakkad
Copy link
Author

Make a python file and keep it in pegasus folder and run it so that it is stored in testdata folder.

import pandas as pd
import tensorflow as tf

name_dict = dict(
    inputs=[
        # Your Inputs
    ],
    targets=[
        # Your Targets For Inputs Respectively.
    ]
)

df = pd.DataFrame(name_dict)

print(df)

header = ["inputs", "targets"]

df.to_csv('output.csv', columns=header, index=False)

csv = pd.read_csv("output.csv").values
with tf.io.TFRecordWriter("pegasus/data/testdata/test_pattern.tfrecords") as writer:
    for row in csv:
        inputs, targets = row[:-1], row[-1]
        example = tf.train.Example(
            features=tf.train.Features(
                feature={
                    "inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
                    "targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
                }
            )
        )
        writer.write(example.SerializeToString())

I had 1 input 1 target pair.
But if you have N input 1 target pair, you have to make inputs as ndarray.

@spookypineapple
Copy link

Thanks for sharing!

@JingqingZ
Copy link
Collaborator

JingqingZ commented Jun 12, 2020

This makes sense to some extent because you're using the pre-trained model without any fine-tuning (i.e. zero-shot summarization). I also experienced similar extractive behaviour on some datasets like AESLC and XSum. During the pre-training, PEGASUS encourages some extractive behaviour to fit the more extractive downstream tasks better (this is also described in our paper, section 6.2). PEGASUS has good performance with little supervision (100-1000 examples) on some datasets but the performance with zero-shot is still limited, especially on abstractive datasets. If your dataset is very abstractive, some fine-tuning should be helpful.

@kandarpkakkad
Copy link
Author

Ok! Thank you very much. I got my answer and so I am closing this issue.

@ManuMahadevaswamy
Copy link

@ kandarpkakkad , did you implement this on google collab ? I am facing disk space shortage issue,
any suggestions?

@ManuMahadevaswamy
Copy link

would love to hear from you as well @JingqingZ

@kandarpkakkad
Copy link
Author

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

@ManuMahadevaswamy
Copy link

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

--I tried running it on local computer too, it is taking ages to download the pretrained model

@kandarpkakkad
Copy link
Author

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

--I tried running it on local computer too, it is taking ages to download the pretrained model

Sorry, can't help.

@ManuMahadevaswamy
Copy link

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

--I tried running it on local computer too, it is taking ages to download the pretrained model

Sorry, can't help.

No, problem. Thanks

@JingqingZ
Copy link
Collaborator

The vocab and all model checkpoints (pre-trained + all finetuned) take ~29G spaces so please make sure if Google colab has sufficient space (i.e. Google drive has sufficient space available if Google drive is mounted).

@ManuMahadevaswamy
Copy link

The vocab and all model checkpoints (pre-trained + all finetuned) take ~29G spaces so please make sure if Google colab has sufficient space (i.e. Google drive has sufficient space available if Google drive is mounted).

okay.Thanks, I haven't mounted google , i am using colab disk space alone

@ManuMahadevaswamy
Copy link

'model.ckpt-1500000.data-00000-of-00001' , is this the hybrid model which is trained on 'c4+HugeNews' datasets?
'c4.unigram.newline.10pct.96000.model',--is this the model trained on only c4 data?

can anyone explain this please?

@ManuMahadevaswamy
Copy link

Also,I'm getting this warning while fune tuning with the dataset aeslc, what does it mean?

'2020-06-21 12:19:04.185622: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 393637888 exceeds 10% of system memory.'

@JingqingZ
Copy link
Collaborator

'model.ckpt-1500000.data-00000-of-00001' , is this the hybrid model which is trained on 'c4+HugeNews' datasets?

Yes

'c4.unigram.newline.10pct.96000.model',--is this the model trained on only c4 data?

No, this is vocab model for sentencepiece.

W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 393637888 exceeds 10% of system memory.'

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

@ManuMahadevaswamy
Copy link

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

-- I am implementing this on Colab and it has enough space in the RAM as well as disk.
-- How can I just use 1000 examples from a particular dataset , say wikihow , to fine tune the model?
-- How can view the wikihow data ?

@ManuMahadevaswamy
Copy link

Make a python file and keep it in pegasus folder and run it so that it is stored in testdata folder.

import pandas as pd
import tensorflow as tf

name_dict = dict(
inputs=[
# Your Inputs
],
targets=[
# Your Targets For Inputs Respectively.
]
)

df = pd.DataFrame(name_dict)

print(df)

header = ["inputs", "targets"]

df.to_csv('output.csv', columns=header, index=False)

csv = pd.read_csv("output.csv").values
with tf.io.TFRecordWriter("pegasus/data/testdata/test_pattern.tfrecords") as writer:
for row in csv:
inputs, targets = row[:-1], row[-1]
example = tf.train.Example(
features=tf.train.Features(
feature={
"inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
"targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
}
)
)
writer.write(example.SerializeToString())
I had 1 input 1 target pair.
But if you have N input 1 target pair, you have to make inputs as ndarray.

@kandarpkakkad , i tried creating a test set with 1 input and 1 target and placed in the testdata folder,
next i registered the 'new_parms' by appending the below code in the public_params .py file.

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

and finally tried testing the model,
!python3 pegasus/bin/evaluate.py --params=new_params
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6
--model_dir=ckpt/pegasus_ckpt

but the model isn't predicting results for the test set that i had given, it is giving prediction for different example

I0622 22:42:07.407472 139655975044992 text_eval.py:126] INPUTS: [0]:
To ensure a smooth flow of bank resolutions to the necessary signatories, I am requesting that Enron Treasury first route the bank resolutions to Angela Davis (EWS Legal) to be initialed before being routed to John Lavorato or Louise Kitchen.
If you have any questions please call me at 3-6544.
Thank you for your attention to this matter.

I0622 22:42:07.407813 139655975044992 text_eval.py:126] TARGETS: Treasury Bank Resolutions

I0622 22:42:07.407969 139655975044992 text_eval.py:126] PREDICTIONS: Thank you for your attention to this matter.

@JingqingZ can somebody please explain what could be the issue?

@JingqingZ
Copy link
Collaborator

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

-- I am implementing this on Colab and it has enough space in the RAM as well as disk.
-- How can I just use 1000 examples from a particular dataset , say wikihow , to fine tune the model?

Take the advantage of the flag --param_overrides=train_pattern=tfds:wikihow/all-train-take_1000. For example

python3 pegasus/bin/train.py --params=wikihow_all_transformer \
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,train_pattern=tfds:wikihow/all-train-take_1000 \
--train_init_checkpoint=ckpt/pegasus_ckpt/model.ckpt-1500000 \
--model_dir=ckpt/pegasus_ckpt/wikihow

Reference in the code to load datasets

def build(self, input_pattern, shuffle_files):

-- How can view the wikihow data ?

Hope this may help https://www.tensorflow.org/datasets/catalog/wikihow

@JingqingZ
Copy link
Collaborator

--model_dir=ckpt/pegasus_ckpt

It seems you're using the pre-trained model checkpoint instead of fine-tuned model checkpoints. Using pre-trained model checkpoint directly as zero-shot summarization may have such extractive output (i.e the prediction is extracted from the input). An explanation is provided above in our previous communication in this issue.

@ManuMahadevaswamy
Copy link

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

-- I am implementing this on Colab and it has enough space in the RAM as well as disk.
-- How can I just use 1000 examples from a particular dataset , say wikihow , to fine tune the model?

Take the advantage of the flag --param_overrides=train_pattern=tfds:wikihow/all-train-take_1000. For example

python3 pegasus/bin/train.py --params=wikihow_all_transformer \
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,train_pattern=tfds:wikihow/all-train-take_1000 \
--train_init_checkpoint=ckpt/pegasus_ckpt/model.ckpt-1500000 \
--model_dir=ckpt/pegasus_ckpt/wikihow

Reference in the code to load datasets

def build(self, input_pattern, shuffle_files):

-- How can view the wikihow data ?

Hope this may help https://www.tensorflow.org/datasets/catalog/wikihow

Thank you, it was very helpful.

Also, can we create a new dataset to fine tune the pretrained model in the TFrecords format ?or should it be of TFDS format only?

@JingqingZ
Copy link
Collaborator

TFrecords should be workable

@ManuMahadevaswamy
Copy link

TFrecords should be workable

Thank you JingqingZ

@ManuMahadevaswamy
Copy link

TFrecords should be workable

@JingqingZ , I have a csv file {input,target} pair, the inputs and targets are of variable length, the code shared by @kandarpkakkad is not working in this case. could you please suggest a method to convert this into TFrecord

@JingqingZ
Copy link
Collaborator

TFrecords should be workable

@JingqingZ , I have a csv file {input,target} pair, the inputs and targets are of variable length, the code shared by @kandarpkakkad is not working in this case. could you please suggest a method to convert this into TFrecord

This sounds a little "abstractive". I think the idea of the code shared is right but if you have pairs of data in a different form, you definitely need to modify some code.

@ManuMahadevaswamy
Copy link

TFrecords should be workable

@JingqingZ , I have a csv file {input,target} pair, the inputs and targets are of variable length, the code shared by @kandarpkakkad is not working in this case. could you please suggest a method to convert this into TFrecord

This sounds a little "abstractive". I think the idea of the code shared is right but if you have pairs of data in a different form, you definitely need to modify some code.

@JingqingZ , Thank you.

@bicrxm
Copy link

bicrxm commented Jun 28, 2020

File "/content/Abstractive-Text-Summarization-Pegasus/datasets.py", line 30, in get_dataset
dataset_name)
ValueError: Dataset name /content/Abstractive-Text-Summarization-Pegasus/test_pattern_1.tfrecords is not found in registered datasets.

"""
_DATASETS = {}

def get_dataset(dataset_name):
if dataset_name not in _DATASETS:
raise ValueError("Dataset name %s is not found in registered datasets." %
dataset_name)
return _DATASETSdataset_name
"""

@ManuMahadevaswamy
Copy link

File "/content/Abstractive-Text-Summarization-Pegasus/datasets.py", line 30, in get_dataset
dataset_name)
ValueError: Dataset name /content/Abstractive-Text-Summarization-Pegasus/test_pattern_1.tfrecords is not found in registered datasets.

"""
_DATASETS = {}

def get_dataset(dataset_name):
if dataset_name not in _DATASETS:
raise ValueError("Dataset name %s is not found in registered datasets." %
dataset_name)
return _DATASETSdataset_name
"""

the file extension should be .tfrecord, not .tfrecords.
Also,you need to register the dataset in the public_params.py file

@rjbanner
Copy link

rjbanner commented Jul 1, 2020

Make a python file and keep it in pegasus folder and run it so that it is stored in testdata folder.
import pandas as pd
import tensorflow as tf
name_dict = dict(
inputs=[

Your Inputs

],
targets=[

Your Targets For Inputs Respectively.

]
)
df = pd.DataFrame(name_dict)
print(df)
header = ["inputs", "targets"]
df.to_csv('output.csv', columns=header, index=False)
csv = pd.read_csv("output.csv").values
with tf.io.TFRecordWriter("pegasus/data/testdata/test_pattern.tfrecords") as writer:
for row in csv:
inputs, targets = row[:-1], row[-1]
example = tf.train.Example(
features=tf.train.Features(
feature={
"inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
"targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
}
)
)
writer.write(example.SerializeToString())
I had 1 input 1 target pair.
But if you have N input 1 target pair, you have to make inputs as ndarray.

@kandarpkakkad , i tried creating a test set with 1 input and 1 target and placed in the testdata folder,
next i registered the 'new_parms' by appending the below code in the public_params .py file.

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

and finally tried testing the model,
!python3 pegasus/bin/evaluate.py --params=new_params
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6
--model_dir=ckpt/pegasus_ckpt

but the model isn't predicting results for the test set that i had given, it is giving prediction for different example

I0622 22:42:07.407472 139655975044992 text_eval.py:126] INPUTS: [0]:
To ensure a smooth flow of bank resolutions to the necessary signatories, I am requesting that Enron Treasury first route the bank resolutions to Angela Davis (EWS Legal) to be initialed before being routed to John Lavorato or Louise Kitchen.
If you have any questions please call me at 3-6544.
Thank you for your attention to this matter.

I0622 22:42:07.407813 139655975044992 text_eval.py:126] TARGETS: Treasury Bank Resolutions

I0622 22:42:07.407969 139655975044992 text_eval.py:126] PREDICTIONS: Thank you for your attention to this matter.

@JingqingZ can somebody please explain what could be the issue?

Hi @ManuMahadevaswamy was there any resolution to this? I only want to generate a summary for my input.

For the following dataset registration in the public_params.py it seems to iterate through all tests instead of the single record in test_pattern.tfrecord

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

@kandarpkakkad what did your dataset registration in the public_params.py look like?

@JingqingZ it appears that for test_pattern the test_pattern.tfrecord file would get picked up from the data/testdata folder. Can we supply the same format for train_pattern and dev_pattern i.e:

"train_pattern": "tfrecord:test_pattern.tfrecord",
"dev_pattern": "tfrecord:test_pattern.tfrecord",
"test_pattern": "tfrecord:test_pattern.tfrecord"

@qiong-sportsbet
Copy link

@rjbanner @darienacosta
it seems we need to provide the full path to test_pattern.tfrecord as follows:
@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

@gserb-datascientist
Copy link

It seems - just changing "test_pattern" is not enough to prevent endless iterations through all tests instead of the single record in the test_pattern.tfrecord.

Finally, I ended up changing train_pattern and dev_pattern in addition to test_pattern to make it produce a single iteration as expected:

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"dev_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"test_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

@ManuMahadevaswamy
Copy link

I am trying to train the PEGASUS model with new fine tuning dataset which has just 600 records, with GPU, the execution is getting interrupted halfway through, I tried keeping the batch size low , by making batch_size=1

INFO:tensorflow:global_step/sec: 0.908995
I0708 19:54:06.618062 140370869380992 tpu_estimator.py:2307] global_step/sec: 0.908995
INFO:tensorflow:examples/sec: 0.908995
I0708 19:54:06.618438 140370869380992 tpu_estimator.py:2308] examples/sec: 0.908995
^
c

with batch_size=8, i am getting below error

2020-07-08 18:59:17.896051: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at softmax_op_gpu.cu.cc:157 : Resource exhausted: OOM when allocating tensor with shape[8,16,1024,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
ERROR:tensorflow:Error recorded from training_loop: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[8,16,1024,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node encoder/layer_7/self_attention/Softmax (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

@JingqingZ @rjbanner @bickrombishsass , is it because of the RAM size? I am using 12 GB RAM with GPU.
Please help me, if anyone has successfully fine-tuned the model with new dataset.

@JingqingZ
Copy link
Collaborator

JingqingZ commented Jul 8, 2020

It seems out-of-memory. 12GB should be enough to run the model for run batch_size=1 (or 4), but may not be enough for batch_size=8.

You can also check the input sequence length and output sequence length. If the input/output length is too long, e.g. >= 1024, the self-attention module will take lots of memory.

@ManuMahadevaswamy
Copy link

It seems out-of-memory. 12GB should be enough to run the model for run batch_size=1 (or 4), but may not be enough for batch_size=8.

You can also check the input sequence length and output sequence length. If the input/output length is too long, e.g. >= 1024, the self-attention module will take lots of memory.

Yes, It could be because of that may be. I have inputs with length greater than 1024.
Is there a way to optimize this for inputs greater than 1024?
@JingqingZ Thanks for your quick response.

@JingqingZ
Copy link
Collaborator

Unfortunately, the model cannot handle inputs longer than the max length you set (like 512 or 1024). For example, in arxiv and pubmed dataset, PEGASUS only takes the first 1024 tokens into the encoder and discard other tokens. You can refer to some other papers which work on mutli-document summarization or summarization with very long inputs.

@ManuMahadevaswamy
Copy link

Unfortunately, the model cannot handle inputs longer than the max length you set (like 512 or 1024). For example, in arxiv and pubmed dataset, PEGASUS only takes the first 1024 tokens into the encoder and discard other tokens. You can refer to some other papers which work on mutli-document summarization or summarization with very long inputs.

Thank you @JingqingZ, I will check them.

@IanMack-uk
Copy link

IanMack-uk commented Sep 11, 2020

Hi Everyone,
I am trying to get PEGASUS to summarise a test article (just to get it working). And, I have followed all the advice and there are no errors being shown. But, still the summary files are not being created. Below is 1, data input code, 2, section added to the public_params.py and 3, the evaluate code lines. I'm a complete newbie - so any help is appreciated. Best, Ian

import pandas as pd
import tensorflow as tf

save_path = "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord"

input_dict = dict(
inputs=[
"Prime Minister Narendra Modi and President Ashraf Ghani on Monday discussed the evolving security situation in the region against the backdrop of a spike in terrorist violence in Afghanistan. The two leaders had a phone conversation and exchanged greetings on the occasion of Eid-ul-Azha. Ghani thanked Modi for timely supply of food and medical assistance to meet Afghanistan’s requirements, according to an official statement. “The two leaders also exchanged views on the evolving security situation in the region and other areas of mutual bilateral interest,” the statement said, adding Modi had reiterated India’s commitment to the Afghan people “in their quest for a peaceful, prosperous and inclusive Afghanistan”. The phone conversation coincided with a meeting in Kabul between Afghan National Security Adviser Hamdullah Mohib and Indian ambassador Vinay Kumar. Kumar spoke of India’s continued help for Afghanistan to counter the Covid-19 pandemic, and Mohib thanked India for its support and affirmed Afghanistan’s “strong commitment to the bilateral partnership”, the Afghan National Security Council said. The contacts between the two sides came a week after Chinese foreign minister Wang Yi chaired a virtual meeting with his counterparts from Afghanistan, Nepal and Pakistan and called for “four-party cooperation”, including on the China-Pakistan Economic Corridor (CPEC). Wang also said Kathmandu and Kabul should learn from China-Pakistan cooperation and the four countries should work together to extend CPEC to Afghanistan. The meeting, held against the backdrop of the China-India border standoff, was viewed warily in New Delhi. India also has serious concerns about the spike in terrorist violence in Afghanistan and the role being played by Pakistan in the Afghan peace process. Days after a report by a UN expert team monitoring the implementation of sanctions against banned terror groups and individuals said almost 6,500 Pakistani terrorists were present in Afghanistan, acting Afghan interior minister Masoud Andarabi said on Monday that the new leader of the Khorasan unit of Islamic State was a member of the Pakistan-based Haqqani Network.“Shahab Almahajir, the newly appointed leader of Islamic State of Khorasan Province-ISKP is a Haqani member. Haqani & the Taliban carry out their terrorism on a daily basis across Afg & when their terrorist activities [do] not suit them politically they rebrand it under ISKP,” Andarabi tweeted."
],
targets=[""
# Targets are supposed to represent ground truth - so, for predictions, leave blank
]
)

data = pd.DataFrame(input_dict)

with tf.io.TFRecordWriter(save_path) as writer:
for row in data.values:
inputs, targets = row[:-1], row[-1]
example = tf.train.Example(
features=tf.train.Features(
feature={
"inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
"targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
}
)
)
writer.write(example.SerializeToString())


save_path = "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord"
@registry.register("test_transformer")
def test_transformer(param_overrides):
return transformer_params(
{
"train_pattern": "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord",
"dev_pattern": "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord",
"test_pattern": "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord",
"max_input_len": 1024,
"max_output_len": 256,
"train_steps": 180000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)


!python3 /content/pegasus/pegasus/bin/evaluate.py --params=test_transformer
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6
--model_dir=ckpt/pegasus_ckpt/


When I run this last couple of lines, this notice appears. Is this anything to do with the issue? Thx

WARNING:tensorflow:From /content/pegasus/pegasus/bin/evaluate.py:85: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
W0911 18:33:41.139357 139982413326208 deprecation.py:323] From /content/pegasus/pegasus/bin/evaluate.py:85: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
File "/content/pegasus/pegasus/bin/evaluate.py", line 153, in
tf.compat.v1.app.run(main)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/content/pegasus/pegasus/bin/evaluate.py", line 85, in main
if not FLAGS.wait and not tf.train.checkpoint_exists(FLAGS.model_dir):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_management.py", line 393, in checkpoint_exists
return checkpoint_exists_internal(checkpoint_prefix)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_management.py", line 366, in checkpoint_exists_internal
if file_io.get_matching_files(pathname):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 363, in get_matching_files
return get_matching_files_v2(filename)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 384, in get_matching_files_v2
compat.as_bytes(pattern))
tensorflow.python.framework.errors_impl.NotFoundError: ckpt/pegasus_ckpt; No such file or directory

@JingqingZ
Copy link
Collaborator

ckpt/pegasus_ckpt; No such file or directory

It seems the folder of checkpoints hasn't been created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants