Extractive Prediction Instead of Abstractive Prediction #21

kandarpkakkad · 2020-06-12T11:32:55Z

Hi!
I have tried to run the pre-trained model to test it on my dataset which consists of paragraphs as inputs and one line sentence as targets. The problem was when I saw the prediction it was extracted from the input instead of generating one as expected.

kandarp@kandarp:~/Downloads/pegasus$ python3 pegasus/bin/evaluate.py --params=new_params --param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6 --model_dir=ckpt/pegasus_ckpt

The new_params is the new .tfrecords dataset for testing.

In the output, I am getting the following output:

I0612 16:45:32.333093 140653653737856 text_eval.py:126] INPUTS: [0]:
Live in the country and last three years longer than my city friends? Good news indeed, more backing for a lifestyle choice made half a lifetime ago when it seemed a good idea to exchange an Edinburgh terrace for a farm cottage. I knew it was a good idea because I had been there before. Born and reared on a farm I had been seduced for a few years by the idea of being a big shot who lived and worked in a city rather than only going for the day to wave at the buses. True, I was familiar with some of the minor disadvantages of country living such as an iffy private water supply sometimes infiltrated by a range of flora and fauna (including, on one memorable occasion, a dead lamb), the absence of central heating in farmhouses and cottages, and a single track farm road easily blocked by snow, broken-down machinery or escaped livestock. But there were many advantages as I told Liz back in the mid-Seventies. Town born and bred, eight months pregnant and exchanging a warm, substantial Corstorphine terrace for a windswept farm cottage on a much lower income, persuading her that country had it over town might have been difficult.
I0612 16:45:32.334013 140653653737856 text_eval.py:126] TARGETS: Although there are many advantages of country living, it is still difficult to persuade a town- born and bred person to live in the country due to disadvantages and inconvenience of country living life.
I0612 16:45:32.335105 140653653737856 text_eval.py:126] PREDICTIONS: Good news indeed, more backing for a lifestyle choice made half a lifetime ago when it seemed a good idea to exchange an Edinburgh terrace for a farm cottage.

The prediction is the 2nd line of the Input.

Is there a mistake by me or is it the problem of the model?

spookypineapple · 2020-06-12T14:56:10Z

@kandarpkakkad would you be able to share how you added your own dataset? Ive tried to do the same (following the official tfds tutorial) but wasnt able to succesfully do so.
Much Appreciated

kandarpkakkad · 2020-06-12T15:23:59Z

Make a python file and keep it in pegasus folder and run it so that it is stored in testdata folder.

import pandas as pd
import tensorflow as tf

name_dict = dict(
    inputs=[
        # Your Inputs
    ],
    targets=[
        # Your Targets For Inputs Respectively.
    ]
)

df = pd.DataFrame(name_dict)

print(df)

header = ["inputs", "targets"]

df.to_csv('output.csv', columns=header, index=False)

csv = pd.read_csv("output.csv").values
with tf.io.TFRecordWriter("pegasus/data/testdata/test_pattern.tfrecords") as writer:
    for row in csv:
        inputs, targets = row[:-1], row[-1]
        example = tf.train.Example(
            features=tf.train.Features(
                feature={
                    "inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
                    "targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
                }
            )
        )
        writer.write(example.SerializeToString())

I had 1 input 1 target pair.
But if you have N input 1 target pair, you have to make inputs as ndarray.

spookypineapple · 2020-06-12T15:39:58Z

Thanks for sharing!

JingqingZ · 2020-06-12T17:01:28Z

This makes sense to some extent because you're using the pre-trained model without any fine-tuning (i.e. zero-shot summarization). I also experienced similar extractive behaviour on some datasets like AESLC and XSum. During the pre-training, PEGASUS encourages some extractive behaviour to fit the more extractive downstream tasks better (this is also described in our paper, section 6.2). PEGASUS has good performance with little supervision (100-1000 examples) on some datasets but the performance with zero-shot is still limited, especially on abstractive datasets. If your dataset is very abstractive, some fine-tuning should be helpful.

kandarpkakkad · 2020-06-12T17:06:11Z

Ok! Thank you very much. I got my answer and so I am closing this issue.

ManuMahadevaswamy · 2020-06-20T12:38:22Z

@ kandarpkakkad , did you implement this on google collab ? I am facing disk space shortage issue,
any suggestions?

ManuMahadevaswamy · 2020-06-20T12:38:50Z

would love to hear from you as well @JingqingZ

kandarpkakkad · 2020-06-20T12:46:29Z

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

ManuMahadevaswamy · 2020-06-20T12:48:43Z

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

--I tried running it on local computer too, it is taking ages to download the pretrained model

kandarpkakkad · 2020-06-20T12:53:36Z

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

--I tried running it on local computer too, it is taking ages to download the pretrained model

Sorry, can't help.

ManuMahadevaswamy · 2020-06-20T12:54:54Z

@ kandarpkakkad , did you implement this on google collab? I am facing disk space shortage issue,
any suggestions?

No, I have not Implemented in Google Colab. I ran this on a local computer. But there was a message that it is using 10% extra RAM. So I guess, it is heavy. I am sorry I don't know how to optimise it.

--I tried running it on local computer too, it is taking ages to download the pretrained model

Sorry, can't help.

No, problem. Thanks

JingqingZ · 2020-06-20T14:13:14Z

The vocab and all model checkpoints (pre-trained + all finetuned) take ~29G spaces so please make sure if Google colab has sufficient space (i.e. Google drive has sufficient space available if Google drive is mounted).

ManuMahadevaswamy · 2020-06-20T14:48:42Z

The vocab and all model checkpoints (pre-trained + all finetuned) take ~29G spaces so please make sure if Google colab has sufficient space (i.e. Google drive has sufficient space available if Google drive is mounted).

okay.Thanks, I haven't mounted google , i am using colab disk space alone

ManuMahadevaswamy · 2020-06-21T13:44:48Z

'model.ckpt-1500000.data-00000-of-00001' , is this the hybrid model which is trained on 'c4+HugeNews' datasets?
'c4.unigram.newline.10pct.96000.model',--is this the model trained on only c4 data?

can anyone explain this please?

ManuMahadevaswamy · 2020-06-21T13:47:28Z

Also,I'm getting this warning while fune tuning with the dataset aeslc, what does it mean?

'2020-06-21 12:19:04.185622: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 393637888 exceeds 10% of system memory.'

JingqingZ · 2020-06-21T21:50:54Z

'model.ckpt-1500000.data-00000-of-00001' , is this the hybrid model which is trained on 'c4+HugeNews' datasets?

Yes

'c4.unigram.newline.10pct.96000.model',--is this the model trained on only c4 data?

No, this is vocab model for sentencepiece.

W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 393637888 exceeds 10% of system memory.'

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

ManuMahadevaswamy · 2020-06-22T12:43:14Z

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

-- I am implementing this on Colab and it has enough space in the RAM as well as disk.
-- How can I just use 1000 examples from a particular dataset , say wikihow , to fine tune the model?
-- How can view the wikihow data ?

ManuMahadevaswamy · 2020-06-22T22:49:43Z

Make a python file and keep it in pegasus folder and run it so that it is stored in testdata folder.

import pandas as pd
import tensorflow as tf

name_dict = dict(
inputs=[
# Your Inputs
],
targets=[
# Your Targets For Inputs Respectively.
]
)

df = pd.DataFrame(name_dict)

print(df)

header = ["inputs", "targets"]

df.to_csv('output.csv', columns=header, index=False)

csv = pd.read_csv("output.csv").values
with tf.io.TFRecordWriter("pegasus/data/testdata/test_pattern.tfrecords") as writer:
for row in csv:
inputs, targets = row[:-1], row[-1]
example = tf.train.Example(
features=tf.train.Features(
feature={
"inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
"targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
}
)
)
writer.write(example.SerializeToString())
I had 1 input 1 target pair.
But if you have N input 1 target pair, you have to make inputs as ndarray.

@kandarpkakkad , i tried creating a test set with 1 input and 1 target and placed in the testdata folder,
next i registered the 'new_parms' by appending the below code in the public_params .py file.

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

and finally tried testing the model,
!python3 pegasus/bin/evaluate.py --params=new_params
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6
--model_dir=ckpt/pegasus_ckpt

but the model isn't predicting results for the test set that i had given, it is giving prediction for different example

I0622 22:42:07.407472 139655975044992 text_eval.py:126] INPUTS: [0]:
To ensure a smooth flow of bank resolutions to the necessary signatories, I am requesting that Enron Treasury first route the bank resolutions to Angela Davis (EWS Legal) to be initialed before being routed to John Lavorato or Louise Kitchen.
If you have any questions please call me at 3-6544.
Thank you for your attention to this matter.

I0622 22:42:07.407813 139655975044992 text_eval.py:126] TARGETS: Treasury Bank Resolutions

I0622 22:42:07.407969 139655975044992 text_eval.py:126] PREDICTIONS: Thank you for your attention to this matter.

@JingqingZ can somebody please explain what could be the issue?

JingqingZ · 2020-06-23T10:10:21Z

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

-- I am implementing this on Colab and it has enough space in the RAM as well as disk.
-- How can I just use 1000 examples from a particular dataset , say wikihow , to fine tune the model?

Take the advantage of the flag --param_overrides=train_pattern=tfds:wikihow/all-train-take_1000. For example

python3 pegasus/bin/train.py --params=wikihow_all_transformer \
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,train_pattern=tfds:wikihow/all-train-take_1000 \
--train_init_checkpoint=ckpt/pegasus_ckpt/model.ckpt-1500000 \
--model_dir=ckpt/pegasus_ckpt/wikihow

Reference in the code to load datasets

pegasus/pegasus/data/datasets.py

Line 177 in addaf5a

def build(self, input_pattern, shuffle_files):

-- How can view the wikihow data ?

Hope this may help https://www.tensorflow.org/datasets/catalog/wikihow

JingqingZ · 2020-06-23T10:14:44Z

--model_dir=ckpt/pegasus_ckpt

It seems you're using the pre-trained model checkpoint instead of fine-tuned model checkpoints. Using pre-trained model checkpoint directly as zero-shot summarization may have such extractive output (i.e the prediction is extracted from the input). An explanation is provided above in our previous communication in this issue.

ManuMahadevaswamy · 2020-06-23T10:18:08Z

This means your CPU is out of memory but it should be fine as the model can still run as normal. You may try to free up some CPU space.

-- I am implementing this on Colab and it has enough space in the RAM as well as disk.
-- How can I just use 1000 examples from a particular dataset , say wikihow , to fine tune the model?

Take the advantage of the flag --param_overrides=train_pattern=tfds:wikihow/all-train-take_1000. For example
python3 pegasus/bin/train.py --params=wikihow_all_transformer \
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,train_pattern=tfds:wikihow/all-train-take_1000 \
--train_init_checkpoint=ckpt/pegasus_ckpt/model.ckpt-1500000 \
--model_dir=ckpt/pegasus_ckpt/wikihow
Reference in the code to load datasets

pegasus/pegasus/data/datasets.py

Line 177 in addaf5a

def build(self, input_pattern, shuffle_files):

-- How can view the wikihow data ?

Hope this may help https://www.tensorflow.org/datasets/catalog/wikihow

Thank you, it was very helpful.

Also, can we create a new dataset to fine tune the pretrained model in the TFrecords format ?or should it be of TFDS format only?

JingqingZ · 2020-06-23T10:21:16Z

TFrecords should be workable

ManuMahadevaswamy · 2020-06-23T10:35:14Z

TFrecords should be workable

Thank you JingqingZ

ManuMahadevaswamy · 2020-06-24T11:20:51Z

TFrecords should be workable

@JingqingZ , I have a csv file {input,target} pair, the inputs and targets are of variable length, the code shared by @kandarpkakkad is not working in this case. could you please suggest a method to convert this into TFrecord

JingqingZ · 2020-06-25T16:50:09Z

TFrecords should be workable

@JingqingZ , I have a csv file {input,target} pair, the inputs and targets are of variable length, the code shared by @kandarpkakkad is not working in this case. could you please suggest a method to convert this into TFrecord

This sounds a little "abstractive". I think the idea of the code shared is right but if you have pairs of data in a different form, you definitely need to modify some code.

ManuMahadevaswamy · 2020-06-25T19:33:28Z

TFrecords should be workable

@JingqingZ , I have a csv file {input,target} pair, the inputs and targets are of variable length, the code shared by @kandarpkakkad is not working in this case. could you please suggest a method to convert this into TFrecord

This sounds a little "abstractive". I think the idea of the code shared is right but if you have pairs of data in a different form, you definitely need to modify some code.

@JingqingZ , Thank you.

bicrxm · 2020-06-28T04:32:25Z

File "/content/Abstractive-Text-Summarization-Pegasus/datasets.py", line 30, in get_dataset
dataset_name)
ValueError: Dataset name /content/Abstractive-Text-Summarization-Pegasus/test_pattern_1.tfrecords is not found in registered datasets.

"""
_DATASETS = {}

def get_dataset(dataset_name):
if dataset_name not in _DATASETS:
raise ValueError("Dataset name %s is not found in registered datasets." %
dataset_name)
return _DATASETSdataset_name
"""

ManuMahadevaswamy · 2020-06-28T15:35:50Z

File "/content/Abstractive-Text-Summarization-Pegasus/datasets.py", line 30, in get_dataset
dataset_name)
ValueError: Dataset name /content/Abstractive-Text-Summarization-Pegasus/test_pattern_1.tfrecords is not found in registered datasets.

"""
_DATASETS = {}

def get_dataset(dataset_name):
if dataset_name not in _DATASETS:
raise ValueError("Dataset name %s is not found in registered datasets." %
dataset_name)
return _DATASETSdataset_name
"""

the file extension should be .tfrecord, not .tfrecords.
Also,you need to register the dataset in the public_params.py file

rjbanner · 2020-07-01T00:11:58Z

Make a python file and keep it in pegasus folder and run it so that it is stored in testdata folder.
import pandas as pd
import tensorflow as tf
name_dict = dict(
inputs=[

Your Inputs

],
targets=[

Your Targets For Inputs Respectively.

]
)
df = pd.DataFrame(name_dict)
print(df)
header = ["inputs", "targets"]
df.to_csv('output.csv', columns=header, index=False)
csv = pd.read_csv("output.csv").values
with tf.io.TFRecordWriter("pegasus/data/testdata/test_pattern.tfrecords") as writer:
for row in csv:
inputs, targets = row[:-1], row[-1]
example = tf.train.Example(
features=tf.train.Features(
feature={
"inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
"targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
}
)
)
writer.write(example.SerializeToString())
I had 1 input 1 target pair.
But if you have N input 1 target pair, you have to make inputs as ndarray.

@kandarpkakkad , i tried creating a test set with 1 input and 1 target and placed in the testdata folder,
next i registered the 'new_parms' by appending the below code in the public_params .py file.

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

and finally tried testing the model,
!python3 pegasus/bin/evaluate.py --params=new_params
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6
--model_dir=ckpt/pegasus_ckpt

but the model isn't predicting results for the test set that i had given, it is giving prediction for different example

I0622 22:42:07.407472 139655975044992 text_eval.py:126] INPUTS: [0]:
To ensure a smooth flow of bank resolutions to the necessary signatories, I am requesting that Enron Treasury first route the bank resolutions to Angela Davis (EWS Legal) to be initialed before being routed to John Lavorato or Louise Kitchen.
If you have any questions please call me at 3-6544.
Thank you for your attention to this matter.

I0622 22:42:07.407813 139655975044992 text_eval.py:126] TARGETS: Treasury Bank Resolutions

I0622 22:42:07.407969 139655975044992 text_eval.py:126] PREDICTIONS: Thank you for your attention to this matter.

@JingqingZ can somebody please explain what could be the issue?

Hi @ManuMahadevaswamy was there any resolution to this? I only want to generate a summary for my input.

For the following dataset registration in the public_params.py it seems to iterate through all tests instead of the single record in test_pattern.tfrecord

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

@kandarpkakkad what did your dataset registration in the public_params.py look like?

@JingqingZ it appears that for test_pattern the test_pattern.tfrecord file would get picked up from the data/testdata folder. Can we supply the same format for train_pattern and dev_pattern i.e:

"train_pattern": "tfrecord:test_pattern.tfrecord",
"dev_pattern": "tfrecord:test_pattern.tfrecord",
"test_pattern": "tfrecord:test_pattern.tfrecord"

qiong-sportsbet · 2020-07-02T05:08:14Z

@rjbanner @darienacosta
it seems we need to provide the full path to test_pattern.tfrecord as follows:
@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfds:aeslc-train",
"dev_pattern": "tfds:aeslc-validation",
"test_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

gserb-datascientist · 2020-07-05T21:40:55Z

It seems - just changing "test_pattern" is not enough to prevent endless iterations through all tests instead of the single record in the test_pattern.tfrecord.

Finally, I ended up changing train_pattern and dev_pattern in addition to test_pattern to make it produce a single iteration as expected:

@registry.register("new_params")
def new_params(param_overrides):
return transformer_params(
{
"train_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"dev_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"test_pattern": "tfrecord:/home/pegasus/Desktop/pegasus/pegasus/data/test_data/test_pattern.tfrecord",
"max_input_len": 512,
"max_output_len": 32,
"train_steps": 32000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

ManuMahadevaswamy · 2020-07-08T21:15:38Z

I am trying to train the PEGASUS model with new fine tuning dataset which has just 600 records, with GPU, the execution is getting interrupted halfway through, I tried keeping the batch size low , by making batch_size=1

INFO:tensorflow:global_step/sec: 0.908995
I0708 19:54:06.618062 140370869380992 tpu_estimator.py:2307] global_step/sec: 0.908995
INFO:tensorflow:examples/sec: 0.908995
I0708 19:54:06.618438 140370869380992 tpu_estimator.py:2308] examples/sec: 0.908995
^
c

with batch_size=8, i am getting below error

2020-07-08 18:59:17.896051: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at softmax_op_gpu.cu.cc:157 : Resource exhausted: OOM when allocating tensor with shape[8,16,1024,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
ERROR:tensorflow:Error recorded from training_loop: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[8,16,1024,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node encoder/layer_7/self_attention/Softmax (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

@JingqingZ @rjbanner @bickrombishsass , is it because of the RAM size? I am using 12 GB RAM with GPU.
Please help me, if anyone has successfully fine-tuned the model with new dataset.

JingqingZ · 2020-07-08T21:27:25Z

It seems out-of-memory. 12GB should be enough to run the model for run batch_size=1 (or 4), but may not be enough for batch_size=8.

You can also check the input sequence length and output sequence length. If the input/output length is too long, e.g. >= 1024, the self-attention module will take lots of memory.

ManuMahadevaswamy · 2020-07-08T21:41:20Z

It seems out-of-memory. 12GB should be enough to run the model for run batch_size=1 (or 4), but may not be enough for batch_size=8.

You can also check the input sequence length and output sequence length. If the input/output length is too long, e.g. >= 1024, the self-attention module will take lots of memory.

Yes, It could be because of that may be. I have inputs with length greater than 1024.
Is there a way to optimize this for inputs greater than 1024?
@JingqingZ Thanks for your quick response.

JingqingZ · 2020-07-08T21:48:32Z

Unfortunately, the model cannot handle inputs longer than the max length you set (like 512 or 1024). For example, in arxiv and pubmed dataset, PEGASUS only takes the first 1024 tokens into the encoder and discard other tokens. You can refer to some other papers which work on mutli-document summarization or summarization with very long inputs.

ManuMahadevaswamy · 2020-07-08T22:22:14Z

Unfortunately, the model cannot handle inputs longer than the max length you set (like 512 or 1024). For example, in arxiv and pubmed dataset, PEGASUS only takes the first 1024 tokens into the encoder and discard other tokens. You can refer to some other papers which work on mutli-document summarization or summarization with very long inputs.

Thank you @JingqingZ, I will check them.

IanMack-uk · 2020-09-11T18:31:33Z

Hi Everyone,
I am trying to get PEGASUS to summarise a test article (just to get it working). And, I have followed all the advice and there are no errors being shown. But, still the summary files are not being created. Below is 1, data input code, 2, section added to the public_params.py and 3, the evaluate code lines. I'm a complete newbie - so any help is appreciated. Best, Ian

import pandas as pd
import tensorflow as tf

save_path = "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord"

input_dict = dict(
inputs=[
"Prime Minister Narendra Modi and President Ashraf Ghani on Monday discussed the evolving security situation in the region against the backdrop of a spike in terrorist violence in Afghanistan. The two leaders had a phone conversation and exchanged greetings on the occasion of Eid-ul-Azha. Ghani thanked Modi for timely supply of food and medical assistance to meet Afghanistan’s requirements, according to an official statement. “The two leaders also exchanged views on the evolving security situation in the region and other areas of mutual bilateral interest,” the statement said, adding Modi had reiterated India’s commitment to the Afghan people “in their quest for a peaceful, prosperous and inclusive Afghanistan”. The phone conversation coincided with a meeting in Kabul between Afghan National Security Adviser Hamdullah Mohib and Indian ambassador Vinay Kumar. Kumar spoke of India’s continued help for Afghanistan to counter the Covid-19 pandemic, and Mohib thanked India for its support and affirmed Afghanistan’s “strong commitment to the bilateral partnership”, the Afghan National Security Council said. The contacts between the two sides came a week after Chinese foreign minister Wang Yi chaired a virtual meeting with his counterparts from Afghanistan, Nepal and Pakistan and called for “four-party cooperation”, including on the China-Pakistan Economic Corridor (CPEC). Wang also said Kathmandu and Kabul should learn from China-Pakistan cooperation and the four countries should work together to extend CPEC to Afghanistan. The meeting, held against the backdrop of the China-India border standoff, was viewed warily in New Delhi. India also has serious concerns about the spike in terrorist violence in Afghanistan and the role being played by Pakistan in the Afghan peace process. Days after a report by a UN expert team monitoring the implementation of sanctions against banned terror groups and individuals said almost 6,500 Pakistani terrorists were present in Afghanistan, acting Afghan interior minister Masoud Andarabi said on Monday that the new leader of the Khorasan unit of Islamic State was a member of the Pakistan-based Haqqani Network.“Shahab Almahajir, the newly appointed leader of Islamic State of Khorasan Province-ISKP is a Haqani member. Haqani & the Taliban carry out their terrorism on a daily basis across Afg & when their terrorist activities [do] not suit them politically they rebrand it under ISKP,” Andarabi tweeted."
],
targets=[""
# Targets are supposed to represent ground truth - so, for predictions, leave blank
]
)

data = pd.DataFrame(input_dict)

with tf.io.TFRecordWriter(save_path) as writer:
for row in data.values:
inputs, targets = row[:-1], row[-1]
example = tf.train.Example(
features=tf.train.Features(
feature={
"inputs": tf.train.Feature(bytes_list=tf.train.BytesList(value=[inputs[0].encode('utf-8')])),
"targets": tf.train.Feature(bytes_list=tf.train.BytesList(value=[targets.encode('utf-8')])),
}
)
)
writer.write(example.SerializeToString())

save_path = "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord"
@registry.register("test_transformer")
def test_transformer(param_overrides):
return transformer_params(
{
"train_pattern": "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord",
"dev_pattern": "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord",
"test_pattern": "/content/pegasus/pegasus/data/testdata/test_pattern_1.tfrecord",
"max_input_len": 1024,
"max_output_len": 256,
"train_steps": 180000,
"learning_rate": 0.0001,
"batch_size": 8,
}, param_overrides)

!python3 /content/pegasus/pegasus/bin/evaluate.py --params=test_transformer
--param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6
--model_dir=ckpt/pegasus_ckpt/

When I run this last couple of lines, this notice appears. Is this anything to do with the issue? Thx

WARNING:tensorflow:From /content/pegasus/pegasus/bin/evaluate.py:85: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
W0911 18:33:41.139357 139982413326208 deprecation.py:323] From /content/pegasus/pegasus/bin/evaluate.py:85: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
File "/content/pegasus/pegasus/bin/evaluate.py", line 153, in
tf.compat.v1.app.run(main)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/content/pegasus/pegasus/bin/evaluate.py", line 85, in main
if not FLAGS.wait and not tf.train.checkpoint_exists(FLAGS.model_dir):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_management.py", line 393, in checkpoint_exists
return checkpoint_exists_internal(checkpoint_prefix)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_management.py", line 366, in checkpoint_exists_internal
if file_io.get_matching_files(pathname):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 363, in get_matching_files
return get_matching_files_v2(filename)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 384, in get_matching_files_v2
compat.as_bytes(pattern))
tensorflow.python.framework.errors_impl.NotFoundError: ckpt/pegasus_ckpt; No such file or directory

JingqingZ · 2020-09-12T21:31:20Z

ckpt/pegasus_ckpt; No such file or directory

It seems the folder of checkpoints hasn't been created.

kandarpkakkad closed this as completed Jun 12, 2020

This was referenced Jun 12, 2020

Summarization code example? #13

Open

Add JSON as new finetuning dataset #26

Open

rohithsiddhartha mentioned this issue Jul 13, 2020

tfds with limited examples #50

Open

JingqingZ mentioned this issue Dec 8, 2020

How to do Fine tuning with Custom dataset? #141

Open

abhijithneilabraham mentioned this issue Mar 10, 2021

Train new model #165

Open

Extractive Prediction Instead of Abstractive Prediction #21

Extractive Prediction Instead of Abstractive Prediction #21

Comments

kandarpkakkad commented Jun 12, 2020

spookypineapple commented Jun 12, 2020 • edited Loading

kandarpkakkad commented Jun 12, 2020

spookypineapple commented Jun 12, 2020

JingqingZ commented Jun 12, 2020 • edited Loading

kandarpkakkad commented Jun 12, 2020

ManuMahadevaswamy commented Jun 20, 2020

ManuMahadevaswamy commented Jun 20, 2020

kandarpkakkad commented Jun 20, 2020

ManuMahadevaswamy commented Jun 20, 2020

kandarpkakkad commented Jun 20, 2020

ManuMahadevaswamy commented Jun 20, 2020

JingqingZ commented Jun 20, 2020

ManuMahadevaswamy commented Jun 20, 2020

ManuMahadevaswamy commented Jun 21, 2020

ManuMahadevaswamy commented Jun 21, 2020

JingqingZ commented Jun 21, 2020

ManuMahadevaswamy commented Jun 22, 2020

ManuMahadevaswamy commented Jun 22, 2020

JingqingZ commented Jun 23, 2020

JingqingZ commented Jun 23, 2020

ManuMahadevaswamy commented Jun 23, 2020

JingqingZ commented Jun 23, 2020

ManuMahadevaswamy commented Jun 23, 2020

ManuMahadevaswamy commented Jun 24, 2020

JingqingZ commented Jun 25, 2020

ManuMahadevaswamy commented Jun 25, 2020

bicrxm commented Jun 28, 2020 • edited Loading

ManuMahadevaswamy commented Jun 28, 2020

rjbanner commented Jul 1, 2020

Your Inputs

Your Targets For Inputs Respectively.

qiong-sportsbet commented Jul 2, 2020

gserb-datascientist commented Jul 5, 2020

ManuMahadevaswamy commented Jul 8, 2020

JingqingZ commented Jul 8, 2020 • edited Loading

ManuMahadevaswamy commented Jul 8, 2020

JingqingZ commented Jul 8, 2020

ManuMahadevaswamy commented Jul 8, 2020

IanMack-uk commented Sep 11, 2020 • edited Loading

JingqingZ commented Sep 12, 2020

spookypineapple commented Jun 12, 2020 •

edited

Loading

JingqingZ commented Jun 12, 2020 •

edited

Loading

bicrxm commented Jun 28, 2020 •

edited

Loading

JingqingZ commented Jul 8, 2020 •

edited

Loading

IanMack-uk commented Sep 11, 2020 •

edited

Loading