## Imports

In [None]:
from pprint import pprint
from transformers import pipeline

## Create Summarization Pipeline with Facebook's `bart-large-cnn` pre-trained model
Model has a limit of 1024 tokens as input

In [None]:
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

We are testing the model on multiple set of paragraphs

In [None]:
ARTICLE = """ 
Insulation medium is an indispensable part of a power transformer, and the interesting thing about it is that it can drop dead at any time. Most of the time, this doesn’t happen, 
but… This article presents the transformer oil features by which one must abide. These features are examined according to standards to ensure the oil’s integrity. 
Transformer-impregnated paper is the determining factor in transformer age, so it is discussed extensively.

The importance of insulation was increased over the years due to the increase in the voltage rating of transformers. Within the last decades, although research on transformer
insulation and diagnosis methods have been improved so much, the insulation of HV transformers remained more or less unchanged, and for EHV and UHV transformers,
the oil–paper insulation is dominant.

Transformer oil has elements indicative of the transformer’s overall health. For example, a DGA transformer oil test shows levels of gases like acetylene and ethylene
rising above a safe limit. This indicates the aging of the transformer.

Other crucial transformer oil tests are the Power/Dissipation factor and Oil breakdown voltage which indicate the dielectric strength and power factor of the transformer oil. 
Any deviation from ideal values is an alarming sign that the transformer needs attention to mitigate the risk of failure.

Transformer oil can be categorized into three types: Mineral oil, Silicone-based oil, and Bio-based oil. Owing to the excellent cooling and insulating property of mineral oils,
these have been the most used transformer oil for many years. However, as years and research progressed, the shortcoming of mineral oils started to gain attention. Poor biodegradability, potential flammability, and low moisture tolerance are the most important concerns.

However, it is a topic for another day to discuss the suitability of these oil types.
"""
output_dict = summarizer(ARTICLE, max_length=142, min_length=30, do_sample=False)[0]
print("Model output:")
pprint(output_dict)

Model output:
{'summary_text': 'Insulation medium is an indispensable part of a power '
                 'transformer, and the interesting thing about it is that it '
                 'can drop dead at any time. This article presents the '
                 'transformer oil features by which one must abide. These '
                 'features are examined according to standards to ensure the '
                 'oil’s integrity.'}


## Break the text into chunks/paragraphs and process repeatedly

To reduce token size and let the model see all the text, we break it into chunks. Here we seperate blocks by the paragrah breaks.

In [None]:
paragraph_break = "\n\n"

summary_text_list = []
for i, paragraph in enumerate(ARTICLE.split(paragraph_break), 1):
    summary_text_list.append(
        summarizer(paragraph.strip(), max_length=50, min_length=30, do_sample=False)[0]['summary_text']
    )
    print(f"Summary of praragraph {i}:")
    pprint(summary_text_list[-1])

Summary of praragraph 1:
('Insulation medium is an indispensable part of a power transformer. This '
 'article presents the transformer oil features by which one must abide. These '
 'features are examined according to standards to ensure the oil’s integrity.')
Summary of praragraph 2:
('The importance of insulation was increased over the years due to the '
 'increase in the voltage rating of transformers. For EHV and UHV '
 'transformers, the oil–paper insulation is dominant.')
Summary of praragraph 3:
('Transformer oil has elements indicative of the transformer’s overall health. '
 'For example, a DGA transformer oil test shows levels of gases like acetylene '
 'and ethylenerising above a safe limit.')
Summary of praragraph 4:
('Transformer oil tests are the Power/Dissipation factor and Oil breakdown '
 'voltage which indicate the dielectric strength and power factor of the '
 'transformer oil. Any deviation from ideal values is an alarming sign that '
 'the transformer needs attenti

Your max_length is set to 50, but you input_length is only 21. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=10)


Summary of praragraph 5:
(' mineral oils have been the most used transformer oil for many years. '
 'However, as years and research progressed, the shortcoming of mineral oils '
 'started to gain attention. Poor biodegradability, potential flammability, '
 'and low moisture tolerance')
Summary of praragraph 6:
('It is a topic for another day to discuss the suitability of these oil types. '
 'However, it is an important topic for the future of the country.')


## Assimilate all individual summaries into a final large summary

Join the summaries as a\seperate paragrahs and pass it into the model.

In [None]:
joined_summary = '\n'.join(summary_text_list)

output_dict = summarizer(joined_summary, max_length=142, min_length=30, do_sample=False)[0]

print("Final output:")
pprint(output_dict)

Final output:
{'summary_text': 'Insulation medium is an indispensable part of a power '
                 'transformer. This article presents the transformer oil '
                 'features by which one must abide. These features are '
                 "examined according to standards to ensure the oil's "
                 'integrity.'}
