If you're opening this Notebook on colab, you will probably need to install 🤗 Transformers and 🤗 Datasets as well as other dependencies. Uncomment the following cell and run it.

In [None]:
#! pip install datasets evaluate transformers rouge-score nltk

If you're opening this notebook locally, make sure your environment has an install from the last version of those libraries.

To be able to share your model with the community and generate results like the one shown in the picture below via the inference API, there are a few more steps to follow.

First you have to store your authentication token from the Hugging Face website (sign up [here](https://huggingface.co/join) if you haven't already!) then execute the following cell and input your username and password:

In [2]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

Then you need to install Git-LFS. Uncomment the following instructions:

In [None]:
# !apt install git-lfs

Make sure your version of Transformers is at least 4.11.0 since the functionality was introduced in that version:

In [3]:
import transformers

print(transformers.__version__)

4.46.3


You can find a script version of this notebook to fine-tune your model in a distributed fashion using multiple GPUs or TPUs [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq).

We also quickly upload some telemetry - this tells us which examples and software versions are getting used so we know where to prioritize our maintenance efforts. We don't collect (or care about) any personally identifiable information, but if you'd prefer not to be counted, feel free to skip this step or delete this cell entirely.

In [4]:
from transformers.utils import send_example_telemetry

send_example_telemetry("summarization_notebook", framework="pytorch")

# Fine-tuning a model on a summarization task

In this notebook, we will see how to fine-tune one of the [🤗 Transformers](https://github.com/huggingface/transformers) model for a summarization task. We will use the [XSum dataset](https://arxiv.org/pdf/1808.08745.pdf) (for extreme summarization) which contains BBC articles accompanied with single-sentence summaries.

![Widget inference on a summarization task](images/summarization.png)

We will see how to easily load the dataset for this task using 🤗 Datasets and how to fine-tune a model on it using the `Trainer` API.

In [2]:
model_checkpoint = "t5-small"

This notebook is built to run  with any model checkpoint from the [Model Hub](https://huggingface.co/models) as long as that model has a sequence-to-sequence version in the Transformers library. Here we picked the [`t5-small`](https://huggingface.co/t5-small) checkpoint. 

## Loading the dataset

We will use the [🤗 Datasets](https://github.com/huggingface/datasets) library to download the data and get the metric we need to use for evaluation (to compare our model to the benchmark). This can be easily done with the functions `load_dataset` and `load_metric`.  

In [8]:
from datasets import load_dataset
from evaluate import load

raw_datasets = load_dataset("xsum")
metric = load("rouge")

2023-06-06 12:43:46.062773: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Found cached dataset xsum (/home/sudolife/.cache/huggingface/datasets/xsum/default/1.2.0/082863bf4754ee058a5b6f6525d0cb2b18eadb62c7b370b095d1364050a52b71)


  0%|          | 0/3 [00:00<?, ?it/s]

The `dataset` object itself is [`DatasetDict`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasetdict), which contains one key for the training, validation and test set:

In [9]:
raw_datasets

DatasetDict({
    train: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 204045
    })
    validation: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 11332
    })
    test: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 11334
    })
})

To access an actual element, you need to select a split first, then give an index:

In [10]:
raw_datasets["train"][0]

 'summary': 'Clean-up operations are continuing across the Scottish Borders and Dumfries and Galloway after flooding caused by Storm Frank.',
 'id': '35232142'}

To get a sense of what the data looks like, the following function will show some examples picked randomly in the dataset.

In [11]:
import datasets
import random
import pandas as pd
from IPython.display import display, HTML

def show_random_elements(dataset, num_examples=5):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)
    
    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

In [12]:
show_random_elements(raw_datasets["train"])

Unnamed: 0,document,summary,id
0,"The theatre is undergoing an £16m refurbishment and was originally planned to open by Easter next year.\nNow Hull City Council has revealed the venue will not be ready until August at the earliest.\nSteven Bayes, the Labour councillor responsible for tourism, said the delay was due to uncertainty over funding.\n""Clearly everyone would have liked to have seen the project finished as soon as possible in 2017, but we feel this is the best method of getting as much done for as long a period of time.""\nLast week, the Chancellor George Osborne announced a £13m grant towards Hull hosting the year long City of Culture arts festival.","The reopening of Hull New Theatre will be delayed until the summer of 2017, missing most of the City of Culture arts festival.",35879272
1,"The Stop Online Piracy Act (Sopa) aims to slash the amount of pirated content on the internet.\nBut signatories including Google co-founder Sergey Brin claim it amounts to China-style censorship.\nThe bill has the backing of Hollywood and the music industry.\nSopa was introduced by Judiciary Committee chairman Lamar Smith, a Texas Republican, who said the legislation was designed to ""stop the flow of revenue to rogue websites... that profit from selling pirated goods without any legal consequences"".\nIt would give content owners and the US government the power to request court orders to shut down websites associated with piracy.\nSopa aims to stop online ad networks and payment processors from doing business with foreign websites accused of enabling or facilitating copyright infringement.\nIt could stop search engines from linking to the allegedly infringing sites. Domain name registrars could be forced to take down the websites, and internet service providers could be forced to block access to the sites accused of infringing.\nA similar law, the Protect IP Act, is making its way through the US Senate.\nCritics argue that the proposals are too broad and could lead to the closure of a range of sites.\nThe latest letter, published in several US newspapers including the Wall Street Journal, The Washington Post and the New York Times, reads: ""We've all had the good fortune to found internet companies and non-profits in a regulatory climate that promotes entrepreneurship, innovation, the creation of content and free expression online.\n""However we're worried that the Protect IP Act and the Stop Online Piracy Act - which started out as well-meaning efforts to control piracy online - will undermine that framework.""\nThe letter said that the legislation would require web services to monitor what users link to or upload.\nThe bill would also ""deny website owners the right to due process"" and ""give the US government the power to censor the web using techniques similar to those used by China, Malaysia and Iran"", the letter goes on.\n""We urge Congress to think hard before changing the regulation that underpins the internet... Let's not deny the next generation of entrepreneurs and founders the same opportunities that we all had.""\nThe letter was signed by Twitter co-founders Jack Dorsey, Biz Stone and Evan Williams; Flickr co-founder Caterina Fake; Yahoo! co-founders David Filo and Jerry Yang; LinkedIn co-founder Reid Hoffman; YouTube co-founder Chad Hurley; PayPal co-founder Elon Musk; Craigslist founder Craig Newmark; eBay founder Pierre Omidyar and Wikipedia founder Jimmy Wales.\nAnother appeal, signed by 83 key internet engineersincluding father of the internet Vint Cerf, has also been sent to Congress.\n""We cannot have a free and open internet unless its naming and routing systems sit above the political concerns and objectives of any one government or industry,"" it reads.\n""Censorship of internet infrastructure will inevitably cause network errors and security problems. This is true in China, Iran and other countries that censor the network today; it will be just as true of American censorship.""\nA group of US politicians is proposing an alternative to Sopa that would see funding cut off to foreign websites accused of copyright infringements in a similar way to how the US ended Wikileaks' commercial operation.\nThey argue that the International Trade Commission (ITC) should take charge of combating piracy, instead of judges. The ITC would be tasked with reviewing claims of online infringement against foreign website owners, ordering them cut off from funding if the claims prove true.\nWhile the US moves to tighten its copyright laws, the UK is aiming to relax its own.\nThe Intellectual Property Office has launched a consultation exercise intended, among other things, to allow the ripping of CDs to digital music players.\nIt follows recommendations from Professor Ian Hargreaves inhis review of intellectual property.\nOther plans include allowing data mining of scientific research for non-commercial use and a licensing scheme to make it easier for digital services to gain access to copyrighted works. It also proposes relaxing copyright rules around ""parody"" videos which are increasingly popular on YouTube.\nThe move was welcomed by the British Library and watchdog Consumer Focus, but The Publishers' Association said it was concerned that the relaxation could make intellectual property theft easier.","The founders of Google, Twitter and eBay have signed a strongly worded letter criticising controversial US legislation ahead of a debate in Congress.",16195344
2,"Lt Gen James Terry, who is co-ordinating efforts against IS, said the soldiers would be in addition to 3,100 US soldiers already promised.\nHe did not say which coalition nations would provide the extra troops or what role they would play.\nThe US has agreed to send troops to Iraq in an advisory role.\nCoalition members discussed the Islamic State issue and made the troop pledge at a security conference in the region last week, Gen Terry said.\nGen Terry also told the conference that air strikes against IS were taking a toll on the militants' campaign in Iraq and Syria.\nThe US state department says nearly 60 countries belong to the coalition, although most play no direct role in the air strikes.\nIt is hoped the deployment of additional soldiers will increase the effectiveness of the Iraqi army, much of which proved ineffective under an IS onslaught last summer.\n""While [the Iraqi security forces] have a long way to go I think they're becoming more capable every day,"" Lt Terry said.\n""When you start now to balance the different capabilities out across the coalition, I think we're doing pretty well in terms of boots on the ground.""\nMeanwhile, the Combined Joint Task Force announced that US-led coalition forces carried out 15 air strikes in Syria and 31 in Iraq between 3 and 8 December.\nIS controls large areas of Syria and Iraq, imposing a rigid version of Sunni Islam and persecuting or killing non-believers.\nThe group has also executed several western hostages and has promised to kill more.\nEarlier on Monday US Defence Secretary Chuck Hagel said that the US would not be reviewing its policy of not paying ransoms in hostage situations, in spite of several failed rescue bids in the last few months.\nIn the latest incident, British-born US journalist Luke Somers and South African teacher Pierre Korkie were both killed by al-Qaeda gunmen in Yemen during an attempted rescue operation on Saturday.","The US-led coalition fighting Islamic State (IS) militants has pledged to send an additional 1,500 troops to Iraq, a top US commander has said.",30388718
3,"The survey suggested that staffing levels and a lack of consultation continued to be the biggest concerns.\nBut the vast majority said they were happy to ""go the extra mile"" and trusted their direct line manager.\nThe survey was completed by 60,681 of NHS Scotland's 160,635 staff - the highest response rate in its history.\nOverall, health workers responded less positively to 17 of the 29 questions and more positively to four questions. The remaining eight questions saw no percentage change.\nAmong the key findings were:\nThe 33% of all employees who thought their department had enough staff fell to 26% for nurses and midwives and 12% for ambulance personnel.\nThe report said there had been ""statistically significant"" improvements in one of the 29 questions, and significant deteriorations in four questions.\nBut all eight statements relating to the overall experience of working for the NHS received a positive response from 53% or more of all respondents.\nNorman Provan, Royal College of Nursing Scotland associate director, said: ""Demand on NHS services resources continues to outstrip supply, putting staff under enormous pressure.\n""It is worrying that nurses and midwives, like other staff groups, say that they are not consulted about changes and that they are also unclear about how changes will work out in practice.\n""The experience of staff on the frontline is key to building sustainable services and better, safer patient care. It is crucial that they are listened to.""\nHealth Secretary Shona Robison welcomed the increase in responses to the survey, and said the feedback was ""extremely important in letting us know what is going right and where we can make improvements"".\nShe added: ""It is welcome that staff remain committed to their roles, with almost nine out of ten willing to go the 'extra mile' at work. It is also promising that so many staff members have confidence and trust in their direct line manager, and get help and support from colleagues when needed.\n""However, we know there are challenges to be addressed, in particular making staff feel engaged and involved in the decisions being made within their health board.\n""It is vital that we learn from these findings, and I will be expecting all health boards to use their individual survey results to work with staff to bring in changes which will further improve staff experience.""\nScottish Conservative health spokesman Jackson Carlaw said the survey ""highlights again just how short we are of nurses, and the SNP has to take full responsibility for that"".\nScottish Labour's Dr Richard Simpson said NHS staff do ""incredible work every day"" but were ""under valued and overworked"" by the Scottish government.","NHS workers in Scotland are slightly less positive about their jobs than they were last year, according to the annual health service staff survey.",35070925
4,"Media playback is not supported on this device\nJamie Vardy scored twice but missed out on a hat-trick by blazing a second penalty over the bar.\nCity's longest-serving player Andy King was also on the score-sheet with Kevin Mirallas netting Everton's consolation.\nBut this was almost a sideshow to the festivities for the Foxes' first top-flight title in their 132-year history.\nMedia playback is not supported on this device\nThe celebrations included a pre-match performance from Italian tenor Andrea Bocelli in front of a sold-out stadium bedecked in banners declaring the club's new status as champions of England.\nAnd then, once the game had finished, came the moment all connected with the club had been waiting for - the lifting of the Premier League trophy.\nIt was held aloft by captain Wes Morgan and manager Claudio Ranieri to a rapturous response from those inside the King Power Stadium and thousands more who had gathered outside the ground.\nSaturday afternoon was billed as a title party, but for most inside the stadium the celebrations have been ongoing since Tottenham's draw at Chelsea confirmed the Foxes as the most remarkable of Premier League winners.\nThe City players themselves only returned to training for this fixture on Thursday after being given an extra day off to revel in their stunning triumph and presumably to fully recover from a rowdy Monday night at Jamie Vardy's house.\nFrom the moment they emerged for their pre-match warm-up, they were cheered to the rafters by supporters who had begun to mass in the stadium hours earlier to enjoy a beer on the club and revel in its greatest moment.\nSo keen were the home fans to show their appreciation to Ranieri and his side that the Italian was forced to quieten them for the performance of Nessun Dorma by compatriot Bocelli, who received a huge ovation when he removed his tracksuit top to reveal a Leicester shirt beneath bearing his name.\nBut despite the carnival atmosphere and a guard of honour from their opponents there was no complacency from the Foxes, who provided a fast-paced and committed finale befitting the side with the best home record in the division.\nIf the rumoured Hollywood film of Vardy's life does come to pass it now has a superb ending, although it came within a few yards of the perfect denouement.\nHaving missed the last two games through suspension, the 29-year-old England striker returned to the starting XI and made an instant impact, scoring with what was his first meaningful touch to steer in from King's chipped delivery.\nA constant menace to the Everton defence, his second came from the spot after he had been tripped by Matthew Pennington.\nBut with a dream scenario beckoning after Darron Gibson had clattered into Jeffrey Schlupp in the box, Vardy blasted his hat-trick chance high over the Everton bar from the spot.\nWith 24 goals for the season, Vardy is now one behind Tottenham's Harry Kane in the race to be Premier League top-scorer, although he has now had a hand in more top-flight goals this season than any other player (30).\nMidfielder King is a one-club man, the longest serving member of the Leicester squad and sole survivor of the side that won promotion from League One in 2009.\nHis 334th career appearance only came about because of Danny Drinkwater's suspension, but the 27-year-old illustrated the reasons for his Foxes longevity with a dynamic and influential display.\nAfter setting up the opener he was perfectly placed to latch on to a loose ball, following Riyad Mahrez's mazy run down the right, and fire his side into a commanding lead.\nThe goal was significant for another reason, as it took Leicester's all-time league goal difference to +1 - an apt time to go into the black.\nGiven the context of the day Everton were always in danger of being totally overshadowed, especially as they have little left to play for this season.\nBut while the Toffees may not have needed Saturday's points, their under-pressure manager Roberto Martinez was in dire need of a proud and professional performance from his side.\nHowever, they were far too accommodating of their opponents in the first half, essentially extending their pre-match guard of honour to allow City a match-winning hold on the game before the break.\nMedia playback is not supported on this device\nThey rallied slightly early in the second half, with Oumar Niasse and Romelu Lukaku drawing good saves from Kaspar Schmeichel.\nBut when Mirallas scored a superb solo goal, which saw him run from the halfway line before beating Schmeichel in the 87th minute, many of the visiting side already seemed to be eyeing a swift exit to allow the Leicester party to get under way.\nLeicester end their season with a trip to the team they have just deposed as Premier League champions, Chelsea.\nEverton have two Premier League games remaining - away at Sunderland on Wednesday before a final-day home game against Norwich next Sunday.\nMatch ends, Leicester City 3, Everton 1.\nSecond Half ends, Leicester City 3, Everton 1.\nCorner, Leicester City. Conceded by Joel Robles.\nAttempt saved. Demarai Gray (Leicester City) right footed shot from the right side of the box is saved in the bottom right corner. Assisted by N'Golo Kanté.\nAttempt missed. Leonardo Ulloa (Leicester City) header from the centre of the box is high and wide to the right. Assisted by Demarai Gray with a cross following a corner.\nCorner, Leicester City. Conceded by James McCarthy.\nSubstitution, Leicester City. Demarai Gray replaces Riyad Mahrez.\nAttempt blocked. Christian Fuchs (Leicester City) right footed shot from the left side of the box is blocked. Assisted by Riyad Mahrez with a cross.\nAttempt blocked. Andy King (Leicester City) left footed shot from the centre of the box is blocked.\nCorner, Leicester City. Conceded by John Stones.\nGoal! Leicester City 3, Everton 1. Kevin Mirallas (Everton) right footed shot from the centre of the box to the bottom right corner. Assisted by Darron Gibson.\nAttempt missed. Andy King (Leicester City) header from the left side of the six yard box is close, but misses to the left. Assisted by Riyad Mahrez with a cross following a corner.\nCorner, Leicester City. Conceded by Joel Robles.\nAttempt missed. Leonardo Ulloa (Leicester City) header from the centre of the box is just a bit too high. Assisted by Christian Fuchs with a cross.\nAttempt blocked. Riyad Mahrez (Leicester City) left footed shot from outside the box is blocked. Assisted by Andy King.\nOffside, Leicester City. Kasper Schmeichel tries a through ball, but Jamie Vardy is caught offside.\nAttempt missed. Andy King (Leicester City) header from very close range misses to the right. Assisted by Leonardo Ulloa following a corner.\nAttempt missed. Leonardo Ulloa (Leicester City) header from the centre of the box is close, but misses to the left. Assisted by Christian Fuchs with a cross following a corner.\nCorner, Leicester City. Conceded by John Stones.\nFoul by Kevin Mirallas (Everton).\nChristian Fuchs (Leicester City) wins a free kick in the defensive half.\nCorner, Everton. Conceded by Christian Fuchs.\nSubstitution, Everton. Leon Osman replaces Ross Barkley.\nAttempt missed. Leonardo Ulloa (Leicester City) header from the centre of the box is close, but misses to the right. Assisted by Riyad Mahrez with a cross.\nAttempt blocked. N'Golo Kanté (Leicester City) right footed shot from outside the box is blocked.\nAttempt blocked. Andy King (Leicester City) right footed shot from the left side of the box is blocked. Assisted by Leonardo Ulloa.\nCorner, Leicester City. Conceded by Matthew Pennington.\nAttempt blocked. Jamie Vardy (Leicester City) left footed shot from the right side of the box is blocked.\nAttempt missed. Leonardo Ulloa (Leicester City) header from the right side of the six yard box is just a bit too high. Assisted by Riyad Mahrez with a cross following a corner.\nCorner, Leicester City. Conceded by Matthew Pennington.\nAttempt blocked. Jamie Vardy (Leicester City) right footed shot from outside the box is blocked. Assisted by Andy King.\nAttempt saved. Bryan Oviedo (Everton) left footed shot from outside the box is saved in the bottom left corner. Assisted by Kevin Mirallas.\nPenalty missed! Bad penalty by Jamie Vardy (Leicester City) right footed shot is too high. Jamie Vardy should be disappointed.\nDarron Gibson (Everton) is shown the yellow card for a bad foul.\nPenalty conceded by Darron Gibson (Everton) after a foul in the penalty area.\nPenalty Leicester City. Jeffrey Schlupp draws a foul in the penalty area.\nAttempt blocked. Wes Morgan (Leicester City) header from the centre of the box is blocked. Assisted by Christian Fuchs with a cross.\nCorner, Leicester City. Conceded by Bryan Oviedo.\nCorner, Leicester City. Conceded by Joel Robles.\nAttempt saved. Jamie Vardy (Leicester City) left footed shot from the left side of the box is saved in the bottom right corner. Assisted by Riyad Mahrez.",Champions Leicester comfortably beat Everton before being presented with the Premier League trophy during a day of celebrations at the King Power Stadium.,36176623


The metric is an instance of [`datasets.Metric`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Metric):

In [13]:
metric

EvaluationModule(name: "rouge", module_type: "metric", features: [{'predictions': Value(dtype='string', id='sequence'), 'references': Sequence(feature=Value(dtype='string', id='sequence'), length=-1, id=None)}, {'predictions': Value(dtype='string', id='sequence'), 'references': Value(dtype='string', id='sequence')}], usage: """
Calculates average rouge scores for a list of hypotheses and references
Args:
    predictions: list of predictions to score. Each prediction
        should be a string with tokens separated by spaces.
    references: list of reference for each prediction. Each
        reference should be a string with tokens separated by spaces.
    rouge_types: A list of rouge types to calculate.
        Valid names:
        `"rouge{n}"` (e.g. `"rouge1"`, `"rouge2"`) where: {n} is the n-gram based scoring,
        `"rougeL"`: Longest common subsequence based scoring.
        `"rougeLsum"`: rougeLsum splits text using `"
"`.
        See details in https://github.com/huggingface/

You can call its `compute` method with your predictions and labels, which need to be list of decoded strings:

In [18]:
fake_preds = ["hello there", "general kenobi"]
fake_labels = ["hello there", "general kenobi"]
metric.compute(predictions=fake_preds, references=fake_labels)

{'rouge1': 1.0, 'rouge2': 1.0, 'rougeL': 1.0, 'rougeLsum': 1.0}

## Preprocessing the data

Before we can feed those texts to our model, we need to preprocess them. This is done by a 🤗 Transformers `Tokenizer` which will (as the name indicates) tokenize the inputs (including converting the tokens to their corresponding IDs in the pretrained vocabulary) and put it in a format the model expects, as well as generate the other inputs that the model requires.

To do all of this, we instantiate our tokenizer with the `AutoTokenizer.from_pretrained` method, which will ensure:

- we get a tokenizer that corresponds to the model architecture we want to use,
- we download the vocabulary used when pretraining this specific checkpoint.

That vocabulary will be cached, so it's not downloaded again the next time we run the cell.

In [19]:
from transformers import AutoTokenizer
    
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

By default, the call above will use one of the fast tokenizers (backed by Rust) from the 🤗 Tokenizers library.

You can directly call this tokenizer on one sentence or a pair of sentences:

In [20]:
tokenizer("Hello, this one sentence!")

{'input_ids': [8774, 6, 48, 80, 7142, 55, 1], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}

Depending on the model you selected, you will see different keys in the dictionary returned by the cell above. They don't matter much for what we're doing here (just know they are required by the model we will instantiate later), you can learn more about them in [this tutorial](https://huggingface.co/transformers/preprocessing.html) if you're interested.

Instead of one sentence, we can pass along a list of sentences:

In [5]:
tokenizer(["Hello, this one sentence!", "This is another sentence."])

{'input_ids': [[8774, 6, 48, 80, 7142, 55, 1], [100, 19, 430, 7142, 5, 1]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1]]}

To prepare the targets for our model, we need to tokenize them using the `text_target` parameter. This will make sure the tokenizer uses the special tokens corresponding to the targets:

In [6]:
print(tokenizer(text_target=["Hello, this one sentence!", "This is another sentence."]))

{'input_ids': [[8774, 6, 48, 80, 7142, 55, 1], [100, 19, 430, 7142, 5, 1]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1]]}


If you are using one of the five T5 checkpoints we have to prefix the inputs with "summarize:" (the model can also translate and it needs the prefix to know which task it has to perform).

In [None]:
if model_checkpoint in ["t5-small", "t5-base", "t5-large", "t5-3b", "t5-11b"]:
    prefix = "summarize: "
else:
    prefix = ""

We can then write the function that will preprocess our samples. We just feed them to the `tokenizer` with the argument `truncation=True`. This will ensure that an input longer that what the model selected can handle will be truncated to the maximum length accepted by the model. The padding will be dealt with later on (in a data collator) so we pad examples to the longest length in the batch and not the whole dataset.

In [25]:
max_input_length = 1024
max_target_length = 128

def preprocess_function(examples):
    inputs = [prefix + doc for doc in examples["document"]]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)

    # Setup the tokenizer for targets
    labels = tokenizer(text_target=examples["summary"], max_length=max_target_length, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

This function works with one or several examples. In the case of several examples, the tokenizer will return a list of lists for each key:

In [26]:
preprocess_function(raw_datasets['train'][:2])

{'input_ids': [[21603, 10, 37, 423, 583, 13, 1783, 16, 20126, 16496, 6, 80, 13, 8, 844, 6025, 4161, 6, 19, 341, 271, 14841, 5, 7057, 161, 19, 4912, 16, 1626, 5981, 11, 186, 7540, 16, 1276, 15, 2296, 7, 5718, 2367, 14621, 4161, 57, 4125, 387, 5, 15059, 7, 30, 8, 4653, 4939, 711, 747, 522, 17879, 788, 12, 1783, 44, 8, 15763, 6029, 1813, 9, 7472, 5, 1404, 1623, 11, 5699, 277, 130, 4161, 57, 18368, 16, 20126, 16496, 227, 8, 2473, 5895, 15, 147, 89, 22411, 139, 8, 1511, 5, 1485, 3271, 3, 21926, 9, 472, 19623, 5251, 8, 616, 12, 15614, 8, 1783, 5, 37, 13818, 10564, 15, 26, 3, 9, 3, 19513, 1481, 6, 18368, 186, 1328, 2605, 30, 7488, 1887, 3, 18, 8, 711, 2309, 9517, 89, 355, 5, 3966, 1954, 9233, 15, 6, 113, 293, 7, 8, 16548, 13363, 106, 14022, 84, 47, 14621, 4161, 6, 243, 255, 228, 59, 7828, 8, 1249, 18, 545, 11298, 1773, 728, 8, 8347, 1560, 5, 611, 6, 255, 243, 72, 1709, 1528, 161, 228, 43, 118, 4006, 91, 12, 766, 8, 3, 19513, 1481, 410, 59, 5124, 5, 96, 196, 17, 19, 1256, 68, 27, 103, 317, 132

To apply this function on all the pairs of sentences in our dataset, we just use the `map` method of our `dataset` object we created earlier. This will apply the function on all the elements of all the splits in `dataset`, so our training, validation and testing data will be preprocessed in one single command.

In [27]:
tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)



Even better, the results are automatically cached by the 🤗 Datasets library to avoid spending time on this step the next time you run your notebook. The 🤗 Datasets library is normally smart enough to detect when the function you pass to map has changed (and thus requires to not use the cache data). For instance, it will properly detect if you change the task in the first cell and rerun the notebook. 🤗 Datasets warns you when it uses cached files, you can pass `load_from_cache_file=False` in the call to `map` to not use the cached files and force the preprocessing to be applied again.

Note that we passed `batched=True` to encode the texts by batches together. This is to leverage the full benefit of the fast tokenizer we loaded earlier, which will use multi-threading to treat the texts in a batch concurrently.

## Fine-tuning the model

Now that our data is ready, we can download the pretrained model and fine-tune it. Since our task is of the sequence-to-sequence kind, we use the `AutoModelForSeq2SeqLM` class. Like with the tokenizer, the `from_pretrained` method will download and cache the model for us.

In [28]:
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer

model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

Note that  we don't get a warning like in our classification example. This means we used all the weights of the pretrained model and there is no randomly initialized head in this case.

To instantiate a `Seq2SeqTrainer`, we will need to define three more things. The most important is the [`Seq2SeqTrainingArguments`](https://huggingface.co/transformers/main_classes/trainer.html#transformers.Seq2SeqTrainingArguments), which is a class that contains all the attributes to customize the training. It requires one folder name, which will be used to save the checkpoints of the model, and all other arguments are optional:

In [30]:
batch_size = 16
model_name = model_checkpoint.split("/")[-1]
args = Seq2SeqTrainingArguments(
    f"{model_name}-finetuned-xsum",
    evaluation_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=1,
    predict_with_generate=True,
    fp16=True,
    push_to_hub=True,
)

Here we set the evaluation to be done at the end of each epoch, tweak the learning rate, use the `batch_size` defined at the top of the cell and customize the weight decay. Since the `Seq2SeqTrainer` will save the model regularly and our dataset is quite large, we tell it to make three saves maximum. Lastly, we use the `predict_with_generate` option (to properly generate summaries) and activate mixed precision training (to go a bit faster).

The last argument to setup everything so we can push the model to the [Hub](https://huggingface.co/models) regularly during training. Remove it if you didn't follow the installation steps at the top of the notebook. If you want to save your model locally in a name that is different than the name of the repository it will be pushed, or if you want to push your model under an organization and not your name space, use the `hub_model_id` argument to set the repo name (it needs to be the full name, including your namespace: for instance `"sgugger/t5-finetuned-xsum"` or `"huggingface/t5-finetuned-xsum"`).

Then, we need a special kind of data collator, which will not only pad the inputs to the maximum length in the batch, but also the labels:

In [31]:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

The last thing to define for our `Seq2SeqTrainer` is how to compute the metrics from the predictions. We need to define a function for this, which will just use the `metric` we loaded earlier, and we have to do a bit of pre-processing to decode the predictions into texts:

In [33]:
import nltk
import numpy as np

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    # Replace -100 in the labels as we can't decode them.
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
    
    # Rouge expects a newline after each sentence
    decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
    decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]
    
    # Note that other metrics may not have a `use_aggregator` parameter
    # and thus will return a list, computing a metric for each sentence.
    result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True, use_aggregator=True)
    # Extract a few results
    result = {key: value * 100 for key, value in result.items()}
    
    # Add mean generated length
    prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
    result["gen_len"] = np.mean(prediction_lens)
    
    return {k: round(v, 4) for k, v in result.items()}

Then we just need to pass all of this along with our datasets to the `Seq2SeqTrainer`:

In [None]:
trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

We can now finetune our model by just calling the `train` method:

In [None]:
trainer.train()

Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum,Gen Len,Runtime,Samples Per Second
1,2.7211,2.479327,28.3009,7.7211,22.243,22.2496,18.8225,326.3338,34.725


TrainOutput(global_step=12753, training_loss=2.7692033505520146, metrics={'train_runtime': 4909.3835, 'train_samples_per_second': 2.598, 'total_flos': 7.774481450954342e+16, 'epoch': 1.0, 'init_mem_cpu_alloc_delta': 335248, 'init_mem_gpu_alloc_delta': 242026496, 'init_mem_cpu_peaked_delta': 18306, 'init_mem_gpu_peaked_delta': 0, 'train_mem_cpu_alloc_delta': 2637782, 'train_mem_gpu_alloc_delta': 728138240, 'train_mem_cpu_peaked_delta': 138226182, 'train_mem_gpu_peaked_delta': 14677017088})

You can now upload the result of the training to the Hub, just execute this instruction:

In [None]:
trainer.push_to_hub()

You can now share this model with all your friends, family, favorite pets: they can all load it with the identifier `"your-username/the-name-you-picked"` so for instance:

```python
from transformers import AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("sgugger/my-awesome-model")
```

In [2]:
import os
import numpy as np
import pandas as pd
from datasets import Dataset
from transformers import pipeline
# Saving model
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig, AutoModelForSeq2SeqLM

testing_file = '/home/ubuntu/llama/source/Describer/article_metadata.test.csv'
testing_df = pd.read_csv(testing_file)
testing_dataset = Dataset.from_pandas(testing_df)

# Download configuration from huggingface.co and cache.
config = AutoConfig.from_pretrained("google-t5/t5-base")
tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-base")
model = AutoModelForSeq2SeqLM.from_config(config)

summarizer = pipeline("summarization", model="google-t5/t5-base", device=0)
updated = []

for index, row in testing_df.iterrows():
    text = "summarize:" + row['Article Text']
    summary = summarizer(text)
    obj = {'title':row['Title'], 'content':text, 'summary':summary}
    print(str(index) + ':' + row['Title'])
    updated.append(obj)
    
updated_df = pd.DataFrame(updated)
updated_df.to_csv('/home/ubuntu/llama/source/Describer/article.updated_2023.csv', index=False)

Your max_length is set to 200, but your input_length is only 68. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=34)


0:Purdue President Mung Chiang’s end-of-year wrap-up on university successes
1:CHL program focuses on taking control of Type 2 diabetes, prediabetes; register by Jan. 10


Your max_length is set to 200, but your input_length is only 191. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)


2:Stay safe and injury-free this winter: Top tips for preventing falls and walking safely
3:Employees can now connect with their Purdue Retirement Program on Fidelity NetBenefits
4:Virtual HealthKick program focuses on physical activity, nutrition, more; register by Jan. 3


Your max_length is set to 200, but your input_length is only 65. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=32)


5:Purdue reputation in space brings better understanding of the stars, planets and everything in between
6:ADP W-2 location change in SuccessFactors
7:Purdue receives $25 million grant from Lilly Endowment
8:Purdue University’s top media stories of 2023
9:Racing executives cite critical workforce need during visit to Purdue University in Indianapolis


Your max_length is set to 200, but your input_length is only 197. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=98)


10:Appointments, honors and activities
11:One Hour with HR sessions continue with Communication and Coaching
12:Medical ID cards to arrive in early January for those who elected new health plans, changed coverages, etc.
13:Purdue United Way hosts victory celebration for successful 2023 campaign
14:Parking reminders for upcoming home basketball games
15:Office of Future Engineers’ Lindsay Elias receives 2023 Martin Award
16:This week’s ‘Thumbs Up’ recipients


Your max_length is set to 200, but your input_length is only 170. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=85)


17:2024 Annual Distinguished Purdue Alumni Scholars call for nominations
18:Most parking garage top levels to be closed Dec. 18-Jan. 7
19:New Multidisciplinary and Professional Studies course leverages AI, other digital tools to advance people’s writing skills
20:Ken Foster announced as the 2023 Hovde Award recipient
21:How sustainable tape can solve sticky recycling issues — new video posted to AP Newsroom
22:Momentous small steps to giant leaps: Purdue celebrates 3,500 winter 2023 graduates
23:Plant metabolism proves more complicated than previously understood
24:Campus building access and operational adjustments through Jan. 1
25:Provost event honors Purdue faculty for their many years of service
26:Winter break vehicle storage available for faculty, staff
27:Fueling advanced manufacturing growth across Indiana’s Hard-Tech Corridor: Reimagine IN-MaC through Purdue in West Lafayette and Indianapolis
28:Graduating Purdue senior channels entrepreneurial mindset to propel growth, succes

Your max_length is set to 200, but your input_length is only 128. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=64)


31:Today’s top 5 from Purdue University
32:Purdue invites high school students to virtual Women in Engineering presentation
33:Staff Excellence: Purdue University Police Department
34:Preeti Sivasankar: ‘Can You Protect Your Voice? Physiological Investigations From Rats to Humans’
35:Advanced practice registered nurses adapt to expanded telehealth use during and after pandemic


Your max_length is set to 200, but your input_length is only 139. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=69)


36:Purdue Global honors faculty, staff with Distinction Awards
37:Are electric VTOL aircraft the future of urban mobility? It all depends on the batteries
38:USDA determines Insignum AgTech corn plants can be sold and grown without restriction
39:National policy aimed at reducing U.S. greenhouse gases also would improve water quality


Your max_length is set to 200, but your input_length is only 109. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=54)


40:National Academy of Inventors names three Purdue faculty as 2023 fellows
41:Save the date for Health Equity Summit on Feb. 29


Your max_length is set to 200, but your input_length is only 147. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=73)


42:Daniels’ Purdue legacy honored at bust unveiling ceremony
43:President Chiang to hold virtual Faculty and Staff Year-End Town Hall
44:Westwood Lecture Series spring lineup released
45:Star of wonder: Dazzling new image of supernova Cassiopeia A released by First Lady Jill Biden and Purdue astronomer


Your max_length is set to 200, but your input_length is only 53. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=26)


46:This week’s ‘Thumbs Up’ recipients
47:Purdue Today announces semester break schedule
48:Employees should update address information ahead of tax form distribution
49:Purdue Global Law School launches AI course
50:European technology leader imec opens innovation hub at Purdue
51:Purdue trustees approve plans for Mitchell E. Daniels, Jr. School of Business building, endorse 13th consecutive tuition freeze, ratify appointment of next chancellor of Purdue University Northwest


Your max_length is set to 200, but your input_length is only 74. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=37)


52:University raises stipend minimum again for PhD students


Your max_length is set to 200, but your input_length is only 74. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=37)


53:Board of Trustees meeting set for Friday; special issue of Purdue Today to follow


Your max_length is set to 200, but your input_length is only 148. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=74)


54:Board of Trustees meeting set for Friday; special issue of Purdue Today to follow
55:Travel ADA accommodation process change to take effect Jan. 1
56:Purdue trustees approve Daniels School building construction
57:Purdue University names Chris Holford as next chancellor of Purdue University Northwest
58:Purdue trustees ratify faculty, staff positions; award posthumous degree; approve new degree programs, resolutions of appreciation and namings
59:Purdue trustees endorse 13th consecutive tuition freeze; approve updated housing, dining plans
60:Mercy Medical Center Cedar Rapids joins educational alliance with Purdue Global
61:Commercial air service returns to Purdue University Airport
62:Provost announces leadership appointments, updates on searches
63:Today’s top 5 from Purdue University
64:Staff Excellence: Payroll and Tax Services
65:New faculty activity reporting system, Elements, launching in January
66:Research team explores genomic options to enhance honeybee resilience
67:Reim

Your max_length is set to 200, but your input_length is only 139. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=69)


83:Purdue University Police Department earns reaccreditation for upholding best practices, professional standards
84:Nominations sought for Purdue Dreamer Award
85:Reminder: 2023 winter recess
86:In Print: ‘Sparse Graphical Modeling for High Dimensional Data’
87:University announces additional details on employee recognition plan
88:Today’s top 5 from Purdue University
89:Appointments, honors and activities
90:Follow Purdue police on X/Twitter for valuable updates, resources
91:Purdue establishes an international footprint in chip technology and workforce innovation
92:Registration open for Dec. 13 Westwood Lecture on voice disorders and voice health
93:Researchers look to the human eye to boost computer vision efficiency
94:Purdue Journal of Service-Learning and International Engagement now accepting student-authored submissions
95:Purdue University to offer Google Career Certificates for learning in-demand tech skills
96:Purdue IoT software platform uses gaming to motivate energy-eff

Your max_length is set to 200, but your input_length is only 75. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=37)


112:Women in Engineering Program to host free Introduce a Girl to Engineering Day
113:University Senate meeting set for Monday
114:Graduate students in Daniels Business School’s business analytics program learn data-driven decision-making
115:Purdue students’ international education opportunities abound abroad
116:Purdue part of new international video project exploring climate change as a health crisis
117:Arizona company acquires patented, soy-based concrete protectant developed in Indiana
118:Edward Delp: ‘Deepfakes and Other Types of Generated and Manipulated Media: It Is Real and Coming for Our Society!’
119:Today’s top 5 from Purdue University
120:Purdue reminds agricultural employers of their responsibilities when hiring youth workers
121:The mind’s eye of a neural network system
122:Fulbright Program opens doors internationally for Purdue scholars, students
123:New faculty reporting tool, Elements, implementation update
124:Manning Regional Healthcare Center joins educational a

Your max_length is set to 200, but your input_length is only 78. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=39)


126:2023 Outstanding Leadership in Globalization Award winners announced; nominations sought for 2024
127:Purdue WL community: Adverse winter weather procedures
128:’Tis the season for holiday budgeting stress, anxiety, more; resources available to help
129:AI knows the score — and it could help instrumentalists make beautiful music
130:Distinguished and Named Professorship Ceremony honors faculty, administrators
131:Purdue-led national summit issues call to action for resilient U.S. supply chains
132:Report ranks Purdue among top 10 universities for international student enrollment


Your max_length is set to 200, but your input_length is only 48. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=24)


133:Purdue Memorial Union to celebrate holiday season
134:Purdue Today schedule for Thanksgiving break
135:Purdue students win monetary prizes for innovative solutions to global problems during Moonshot Pitch Challenge
136:Purdue Global law degree allows insurance executive to take next career steps
137:Purdue pharmaceutical compound sounds the alarm on cancer cells and unleashes T cells
138:International Education Week to feature International Research Speakers Series
139:Red Arrow Flight Academy partners with Purdue Global to address projected demand for aviation professionals
140:Thumbs Up
141:Groundbreaking launches a giant leap in nursing and pharmacy education
142:Purdue helps put Indiana on the map in national security
143:New seed administrator appointed to Indiana State Chemist office
144:Inaugural Entrepreneurial Alumni Reunion at Purdue University kicks off


Your max_length is set to 200, but your input_length is only 177. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=88)


145:Head of psychological sciences named associate vice provost for Indianapolis
146:Purdue to hold events as part of International Education Week on Nov. 13-17
147:Veterans Day: Purdue Global faculty, student reflect on their military service
148:She’s on it: Josefine Eskildsen is rapidly rising up the racing ranks
149:Purdue Global unveils new simulation center and growing partnership with CHI Health
150:Chipshub: An online platform for everything semiconductors


Your max_length is set to 200, but your input_length is only 146. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=73)


151:Today’s top 5 from Purdue University


Your max_length is set to 200, but your input_length is only 133. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=66)


152:Nominations sought for Purdue University Dreamer Award
153:Semiconductor Week 2023 brings top industry officials to campus
154:Concord Law School officially renamed Purdue Global Law School
155:Judges sought for Fall Undergraduate Research Expo; presentation schedule available
156:Judges sought for Fall Undergraduate Research Expo; presentation schedule available
157:Butler Center accepting applications for new faculty mentoring program
158:Applications sought for Mortar Board’s Class of 2025
159:In Print: ‘Intermittent Convex Integration for the 3D Euler Equations’
160:Purdue United Way campaign still needs pledges to reach goal
161:Purdue sensors measure uric acid levels better than other noninvasive methods
162:Purdue Global honors 22 students with First-Generation College Student Scholarships


Your max_length is set to 200, but your input_length is only 166. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=83)


163:Deadline approaching: Nominations for Purdue’s Distinguished Research Awards
164:CILMAR accepting applications for seed grant program
165:Provost launches advisory committee; first meeting convenes today
166:Provost launches advisory committee; first meeting convenes today
167:Former World Bank president to join Purdue University and its Daniels School of Business
168:Today is the last day to enroll for 2024 benefits; deadline is 6 p.m. ET
169:Purdue marketing leaders to share insights on Purdue Global’s brand building at Ad Age’s ‘Business of Brands’ event in New York City
170:Purdue and leading companies chart a taxonomy of 6G technologies
171:Save time, stay on message: Use Purdue-branded communication resources
172:Community invited to join ‘Host-a-Boiler’ during Thanksgiving break
173:Open enrollment for 2024 benefits ends tomorrow (Nov. 7) at 6 p.m. ET
174:Thumbs Up
175:Experiment shows biological interactions of microplastics in watery environment
176:Daylight saving time en

Your max_length is set to 200, but your input_length is only 95. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=47)


188:Policy updates for November
189:Institute for Physical Artificial Intelligence town hall held
190:Indy leader Evan Hawkins named senior director for administrative operations for Purdue University in Indianapolis
191:Science enabling heat and air conditioning for long-term space habitats is almost fully available


Your max_length is set to 200, but your input_length is only 196. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=98)


192:Open enrollment for 2024 benefits ends Tuesday; enroll by 6 p.m. ET Nov. 7
193:One Hour with HR to focus on leaves, related leave processes in upcoming sessions


Your max_length is set to 200, but your input_length is only 103. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=51)


194:Mental health, diabetes can impact each other; resources available to help
195:Indy leader Evan Hawkins named senior director for administrative operations for Purdue University in Indianapolis
196:Purdue engineer works to improve formulation of RNA-based pharmaceuticals


Your max_length is set to 200, but your input_length is only 57. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=28)


197:CERIAS’ Shawn Huddy receives 2023 Community Spirit Award


Your max_length is set to 200, but your input_length is only 169. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=84)


198:Application deadline approaching: Lilly Scholars at Purdue program
199:Purdue University College of Engineering to offer guaranteed internship to students in Indianapolis
200:Purdue Global cuts ribbon on first military base extension
201:Season of Sharing accepting gifts for local children, families through Dec. 8
202:‘Purdue Pursuits’: Becoming an A.H. Ismail Center member
203:Appointments, honors and activities
204:Thumbs Up
205:An astronaut and a communication advocate return to their alma mater as Presidential Ambassadors
206:Dickey named inaugural associate dean of students at Purdue University in Indianapolis
207:Purdue University Fire Department welcomes new firefighters with swearing-in ceremony


Your max_length is set to 200, but your input_length is only 60. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=30)


208:Open enrollment for 2024 benefits underway; enroll by 6 p.m. Nov. 7


Your max_length is set to 200, but your input_length is only 134. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=67)


209:Discounted football tickets available for faculty, staff for Nov. 11 game against Minnesota
210:Concur site changes to improve screen accessibility
211:Today’s top 5 from Purdue University
212:MaPSAC’s Lovell Leadership Series to present Provost Patrick Wolfe
213:Pioneering commercial astronaut and Purdue alumna Beth Moses to join President Chiang for Purdue Presidential Lecture Series
214:Dry-surface foodborne pathogens under scrutiny at Purdue
215:InnovatED graduate research magazine seeking submissions
216:In Print: ‘An Introduction to Optimization: With Applications to Machine Learning, 5th Edition’
217:Register now for Purdue annual safety meeting and fair on Nov. 8
218:Ready, set, enroll: Open enrollment for 2024 benefits begins today
219:Gebisa Ejeta awarded National Medal of Science
220:Summit to forge a national coalition around building resiliency in U.S. manufacturing and operations
221:Newly updated edition of Purdue Forage Field Guide provides essential information on 

Your max_length is set to 200, but your input_length is only 81. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=40)


222:Purdue to offer vote-ready IDs for Mobile First students eligible to vote in general election this fall


Your max_length is set to 200, but your input_length is only 134. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=67)


223:Research faculty, staff and university leaders participate in Life and Health Sciences Summit
224:Deadline nears for Eudoxia Girard Martin Memorial Staff Recognition Award nominations
225:Purdue strengths in biotech manufacturing part of new federally designated Regional Technology and Innovation Hub won by the state of Indiana
226:Thumbs Up
227:Keynote speaker to Purdue Global grads: Keep building your dreams
228:Campus community invited to attend sustainability-related events during annual Green Week celebration
229:Institute for Physical AI to host town hall Oct. 30
230:Toastmasters at Purdue to host open house for faculty, staff, alumni
231:Additional response on Oct. 20 to inquiries about some on-campus activities this week
232:Discovery Park District at Purdue announces DUIRI projects for spring 2024; student applications now being accepted
233:Campus community encouraged to learn about earthquake preparedness today
234:Today’s top 5 from Purdue University
235:Purdue to host 

Your max_length is set to 200, but your input_length is only 85. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=42)


240:New endowed scholarship to help international students facing substantial personal risks and educational barriers
241:Purdue Day of Service scheduled for Oct. 27


Your max_length is set to 200, but your input_length is only 133. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=66)


242:Exclusive screening of documentary on Indiana’s youth mental health crisis available
243:Preschool, pre-k, and before- and after-school care available on Purdue’s West Lafayette campus
244:Inaugural Spring Family Weekend set for April 12-14, 2024
245:Purdue Silicon Summit further accelerates momentum with announcements and brings national, global semiconductor leaders to campus
246:Purdue Silicon Summit further accelerates momentum with announcements and brings national, global semiconductor leaders to campus
247:Purdue team examines bio-impact of toxic chemical cocktails in the environment
248:Purdue Global selected by Gardner Institute to join inaugural cohort of Transforming the Foundational Postsecondary Experience


Your max_length is set to 200, but your input_length is only 160. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=80)


249:Life and health sciences faculty summit to chart next major investment at Purdue
250:No-cost vaccine clinic scheduled for Oct.19
251:Strength of Purdue’s pharmacy graduate program felt in improved lives, purposed service, rewarding careers in health care
252:Purdue United Way campaign climbs toward $700,000 goal
253:Purdue, IU to collaborate on analysis of Indiana’s $500 million economic development efforts


Your max_length is set to 200, but your input_length is only 168. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=84)


254:Signatures of the Space Age: Spacecraft metals left in the wake of humanity’s path to the stars
255:Purdue names new pediatric cancer research center for the late Tyler Trent


Your max_length is set to 200, but your input_length is only 63. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=31)


256:Registration open for Oct. 19 Ambassador Distinguished Lecture at Purdue and Krach Freedom Lecture with Nathaniel C. Fick, U.S. ambassador-at-large for cyberspace and digital policy
257:University Senate meeting set for Monday
258:Emerging collegiate scholar is making her comeback in cybersecurity with Purdue Global
259:President Chiang keynotes regional luncheon to explore economic growth sectors in northwest Indiana
260:Participants of the Purdue Ukraine Scholars Initiative to be featured in panel discussion
261:Sustainability and energy topics bring together Purdue scientists and engineers from West Lafayette and Indianapolis


Your max_length is set to 200, but your input_length is only 190. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)


262:Purdue Conferences launches Event Design Lab, an engagement and event design center


Your max_length is set to 200, but your input_length is only 98. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=49)


263:Participants of the Purdue Ukraine Scholars Initiative to be featured in panel discussion
264:Purdue Alumni Medical Network launches
265:ZS Instruments receives $1M grant to develop advanced lithography tech for precision optical encoders
266:Today’s top 5 from Purdue University
267:2023 Special Boilermaker Award recipients announced
268:Rx Savings Solutions offers Healthy Boiler workshop on smart, simple way to save on prescriptions
269:Purdue researchers, IBM perform perturbation theory method on quantum computer
270:SupportLinc to feature self-care essentials in October; resources, tools available
271:Aerovy, an advanced air mobility software provider, completes $800,000 pre-seed funding round
272:Purdue community and No. 6 graduate program attracts aerospace engineers across continents


Your max_length is set to 200, but your input_length is only 137. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=68)


273:Purdue survey delves into brand-name food and beverage preferences of consumers
274:Purdue recognized by U.S. Department of Education for ongoing sustainability efforts
275:Virtual presentation on Medicare, Social Security, health savings accounts available
276:President Chiang continues 92-county Indiana tour with visit to northern part of state
277:Purdue launches broadband team, effort to increase high-speed internet access, adoption and use throughout Indiana
278:Purdue Applied Research Institute joins Abt Associates-led team in $49 million USAID climate initiative
279:Predicting prostate cancer recurrence 15 months faster
280:Purdue trustees approve land transfers to support student organizations, future campus development
281:Purdue trustees approve contract extension for Bobinski as vice president and director of intercollegiate athletics
282:Purdue trustees ratify faculty and staff positions, award posthumous degrees, approve department name changes and resolutions of appre

Your max_length is set to 200, but your input_length is only 131. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=65)


287:DOE funds 3D printing of wind blade tooling to make U.S. clean energy sector more competitive
288:Registration open for Purdue Global Village conference
289:Ceremonial groundbreaking marks beginning of University Hall renovation


Your max_length is set to 200, but your input_length is only 74. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=37)


290:PhD student from China takes chance on America, finds ideal Purdue Polytechnic program and community
291:Board of Trustees meeting set for Friday; special issue of Purdue Today to follow


Your max_length is set to 200, but your input_length is only 123. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=61)


292:New leadership appointments announced for Purdue University in Indianapolis; committee for tenure working group established
293:New faculty, lecturers introduced with brief profiles
294:National Depression Screening Day takes place Oct. 5 to raise awareness, promote resources
295:Today’s top 5 from Purdue University
296:Registration available for Oct. 18 Westwood Lecture on ‘A Digital Revolution in Forestry’
297:CAPS, Purdue University Online partner to offer new well-being resource for students
298:Bowman appointed interim dean of Purdue’s College of Health and Human Sciences
299:Purdue Policy Research Institute accepting Diplomacy Lab project bids in collaboration with U.S. Department of State
300:New interdisciplinary labs bolster U.S. manufacturing
301:Purdue professor foresees AI as catalyst for transformation in manufacturing and workforce
302:Purdue center addresses pressing challenge of securing semiconductor chips
303:$1.1 million grant to fund research on molecular respon

Your max_length is set to 200, but your input_length is only 136. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=68)


310:Thumbs Up
311:Nominations being accepted for Eudoxia Girard Martin Memorial Staff Recognition Award
312:Purdue Fall Undergraduate Research Expo accepting research talk and poster abstract submissions
313:Purdue Global’s new emergency management degree program prepares students to rebuild communities after disasters
314:Purdue-Ireland relationship fosters research and study abroad opportunities in pharmaceutical manufacturing
315:EPICS program to expand at Purdue University in Indianapolis, partner with more Indy organizations, businesses to demonstrate power of experiential learning
316:Pedestrian safety message
317:Purdue Global names new dean of School of Health Sciences
318:Healthy Boiler Fair takes place Oct. 4 with vendors, door prizes
319:Reminder: Campuswide PurdueALERT test scheduled for today
320:Purdue’s inaugural Lilly Scholars train to become pharmaceutical manufacturing workforce talent of the future
321:Healthy Boiler virtual workshop presented by SupportLinc to focus

Your max_length is set to 200, but your input_length is only 151. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=75)


336:Exploring AI together across Purdue
337:Domestic shipments transition to eShipGlobal starting Oct. 2


Your max_length is set to 200, but your input_length is only 198. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=99)


338:Human Resources to continue performing registry checks for student hires past Oct. 1
339:Homecoming weekend kicks off Friday
340:Purdue’s semiconductor innovation ecosystem grows with CHIPS-funded, Indiana-led semiconductor hub and with upcoming summit
341:Bringing home asteroids: Purdue scientist is among the first to examine asteroid pieces from NASA’s OSIRIS-REx mission
342:Celebrated actor, author to be featured speaker as Purdue Global celebrates Hispanic Heritage Month
343:Purdue’s online Dual Engineering + MBA and Management and Leadership master’s degree program proves to be a career booster
344:Appointments, honors and activities


Your max_length is set to 200, but your input_length is only 163. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=81)


345:SECURE 2.0 retirement changes to impact Purdue retirement programs


Your max_length is set to 200, but your input_length is only 178. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=89)


346:Purdue launches rural education center to help address teacher shortage


Your max_length is set to 200, but your input_length is only 150. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=75)


347:$3M grant renews funding for Purdue program expanding access to the veterinary profession
348:Office of Engagement seeking Jefferson Awards nominations


Your max_length is set to 200, but your input_length is only 158. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=79)


349:Purdue United Way campaign kicks off
350:Applications sought for vice provost for enrollment management
351:Deborah Knapp: ‘A New Era in Comparative Medical Research and Opportunities to Position Purdue as a World Leader’
352:Purdue University to launch Trimble Technology Lab with focus on construction management technology
353:Purdue center addresses pressing challenge of securing semiconductor chips
354:Verbal de-escalation training available in October
355:Purdue Global grad: ‘Dreams really do come true. This was my chance to make it happen’
356:Treatment for mental health conditions, substance use disorders covered by health insurance policies same as physical illness
357:Mars region offers NASA rover environment to search for evidence of ancient microbial life


Your max_length is set to 200, but your input_length is only 161. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=80)


358:Purdue developing field test to detect SARS-CoV-2 virus in dozens of host species
359:Inkjet-printed tumors: Custom cancer drug testbeds in less than a day
360:Planning a campus event? An emergency preparedness checklist is available for event planners
361:Purdue Global to host FEMA official for ‘Equity in Emergency Management’ webinar


Your max_length is set to 200, but your input_length is only 139. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=69)


362:Purdue University named No. 6 Best Value among public universities in the U.S., No. 15 overall by SmartAsset
363:Summer 2023 personnel activity reports open in SEEMLESS for faculty certification


Your max_length is set to 200, but your input_length is only 106. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=53)


364:Accreditation organization to visit Speech, Language, and Hearing Sciences; public input sought
365:Purdue InterCultural Learning Community of Practice schedule of events for fall 2023
366:Nominations sought for Purdue’s top research awards
367:Purdue United Way campaign to kick off Wednesday


Your max_length is set to 200, but your input_length is only 179. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=89)


368:Purdue University undergraduate national ranking jumps 8 spots, with 13 undergraduate programs in top 10 in the US
369:Policy updates for September
370:Trask Innovation Fund to award researchers up to $50,000 to enhance Purdue intellectual property
371:NSF awards $2M to Purdue’s College of Education, Downtown Boxing Gym for STEM-based research
372:Purdue researcher awarded $1.3 million for malaria drug trials in Southeast Asia and Africa


Your max_length is set to 200, but your input_length is only 153. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=76)


373:Purdue welcomes its most selective incoming class
374:No-cost vaccine clinic scheduled for Sept. 21
375:New staff, outreach efforts expand accessibility of CAPS services
376:Purdue president partners with students to create official university ice cream
377:Allow extra processing time for driver authorization requests
378:Center for Healthy Living to offer drive-thru, walk-in, on-campus flu shots this flu season
379:2023 System-Wide Virtual Forum to focus on artificial intelligence in higher education
380:Upcoming Boilermaker Half-marathon & 5K supports behavioral health; discounted fees available
381:Cracking the science of collagen in bones
382:Solving stickiness sustainably
383:Purdue panels to address US semiconductor needs, ‘Next Big Things in Tech’ at Fast Company Innovation Festival
384:Purdue Global to showcase ‘This Is My Comeback’ campaign during Adweek’s Brandweek in Miami
385:Study improves accuracy of planted forest locations in East Asia
386:Purdue Entomology to host 

Your max_length is set to 200, but your input_length is only 61. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=30)


394:The world’s smallest drum
395:University Senate meeting set for Monday


Your max_length is set to 200, but your input_length is only 155. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=77)


396:Purdue Police, Fire departments to offer ‘Prepared at Purdue’ training
397:Equal Access and Equal Opportunity Policies
398:Important building block to brand success includes workforce buy-in
399:Women in Engineering Program invites high school juniors and seniors to campus
400:Proposals being accepted for spring 2024 Discovery Undergraduate Interdisciplinary Research Internship program
401:Registration available for Westwood Lecture on Sept. 21
402:New bio-based glues from Purdue form adhesive bonds that grow stronger in water
403:Purdue efforts drive future workforce development for semiconductor industry
404:Registration open for Worldview Workshops offered by CILMAR
405:Campuswide PurdueALERT test scheduled for Sept. 28
406:Purdue delegation embarks on USS Nimitz, gains insights into naval operations
407:Purdue establishes permanent presence next to NSWC Crane for future of national defense and semiconductors
408:Life and health sciences faculty summit to chart next major invest

Your max_length is set to 200, but your input_length is only 180. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=90)


417:Purdue Global’s ‘This Is My Comeback’ campaign to make splash at Adweek’s Brandweek in Miami
418:New faculty members welcomed, introduced with brief profiles


Your max_length is set to 200, but your input_length is only 154. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=77)


419:SuccessFactors mobile app, Talent Profile now available
420:Submit Xerox print services orders by Sept. 15; new vendor available now
421:September recognized as National Suicide Prevention Month; resources available to help
422:IHT Group to manufacture, sell hog-cooling technology developed at Purdue
423:6-week wellness program focuses on heart knowledge, hypertension; register by Sept. 5
424:University wide Study Abroad Fair set for Thursday
425:Second funding round delivers $19 million to Purdue-led microelectronics workforce development program
426:Purdue streak camera innovation could capture actions that last femtoseconds or less
427:Purdue Digital Forestry team advances to $10 million XPRIZE Rainforest competition finals
428:A ‘mini-brain’ traces the link between concussion and Alzheimer’s disease
429:Purdue police provide travel safety tips as new academic year gets underway on West Lafayette campus
430:Purdue Global: Don’t fear generative AI tools in the classroom
431:Work 

Your max_length is set to 200, but your input_length is only 165. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=82)


437:Purdue-Colombia undergrad research program expands; similar student engagement efforts planned for Mexico, Brazil, Ecuador, Peru
438:Human Resources to offer new monthly training options
439:Civil Engineering’s Crawford receives esteemed IEEE Mildred Dresselhaus Medal
440:Research mentors, programs can recruit students in virtual Undergraduate Research Roundtable
441:PhD student’s materials engineering research offers glimpse into challenges of electronic device miniaturization
442:Nominations sought for Leadership in Action Award
443:‘Monkey King: Journey to the West’ to be featured Big Read book
444:Find out what’s happening on campus, in the community with the Purdue University Events Calendar
445:Wellness Council of Indiana renews Purdue’s AchieveWELL 5 Star designation
446:Purdue’s microwave technology could lead to more stable vaccine supply chain
447:Nominations sought for Violet Haas Award
448:Available space podcast: A look into NASA’s James Webb Space Telescope with astro

Your max_length is set to 200, but your input_length is only 177. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=88)


453:Nominations sought for research project grants for assistant and associate professors
454:Purdue Food Co. offers Retail Dining Memberships to campus community
455:Purdue University, U.S. Naval Test Pilot School to partner on joint graduate degree program
456:Prestigious civil engineering programs continue global mission in building, advancing legacy of excellence, storied tradition
457:Quantum Research Sciences selected as finalist for Rally IN-Prize pitch competition
458:Healthy Boiler workshop focuses on tips, exercises to target major muscle groups
459:Orientation programs welcome more than 8,000 students to campus
460:Student loan repayments to restart soon
461:MaPSAC accepting applications for professional development grants
462:Next session of Healthy Boiler lifestyle program One to One Fit begins Sept. 6


Your max_length is set to 200, but your input_length is only 197. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=98)


463:Purdue announces new Dean of the College of Science and other leadership updates


Your max_length is set to 200, but your input_length is only 168. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=84)


464:Purdue Farmers Market continues through October
465:Boiler Gold Rush kicks off with move-in, opening ceremonies


Your max_length is set to 200, but your input_length is only 149. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=74)


466:New effective date announced for student background check process change
467:Free hearing screenings available through M.D. Steer Audiology Clinic
468:Westwood Lecture Series fall lineup announced
469:CBF Forensics launches VR crime scene training programs and THC quantification system
470:Purdue Global to offer new Military Physician Assistant Preparation concentration


Your max_length is set to 200, but your input_length is only 66. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=33)


471:Q&A with Internet founders Vint Cerf and Robert Kahn ahead of upcoming Presidential Lecture Series
472:Memo: Religious, ethnic and civic observances
473:Staff & Student Bowling League to organize today
474:Purdue all-hazards outdoor sirens test and evacuation drill scheduled for incoming students during Boiler Gold Rush


Your max_length is set to 200, but your input_length is only 159. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=79)


475:Exposing the hidden genetic diversity of an ecologically harmful microbe
476:Nominations sought for Community Spirit Award
477:PARI’s Global Development and Innovation begins $1.6M engineers program in Kenya
478:Go ‘Beyond the Surface’ to build resiliency, self-care to benefit behavioral health
479:Several Purdue firefighters promoted during advancement ceremony


Your max_length is set to 200, but your input_length is only 137. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=68)


480:Doctoral student plays key role in new genetic testing for helping dog breeders eliminate specific diseases


Your max_length is set to 200, but your input_length is only 190. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)


481:Greg Pence joins Purdue leadership, faculty to discuss innovation in energy research
482:Departments must conduct sex, violent offender checks for student hires
483:Purdue Mobile ID, Mobile First effort kicks into high gear this week with Boiler Gold Rush, Boiler Gold Rush International orientation
484:Purdue University in Indianapolis joins Stewart-Haas Racing for Verizon 200 at the Brickyard NASCAR Cup Series race
485:Indiana farmland prices continue to rise in 2023
486:Arequipa Nexus Institute in Peru receives $4 million for phase 3 funding


Your max_length is set to 200, but your input_length is only 92. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=46)


487:Reminder: Reportable Outside Activities disclosure required for 2023-24


Your max_length is set to 200, but your input_length is only 181. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=90)


488:Amazon at the Purdue Memorial Union now closed; other campus location remains available
489:Academic integrity to be focus of presentations for faculty, staff


Your max_length is set to 200, but your input_length is only 128. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=64)


490:Experiments identify important new role of chemical compounds in plant development


Your max_length is set to 200, but your input_length is only 104. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=52)


491:Atlas Family Marketplace to be dining host for BGR students, limited options available
492:Materials Management and Distribution advises resuming normal shipping and receiving operations
493:Purdue Global trustees approve renaming law school
494:Purdue trustees ratify faculty and staff positions
495:New university residence to the south of Hillenbrand Hall to further increase availability of on-campus housing
496:Purdue trustees approve new residence hall, 2024 health plans among actions
497:Purdue trustees approve new Nursing and Pharmacy Education Building, series of repair and rehabilitation projects
498:No employee premium increase for 5th straight year as Purdue trustees approve 2024 health plans


Your max_length is set to 200, but your input_length is only 110. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=55)


499:Trustees approve new Purdue University Airport terminal as exploration of commercial air service continues


Your max_length is set to 200, but your input_length is only 143. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=71)


500:Reminder for instructors with sponsored class projects or capstone projects
501:Ismail Center free trial period extended through Aug. 31
502:Office of Engagement earns national top-10 recognition at Social Innovation Summit


Your max_length is set to 200, but your input_length is only 74. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=37)


503:Connection between cyberbullying, back-to-school important to recognize
504:Board of Trustees meeting set for Friday; special issue of Purdue Today to follow
505:Purdue animal sciences faculty members receive USDA grants for animal welfare research
506:LIFT Academy, Purdue Global partner to increase education access for next generation of pilots
507:Gourmet or imitation? New technique ferrets out food fraud
508:Anu awarded $200,000 grant to mass manufacture its aeroponic seed pods that grow produce in controlled environments
509:Decades of research have left knowledge gaps about cells that regulate the immune system: Purdue and NIH
510:Purdue Global to offer educational opportunities with tuition incentives to Iowa’s Montgomery County Memorial Hospital + Clinics
511:Purdue thermal imaging innovation allows AI to see through pitch darkness like broad daylight
512:Southwest Airlines offers opportunity for Purdue employees to earn A-List status


Your max_length is set to 200, but your input_length is only 161. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=80)


513:Registration open for upcoming Conference for Assistant Professors
514:Update to policy on docking pay
515:Purdue ties its Fulbright Scholar Awards record, eight faculty honored
516:Purdue research awards and philanthropic fundraising both new records in fiscal year 2023
517:Purdue Global School of Nursing honors four students with DAISY Awards
518:IUPUI faculty transition fact sheet
519:Upcoming food demo to focus on fiber-forward food; register by Aug. 7
520:Purdue University Police Department seeks public feedback during reaccreditation process
521:Billions hear of Purdue’s record for the world’s whitest paint
522:Purdue Extension State Fair exhibits to showcase mental health awareness, veterinary care and becoming a Master Gardener
523:Pro bono work helps students gain experience, keep faculty on top of legal changes
524:Purdue response to Inside Higher Ed regarding Indianapolis faculty
525:SupportLinc offers virtual Healthy Boiler ‘Navigating Back-to-School’ workshop
526:India

Your max_length is set to 200, but your input_length is only 122. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=61)


535:4 Purdue police officers promoted at pinning ceremony, Chief’s Awards given
536:Purdue for Life Foundation thanks campus instructors for participation in sold-out Grandparents University


Your max_length is set to 200, but your input_length is only 165. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=82)


537:CHL program to focus on taking control of type 2 diabetes, prediabetes
538:Materials Management and Distribution provides shipping guidance ahead of potential UPS service disruptions
539:Summer Undergraduate Research Symposium starts July 27
540:Reducing speed limits, increasing safety on Purdue’s West Lafayette campus
541:Purdue Global expands opportunities through Community College of the Air Force program
542:Roth in-plan conversion option now available within voluntary retirement plan
543:Roth in-plan conversion option now available within voluntary retirement plan
544:International Self-Care Day takes place July 24; self-care resources available to help
545:Purdue Innovates Startup Foundry awards $200,000 in equity investment to Aerovy Mobility and Uniform Sierra Aerospace
546:Reducing speed limits, increasing safety on Purdue’s West Lafayette campus
547:Diesel engine research leading to better efficiency, emissions standards on the roads
548:High-rise structure efficiency, co

Your max_length is set to 200, but your input_length is only 159. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=79)


551:Express Air Coach no longer campus vendor; reservations to be transferred
552:Employee participation requested in brand survey
553:Digging deeper into the Grounds department’s planting process
554:Purdue welcomes fifth cohort of Mandela Washington Fellows to campus for six-week leadership institute
555:Purdue Animal Sciences selects Hattie Duncan as livestock judging coordinator
556:Soybean industry to benefit from growing demand of cell-cultured meat
557:Pilotsmith partners with Purdue Global to address projected demand for aviation professionals over next 20 years
558:Emotional Freedom Technique: Research supports benefits of tapping for mental health
559:Jim Bullard, president of St. Louis Federal Reserve Bank, appointed inaugural dean at Purdue’s Daniels School of Business
560:‘Asthma Care for Adults’ lifestyle program begins Aug. 15; register by July 26
561:Purdue researchers receive $118,000 to develop freeze-drying, meat validation and thermal imaging innovations
562:Amplifi

Your max_length is set to 200, but your input_length is only 53. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=26)


564:‘This is a comeback that’s really mine’
565:Employee compensation statements available in SuccessFactors
566:Reminder: New salary tier for medical benefits in effect; no change on 2023 benefit premiums for current employees
567:HSA Bank investment update: TD Ameritrade investors will move to Charles Schwab platform over Labor Day weekend


Your max_length is set to 200, but your input_length is only 94. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=47)


568:Purdue Global Village virtual conference accepting proposals
569:Campus events professionals invited to join Purdue University Special Events Council
570:Purdue University ascends to the top 10 of the Global University Visibility rankings
571:Purdue Police, Fire departments to offer ‘Prepared at Purdue’ training
572:Purdue researchers fabricate sensors with potential health-monitoring applications onto ready-made wearables
573:Purdue Convocations announces its 121st season


Your max_length is set to 200, but your input_length is only 144. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=72)


574:‘Baseball was my dream, and Purdue Global moved me forward’
575:Indiana State Medical Association, Purdue announce landmark effort to address social drivers of health
576:$1.3B investments in Purdue University facilities for students and faculty
577:Healthy Boiler workshop focuses on getting family involved in the kitchen
578:‘The Attitude of Gratitude’ is SupportLinc’s featured topic for July
579:$1.3B investments in Purdue University facilities for students and faculty
580:All new summer, fall 2023 undergraduate Purdue students to be issued mobile digital ID cards for greater campus convenience
581:Purdue Global faculty earn national credential in teaching excellence
582:Biotechnology offers holistic approach to restoration of at-risk forest tree species
583:Purdue Global faculty earn national credential in teaching excellence
584:Kayli Peterson is driven to succeed
585:Fact sheet: Student debt at Purdue University
586:Sociogenomics: The intricate science of how genetics influenc

Your max_length is set to 200, but your input_length is only 154. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=77)


593:Expert: How wildfires contaminate drinking water
594:Topp named to lead institute for advanced pharmaceutical manufacturing at Purdue
595:Purdue-launched solid rocket motor-maker Adranos flies off with Anduril


Your max_length is set to 200, but your input_length is only 177. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=88)


596:Purdue and SEMI convene semiconductor partnership meeting in Washington, D.C., with top Indian government officials and industry leaders
597:Introducing high school students to careers in aviation and space
598:Purdue Global and Northern Light Inland Hospital celebrate opening of state-of-the-art simulation center
599:Purdue signs tech-focused MOUs with Taiwan universities; Krach Institute for Tech Diplomacy hosts Taiwan delegation and Ambassador Bi-khim Hsiao
600:Purdue University, High Alpha partner to house programs in downtown Indianapolis
601:Purdue, TSMC extend partnership on semiconductor research and workforce development
602:It’s never too late to come back. ‘Purdue Global was built for working adults like us.’
603:Are cars lasting longer than they used to? – new video uploaded to AP Video Hub
604:Purdue Global’s organizational management program equips employees for new leadership, managerial roles
605:Governor appoints new student trustee for Purdue; three trustees reapp

Your max_length is set to 200, but your input_length is only 165. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=82)


645:Purdue signs landmark U.S.-Japan agreement in semiconductors at G7 summit
646:Annual free summer concert series returning to Purdue Memorial Union June 2
647:Practicing what it teaches, environmentally friendly Purdue earns sustainability honor from U.S. Department of Education
648:New Purdue Global doctoral program to expand access for next generation of leaders
649:From micro to macro: Cooling data centers from the inside out
650:Approaching artificial intelligence: How Purdue is leading the research and advancement of AI technologies
651:Purdue Ventures invests in wearable communication chip company Ixana
652:Purdue University Fire Department recognizes local business for supporting annual Shop with a Firefighter event for more than a decade
653:Purdue President Chiang to grads: Let Boilermakers lead in ‘sharpening the ability to doubt, debate and dissent’ in world of AI
654:Purdue mourns the death of alum and trustee William ‘Bill’ Oesterle after 5-year battle with ALS
655:Purd

Your max_length is set to 200, but your input_length is only 191. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)


724:Purdue offering new online Hypersonics Graduate Certificate
725:Purdue sophomore named Frederick Douglass Fellow
726:New Webb telescope image reveals wonders, beauty, secrets of star structure and building blocks of life
727:13 Purdue Researchers Earn NSF Early Career Recognition
728:Former U.S. ambassador to Ukraine to speak in ethics series


Your max_length is set to 200, but your input_length is only 190. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)


729:Purdue University and Duke Energy to release interim report of nuclear power feasibility study
730:Stories spark the imagination of Purdue’s first Beinecke Scholar
731:Air Force initiative at Purdue progresses as it enters second year
732:Caterpillar reaffirms recent $1 million commitment to Purdue by establishing office at Convergence Center
733:‘This Is My Comeback’: Purdue Global launches new brand platform and marketing campaign


Your max_length is set to 200, but your input_length is only 134. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=67)


734:Purdue Extension needs your help growing food for science


Your max_length is set to 200, but your input_length is only 189. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=94)


735:Appointments, honors and activities
736:Purdue economic education program to celebrate entrepreneurial efforts of Indiana students
737:Today’s top 5 from Purdue University
738:GeniPhys secures $6 million in Series A funding
739:NutraMaize receives $650,000 USDA grant to scale research on orange corn that improves poultry health and egg yolk pigmentation
740:Collaborative Research With Purdue Polytechnic High Schools: Documenting Student Impact
741:$500,000 grant targets lack of air-quality data in swine production
742:Purdue launches oneAPI Center of Excellence to advance AI and HPC teaching in the US
743:Sheep checkoff calls for nominations, approves projects
744:Today’s top 5 from Purdue University
745:Purdue receives $20 million commitment from alum Sassola for new pharmacy leadership academy
746:Connected vehicles the latest tool to give engineers real-time insight into highway traffic congestion issues
747:Colorful sidewalks move tourists, Purdue HTM researchers discover
748:B

Your max_length is set to 200, but your input_length is only 170. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=85)


757:Purdue’s ‘world’s whitest paint’ wins 2023 SXSW Innovation Award
758:Purdue Road School expected to draw more than 3,000 participants
759:Purdue, Varcity partner for new alumni-focused residential development in Discovery Park District
760:‘Talking’ concrete could help prevent traffic jams and cut carbon emissions
761:PurdueALERT test, campuswide tornado drill on Tuesday (March 14)
762:Purdue engineers create safer solid-state lithium-ion batteries from new composite materials
763:Today’s top 5 from Purdue University
764:Americans planning frugal uses for their 2023 tax refunds
765:Purdue Women’s Conference 2023 to feature more than 20 empowering speakers
766:Fighter pilot Heather Penney on Purdue journey and 9/11 mission
767:Purdue engineer, IU cardiologist collaborate to offer innovative tool and fresh hope for babies with heart defects
768:Purdue Global’s Dooley selected to ACE board of directors
769:Gardeners asked to be vigilant this spring for invasive jumping worms
770:Wheel

Your max_length is set to 200, but your input_length is only 192. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=96)


786:Ukraine Scholars Initiative expands and receives Heritage Group gift
787:U.S. cybersecurity leaders to examine lessons learned and set strategies for future risk
788:Purdue police investigating reports of fraud related to purchasing basketball tickets
789:Purdue Global School of Nursing to support Story County Medical Center’s participation in DAISY Award program
790:Purdue Global partnering with Northern Light Inland Hospital on innovative learning model featuring state-of-the-art simulation center
791:Dust explosion incidents increased last year, no fatalities
792:Purdue Global announces Anaheim commencement weekend schedule — Feb. 24-25


Your max_length is set to 200, but your input_length is only 197. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=98)


793:Today’s top 5 from Purdue University
794:Hands-on ‘guitar lab’ one of Purdue’s most popular courses
795:Purdue Ventures invests $250,000 in assistive educational technology company
796:Purdue-connected digital health startup wins phase 1 of NIH competition for maternal health
797:Purdue University receives $21 million art donation of Degas sculptures from engineering alum Avrum Gray
798:Purdue Global, UMGC Team with GetSet Learning to Boost Student Persistence and Retention
799:You’ve got to have heart: Computer scientist works to help AI comprehend human emotions
800:White Family Foundation commits $50 million to new Daniels School of Business at Purdue University
801:$10 million USDA grant to fuel economic resilience and sustainability in Eastern U.S. forests
802:Purdue secures university-issued/subsidized devices through TikTok removal


Your max_length is set to 200, but your input_length is only 179. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=89)


803:Appointments, honors and activities
804:Leading researcher on health and social inequities to speak at John Martinson Honors College
805:Indiana AgrAbility farmer to speak at 2023 AgrAbility National Training Workshop
806:Today’s top 5 from Purdue University
807:Purdue Global Chancellor Dooley to appear on ‘The Balancing Act’
808:Health Care Navigation Program now available
809:Food survey queries consumers about New Year’s resolutions, risk tolerance
810:New success coaches to help Purdue Global students with life challenges
811:Speaking stones: Analyzing Antarctica’s rocks to explore Earth’s past and possible futures.
812:Arvind Raman selected as the next dean of Purdue’s College of Engineering, the largest top-ranked program in the nation
813:Purdue Engineering to play key role in two new SRC JUMP 2.0 centers
814:Abbott appointed interim Purdue associate dean and Extension director
815:Commitment to service prompts restaurant exec to choose Purdue Global
816:Purdue trustees rati

Your max_length is set to 200, but your input_length is only 188. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=94)


817:Purdue trustees endorse 12th consecutive tuition freeze
818:Purdue trustees discuss student housing, identify ways to increase capacity


Your max_length is set to 200, but your input_length is only 133. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=66)


819:Purdue trustees approve plans for state-of-the-art Mitchell E. Daniels, Jr. School of Business, endorse 12th consecutive tuition freeze
820:Trustees approve winter recess for 2023
821:Purdue’s next big move: The Mitchell E. Daniels, Jr. School of Business
822:Chancellor Dooley updates trustees on Purdue Global Moves
823:Purdue researchers receive over $143,000 to strengthen marketplace interest in their IP
824:Today’s top 5 from Purdue University
825:Purdue Ag-Celerator fund invests $100,000 in pathogen detection company
826:Purdue a factor as Lafayette No. 1 nationally in latest WSJ/Realtor.com housing index of most affordable markets for fourth quarter
827:The moon is too hot and too cold; now it could be just right for humans, thanks to newly available science
828:Digital revolution inspires new research direction in ecosystem structural diversity
829:Purdue’s popular aseptic processing and packaging workshop now offered online


Your max_length is set to 200, but your input_length is only 150. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=75)


830:Purdue faculty chosen as fellows of the American Association for the Advancement of Sciences
831:Purdue opens Black History Month by honoring Parker sisters
832:Anu, previously gropod, awarded nearly $1 million competitive grant from the National Science Foundation
833:Forage enthusiasts to gather for annual meeting and seminar
834:VR avatars rival online interactions in creating closeness
835:Today’s top 5 from Purdue University
836:Resume normal campus parking; regular enforcement begins at 7 a.m. Friday
837:Researchers tailor thickness of conducting nitrides and oxides to enhance their photonic applications
838:Registration opens for the Ag Women Engage Conference
839:Purdue launches new AI-based global forest mapping project
840:Study: How to apply lessons from Colorado’s costliest wildfire to drinking water systems
841:Transistors repurposed as microchip ‘clock’ address supply chain weakness
842:Purdue’s online engineering graduate programs gain overall in U.S. News rankings o

Your max_length is set to 200, but your input_length is only 153. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=76)


851:Purdue Bands & Orchestras hosting 33rd annual Purdue Jazz Festival
852:Using cancer cells as logic gates to determine what makes them move
853:Purdue engineers improve solar cell efficiency, stability
854:Expo registration open for Indiana green industry professionals
855:Raytheon Technologies commits $4 million to Purdue for named chair position in new School of Business


Your max_length is set to 200, but your input_length is only 164. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=82)


856:OmniVis recognized with prestigious Most Fundable Companies designation
857:President Chiang, Provost Wolfe to hold undergraduate listening session
858:National searches for Purdue’s named deanships in Agriculture and in Science; nominations sought
859:Finalists for dean of College of Engineering selected, will make on-campus presentations
860:Expansive agricultural dataset now available from Purdue University
861:Today’s top 5 from Purdue University
862:Purdue scientists and engineers push the boundaries of space knowledge, studying the stars, the solar system and beyond


Your max_length is set to 200, but your input_length is only 190. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)


863:December Consumer Food Insights Report reveals steady food behaviors through economic change
864:How the College of Education is addressing the teacher shortage


Your max_length is set to 200, but your input_length is only 190. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)


865:Plan ahead for on-street snow removal: Purdue, city of West Lafayette snow routes and process
866:Purdue to celebrate Martin Luther King Jr. with concert by Morgan State University Choir
867:Purdue Global Concord Law School to discuss how technology can promote access to justice in next Distinguished Speaker Series
868:Purdue announces investments and policies for scholarly impact and research excellence
869:New high-tech startup developing smart contact lenses for glaucoma diagnosis and management


Your max_length is set to 200, but your input_length is only 130. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=65)


870:Welcome back letter from President Chiang, Provost Wolfe
871:Over 500 to attend Purdue’s Indiana STEM Education Conference
872:Purdue Global’s Diego Britto selected for second term on UPCEA finance committee
873:Today’s top 5 from Purdue University
874:Purdue appoints senior vice president, vice provosts and interim agriculture dean
