# Text Summary & Scoring Project
##### Michael Creegan, Yungfeng Dai, Hong Gyu Ji, Ziling Zeng
##### Python for Data Analysis
##### Columbia University

# Abstract

Summarization is a common problem in the 21st century as the world has become increasingly driven by data. Summarization of data can be very useful to  quickly determine if something is relevant or whether it's worth reading. Another use case could could be to store summaries of articles it in the backend to run downstream taks on. It could also be useful to understand the semantic integrity to indicate quality.

To explore this topic, we will leverage the extreme summarization dataset (XSUM) which consists of BBC articles accompanying single sentence summaries. Each article is prefaced with an introductory sentence (which is a summary) that is professionally written, typically by the author of the article.

To summarize articles, we will use an encoder-decoder transformer (sequence-to-sequence) which combines  decoders and encoders because we need to perform both input and output tasks: taking in text and then generating a summary. We selected this type of transformer because the encoder accepts inputs (text) and computes a high level representation of those inputs  which are then passed to the decoder to generate a prediction output (summary). This has advantages over using a standalone encoder like BERT/ALBERT/ELECTRA/RoBERTA/DistilBERT to name a few because  encoders are pre-trained by filling randomly masked words in sentences and therefore are better suited for output tasks. Using a standalone decoder like gpt2 would also not be optimal because decoders are trained to guess the next word in a sequence (left or right context aka does not have context on one side of the sequence) and therefore are better suited at generating text but not necessarily taking in text because of the hidden context limitations. 

Our scoring will compare the output of the BART encoder-decoder model to the professionally written summaries in the XSUM dataset to see how similar a machine generated summary is to a professional one. Our scoring methodology will be focused on semantic textual similarity and computed using the cosine similarity between the professional human written summary and the machine generated one. 

# Importing Transformers & Dependencies

In [37]:
import pandas as pd
import numpy as np
from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig
from datasets import load_dataset, load_metric
from sentence_transformers import SentenceTransformer, util
import random
from IPython.display import display, HTML

# Load XSUM Dataset

In [38]:
xsum = load_dataset('xsum')

Using custom data configuration default
Reusing dataset xsum (C:\Users\creeg\.cache\huggingface\datasets\xsum\default\1.2.0\32c23220eadddb1149b16ed2e9430a05293768cfffbdfd151058697d4c11f934)
100%|██████████| 3/3 [00:00<00:00, 52.32it/s]


### We can see that the dataset is a "DatasetDict" where the keys are strings that correspond to the split and the values are the dataset object. In the XSUM dataset, the the keys are "training", "validation", and "test" with values corresponding to "document", "summary", and "id" (columns)

In [39]:
xsum

DatasetDict({
    train: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 204045
    })
    validation: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 11332
    })
    test: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 11334
    })
})

### View a record of the underlying data

In [40]:
xsum['test'][0]

{'document': 'Prison Link Cymru had 1,099 referrals in 2015-16 and said some ex-offenders were living rough for up to a year before finding suitable accommodation.\nWorkers at the charity claim investment in housing would be cheaper than jailing homeless repeat offenders.\nThe Welsh Government said more people than ever were getting help to address housing problems.\nChanges to the Housing Act in Wales, introduced in 2015, removed the right for prison leavers to be given priority for accommodation.\nPrison Link Cymru, which helps people find accommodation after their release, said things were generally good for women because issues such as children or domestic violence were now considered.\nHowever, the same could not be said for men, the charity said, because issues which often affect them, such as post traumatic stress disorder or drug dependency, were often viewed as less of a priority.\nAndrew Stevens, who works in Welsh prisons trying to secure housing for prison leavers, said the

### We can use a function to view a random selection of articles and summaries in the training section (largest section) to get a more accurate depiction of what the data looks like in a synthesized format

In [41]:
def display_function(xsum, num_examples=5):
    assert num_examples <= len(xsum)                # limit to number of records in the xsum
    
    selections = []                                 # create empty list to put the records into 
    
    for _ in range(num_examples):                   # we can use _ here in place of a variable name because we don't care how many time sthe loop is run
        selection = random.randint(0, len(xsum) - 1)
        while selection in selections:
            selection = random.randint(0, len(xsum) - 1)
        selections.append(selection)

    xsumPd = pd.DataFrame(xsum[selections])
    for column, typ in xsum.features.items():
        display(HTML(xsumPd.to_html()))

### Our end goal is to create accurate summaries using this model so we need to remove the text characters that do not provide any contextual value. We can also see that there are characters in the document that are not present in the summary which could cause discrepencies between our machine generated summary vs the professional human generated one. We need to remove new line characters that are present in the document column but not the summary column

In [42]:
display_function(xsum["test"])

Unnamed: 0,document,summary,id
0,"In a speech marking the end of the Islamic holy month of Ramadan, he said Iran still had sharp differences with the US, above all over the Middle East.\nIran would continue to back Syria, Iraq, the Palestinians and ""oppressed people"" in Yemen and Bahrain, he said.\nThe deal on Iran's nuclear programme came after years of negotiations.\nThe agreement limits Iran's nuclear activities for at least 10 years in exchange for the gradual lifting of sanctions which have hampered the country's economy.\n""Whether the [nuclear] deal is approved or disapproved, we will never stop supporting our friends in the region and the people of Palestine, Yemen, Syria, Iraq, Bahrain and Lebanon,"" Ayatollah Khamenei said.\n""Even after this deal our policy towards the arrogant US will not change.""\nThe supreme leader also denied that Iran was intending to create a nuclear bomb.\n""The Americans say they stopped Iran from acquiring a nuclear weapon,"" he said in his speech at the Mosala mosque in Tehran.\n""They know it's not true. We had a fatwa [religious ruling] declaring nuclear weapons to be religiously forbidden under Islamic law. It had nothing to do with the nuclear talks.\n""We have repeatedly said we don't negotiate with the US on regional or international affairs; not even on bilateral issues.\n""There are some exceptions like the nuclear programme that we negotiated with Americans to serve our interests... US policies in the region are diametrically opposed with Iran's policies.""\nThe supreme leader said it was now necessary for Iranian politicians to scrutinise the nuclear agreement and make sure that Iran's national interests were being preserved.\nCorrespondents say that Ayatollah Khamenei's views are in contrast to the acclaim that the accord has received from President Hassan Rouhani and Foreign Minister Mohammad Javad Zarif.","Iran's stance towards the ""arrogant"" US will not change despite the nuclear deal reached earlier this week, supreme leader Ayatollah Ali Khamenei has said.",33578942
1,"Mr Roache, who is secretary of the union's Yorkshire region, won 56.7% of the vote, while the only other candidate, Paul McCarthy, from the North West region, had 43.3%.\nThe union is the third largest in the UK with more than 600,000 members.\nCurrent general secretary Paul Kenny announced in the summer he was standing down after almost a decade in the job.\nMr Roache has 35 years experience at the GMB and led the Leeds City Council 13-week refuse and street cleaning strike in 2009 - the longest in the union's history.\nHe said he was ""proud and humbled"" to have been elected.\n""I will repay GMB members' faith in me by leading a 21st Century union that fights for our members, their families and communities, every hour of every day.""\nMr Roache also paid tribute to Mr Kenny for his work ""that has made GMB the envy of the union movement"".\nDetails of the handover date have yet to be agreed.\nThe GMB is one of the three largest affiliates to the Labour Party and is a significant financial contributor to the party locally and nationally.",Regional official Tim Roache has been elected to become the new general secretary of the GMB union.,34801085
2,"Dunkley, 25, made 52 appearances in all competitions for Oxford last season but rejected a new contract and will join Wigan on 30 June on a free transfer.\nThe former Crewe and Hednesford Town centre-back joined the U's from Kidderminster Harriers on loan before signing permanently in 2015.\nHe is new manager Paul Cook's first signing since arriving on 31 May.\nFind all the latest football transfers on our dedicated page.",Wigan Athletic have signed defender Chey Dunkley on a three-year deal from League One rivals Oxford United.,40159991
3,"The man, who is in his 50s, also sustained a suspected broken leg in the first attack on Castle Street at about 18:45 BST on Saturday.\nPolice said he was taken to hospital after a row on a garage forecourt.\nA short time later, another man and a woman were assaulted in licensed premises on Castle Street.\nDetectives have appealed for anyone who witnessed the attacks to contact police.","A man's eye socket has been broken and two other people have received minor injuries during two assaults in Ballycastle, County Antrim.",34566103
4,"Len Richards has been named as the new chief executive of Cardiff and Vale University Health Board (CVUHB).\nMr Richards, who moved from the UK to Australia in November 2013, is expected to start his new post in June.\nCVUHB chair Maria Battle said he brought ""broad international"" experience to the role.\nMr Richards is currently the deputy chief executive for South Australia Health, a government department responsible for public health in Adelaide.\nHe said there was a lot of hard work to do on the board's financial situation, which is forecasting a Â£31m deficit for the 2016-17 financial year.",A new boss for health services in south Wales has been appointed all the way from Australia.,39634820


Unnamed: 0,document,summary,id
0,"In a speech marking the end of the Islamic holy month of Ramadan, he said Iran still had sharp differences with the US, above all over the Middle East.\nIran would continue to back Syria, Iraq, the Palestinians and ""oppressed people"" in Yemen and Bahrain, he said.\nThe deal on Iran's nuclear programme came after years of negotiations.\nThe agreement limits Iran's nuclear activities for at least 10 years in exchange for the gradual lifting of sanctions which have hampered the country's economy.\n""Whether the [nuclear] deal is approved or disapproved, we will never stop supporting our friends in the region and the people of Palestine, Yemen, Syria, Iraq, Bahrain and Lebanon,"" Ayatollah Khamenei said.\n""Even after this deal our policy towards the arrogant US will not change.""\nThe supreme leader also denied that Iran was intending to create a nuclear bomb.\n""The Americans say they stopped Iran from acquiring a nuclear weapon,"" he said in his speech at the Mosala mosque in Tehran.\n""They know it's not true. We had a fatwa [religious ruling] declaring nuclear weapons to be religiously forbidden under Islamic law. It had nothing to do with the nuclear talks.\n""We have repeatedly said we don't negotiate with the US on regional or international affairs; not even on bilateral issues.\n""There are some exceptions like the nuclear programme that we negotiated with Americans to serve our interests... US policies in the region are diametrically opposed with Iran's policies.""\nThe supreme leader said it was now necessary for Iranian politicians to scrutinise the nuclear agreement and make sure that Iran's national interests were being preserved.\nCorrespondents say that Ayatollah Khamenei's views are in contrast to the acclaim that the accord has received from President Hassan Rouhani and Foreign Minister Mohammad Javad Zarif.","Iran's stance towards the ""arrogant"" US will not change despite the nuclear deal reached earlier this week, supreme leader Ayatollah Ali Khamenei has said.",33578942
1,"Mr Roache, who is secretary of the union's Yorkshire region, won 56.7% of the vote, while the only other candidate, Paul McCarthy, from the North West region, had 43.3%.\nThe union is the third largest in the UK with more than 600,000 members.\nCurrent general secretary Paul Kenny announced in the summer he was standing down after almost a decade in the job.\nMr Roache has 35 years experience at the GMB and led the Leeds City Council 13-week refuse and street cleaning strike in 2009 - the longest in the union's history.\nHe said he was ""proud and humbled"" to have been elected.\n""I will repay GMB members' faith in me by leading a 21st Century union that fights for our members, their families and communities, every hour of every day.""\nMr Roache also paid tribute to Mr Kenny for his work ""that has made GMB the envy of the union movement"".\nDetails of the handover date have yet to be agreed.\nThe GMB is one of the three largest affiliates to the Labour Party and is a significant financial contributor to the party locally and nationally.",Regional official Tim Roache has been elected to become the new general secretary of the GMB union.,34801085
2,"Dunkley, 25, made 52 appearances in all competitions for Oxford last season but rejected a new contract and will join Wigan on 30 June on a free transfer.\nThe former Crewe and Hednesford Town centre-back joined the U's from Kidderminster Harriers on loan before signing permanently in 2015.\nHe is new manager Paul Cook's first signing since arriving on 31 May.\nFind all the latest football transfers on our dedicated page.",Wigan Athletic have signed defender Chey Dunkley on a three-year deal from League One rivals Oxford United.,40159991
3,"The man, who is in his 50s, also sustained a suspected broken leg in the first attack on Castle Street at about 18:45 BST on Saturday.\nPolice said he was taken to hospital after a row on a garage forecourt.\nA short time later, another man and a woman were assaulted in licensed premises on Castle Street.\nDetectives have appealed for anyone who witnessed the attacks to contact police.","A man's eye socket has been broken and two other people have received minor injuries during two assaults in Ballycastle, County Antrim.",34566103
4,"Len Richards has been named as the new chief executive of Cardiff and Vale University Health Board (CVUHB).\nMr Richards, who moved from the UK to Australia in November 2013, is expected to start his new post in June.\nCVUHB chair Maria Battle said he brought ""broad international"" experience to the role.\nMr Richards is currently the deputy chief executive for South Australia Health, a government department responsible for public health in Adelaide.\nHe said there was a lot of hard work to do on the board's financial situation, which is forecasting a Â£31m deficit for the 2016-17 financial year.",A new boss for health services in south Wales has been appointed all the way from Australia.,39634820


Unnamed: 0,document,summary,id
0,"In a speech marking the end of the Islamic holy month of Ramadan, he said Iran still had sharp differences with the US, above all over the Middle East.\nIran would continue to back Syria, Iraq, the Palestinians and ""oppressed people"" in Yemen and Bahrain, he said.\nThe deal on Iran's nuclear programme came after years of negotiations.\nThe agreement limits Iran's nuclear activities for at least 10 years in exchange for the gradual lifting of sanctions which have hampered the country's economy.\n""Whether the [nuclear] deal is approved or disapproved, we will never stop supporting our friends in the region and the people of Palestine, Yemen, Syria, Iraq, Bahrain and Lebanon,"" Ayatollah Khamenei said.\n""Even after this deal our policy towards the arrogant US will not change.""\nThe supreme leader also denied that Iran was intending to create a nuclear bomb.\n""The Americans say they stopped Iran from acquiring a nuclear weapon,"" he said in his speech at the Mosala mosque in Tehran.\n""They know it's not true. We had a fatwa [religious ruling] declaring nuclear weapons to be religiously forbidden under Islamic law. It had nothing to do with the nuclear talks.\n""We have repeatedly said we don't negotiate with the US on regional or international affairs; not even on bilateral issues.\n""There are some exceptions like the nuclear programme that we negotiated with Americans to serve our interests... US policies in the region are diametrically opposed with Iran's policies.""\nThe supreme leader said it was now necessary for Iranian politicians to scrutinise the nuclear agreement and make sure that Iran's national interests were being preserved.\nCorrespondents say that Ayatollah Khamenei's views are in contrast to the acclaim that the accord has received from President Hassan Rouhani and Foreign Minister Mohammad Javad Zarif.","Iran's stance towards the ""arrogant"" US will not change despite the nuclear deal reached earlier this week, supreme leader Ayatollah Ali Khamenei has said.",33578942
1,"Mr Roache, who is secretary of the union's Yorkshire region, won 56.7% of the vote, while the only other candidate, Paul McCarthy, from the North West region, had 43.3%.\nThe union is the third largest in the UK with more than 600,000 members.\nCurrent general secretary Paul Kenny announced in the summer he was standing down after almost a decade in the job.\nMr Roache has 35 years experience at the GMB and led the Leeds City Council 13-week refuse and street cleaning strike in 2009 - the longest in the union's history.\nHe said he was ""proud and humbled"" to have been elected.\n""I will repay GMB members' faith in me by leading a 21st Century union that fights for our members, their families and communities, every hour of every day.""\nMr Roache also paid tribute to Mr Kenny for his work ""that has made GMB the envy of the union movement"".\nDetails of the handover date have yet to be agreed.\nThe GMB is one of the three largest affiliates to the Labour Party and is a significant financial contributor to the party locally and nationally.",Regional official Tim Roache has been elected to become the new general secretary of the GMB union.,34801085
2,"Dunkley, 25, made 52 appearances in all competitions for Oxford last season but rejected a new contract and will join Wigan on 30 June on a free transfer.\nThe former Crewe and Hednesford Town centre-back joined the U's from Kidderminster Harriers on loan before signing permanently in 2015.\nHe is new manager Paul Cook's first signing since arriving on 31 May.\nFind all the latest football transfers on our dedicated page.",Wigan Athletic have signed defender Chey Dunkley on a three-year deal from League One rivals Oxford United.,40159991
3,"The man, who is in his 50s, also sustained a suspected broken leg in the first attack on Castle Street at about 18:45 BST on Saturday.\nPolice said he was taken to hospital after a row on a garage forecourt.\nA short time later, another man and a woman were assaulted in licensed premises on Castle Street.\nDetectives have appealed for anyone who witnessed the attacks to contact police.","A man's eye socket has been broken and two other people have received minor injuries during two assaults in Ballycastle, County Antrim.",34566103
4,"Len Richards has been named as the new chief executive of Cardiff and Vale University Health Board (CVUHB).\nMr Richards, who moved from the UK to Australia in November 2013, is expected to start his new post in June.\nCVUHB chair Maria Battle said he brought ""broad international"" experience to the role.\nMr Richards is currently the deputy chief executive for South Australia Health, a government department responsible for public health in Adelaide.\nHe said there was a lot of hard work to do on the board's financial situation, which is forecasting a Â£31m deficit for the 2016-17 financial year.",A new boss for health services in south Wales has been appointed all the way from Australia.,39634820


### We can address the problem we mentioned above by define a cleaning function that replaces new lines with white space.

In [43]:
def clean(row):
    row['document'] = row['document'].replace('\n', ' ')
    return row

### We can now apply the cleaning function we created and map it onto our data (it loads for train, test, and validation)

In [44]:
xsum = xsum.map(clean)

Loading cached processed dataset at C:\Users\creeg\.cache\huggingface\datasets\xsum\default\1.2.0\32c23220eadddb1149b16ed2e9430a05293768cfffbdfd151058697d4c11f934\cache-ec5b3ab440c9df82.arrow
Loading cached processed dataset at C:\Users\creeg\.cache\huggingface\datasets\xsum\default\1.2.0\32c23220eadddb1149b16ed2e9430a05293768cfffbdfd151058697d4c11f934\cache-a176a692461cda61.arrow
Loading cached processed dataset at C:\Users\creeg\.cache\huggingface\datasets\xsum\default\1.2.0\32c23220eadddb1149b16ed2e9430a05293768cfffbdfd151058697d4c11f934\cache-bc530be4c3ab51ba.arrow


### Voila!

In [45]:
display_function(xsum["test"])

Unnamed: 0,document,summary,id
0,"Toby Ricketts and Marianna Fenn tied the ""noodle knot"" in the New Zealand South Island town of Akaroa. The happy couple say that guidelines of the Pastafarian religion stipulate that wedding celebrants must be pirates. Members of the church profess the belief that the world was created by an airborne spaghetti and meatballs-based being and humans evolved from pirates. New Zealand officials last month designated the religion as an officially-recognised faith, allowing Wellington-based Pastafarian Karen Martyn the legal right to conduct marriages. She carried out her inaugural wedding as an ordained ""ministeroni"" on Saturday. More weddings are planned, she said, including same-sex unions that were legalised in New Zealand in 2013. ""I've had people from Russia, from Germany, from Denmark, from all over contacting me and wanting me to marry them in the church because of our non-discriminatory philosophy,"" she said. ""We will marry any consenting legal adults who meet the legal requirement.""",The light-hearted Church of the Flying Spaghetti Monster has staged its first legally recognised wedding.,36062126
1,"The picture above would certainly make you think so. Unfortunately, the reality is quite different: what looks like snow is actually harmful snow-white froth that floats up from the city's largest lake and spills over into neighbouring areas. Over the years, the 9,000-acre Bellandur lake in India's technology capital has been polluted by chemicals and sewage. IT professional Debasish Ghosh has been taking pictures of the lake of ""harmful snowy froth"" for months now. Here is a selection of his pictures.",Is it snowing in India's tropical southern city of Bangalore?,34376988
2,"Judge Thokozile Masipa did the same for the lawyers on Thursday, urging them to make good use of the upcoming fortnight break for the Easter holidays. In that spirit, here are a few questions that have been niggling me in recent days. Tweet your thoughts and suggestions to @BBCAndrewH. I will be taking a week off and then focusing on South Africa's general election before returning to the hard benches of Courtroom GD on 5 May.","Both defence lawyer Barry Roux and prosecutor Gerrie Nel have made a habit of setting ""homework"" for the witnesses they are cross-examining at the murder trail of South African athlete Oscar Pistorius.",27070096
3,"Media playback is not supported on this device The new deal does not extend Conte's commitment to the club, as he signed a three-year contract on his arrival in west London in the summer of 2016. ""I am very happy to have signed a new contract,"" said the Italian, 47. ""We worked extremely hard in our first year to achieve something amazing, which I am very proud of. Now we must work even harder to stay at the top."" The decision to sign a new contract without extending the terms runs counter to previous comments by Conte, who had indicated his willingness to commit to a longer deal. Speaking in May, he said he wanted to stay with Chelsea ""for many years"", adding: ""If the club give me the possibility to stay and extend my contract, for sure I'm available to."" Conte lifted the Premier League title at the first attempt in the 2016-17 season, winning 30 games, which included a club record 13 consecutive victories. He also guided the Blues to the FA Cup final, though they were beaten by Arsenal. The former Juventus and Italy manager was credited for transforming the Stamford Bridge club's fortunes after they could only finish 10th the previous season. A brutal training regime was part of the transformation, as was his decision to switch to a three-man central-defensive set-up - his preferred tactic at both Juve and Italy. ""The Chelsea fans have given me so much support since I arrived here one year ago and it is important we continue to succeed together,"" added Conte, whose team has flown out to China and Singapore for pre-season games against the Gunners, Bayern Munich and Inter Milan. Chelsea director Marina Granovskaia added: ""Antonio achieved incredible success last season, adapting to English football very quickly and leading us to the Premier League title. ""This new contract reflects our belief that he can continue to deliver results both domestically and as we return to European competition in the Champions League."" Conte has presided over at busy summer so far at Stamford Bridge. Blues have signed midfielder Tiemoue Bakayoko from French champions Monaco in a reported £40m deal, and also brought in defender Antonio Rudiger from Roma for an initial £29m. The Chelsea boss was thwarted in the chase to take striker Romelu Lukaku from Everton, with the Belgium international instead joining Manchester United in a £75m deal. Conte has also been working hard to get players out the door at Stamford Bridge, with striker Diego Costa and midfielder Nemanja Matic both absent from the Far East tour before expected moves.",Chelsea manager Antonio Conte has signed an improved two-year deal with the Premier League champions.,40650899
4,"Although 14 candidates are contesting the election, these two men are the frontrunners with most eyes on them. The elections in Africa's biggest oil producer come at a politically sensitive time, with the rise of Islamist group Boko Haram in the north-east meaning security is at the centre of the campaigns for votes. Nigerians discuss their experience of the election campaign with the BBC and say whom they plan to support. I am voting APC this time, though I voted People's Democractic Party (PDP) in the last presidential election. This is because PDP has been in power for over 15 years and we haven't really progressed. Our leaders need to understand Nigerians decide who leads them. I have no sympathy for any political party. I simply want the best for my country. We have tried PDP and they have failed. Now is time for change. Nigeria has abundant resources (human and natural) to be amongst the world's greatest nations. We need a compassionate, visionary, strong-willed leader to lead us to our rightful place. Kill corruption and Nigeria will not only live but prosper. Buhari is certainly not the ""Messiah"" but he surely can be the forerunner. He can help lay the right foundations for a new corrupt-free Nigeria. I'm supporting President Goodluck Ebel Jonathan because he is a true democrat. He is building institutions in Nigeria and giving them the free hand to tell between good and evil. He is a true Nigerian - patriotic and loyal. No past leader compares to GEJ in democracy, performance and transformation. I am not voting because of the candidates. They don't meet my standards. Corruption is the biggest problem in the country. Goodluck Jonathan is weak because there is a lot of corruption in the land and he hasn't confronted it. Muhammadu Buhari could not deal with it in my opinion - he doesn't have enough brains. I would prefer a balance of 50% ability to fight corruption and 50% ability to handle the economy. The other guys aren't popular, the more well-known candidates over-shadow them. People are eager and are waiting to see what happens. This election is divided. Now people don't see themselves as Nigerian but rather by their ethnic group.","Nigeria's presidential elections, taking place this Saturday, will see a showdown between incumbent President Goodluck Jonathan and former military ruler Muhammadu Buhari, of the opposition All Progressives Congress (APC) party.",32079597


Unnamed: 0,document,summary,id
0,"Toby Ricketts and Marianna Fenn tied the ""noodle knot"" in the New Zealand South Island town of Akaroa. The happy couple say that guidelines of the Pastafarian religion stipulate that wedding celebrants must be pirates. Members of the church profess the belief that the world was created by an airborne spaghetti and meatballs-based being and humans evolved from pirates. New Zealand officials last month designated the religion as an officially-recognised faith, allowing Wellington-based Pastafarian Karen Martyn the legal right to conduct marriages. She carried out her inaugural wedding as an ordained ""ministeroni"" on Saturday. More weddings are planned, she said, including same-sex unions that were legalised in New Zealand in 2013. ""I've had people from Russia, from Germany, from Denmark, from all over contacting me and wanting me to marry them in the church because of our non-discriminatory philosophy,"" she said. ""We will marry any consenting legal adults who meet the legal requirement.""",The light-hearted Church of the Flying Spaghetti Monster has staged its first legally recognised wedding.,36062126
1,"The picture above would certainly make you think so. Unfortunately, the reality is quite different: what looks like snow is actually harmful snow-white froth that floats up from the city's largest lake and spills over into neighbouring areas. Over the years, the 9,000-acre Bellandur lake in India's technology capital has been polluted by chemicals and sewage. IT professional Debasish Ghosh has been taking pictures of the lake of ""harmful snowy froth"" for months now. Here is a selection of his pictures.",Is it snowing in India's tropical southern city of Bangalore?,34376988
2,"Judge Thokozile Masipa did the same for the lawyers on Thursday, urging them to make good use of the upcoming fortnight break for the Easter holidays. In that spirit, here are a few questions that have been niggling me in recent days. Tweet your thoughts and suggestions to @BBCAndrewH. I will be taking a week off and then focusing on South Africa's general election before returning to the hard benches of Courtroom GD on 5 May.","Both defence lawyer Barry Roux and prosecutor Gerrie Nel have made a habit of setting ""homework"" for the witnesses they are cross-examining at the murder trail of South African athlete Oscar Pistorius.",27070096
3,"Media playback is not supported on this device The new deal does not extend Conte's commitment to the club, as he signed a three-year contract on his arrival in west London in the summer of 2016. ""I am very happy to have signed a new contract,"" said the Italian, 47. ""We worked extremely hard in our first year to achieve something amazing, which I am very proud of. Now we must work even harder to stay at the top."" The decision to sign a new contract without extending the terms runs counter to previous comments by Conte, who had indicated his willingness to commit to a longer deal. Speaking in May, he said he wanted to stay with Chelsea ""for many years"", adding: ""If the club give me the possibility to stay and extend my contract, for sure I'm available to."" Conte lifted the Premier League title at the first attempt in the 2016-17 season, winning 30 games, which included a club record 13 consecutive victories. He also guided the Blues to the FA Cup final, though they were beaten by Arsenal. The former Juventus and Italy manager was credited for transforming the Stamford Bridge club's fortunes after they could only finish 10th the previous season. A brutal training regime was part of the transformation, as was his decision to switch to a three-man central-defensive set-up - his preferred tactic at both Juve and Italy. ""The Chelsea fans have given me so much support since I arrived here one year ago and it is important we continue to succeed together,"" added Conte, whose team has flown out to China and Singapore for pre-season games against the Gunners, Bayern Munich and Inter Milan. Chelsea director Marina Granovskaia added: ""Antonio achieved incredible success last season, adapting to English football very quickly and leading us to the Premier League title. ""This new contract reflects our belief that he can continue to deliver results both domestically and as we return to European competition in the Champions League."" Conte has presided over at busy summer so far at Stamford Bridge. Blues have signed midfielder Tiemoue Bakayoko from French champions Monaco in a reported £40m deal, and also brought in defender Antonio Rudiger from Roma for an initial £29m. The Chelsea boss was thwarted in the chase to take striker Romelu Lukaku from Everton, with the Belgium international instead joining Manchester United in a £75m deal. Conte has also been working hard to get players out the door at Stamford Bridge, with striker Diego Costa and midfielder Nemanja Matic both absent from the Far East tour before expected moves.",Chelsea manager Antonio Conte has signed an improved two-year deal with the Premier League champions.,40650899
4,"Although 14 candidates are contesting the election, these two men are the frontrunners with most eyes on them. The elections in Africa's biggest oil producer come at a politically sensitive time, with the rise of Islamist group Boko Haram in the north-east meaning security is at the centre of the campaigns for votes. Nigerians discuss their experience of the election campaign with the BBC and say whom they plan to support. I am voting APC this time, though I voted People's Democractic Party (PDP) in the last presidential election. This is because PDP has been in power for over 15 years and we haven't really progressed. Our leaders need to understand Nigerians decide who leads them. I have no sympathy for any political party. I simply want the best for my country. We have tried PDP and they have failed. Now is time for change. Nigeria has abundant resources (human and natural) to be amongst the world's greatest nations. We need a compassionate, visionary, strong-willed leader to lead us to our rightful place. Kill corruption and Nigeria will not only live but prosper. Buhari is certainly not the ""Messiah"" but he surely can be the forerunner. He can help lay the right foundations for a new corrupt-free Nigeria. I'm supporting President Goodluck Ebel Jonathan because he is a true democrat. He is building institutions in Nigeria and giving them the free hand to tell between good and evil. He is a true Nigerian - patriotic and loyal. No past leader compares to GEJ in democracy, performance and transformation. I am not voting because of the candidates. They don't meet my standards. Corruption is the biggest problem in the country. Goodluck Jonathan is weak because there is a lot of corruption in the land and he hasn't confronted it. Muhammadu Buhari could not deal with it in my opinion - he doesn't have enough brains. I would prefer a balance of 50% ability to fight corruption and 50% ability to handle the economy. The other guys aren't popular, the more well-known candidates over-shadow them. People are eager and are waiting to see what happens. This election is divided. Now people don't see themselves as Nigerian but rather by their ethnic group.","Nigeria's presidential elections, taking place this Saturday, will see a showdown between incumbent President Goodluck Jonathan and former military ruler Muhammadu Buhari, of the opposition All Progressives Congress (APC) party.",32079597


Unnamed: 0,document,summary,id
0,"Toby Ricketts and Marianna Fenn tied the ""noodle knot"" in the New Zealand South Island town of Akaroa. The happy couple say that guidelines of the Pastafarian religion stipulate that wedding celebrants must be pirates. Members of the church profess the belief that the world was created by an airborne spaghetti and meatballs-based being and humans evolved from pirates. New Zealand officials last month designated the religion as an officially-recognised faith, allowing Wellington-based Pastafarian Karen Martyn the legal right to conduct marriages. She carried out her inaugural wedding as an ordained ""ministeroni"" on Saturday. More weddings are planned, she said, including same-sex unions that were legalised in New Zealand in 2013. ""I've had people from Russia, from Germany, from Denmark, from all over contacting me and wanting me to marry them in the church because of our non-discriminatory philosophy,"" she said. ""We will marry any consenting legal adults who meet the legal requirement.""",The light-hearted Church of the Flying Spaghetti Monster has staged its first legally recognised wedding.,36062126
1,"The picture above would certainly make you think so. Unfortunately, the reality is quite different: what looks like snow is actually harmful snow-white froth that floats up from the city's largest lake and spills over into neighbouring areas. Over the years, the 9,000-acre Bellandur lake in India's technology capital has been polluted by chemicals and sewage. IT professional Debasish Ghosh has been taking pictures of the lake of ""harmful snowy froth"" for months now. Here is a selection of his pictures.",Is it snowing in India's tropical southern city of Bangalore?,34376988
2,"Judge Thokozile Masipa did the same for the lawyers on Thursday, urging them to make good use of the upcoming fortnight break for the Easter holidays. In that spirit, here are a few questions that have been niggling me in recent days. Tweet your thoughts and suggestions to @BBCAndrewH. I will be taking a week off and then focusing on South Africa's general election before returning to the hard benches of Courtroom GD on 5 May.","Both defence lawyer Barry Roux and prosecutor Gerrie Nel have made a habit of setting ""homework"" for the witnesses they are cross-examining at the murder trail of South African athlete Oscar Pistorius.",27070096
3,"Media playback is not supported on this device The new deal does not extend Conte's commitment to the club, as he signed a three-year contract on his arrival in west London in the summer of 2016. ""I am very happy to have signed a new contract,"" said the Italian, 47. ""We worked extremely hard in our first year to achieve something amazing, which I am very proud of. Now we must work even harder to stay at the top."" The decision to sign a new contract without extending the terms runs counter to previous comments by Conte, who had indicated his willingness to commit to a longer deal. Speaking in May, he said he wanted to stay with Chelsea ""for many years"", adding: ""If the club give me the possibility to stay and extend my contract, for sure I'm available to."" Conte lifted the Premier League title at the first attempt in the 2016-17 season, winning 30 games, which included a club record 13 consecutive victories. He also guided the Blues to the FA Cup final, though they were beaten by Arsenal. The former Juventus and Italy manager was credited for transforming the Stamford Bridge club's fortunes after they could only finish 10th the previous season. A brutal training regime was part of the transformation, as was his decision to switch to a three-man central-defensive set-up - his preferred tactic at both Juve and Italy. ""The Chelsea fans have given me so much support since I arrived here one year ago and it is important we continue to succeed together,"" added Conte, whose team has flown out to China and Singapore for pre-season games against the Gunners, Bayern Munich and Inter Milan. Chelsea director Marina Granovskaia added: ""Antonio achieved incredible success last season, adapting to English football very quickly and leading us to the Premier League title. ""This new contract reflects our belief that he can continue to deliver results both domestically and as we return to European competition in the Champions League."" Conte has presided over at busy summer so far at Stamford Bridge. Blues have signed midfielder Tiemoue Bakayoko from French champions Monaco in a reported £40m deal, and also brought in defender Antonio Rudiger from Roma for an initial £29m. The Chelsea boss was thwarted in the chase to take striker Romelu Lukaku from Everton, with the Belgium international instead joining Manchester United in a £75m deal. Conte has also been working hard to get players out the door at Stamford Bridge, with striker Diego Costa and midfielder Nemanja Matic both absent from the Far East tour before expected moves.",Chelsea manager Antonio Conte has signed an improved two-year deal with the Premier League champions.,40650899
4,"Although 14 candidates are contesting the election, these two men are the frontrunners with most eyes on them. The elections in Africa's biggest oil producer come at a politically sensitive time, with the rise of Islamist group Boko Haram in the north-east meaning security is at the centre of the campaigns for votes. Nigerians discuss their experience of the election campaign with the BBC and say whom they plan to support. I am voting APC this time, though I voted People's Democractic Party (PDP) in the last presidential election. This is because PDP has been in power for over 15 years and we haven't really progressed. Our leaders need to understand Nigerians decide who leads them. I have no sympathy for any political party. I simply want the best for my country. We have tried PDP and they have failed. Now is time for change. Nigeria has abundant resources (human and natural) to be amongst the world's greatest nations. We need a compassionate, visionary, strong-willed leader to lead us to our rightful place. Kill corruption and Nigeria will not only live but prosper. Buhari is certainly not the ""Messiah"" but he surely can be the forerunner. He can help lay the right foundations for a new corrupt-free Nigeria. I'm supporting President Goodluck Ebel Jonathan because he is a true democrat. He is building institutions in Nigeria and giving them the free hand to tell between good and evil. He is a true Nigerian - patriotic and loyal. No past leader compares to GEJ in democracy, performance and transformation. I am not voting because of the candidates. They don't meet my standards. Corruption is the biggest problem in the country. Goodluck Jonathan is weak because there is a lot of corruption in the land and he hasn't confronted it. Muhammadu Buhari could not deal with it in my opinion - he doesn't have enough brains. I would prefer a balance of 50% ability to fight corruption and 50% ability to handle the economy. The other guys aren't popular, the more well-known candidates over-shadow them. People are eager and are waiting to see what happens. This election is divided. Now people don't see themselves as Nigerian but rather by their ethnic group.","Nigeria's presidential elections, taking place this Saturday, will see a showdown between incumbent President Goodluck Jonathan and former military ruler Muhammadu Buhari, of the opposition All Progressives Congress (APC) party.",32079597


### We can view the column names and data types without our dataset using .features

In [46]:
xsum['test'].features

{'document': Value(dtype='string', id=None),
 'summary': Value(dtype='string', id=None),
 'id': Value(dtype='string', id=None)}

In [47]:
print(xsum['test'].info)

DatasetInfo(description='\nExtreme Summarization (XSum) Dataset.\n\nThere are three features:\n  - document: Input news article.\n  - summary: One sentence summary of the article.\n  - id: BBC ID of the article.\n\n', citation="\n@article{Narayan2018DontGM,\n  title={Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization},\n  author={Shashi Narayan and Shay B. Cohen and Mirella Lapata},\n  journal={ArXiv},\n  year={2018},\n  volume={abs/1808.08745}\n}\n", homepage='https://github.com/EdinburghNLP/XSum/tree/master/XSum-Dataset', license='', features={'document': Value(dtype='string', id=None), 'summary': Value(dtype='string', id=None), 'id': Value(dtype='string', id=None)}, post_processed=None, supervised_keys=SupervisedKeysData(input='document', output='summary'), task_templates=None, builder_name='xsum', config_name='default', version=1.2.0, splits={'train': SplitInfo(name='train', num_bytes=479206615, num_examples=204045, data

# Preparing XSUM Data
Before we can put the text into a model we need to convert it into a format that the transformer can understand. Encoders and decoders only understand numerical values; we need to tokenize each word and then convert the tokens into numerical values. The tokenization transformer splits text into tokens and then adds special tokens if expected based on pretraining. The tokenizer then matches each token to unique id in vocabulary of tokenizer which has a corresponding vector of numerical values. These vectors contain the contextualized value of a word. For example, the vector representation of the word "to" isnt just "to", it also takes into account the words around it which are called context (right and left context). To continue this example, "Welcome to NYC" is a sentence that has the word "to". For the word "to" the left context is "Welcome" and the right context is "NYC". The output is based on these contexts; this is how the value is a contextualized vector thanks to self-attention mechanism. We can do all of this using the AutoTokenizer.from_pretarined method to ensure that we get a tokenizer that corresponds to the model architecture we want to use (facebook/bart-large-cnn); however, we will specifically reference the BartTokenizer in our checkpoint, tokenizer, and model to ensure all aspects of our model were trained using the same methodologies so we can avoid unexpected summaries

In [63]:
checkpoint = "facebook/bart-large-cnn"
tokenizer = BartTokenizer.from_pretrained(checkpoint)
model = BartForConditionalGeneration.from_pretrained(checkpoint)

### We now write a function that preprocesses the test data by passing it to the tokenizer. We need to use the argument truncation=True to ensure that any input longer than the model can handle will be truncated to the maximum length alowed. We can view this information in the model config. BART has a maximum length of 1024 which we can see in max_position_embeddings

In [49]:
model.config

BartConfig {
  "_name_or_path": "facebook/bart-large-cnn",
  "_num_labels": 3,
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_final_layer_norm": false,
  "architectures": [
    "BartForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "classif_dropout": 0.0,
  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
  "decoder_layerdrop": 0.0,
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
  "force_bos_token_to_be_generated": true,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "gradient_checkpointing": false,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2"
  },
  "init_std": 0.02,
  "is_encoder_decoder": true,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "L

### We can now create the function with the maximum length allowed as per the config and an arbitrary minimum length. 

In [50]:
max_input_length = 1024
max_target_length = 100


def preperation_function(examples):
    inputs = [doc for doc in examples["document"]]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True, padding=True)

    
    with tokenizer.as_target_tokenizer(): # Setup the tokenizer for summaries where "as_target_tokenizer" is what provides passes along the context for each vector
        labels = tokenizer(
            examples["summary"], max_length=max_target_length, truncation=True
        )

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

### We can apply this function to our dataset using map

In [51]:
tokenized_xsum = xsum.map(preperation_function, batched=True)

Loading cached processed dataset at C:\Users\creeg\.cache\huggingface\datasets\xsum\default\1.2.0\32c23220eadddb1149b16ed2e9430a05293768cfffbdfd151058697d4c11f934\cache-2b651f21d6ec073a.arrow
Loading cached processed dataset at C:\Users\creeg\.cache\huggingface\datasets\xsum\default\1.2.0\32c23220eadddb1149b16ed2e9430a05293768cfffbdfd151058697d4c11f934\cache-35f38c35a797b587.arrow
Loading cached processed dataset at C:\Users\creeg\.cache\huggingface\datasets\xsum\default\1.2.0\32c23220eadddb1149b16ed2e9430a05293768cfffbdfd151058697d4c11f934\cache-c6fb5876cc0b65d3.arrow


In [52]:
tokenized_xsum

DatasetDict({
    train: Dataset({
        features: ['attention_mask', 'document', 'id', 'input_ids', 'labels', 'summary'],
        num_rows: 204045
    })
    validation: Dataset({
        features: ['attention_mask', 'document', 'id', 'input_ids', 'labels', 'summary'],
        num_rows: 11332
    })
    test: Dataset({
        features: ['attention_mask', 'document', 'id', 'input_ids', 'labels', 'summary'],
        num_rows: 11334
    })
})

In [53]:
tokenized_xsum['test'].features

{'attention_mask': Sequence(feature=Value(dtype='int8', id=None), length=-1, id=None),
 'document': Value(dtype='string', id=None),
 'id': Value(dtype='string', id=None),
 'input_ids': Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None),
 'labels': Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None),
 'summary': Value(dtype='string', id=None)}

### The attention mask tells the model what to pay attention to by passing values of 1 for tokens to consider and values of 0 for tokens to ignore. The input ids are the numerical mapping of tokens to BART's vocabulary; each word in BART's vocabulary is assigned a numerical value.

In [54]:
display_function(tokenized_xsum['test'])

Unnamed: 0,attention_mask,document,id,input_ids,labels,summary
0,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Media playback is not supported on this device How did their players rate in the biggest match in the history of Welsh football? Coped well with Portugal's early flurry of crosses but was powerless to deny Cristiano Ronaldo for Portugal's opening goal. The Crystal Palace player was unfortunate to be wrong-footed by Nani's deflection for the second goal. Watchful against the threat of Ronaldo cutting inside from Portugal's left, the Reading defender was kept busy by the likes of Renato Sanches and Nani and struggled to influence the game in attack. Switched to the left side of Wales' three centre-backs in Ben Davies' absence, the West Brom man timed his advances well to make interceptions. Beaten by Ronaldo for Portugal's opening goal but won a team-high eight aerial duels. Made some strong early challenges, particularly on Cristiano Ronaldo, to assert his authority on the game. The Swansea City skipper led by example, winning 100% of his 50-50 contests with Portugal players. Making his first international start since March 2015, he competed well in the air as Portugal sought to make the most of Ronaldo's aerial prowess. Wales might have missed Ben Davies' distribution but his replacement was solid defensively. Like Gunter, kept on the back foot by Portugal's attacking players. Pushed forward but, when he got into promising positions, struggled to provide quality crosses. An early booking for a foul on Nani made his job of protecting Wales' defence difficult, but still the Liverpool midfielder buzzed around with intent. Typically sound in possession but not as influential as he has been earlier in the tournament. Showed imagination with a low corner which led to a chance for Gareth Bale but had only limited influence in open play before being replaced by Sam Vokes shortly after Portugal's second goal. Made some characteristic runs into the Portugal penalty area but could not make the crucial connections. Forced deeper as Portugal's midfield gained control in the second half, the Leicester Premier League winner had to curb his attacking instincts. Trademark runs from deep and at a startling pace had Portugal's defenders backtracking in the first half but his influence waned in the second period. The Real Madrid forward's audacious long-range shot was Wales' last effort. Brimming with confidence following his stunning goal in the quarter-final win over Belgium, the free agent stretched Portugal's defence with his powerful running. He was starved of the ball in the second half, however, as the match wore on. Brought on shortly after Wales fell 2-0 behind, the Burnley striker failed to connect meaningfully with any of the crosses which came his way. Did not see much of the ball and, when he did, was not in a position to cause Portugal any problems. Tried making his usual probing runs between the opponents' midfield and defence but found himself crowded out.",36730443,"[0, 18801, 20083, 16, 45, 2800, 15, 42, 2187, 1336, 222, 49, 472, 731, 11, 5, 934, 914, 11, 5, 750, 9, 12093, 1037, 116, 9351, 196, 157, 19, 8062, 18, 419, 18996, 9, 20238, 53, 21, 33128, 7, 7631, 8767, 5472, 7991, 13, 8062, 18, 1273, 724, 4, 20, 9793, 5928, 869, 21, 9327, 7, 28, 1593, 12, 26620, 30, 234, 1543, 18, 3816, 20576, 13, 5, 200, 724, 4, 3075, 2650, 136, 5, 1856, 9, 7991, 3931, 1025, 31, 8062, 18, 314, 6, 5, 4913, 5142, 21, 1682, 3610, 30, 5, 3829, 9, 6340, 3938, 208, 23833, 8, ...]","[0, 771, 4575, 108, 24666, 5122, 336, 422, 376, 7, 41, 253, 65, 177, 137, 5, 507, 25, 51, 685, 132, 12, 288, 7, 8062, 11, 5, 94, 237, 4, 2]",Wales' heroic Euro 2016 run came to an end one game before the final as they lost 2-0 to Portugal in the last four.
1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","The Â£42m MV Loch Seaforth made its first passenger sailing last month but is still in a ""test period"" and not fully in service. Thursday's fault took five hours to fix and the ferry was cleared again for sailings. Another ferry, the Isle of Lewis, took the passengers involved. Bad weather has led the cancellations of Friday sailings on the Stornoway-Ullapool route and other services on Scotland's west coast. Ferry operator Caledonian MacBrayne said withdrawing the Loch Seaforth had been an operational decision and the fault would not have prevented the ship from sailing. A spokesman said: ""Yesterday evening an issue arose with an engine room ventilation fan which required attention and an operational decision was taken to remove her from the route while it was fixed. ""While passengers were delayed, and we regret any inconvenience to them, no-one was stranded."" A spokesman added: ""This was not a major issue but it required around five hours of work as the fan was in a difficult to reach location.""",31759248,"[0, 133, 1437, 2537, 29254, 3714, 119, 28830, 26384, 27411, 22494, 156, 63, 78, 4408, 17664, 94, 353, 53, 16, 202, 11, 10, 22, 21959, 675, 113, 8, 45, 1950, 11, 544, 4, 296, 18, 7684, 362, 292, 722, 7, 4190, 8, 5, 15169, 21, 6049, 456, 13, 18840, 1033, 4, 2044, 15169, 6, 5, 18930, 9, 3577, 6, 362, 5, 3670, 963, 4, 5654, 1650, 34, 669, 5, 24068, 1635, 9, 273, 18840, 1033, 15, 5, 312, 4244, 20574, 12, 791, 890, 1115, 8110, 3420, 8, 97, 518, 15, 3430, 18, 3072, 3673, 4, 25820, 5364, 2912, 196, 26399, ...]","[0, 250, 92, 15169, 1490, 13, 5, 312, 4244, 20574, 7, 121, 890, 1115, 8110, 3420, 21, 8059, 13375, 31, 4408, 5941, 142, 9, 10, 23624, 30911, 2378, 4, 2]",A new ferry built for the Stornoway to Ullapool route was temporarily withdrawn from passenger duties because of a faulty ventilation fan.
2,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Broadband speeds in Kingsmere, on the edge of Bicester, rarely exceed 2Mbps, and some homes cannot get a landline. Residents of the 400 homes have put up posters warning potential newcomers of the issue. Developer Countryside Properties said it had provided ducting for the cables but it was up to BT what went in them. BT said it proposed sharing the Â£45,000 cost of providing fibre-optic broadband. The development at Kingsmere is part of the 13,000 homes planned by the government to turn Bicester into one of a new generation of garden cities, announced earlier this month. Resident Matt Maunder said: ""I'm a home worker, and I need good broadband to do my job. We've actually got residents who moved here in August who still don't have a phone line - that's just unacceptable. ""I can't carry out my job effectively, I can't take advantage of services like Skype, my family live abroad so I can't get in touch with them as easily as I would like. ""Unfortunately we have got people now saying they wish they hadn't moved here because of the way the service is and that's a real shame, particularly because it's been lauded as the latest and greatest housing development in the country."" BT said it had reached an agreement for 726 additional homes yet to be built and a proposal for the existing houses would be ready by 10 January. Countryside Properties said the ducting installed at Kingsmere was based on a design agreed with BT in 2010, based on a copper network. A spokesman said: ""It is then BT/Openreach's decision as to whether they would run copper or fibre through the ducting."" Communications Minister Ed Vaizey said: ""You wouldn't move into a brand new house in 2014/2015 and not expect to get superfast broadband. It is unacceptable.""",30585271,"[0, 28806, 9484, 9706, 11, 5414, 25416, 6, 15, 5, 3543, 9, 163, 40755, 6, 7154, 11514, 132, 36030, 6, 8, 103, 1611, 1395, 120, 10, 1212, 1902, 4, 10073, 9, 5, 3675, 1611, 33, 342, 62, 15736, 2892, 801, 19298, 9, 5, 696, 4, 31285, 5093, 3730, 13094, 26, 24, 56, 1286, 29259, 154, 13, 5, 19185, 53, 24, 21, 62, 7, 12482, 99, 439, 11, 106, 4, 12482, 26, 24, 1850, 3565, 5, 1437, 2537, 29254, 1898, 6, 151, 701, 9, 1976, 21060, 12, 19693, 636, 11451, 4, 20, 709, 23, 5414, 25416, 16, 233, 9, 5, 508, ...]","[0, 35129, 11, 65, 9, 5, 168, 18, 92, 5671, 1947, 32, 2892, 801, 184, 4859, 89, 16, 117, 1769, 11451, 15, 49, 709, 4, 2]",Residents in one of the government's new garden cities are warning potential home buyers there is no fast broadband on their development.
3,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Mr Trump said Mr Obama had learned well before the 8 November poll about the accusations and ""did nothing"". His comments followed an article in the Washington Post which said that Mr Obama learned last August of President Vladimir Putin's ""direct involvement"". The alleged meddling is the subject of high-level investigations in the US. President Putin has repeatedly denied any Russian interference into the presidential election. The Washington Post article says Mr Obama was told early last August by sources deep within the Russian government that Mr Putin was directly involved in a cyber campaign to disrupt the election, injure Hillary Clinton and aid a Trump victory. The Post said Mr Obama secretly debated dozens of options to punish Russia but in the end settled on what it called symbolic measures - the expulsion of 35 diplomats and closure of two Russian compounds. They came in late December, well after the election. The Post reported that Mr Obama was concerned he might himself be seen as trying to manipulate the election. The paper quoted a former administration official as saying: ""From national security people there was a sense of immediate introspection, of, 'Wow, did we mishandle this'."" Measures Mr Obama had considered but which were not put into action included planting cyber weapons in the Russian infrastructure and releasing information personally damaging to Mr Putin. Imagine, for a moment, that you're Barack Obama in August 2016. You've just been informed by the CIA that Vladimir Putin has ordered a wide-ranging effort to disrupt the US presidential election. What do you do? Mr Obama responded in typical fashion - cautiously. He alerted state officials, warned Russia and attempted (unsuccessfully) to fashion a bipartisan response with Republicans in Congress. Now the second-guessing has begun. Some Democrats are saying the Obama team should have gone public with such a startling discovery before election day. The president feared such a move would prompt the Republican nominee to accuse him of meddling and undermine faith in the electoral process. He believed Mrs Clinton was going to win anyway, so it was best not to rock the boat. Mr Trump himself is now questioning why Mr Obama didn't do more - a curious position given that he recently described the Russia hacking story as a Democratic ""hoax"". These latest revelations add yet another wrinkle to a 2016 campaign that will be hashed and rehashed for the foreseeable future. The most pressing question now, however, is not what Mr Obama did. It's what the US government does next. Mr Trump tweeted on Friday: ""The Obama Administration knew far in advance of November 8th about election meddling by Russia. Did nothing about it. WHY?"" He followed that up with two more tweets on Saturday, the second saying: ""Obama Administration official said they ""choked"" when it came to acting on Russian meddling of election. They didn't want to hurt Hillary?"" He repeats the argument in an interview with Fox News, which will air on Sunday. ""If he had the information, why didn't he do something about it? He should have done something about it. But you don't read that. It's quite sad."" Allegations of collusion between the Trump team and Russian officials during the election have dogged the president's first five months in office. He has repeatedly denied the allegations, calling the investigations a ""witch hunt"". US investigators are looking into whether Russian cyber hackers targeted US electoral systems to help Mr Trump win. US media say special counsel Robert Mueller is also investigating Mr Trump for possible obstruction of justice over the Russia inquiries. They involve the president's firing of FBI chief James Comey, who led one of the inquiries, and Mr Trump's alleged attempt to end a probe into sacked national security adviser Michael Flynn.",40395433,"[0, 10980, 140, 26, 427, 1284, 56, 2435, 157, 137, 5, 290, 759, 2902, 59, 5, 6124, 8, 22, 24001, 1085, 845, 832, 1450, 1432, 41, 1566, 11, 5, 663, 1869, 61, 26, 14, 427, 1284, 2435, 94, 830, 9, 270, 6546, 3176, 18, 22, 27555, 5292, 845, 20, 1697, 13683, 16, 5, 2087, 9, 239, 12, 4483, 4941, 11, 5, 382, 4, 270, 3176, 34, 3987, 2296, 143, 1083, 8149, 88, 5, 1939, 729, 4, 20, 663, 1869, 1566, 161, 427, 1284, 21, 174, 419, 94, 830, 30, 1715, 1844, 624, 5, 1083, 168, 14, 427, 3176, 21, 2024, ...]","[0, 6517, 807, 140, 34, 1238, 39, 9933, 4282, 1284, 9, 30365, 81, 1697, 1083, 8149, 11, 5, 382, 729, 11, 336, 4, 2]",President Donald Trump has accused his predecessor Barack Obama of inaction over alleged Russian interference in the US election in 2016.
4,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","James, 29, who presents the drive-time show, studied drama and had a number of shows on the students' union radio station before rising to fame. The DJ, who is covering Radio 1's Big Weekend in Norwich this bank holiday, said he was ""completely honoured"". Father Ted co-writer Graham Linehan and broadcaster Dame Jenni Murray are also receiving degrees. James said: ""My affair with UEA continues - it really is wonderful and I'm completely honoured to be given this. ""I spent a very happy three years there and it's the place where I fell in love with drama and radio, so it's very special to me. ""As soon as the ceremony is over, I'll set about officially changing my name on all credit cards, mortgage documents and, most importantly, the Radio 1 website."" James and Linehan will receive an Honorary Doctorate of Letters at a ceremony in Norwich in July, with BBC Radio 4's Woman's Hour presenter Murray being given an Honorary Doctorate of Civil Law. Other names being recognised for their distinguished careers include comedian Arthur Smith, author Erica Wagner, actor Samuel West and the Bishop of Norwich, the Rt Rev Graham James.",32837393,"[0, 18031, 6, 1132, 6, 54, 6822, 5, 1305, 12, 958, 311, 6, 8069, 4149, 8, 56, 10, 346, 9, 924, 15, 5, 521, 108, 2918, 3188, 1992, 137, 2227, 7, 9444, 4, 20, 7766, 6, 54, 16, 4631, 4611, 112, 18, 1776, 16520, 11, 18749, 42, 827, 2317, 6, 26, 37, 21, 22, 28655, 16962, 845, 9510, 8115, 1029, 12, 9408, 4572, 5562, 4134, 8, 10901, 9038, 23710, 118, 4479, 32, 67, 2806, 4176, 4, 957, 26, 35, 22, 2387, 7226, 19, 121, 14684, 1388, 111, 24, 269, 16, 4613, 8, 38, 437, 2198, 16962, 7, 28, 576, 42, ...]","[0, 28713, 4611, 112, 7766, 4275, 957, 16, 7, 1325, 41, 23536, 3299, 877, 31, 5, 589, 9, 953, 7413, 14190, 36, 9162, 250, 322, 2]",BBC Radio 1 DJ Greg James is to receive an honorary doctorate from the University of East Anglia (UEA).


Unnamed: 0,attention_mask,document,id,input_ids,labels,summary
0,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Media playback is not supported on this device How did their players rate in the biggest match in the history of Welsh football? Coped well with Portugal's early flurry of crosses but was powerless to deny Cristiano Ronaldo for Portugal's opening goal. The Crystal Palace player was unfortunate to be wrong-footed by Nani's deflection for the second goal. Watchful against the threat of Ronaldo cutting inside from Portugal's left, the Reading defender was kept busy by the likes of Renato Sanches and Nani and struggled to influence the game in attack. Switched to the left side of Wales' three centre-backs in Ben Davies' absence, the West Brom man timed his advances well to make interceptions. Beaten by Ronaldo for Portugal's opening goal but won a team-high eight aerial duels. Made some strong early challenges, particularly on Cristiano Ronaldo, to assert his authority on the game. The Swansea City skipper led by example, winning 100% of his 50-50 contests with Portugal players. Making his first international start since March 2015, he competed well in the air as Portugal sought to make the most of Ronaldo's aerial prowess. Wales might have missed Ben Davies' distribution but his replacement was solid defensively. Like Gunter, kept on the back foot by Portugal's attacking players. Pushed forward but, when he got into promising positions, struggled to provide quality crosses. An early booking for a foul on Nani made his job of protecting Wales' defence difficult, but still the Liverpool midfielder buzzed around with intent. Typically sound in possession but not as influential as he has been earlier in the tournament. Showed imagination with a low corner which led to a chance for Gareth Bale but had only limited influence in open play before being replaced by Sam Vokes shortly after Portugal's second goal. Made some characteristic runs into the Portugal penalty area but could not make the crucial connections. Forced deeper as Portugal's midfield gained control in the second half, the Leicester Premier League winner had to curb his attacking instincts. Trademark runs from deep and at a startling pace had Portugal's defenders backtracking in the first half but his influence waned in the second period. The Real Madrid forward's audacious long-range shot was Wales' last effort. Brimming with confidence following his stunning goal in the quarter-final win over Belgium, the free agent stretched Portugal's defence with his powerful running. He was starved of the ball in the second half, however, as the match wore on. Brought on shortly after Wales fell 2-0 behind, the Burnley striker failed to connect meaningfully with any of the crosses which came his way. Did not see much of the ball and, when he did, was not in a position to cause Portugal any problems. Tried making his usual probing runs between the opponents' midfield and defence but found himself crowded out.",36730443,"[0, 18801, 20083, 16, 45, 2800, 15, 42, 2187, 1336, 222, 49, 472, 731, 11, 5, 934, 914, 11, 5, 750, 9, 12093, 1037, 116, 9351, 196, 157, 19, 8062, 18, 419, 18996, 9, 20238, 53, 21, 33128, 7, 7631, 8767, 5472, 7991, 13, 8062, 18, 1273, 724, 4, 20, 9793, 5928, 869, 21, 9327, 7, 28, 1593, 12, 26620, 30, 234, 1543, 18, 3816, 20576, 13, 5, 200, 724, 4, 3075, 2650, 136, 5, 1856, 9, 7991, 3931, 1025, 31, 8062, 18, 314, 6, 5, 4913, 5142, 21, 1682, 3610, 30, 5, 3829, 9, 6340, 3938, 208, 23833, 8, ...]","[0, 771, 4575, 108, 24666, 5122, 336, 422, 376, 7, 41, 253, 65, 177, 137, 5, 507, 25, 51, 685, 132, 12, 288, 7, 8062, 11, 5, 94, 237, 4, 2]",Wales' heroic Euro 2016 run came to an end one game before the final as they lost 2-0 to Portugal in the last four.
1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","The Â£42m MV Loch Seaforth made its first passenger sailing last month but is still in a ""test period"" and not fully in service. Thursday's fault took five hours to fix and the ferry was cleared again for sailings. Another ferry, the Isle of Lewis, took the passengers involved. Bad weather has led the cancellations of Friday sailings on the Stornoway-Ullapool route and other services on Scotland's west coast. Ferry operator Caledonian MacBrayne said withdrawing the Loch Seaforth had been an operational decision and the fault would not have prevented the ship from sailing. A spokesman said: ""Yesterday evening an issue arose with an engine room ventilation fan which required attention and an operational decision was taken to remove her from the route while it was fixed. ""While passengers were delayed, and we regret any inconvenience to them, no-one was stranded."" A spokesman added: ""This was not a major issue but it required around five hours of work as the fan was in a difficult to reach location.""",31759248,"[0, 133, 1437, 2537, 29254, 3714, 119, 28830, 26384, 27411, 22494, 156, 63, 78, 4408, 17664, 94, 353, 53, 16, 202, 11, 10, 22, 21959, 675, 113, 8, 45, 1950, 11, 544, 4, 296, 18, 7684, 362, 292, 722, 7, 4190, 8, 5, 15169, 21, 6049, 456, 13, 18840, 1033, 4, 2044, 15169, 6, 5, 18930, 9, 3577, 6, 362, 5, 3670, 963, 4, 5654, 1650, 34, 669, 5, 24068, 1635, 9, 273, 18840, 1033, 15, 5, 312, 4244, 20574, 12, 791, 890, 1115, 8110, 3420, 8, 97, 518, 15, 3430, 18, 3072, 3673, 4, 25820, 5364, 2912, 196, 26399, ...]","[0, 250, 92, 15169, 1490, 13, 5, 312, 4244, 20574, 7, 121, 890, 1115, 8110, 3420, 21, 8059, 13375, 31, 4408, 5941, 142, 9, 10, 23624, 30911, 2378, 4, 2]",A new ferry built for the Stornoway to Ullapool route was temporarily withdrawn from passenger duties because of a faulty ventilation fan.
2,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Broadband speeds in Kingsmere, on the edge of Bicester, rarely exceed 2Mbps, and some homes cannot get a landline. Residents of the 400 homes have put up posters warning potential newcomers of the issue. Developer Countryside Properties said it had provided ducting for the cables but it was up to BT what went in them. BT said it proposed sharing the Â£45,000 cost of providing fibre-optic broadband. The development at Kingsmere is part of the 13,000 homes planned by the government to turn Bicester into one of a new generation of garden cities, announced earlier this month. Resident Matt Maunder said: ""I'm a home worker, and I need good broadband to do my job. We've actually got residents who moved here in August who still don't have a phone line - that's just unacceptable. ""I can't carry out my job effectively, I can't take advantage of services like Skype, my family live abroad so I can't get in touch with them as easily as I would like. ""Unfortunately we have got people now saying they wish they hadn't moved here because of the way the service is and that's a real shame, particularly because it's been lauded as the latest and greatest housing development in the country."" BT said it had reached an agreement for 726 additional homes yet to be built and a proposal for the existing houses would be ready by 10 January. Countryside Properties said the ducting installed at Kingsmere was based on a design agreed with BT in 2010, based on a copper network. A spokesman said: ""It is then BT/Openreach's decision as to whether they would run copper or fibre through the ducting."" Communications Minister Ed Vaizey said: ""You wouldn't move into a brand new house in 2014/2015 and not expect to get superfast broadband. It is unacceptable.""",30585271,"[0, 28806, 9484, 9706, 11, 5414, 25416, 6, 15, 5, 3543, 9, 163, 40755, 6, 7154, 11514, 132, 36030, 6, 8, 103, 1611, 1395, 120, 10, 1212, 1902, 4, 10073, 9, 5, 3675, 1611, 33, 342, 62, 15736, 2892, 801, 19298, 9, 5, 696, 4, 31285, 5093, 3730, 13094, 26, 24, 56, 1286, 29259, 154, 13, 5, 19185, 53, 24, 21, 62, 7, 12482, 99, 439, 11, 106, 4, 12482, 26, 24, 1850, 3565, 5, 1437, 2537, 29254, 1898, 6, 151, 701, 9, 1976, 21060, 12, 19693, 636, 11451, 4, 20, 709, 23, 5414, 25416, 16, 233, 9, 5, 508, ...]","[0, 35129, 11, 65, 9, 5, 168, 18, 92, 5671, 1947, 32, 2892, 801, 184, 4859, 89, 16, 117, 1769, 11451, 15, 49, 709, 4, 2]",Residents in one of the government's new garden cities are warning potential home buyers there is no fast broadband on their development.
3,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Mr Trump said Mr Obama had learned well before the 8 November poll about the accusations and ""did nothing"". His comments followed an article in the Washington Post which said that Mr Obama learned last August of President Vladimir Putin's ""direct involvement"". The alleged meddling is the subject of high-level investigations in the US. President Putin has repeatedly denied any Russian interference into the presidential election. The Washington Post article says Mr Obama was told early last August by sources deep within the Russian government that Mr Putin was directly involved in a cyber campaign to disrupt the election, injure Hillary Clinton and aid a Trump victory. The Post said Mr Obama secretly debated dozens of options to punish Russia but in the end settled on what it called symbolic measures - the expulsion of 35 diplomats and closure of two Russian compounds. They came in late December, well after the election. The Post reported that Mr Obama was concerned he might himself be seen as trying to manipulate the election. The paper quoted a former administration official as saying: ""From national security people there was a sense of immediate introspection, of, 'Wow, did we mishandle this'."" Measures Mr Obama had considered but which were not put into action included planting cyber weapons in the Russian infrastructure and releasing information personally damaging to Mr Putin. Imagine, for a moment, that you're Barack Obama in August 2016. You've just been informed by the CIA that Vladimir Putin has ordered a wide-ranging effort to disrupt the US presidential election. What do you do? Mr Obama responded in typical fashion - cautiously. He alerted state officials, warned Russia and attempted (unsuccessfully) to fashion a bipartisan response with Republicans in Congress. Now the second-guessing has begun. Some Democrats are saying the Obama team should have gone public with such a startling discovery before election day. The president feared such a move would prompt the Republican nominee to accuse him of meddling and undermine faith in the electoral process. He believed Mrs Clinton was going to win anyway, so it was best not to rock the boat. Mr Trump himself is now questioning why Mr Obama didn't do more - a curious position given that he recently described the Russia hacking story as a Democratic ""hoax"". These latest revelations add yet another wrinkle to a 2016 campaign that will be hashed and rehashed for the foreseeable future. The most pressing question now, however, is not what Mr Obama did. It's what the US government does next. Mr Trump tweeted on Friday: ""The Obama Administration knew far in advance of November 8th about election meddling by Russia. Did nothing about it. WHY?"" He followed that up with two more tweets on Saturday, the second saying: ""Obama Administration official said they ""choked"" when it came to acting on Russian meddling of election. They didn't want to hurt Hillary?"" He repeats the argument in an interview with Fox News, which will air on Sunday. ""If he had the information, why didn't he do something about it? He should have done something about it. But you don't read that. It's quite sad."" Allegations of collusion between the Trump team and Russian officials during the election have dogged the president's first five months in office. He has repeatedly denied the allegations, calling the investigations a ""witch hunt"". US investigators are looking into whether Russian cyber hackers targeted US electoral systems to help Mr Trump win. US media say special counsel Robert Mueller is also investigating Mr Trump for possible obstruction of justice over the Russia inquiries. They involve the president's firing of FBI chief James Comey, who led one of the inquiries, and Mr Trump's alleged attempt to end a probe into sacked national security adviser Michael Flynn.",40395433,"[0, 10980, 140, 26, 427, 1284, 56, 2435, 157, 137, 5, 290, 759, 2902, 59, 5, 6124, 8, 22, 24001, 1085, 845, 832, 1450, 1432, 41, 1566, 11, 5, 663, 1869, 61, 26, 14, 427, 1284, 2435, 94, 830, 9, 270, 6546, 3176, 18, 22, 27555, 5292, 845, 20, 1697, 13683, 16, 5, 2087, 9, 239, 12, 4483, 4941, 11, 5, 382, 4, 270, 3176, 34, 3987, 2296, 143, 1083, 8149, 88, 5, 1939, 729, 4, 20, 663, 1869, 1566, 161, 427, 1284, 21, 174, 419, 94, 830, 30, 1715, 1844, 624, 5, 1083, 168, 14, 427, 3176, 21, 2024, ...]","[0, 6517, 807, 140, 34, 1238, 39, 9933, 4282, 1284, 9, 30365, 81, 1697, 1083, 8149, 11, 5, 382, 729, 11, 336, 4, 2]",President Donald Trump has accused his predecessor Barack Obama of inaction over alleged Russian interference in the US election in 2016.
4,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","James, 29, who presents the drive-time show, studied drama and had a number of shows on the students' union radio station before rising to fame. The DJ, who is covering Radio 1's Big Weekend in Norwich this bank holiday, said he was ""completely honoured"". Father Ted co-writer Graham Linehan and broadcaster Dame Jenni Murray are also receiving degrees. James said: ""My affair with UEA continues - it really is wonderful and I'm completely honoured to be given this. ""I spent a very happy three years there and it's the place where I fell in love with drama and radio, so it's very special to me. ""As soon as the ceremony is over, I'll set about officially changing my name on all credit cards, mortgage documents and, most importantly, the Radio 1 website."" James and Linehan will receive an Honorary Doctorate of Letters at a ceremony in Norwich in July, with BBC Radio 4's Woman's Hour presenter Murray being given an Honorary Doctorate of Civil Law. Other names being recognised for their distinguished careers include comedian Arthur Smith, author Erica Wagner, actor Samuel West and the Bishop of Norwich, the Rt Rev Graham James.",32837393,"[0, 18031, 6, 1132, 6, 54, 6822, 5, 1305, 12, 958, 311, 6, 8069, 4149, 8, 56, 10, 346, 9, 924, 15, 5, 521, 108, 2918, 3188, 1992, 137, 2227, 7, 9444, 4, 20, 7766, 6, 54, 16, 4631, 4611, 112, 18, 1776, 16520, 11, 18749, 42, 827, 2317, 6, 26, 37, 21, 22, 28655, 16962, 845, 9510, 8115, 1029, 12, 9408, 4572, 5562, 4134, 8, 10901, 9038, 23710, 118, 4479, 32, 67, 2806, 4176, 4, 957, 26, 35, 22, 2387, 7226, 19, 121, 14684, 1388, 111, 24, 269, 16, 4613, 8, 38, 437, 2198, 16962, 7, 28, 576, 42, ...]","[0, 28713, 4611, 112, 7766, 4275, 957, 16, 7, 1325, 41, 23536, 3299, 877, 31, 5, 589, 9, 953, 7413, 14190, 36, 9162, 250, 322, 2]",BBC Radio 1 DJ Greg James is to receive an honorary doctorate from the University of East Anglia (UEA).


Unnamed: 0,attention_mask,document,id,input_ids,labels,summary
0,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Media playback is not supported on this device How did their players rate in the biggest match in the history of Welsh football? Coped well with Portugal's early flurry of crosses but was powerless to deny Cristiano Ronaldo for Portugal's opening goal. The Crystal Palace player was unfortunate to be wrong-footed by Nani's deflection for the second goal. Watchful against the threat of Ronaldo cutting inside from Portugal's left, the Reading defender was kept busy by the likes of Renato Sanches and Nani and struggled to influence the game in attack. Switched to the left side of Wales' three centre-backs in Ben Davies' absence, the West Brom man timed his advances well to make interceptions. Beaten by Ronaldo for Portugal's opening goal but won a team-high eight aerial duels. Made some strong early challenges, particularly on Cristiano Ronaldo, to assert his authority on the game. The Swansea City skipper led by example, winning 100% of his 50-50 contests with Portugal players. Making his first international start since March 2015, he competed well in the air as Portugal sought to make the most of Ronaldo's aerial prowess. Wales might have missed Ben Davies' distribution but his replacement was solid defensively. Like Gunter, kept on the back foot by Portugal's attacking players. Pushed forward but, when he got into promising positions, struggled to provide quality crosses. An early booking for a foul on Nani made his job of protecting Wales' defence difficult, but still the Liverpool midfielder buzzed around with intent. Typically sound in possession but not as influential as he has been earlier in the tournament. Showed imagination with a low corner which led to a chance for Gareth Bale but had only limited influence in open play before being replaced by Sam Vokes shortly after Portugal's second goal. Made some characteristic runs into the Portugal penalty area but could not make the crucial connections. Forced deeper as Portugal's midfield gained control in the second half, the Leicester Premier League winner had to curb his attacking instincts. Trademark runs from deep and at a startling pace had Portugal's defenders backtracking in the first half but his influence waned in the second period. The Real Madrid forward's audacious long-range shot was Wales' last effort. Brimming with confidence following his stunning goal in the quarter-final win over Belgium, the free agent stretched Portugal's defence with his powerful running. He was starved of the ball in the second half, however, as the match wore on. Brought on shortly after Wales fell 2-0 behind, the Burnley striker failed to connect meaningfully with any of the crosses which came his way. Did not see much of the ball and, when he did, was not in a position to cause Portugal any problems. Tried making his usual probing runs between the opponents' midfield and defence but found himself crowded out.",36730443,"[0, 18801, 20083, 16, 45, 2800, 15, 42, 2187, 1336, 222, 49, 472, 731, 11, 5, 934, 914, 11, 5, 750, 9, 12093, 1037, 116, 9351, 196, 157, 19, 8062, 18, 419, 18996, 9, 20238, 53, 21, 33128, 7, 7631, 8767, 5472, 7991, 13, 8062, 18, 1273, 724, 4, 20, 9793, 5928, 869, 21, 9327, 7, 28, 1593, 12, 26620, 30, 234, 1543, 18, 3816, 20576, 13, 5, 200, 724, 4, 3075, 2650, 136, 5, 1856, 9, 7991, 3931, 1025, 31, 8062, 18, 314, 6, 5, 4913, 5142, 21, 1682, 3610, 30, 5, 3829, 9, 6340, 3938, 208, 23833, 8, ...]","[0, 771, 4575, 108, 24666, 5122, 336, 422, 376, 7, 41, 253, 65, 177, 137, 5, 507, 25, 51, 685, 132, 12, 288, 7, 8062, 11, 5, 94, 237, 4, 2]",Wales' heroic Euro 2016 run came to an end one game before the final as they lost 2-0 to Portugal in the last four.
1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","The Â£42m MV Loch Seaforth made its first passenger sailing last month but is still in a ""test period"" and not fully in service. Thursday's fault took five hours to fix and the ferry was cleared again for sailings. Another ferry, the Isle of Lewis, took the passengers involved. Bad weather has led the cancellations of Friday sailings on the Stornoway-Ullapool route and other services on Scotland's west coast. Ferry operator Caledonian MacBrayne said withdrawing the Loch Seaforth had been an operational decision and the fault would not have prevented the ship from sailing. A spokesman said: ""Yesterday evening an issue arose with an engine room ventilation fan which required attention and an operational decision was taken to remove her from the route while it was fixed. ""While passengers were delayed, and we regret any inconvenience to them, no-one was stranded."" A spokesman added: ""This was not a major issue but it required around five hours of work as the fan was in a difficult to reach location.""",31759248,"[0, 133, 1437, 2537, 29254, 3714, 119, 28830, 26384, 27411, 22494, 156, 63, 78, 4408, 17664, 94, 353, 53, 16, 202, 11, 10, 22, 21959, 675, 113, 8, 45, 1950, 11, 544, 4, 296, 18, 7684, 362, 292, 722, 7, 4190, 8, 5, 15169, 21, 6049, 456, 13, 18840, 1033, 4, 2044, 15169, 6, 5, 18930, 9, 3577, 6, 362, 5, 3670, 963, 4, 5654, 1650, 34, 669, 5, 24068, 1635, 9, 273, 18840, 1033, 15, 5, 312, 4244, 20574, 12, 791, 890, 1115, 8110, 3420, 8, 97, 518, 15, 3430, 18, 3072, 3673, 4, 25820, 5364, 2912, 196, 26399, ...]","[0, 250, 92, 15169, 1490, 13, 5, 312, 4244, 20574, 7, 121, 890, 1115, 8110, 3420, 21, 8059, 13375, 31, 4408, 5941, 142, 9, 10, 23624, 30911, 2378, 4, 2]",A new ferry built for the Stornoway to Ullapool route was temporarily withdrawn from passenger duties because of a faulty ventilation fan.
2,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Broadband speeds in Kingsmere, on the edge of Bicester, rarely exceed 2Mbps, and some homes cannot get a landline. Residents of the 400 homes have put up posters warning potential newcomers of the issue. Developer Countryside Properties said it had provided ducting for the cables but it was up to BT what went in them. BT said it proposed sharing the Â£45,000 cost of providing fibre-optic broadband. The development at Kingsmere is part of the 13,000 homes planned by the government to turn Bicester into one of a new generation of garden cities, announced earlier this month. Resident Matt Maunder said: ""I'm a home worker, and I need good broadband to do my job. We've actually got residents who moved here in August who still don't have a phone line - that's just unacceptable. ""I can't carry out my job effectively, I can't take advantage of services like Skype, my family live abroad so I can't get in touch with them as easily as I would like. ""Unfortunately we have got people now saying they wish they hadn't moved here because of the way the service is and that's a real shame, particularly because it's been lauded as the latest and greatest housing development in the country."" BT said it had reached an agreement for 726 additional homes yet to be built and a proposal for the existing houses would be ready by 10 January. Countryside Properties said the ducting installed at Kingsmere was based on a design agreed with BT in 2010, based on a copper network. A spokesman said: ""It is then BT/Openreach's decision as to whether they would run copper or fibre through the ducting."" Communications Minister Ed Vaizey said: ""You wouldn't move into a brand new house in 2014/2015 and not expect to get superfast broadband. It is unacceptable.""",30585271,"[0, 28806, 9484, 9706, 11, 5414, 25416, 6, 15, 5, 3543, 9, 163, 40755, 6, 7154, 11514, 132, 36030, 6, 8, 103, 1611, 1395, 120, 10, 1212, 1902, 4, 10073, 9, 5, 3675, 1611, 33, 342, 62, 15736, 2892, 801, 19298, 9, 5, 696, 4, 31285, 5093, 3730, 13094, 26, 24, 56, 1286, 29259, 154, 13, 5, 19185, 53, 24, 21, 62, 7, 12482, 99, 439, 11, 106, 4, 12482, 26, 24, 1850, 3565, 5, 1437, 2537, 29254, 1898, 6, 151, 701, 9, 1976, 21060, 12, 19693, 636, 11451, 4, 20, 709, 23, 5414, 25416, 16, 233, 9, 5, 508, ...]","[0, 35129, 11, 65, 9, 5, 168, 18, 92, 5671, 1947, 32, 2892, 801, 184, 4859, 89, 16, 117, 1769, 11451, 15, 49, 709, 4, 2]",Residents in one of the government's new garden cities are warning potential home buyers there is no fast broadband on their development.
3,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Mr Trump said Mr Obama had learned well before the 8 November poll about the accusations and ""did nothing"". His comments followed an article in the Washington Post which said that Mr Obama learned last August of President Vladimir Putin's ""direct involvement"". The alleged meddling is the subject of high-level investigations in the US. President Putin has repeatedly denied any Russian interference into the presidential election. The Washington Post article says Mr Obama was told early last August by sources deep within the Russian government that Mr Putin was directly involved in a cyber campaign to disrupt the election, injure Hillary Clinton and aid a Trump victory. The Post said Mr Obama secretly debated dozens of options to punish Russia but in the end settled on what it called symbolic measures - the expulsion of 35 diplomats and closure of two Russian compounds. They came in late December, well after the election. The Post reported that Mr Obama was concerned he might himself be seen as trying to manipulate the election. The paper quoted a former administration official as saying: ""From national security people there was a sense of immediate introspection, of, 'Wow, did we mishandle this'."" Measures Mr Obama had considered but which were not put into action included planting cyber weapons in the Russian infrastructure and releasing information personally damaging to Mr Putin. Imagine, for a moment, that you're Barack Obama in August 2016. You've just been informed by the CIA that Vladimir Putin has ordered a wide-ranging effort to disrupt the US presidential election. What do you do? Mr Obama responded in typical fashion - cautiously. He alerted state officials, warned Russia and attempted (unsuccessfully) to fashion a bipartisan response with Republicans in Congress. Now the second-guessing has begun. Some Democrats are saying the Obama team should have gone public with such a startling discovery before election day. The president feared such a move would prompt the Republican nominee to accuse him of meddling and undermine faith in the electoral process. He believed Mrs Clinton was going to win anyway, so it was best not to rock the boat. Mr Trump himself is now questioning why Mr Obama didn't do more - a curious position given that he recently described the Russia hacking story as a Democratic ""hoax"". These latest revelations add yet another wrinkle to a 2016 campaign that will be hashed and rehashed for the foreseeable future. The most pressing question now, however, is not what Mr Obama did. It's what the US government does next. Mr Trump tweeted on Friday: ""The Obama Administration knew far in advance of November 8th about election meddling by Russia. Did nothing about it. WHY?"" He followed that up with two more tweets on Saturday, the second saying: ""Obama Administration official said they ""choked"" when it came to acting on Russian meddling of election. They didn't want to hurt Hillary?"" He repeats the argument in an interview with Fox News, which will air on Sunday. ""If he had the information, why didn't he do something about it? He should have done something about it. But you don't read that. It's quite sad."" Allegations of collusion between the Trump team and Russian officials during the election have dogged the president's first five months in office. He has repeatedly denied the allegations, calling the investigations a ""witch hunt"". US investigators are looking into whether Russian cyber hackers targeted US electoral systems to help Mr Trump win. US media say special counsel Robert Mueller is also investigating Mr Trump for possible obstruction of justice over the Russia inquiries. They involve the president's firing of FBI chief James Comey, who led one of the inquiries, and Mr Trump's alleged attempt to end a probe into sacked national security adviser Michael Flynn.",40395433,"[0, 10980, 140, 26, 427, 1284, 56, 2435, 157, 137, 5, 290, 759, 2902, 59, 5, 6124, 8, 22, 24001, 1085, 845, 832, 1450, 1432, 41, 1566, 11, 5, 663, 1869, 61, 26, 14, 427, 1284, 2435, 94, 830, 9, 270, 6546, 3176, 18, 22, 27555, 5292, 845, 20, 1697, 13683, 16, 5, 2087, 9, 239, 12, 4483, 4941, 11, 5, 382, 4, 270, 3176, 34, 3987, 2296, 143, 1083, 8149, 88, 5, 1939, 729, 4, 20, 663, 1869, 1566, 161, 427, 1284, 21, 174, 419, 94, 830, 30, 1715, 1844, 624, 5, 1083, 168, 14, 427, 3176, 21, 2024, ...]","[0, 6517, 807, 140, 34, 1238, 39, 9933, 4282, 1284, 9, 30365, 81, 1697, 1083, 8149, 11, 5, 382, 729, 11, 336, 4, 2]",President Donald Trump has accused his predecessor Barack Obama of inaction over alleged Russian interference in the US election in 2016.
4,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","James, 29, who presents the drive-time show, studied drama and had a number of shows on the students' union radio station before rising to fame. The DJ, who is covering Radio 1's Big Weekend in Norwich this bank holiday, said he was ""completely honoured"". Father Ted co-writer Graham Linehan and broadcaster Dame Jenni Murray are also receiving degrees. James said: ""My affair with UEA continues - it really is wonderful and I'm completely honoured to be given this. ""I spent a very happy three years there and it's the place where I fell in love with drama and radio, so it's very special to me. ""As soon as the ceremony is over, I'll set about officially changing my name on all credit cards, mortgage documents and, most importantly, the Radio 1 website."" James and Linehan will receive an Honorary Doctorate of Letters at a ceremony in Norwich in July, with BBC Radio 4's Woman's Hour presenter Murray being given an Honorary Doctorate of Civil Law. Other names being recognised for their distinguished careers include comedian Arthur Smith, author Erica Wagner, actor Samuel West and the Bishop of Norwich, the Rt Rev Graham James.",32837393,"[0, 18031, 6, 1132, 6, 54, 6822, 5, 1305, 12, 958, 311, 6, 8069, 4149, 8, 56, 10, 346, 9, 924, 15, 5, 521, 108, 2918, 3188, 1992, 137, 2227, 7, 9444, 4, 20, 7766, 6, 54, 16, 4631, 4611, 112, 18, 1776, 16520, 11, 18749, 42, 827, 2317, 6, 26, 37, 21, 22, 28655, 16962, 845, 9510, 8115, 1029, 12, 9408, 4572, 5562, 4134, 8, 10901, 9038, 23710, 118, 4479, 32, 67, 2806, 4176, 4, 957, 26, 35, 22, 2387, 7226, 19, 121, 14684, 1388, 111, 24, 269, 16, 4613, 8, 38, 437, 2198, 16962, 7, 28, 576, 42, ...]","[0, 28713, 4611, 112, 7766, 4275, 957, 16, 7, 1325, 41, 23536, 3299, 877, 31, 5, 589, 9, 953, 7413, 14190, 36, 9162, 250, 322, 2]",BBC Radio 1 DJ Greg James is to receive an honorary doctorate from the University of East Anglia (UEA).


Unnamed: 0,attention_mask,document,id,input_ids,labels,summary
0,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Media playback is not supported on this device How did their players rate in the biggest match in the history of Welsh football? Coped well with Portugal's early flurry of crosses but was powerless to deny Cristiano Ronaldo for Portugal's opening goal. The Crystal Palace player was unfortunate to be wrong-footed by Nani's deflection for the second goal. Watchful against the threat of Ronaldo cutting inside from Portugal's left, the Reading defender was kept busy by the likes of Renato Sanches and Nani and struggled to influence the game in attack. Switched to the left side of Wales' three centre-backs in Ben Davies' absence, the West Brom man timed his advances well to make interceptions. Beaten by Ronaldo for Portugal's opening goal but won a team-high eight aerial duels. Made some strong early challenges, particularly on Cristiano Ronaldo, to assert his authority on the game. The Swansea City skipper led by example, winning 100% of his 50-50 contests with Portugal players. Making his first international start since March 2015, he competed well in the air as Portugal sought to make the most of Ronaldo's aerial prowess. Wales might have missed Ben Davies' distribution but his replacement was solid defensively. Like Gunter, kept on the back foot by Portugal's attacking players. Pushed forward but, when he got into promising positions, struggled to provide quality crosses. An early booking for a foul on Nani made his job of protecting Wales' defence difficult, but still the Liverpool midfielder buzzed around with intent. Typically sound in possession but not as influential as he has been earlier in the tournament. Showed imagination with a low corner which led to a chance for Gareth Bale but had only limited influence in open play before being replaced by Sam Vokes shortly after Portugal's second goal. Made some characteristic runs into the Portugal penalty area but could not make the crucial connections. Forced deeper as Portugal's midfield gained control in the second half, the Leicester Premier League winner had to curb his attacking instincts. Trademark runs from deep and at a startling pace had Portugal's defenders backtracking in the first half but his influence waned in the second period. The Real Madrid forward's audacious long-range shot was Wales' last effort. Brimming with confidence following his stunning goal in the quarter-final win over Belgium, the free agent stretched Portugal's defence with his powerful running. He was starved of the ball in the second half, however, as the match wore on. Brought on shortly after Wales fell 2-0 behind, the Burnley striker failed to connect meaningfully with any of the crosses which came his way. Did not see much of the ball and, when he did, was not in a position to cause Portugal any problems. Tried making his usual probing runs between the opponents' midfield and defence but found himself crowded out.",36730443,"[0, 18801, 20083, 16, 45, 2800, 15, 42, 2187, 1336, 222, 49, 472, 731, 11, 5, 934, 914, 11, 5, 750, 9, 12093, 1037, 116, 9351, 196, 157, 19, 8062, 18, 419, 18996, 9, 20238, 53, 21, 33128, 7, 7631, 8767, 5472, 7991, 13, 8062, 18, 1273, 724, 4, 20, 9793, 5928, 869, 21, 9327, 7, 28, 1593, 12, 26620, 30, 234, 1543, 18, 3816, 20576, 13, 5, 200, 724, 4, 3075, 2650, 136, 5, 1856, 9, 7991, 3931, 1025, 31, 8062, 18, 314, 6, 5, 4913, 5142, 21, 1682, 3610, 30, 5, 3829, 9, 6340, 3938, 208, 23833, 8, ...]","[0, 771, 4575, 108, 24666, 5122, 336, 422, 376, 7, 41, 253, 65, 177, 137, 5, 507, 25, 51, 685, 132, 12, 288, 7, 8062, 11, 5, 94, 237, 4, 2]",Wales' heroic Euro 2016 run came to an end one game before the final as they lost 2-0 to Portugal in the last four.
1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","The Â£42m MV Loch Seaforth made its first passenger sailing last month but is still in a ""test period"" and not fully in service. Thursday's fault took five hours to fix and the ferry was cleared again for sailings. Another ferry, the Isle of Lewis, took the passengers involved. Bad weather has led the cancellations of Friday sailings on the Stornoway-Ullapool route and other services on Scotland's west coast. Ferry operator Caledonian MacBrayne said withdrawing the Loch Seaforth had been an operational decision and the fault would not have prevented the ship from sailing. A spokesman said: ""Yesterday evening an issue arose with an engine room ventilation fan which required attention and an operational decision was taken to remove her from the route while it was fixed. ""While passengers were delayed, and we regret any inconvenience to them, no-one was stranded."" A spokesman added: ""This was not a major issue but it required around five hours of work as the fan was in a difficult to reach location.""",31759248,"[0, 133, 1437, 2537, 29254, 3714, 119, 28830, 26384, 27411, 22494, 156, 63, 78, 4408, 17664, 94, 353, 53, 16, 202, 11, 10, 22, 21959, 675, 113, 8, 45, 1950, 11, 544, 4, 296, 18, 7684, 362, 292, 722, 7, 4190, 8, 5, 15169, 21, 6049, 456, 13, 18840, 1033, 4, 2044, 15169, 6, 5, 18930, 9, 3577, 6, 362, 5, 3670, 963, 4, 5654, 1650, 34, 669, 5, 24068, 1635, 9, 273, 18840, 1033, 15, 5, 312, 4244, 20574, 12, 791, 890, 1115, 8110, 3420, 8, 97, 518, 15, 3430, 18, 3072, 3673, 4, 25820, 5364, 2912, 196, 26399, ...]","[0, 250, 92, 15169, 1490, 13, 5, 312, 4244, 20574, 7, 121, 890, 1115, 8110, 3420, 21, 8059, 13375, 31, 4408, 5941, 142, 9, 10, 23624, 30911, 2378, 4, 2]",A new ferry built for the Stornoway to Ullapool route was temporarily withdrawn from passenger duties because of a faulty ventilation fan.
2,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Broadband speeds in Kingsmere, on the edge of Bicester, rarely exceed 2Mbps, and some homes cannot get a landline. Residents of the 400 homes have put up posters warning potential newcomers of the issue. Developer Countryside Properties said it had provided ducting for the cables but it was up to BT what went in them. BT said it proposed sharing the Â£45,000 cost of providing fibre-optic broadband. The development at Kingsmere is part of the 13,000 homes planned by the government to turn Bicester into one of a new generation of garden cities, announced earlier this month. Resident Matt Maunder said: ""I'm a home worker, and I need good broadband to do my job. We've actually got residents who moved here in August who still don't have a phone line - that's just unacceptable. ""I can't carry out my job effectively, I can't take advantage of services like Skype, my family live abroad so I can't get in touch with them as easily as I would like. ""Unfortunately we have got people now saying they wish they hadn't moved here because of the way the service is and that's a real shame, particularly because it's been lauded as the latest and greatest housing development in the country."" BT said it had reached an agreement for 726 additional homes yet to be built and a proposal for the existing houses would be ready by 10 January. Countryside Properties said the ducting installed at Kingsmere was based on a design agreed with BT in 2010, based on a copper network. A spokesman said: ""It is then BT/Openreach's decision as to whether they would run copper or fibre through the ducting."" Communications Minister Ed Vaizey said: ""You wouldn't move into a brand new house in 2014/2015 and not expect to get superfast broadband. It is unacceptable.""",30585271,"[0, 28806, 9484, 9706, 11, 5414, 25416, 6, 15, 5, 3543, 9, 163, 40755, 6, 7154, 11514, 132, 36030, 6, 8, 103, 1611, 1395, 120, 10, 1212, 1902, 4, 10073, 9, 5, 3675, 1611, 33, 342, 62, 15736, 2892, 801, 19298, 9, 5, 696, 4, 31285, 5093, 3730, 13094, 26, 24, 56, 1286, 29259, 154, 13, 5, 19185, 53, 24, 21, 62, 7, 12482, 99, 439, 11, 106, 4, 12482, 26, 24, 1850, 3565, 5, 1437, 2537, 29254, 1898, 6, 151, 701, 9, 1976, 21060, 12, 19693, 636, 11451, 4, 20, 709, 23, 5414, 25416, 16, 233, 9, 5, 508, ...]","[0, 35129, 11, 65, 9, 5, 168, 18, 92, 5671, 1947, 32, 2892, 801, 184, 4859, 89, 16, 117, 1769, 11451, 15, 49, 709, 4, 2]",Residents in one of the government's new garden cities are warning potential home buyers there is no fast broadband on their development.
3,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Mr Trump said Mr Obama had learned well before the 8 November poll about the accusations and ""did nothing"". His comments followed an article in the Washington Post which said that Mr Obama learned last August of President Vladimir Putin's ""direct involvement"". The alleged meddling is the subject of high-level investigations in the US. President Putin has repeatedly denied any Russian interference into the presidential election. The Washington Post article says Mr Obama was told early last August by sources deep within the Russian government that Mr Putin was directly involved in a cyber campaign to disrupt the election, injure Hillary Clinton and aid a Trump victory. The Post said Mr Obama secretly debated dozens of options to punish Russia but in the end settled on what it called symbolic measures - the expulsion of 35 diplomats and closure of two Russian compounds. They came in late December, well after the election. The Post reported that Mr Obama was concerned he might himself be seen as trying to manipulate the election. The paper quoted a former administration official as saying: ""From national security people there was a sense of immediate introspection, of, 'Wow, did we mishandle this'."" Measures Mr Obama had considered but which were not put into action included planting cyber weapons in the Russian infrastructure and releasing information personally damaging to Mr Putin. Imagine, for a moment, that you're Barack Obama in August 2016. You've just been informed by the CIA that Vladimir Putin has ordered a wide-ranging effort to disrupt the US presidential election. What do you do? Mr Obama responded in typical fashion - cautiously. He alerted state officials, warned Russia and attempted (unsuccessfully) to fashion a bipartisan response with Republicans in Congress. Now the second-guessing has begun. Some Democrats are saying the Obama team should have gone public with such a startling discovery before election day. The president feared such a move would prompt the Republican nominee to accuse him of meddling and undermine faith in the electoral process. He believed Mrs Clinton was going to win anyway, so it was best not to rock the boat. Mr Trump himself is now questioning why Mr Obama didn't do more - a curious position given that he recently described the Russia hacking story as a Democratic ""hoax"". These latest revelations add yet another wrinkle to a 2016 campaign that will be hashed and rehashed for the foreseeable future. The most pressing question now, however, is not what Mr Obama did. It's what the US government does next. Mr Trump tweeted on Friday: ""The Obama Administration knew far in advance of November 8th about election meddling by Russia. Did nothing about it. WHY?"" He followed that up with two more tweets on Saturday, the second saying: ""Obama Administration official said they ""choked"" when it came to acting on Russian meddling of election. They didn't want to hurt Hillary?"" He repeats the argument in an interview with Fox News, which will air on Sunday. ""If he had the information, why didn't he do something about it? He should have done something about it. But you don't read that. It's quite sad."" Allegations of collusion between the Trump team and Russian officials during the election have dogged the president's first five months in office. He has repeatedly denied the allegations, calling the investigations a ""witch hunt"". US investigators are looking into whether Russian cyber hackers targeted US electoral systems to help Mr Trump win. US media say special counsel Robert Mueller is also investigating Mr Trump for possible obstruction of justice over the Russia inquiries. They involve the president's firing of FBI chief James Comey, who led one of the inquiries, and Mr Trump's alleged attempt to end a probe into sacked national security adviser Michael Flynn.",40395433,"[0, 10980, 140, 26, 427, 1284, 56, 2435, 157, 137, 5, 290, 759, 2902, 59, 5, 6124, 8, 22, 24001, 1085, 845, 832, 1450, 1432, 41, 1566, 11, 5, 663, 1869, 61, 26, 14, 427, 1284, 2435, 94, 830, 9, 270, 6546, 3176, 18, 22, 27555, 5292, 845, 20, 1697, 13683, 16, 5, 2087, 9, 239, 12, 4483, 4941, 11, 5, 382, 4, 270, 3176, 34, 3987, 2296, 143, 1083, 8149, 88, 5, 1939, 729, 4, 20, 663, 1869, 1566, 161, 427, 1284, 21, 174, 419, 94, 830, 30, 1715, 1844, 624, 5, 1083, 168, 14, 427, 3176, 21, 2024, ...]","[0, 6517, 807, 140, 34, 1238, 39, 9933, 4282, 1284, 9, 30365, 81, 1697, 1083, 8149, 11, 5, 382, 729, 11, 336, 4, 2]",President Donald Trump has accused his predecessor Barack Obama of inaction over alleged Russian interference in the US election in 2016.
4,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","James, 29, who presents the drive-time show, studied drama and had a number of shows on the students' union radio station before rising to fame. The DJ, who is covering Radio 1's Big Weekend in Norwich this bank holiday, said he was ""completely honoured"". Father Ted co-writer Graham Linehan and broadcaster Dame Jenni Murray are also receiving degrees. James said: ""My affair with UEA continues - it really is wonderful and I'm completely honoured to be given this. ""I spent a very happy three years there and it's the place where I fell in love with drama and radio, so it's very special to me. ""As soon as the ceremony is over, I'll set about officially changing my name on all credit cards, mortgage documents and, most importantly, the Radio 1 website."" James and Linehan will receive an Honorary Doctorate of Letters at a ceremony in Norwich in July, with BBC Radio 4's Woman's Hour presenter Murray being given an Honorary Doctorate of Civil Law. Other names being recognised for their distinguished careers include comedian Arthur Smith, author Erica Wagner, actor Samuel West and the Bishop of Norwich, the Rt Rev Graham James.",32837393,"[0, 18031, 6, 1132, 6, 54, 6822, 5, 1305, 12, 958, 311, 6, 8069, 4149, 8, 56, 10, 346, 9, 924, 15, 5, 521, 108, 2918, 3188, 1992, 137, 2227, 7, 9444, 4, 20, 7766, 6, 54, 16, 4631, 4611, 112, 18, 1776, 16520, 11, 18749, 42, 827, 2317, 6, 26, 37, 21, 22, 28655, 16962, 845, 9510, 8115, 1029, 12, 9408, 4572, 5562, 4134, 8, 10901, 9038, 23710, 118, 4479, 32, 67, 2806, 4176, 4, 957, 26, 35, 22, 2387, 7226, 19, 121, 14684, 1388, 111, 24, 269, 16, 4613, 8, 38, 437, 2198, 16962, 7, 28, 576, 42, ...]","[0, 28713, 4611, 112, 7766, 4275, 957, 16, 7, 1325, 41, 23536, 3299, 877, 31, 5, 589, 9, 953, 7413, 14190, 36, 9162, 250, 322, 2]",BBC Radio 1 DJ Greg James is to receive an honorary doctorate from the University of East Anglia (UEA).


Unnamed: 0,attention_mask,document,id,input_ids,labels,summary
0,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Media playback is not supported on this device How did their players rate in the biggest match in the history of Welsh football? Coped well with Portugal's early flurry of crosses but was powerless to deny Cristiano Ronaldo for Portugal's opening goal. The Crystal Palace player was unfortunate to be wrong-footed by Nani's deflection for the second goal. Watchful against the threat of Ronaldo cutting inside from Portugal's left, the Reading defender was kept busy by the likes of Renato Sanches and Nani and struggled to influence the game in attack. Switched to the left side of Wales' three centre-backs in Ben Davies' absence, the West Brom man timed his advances well to make interceptions. Beaten by Ronaldo for Portugal's opening goal but won a team-high eight aerial duels. Made some strong early challenges, particularly on Cristiano Ronaldo, to assert his authority on the game. The Swansea City skipper led by example, winning 100% of his 50-50 contests with Portugal players. Making his first international start since March 2015, he competed well in the air as Portugal sought to make the most of Ronaldo's aerial prowess. Wales might have missed Ben Davies' distribution but his replacement was solid defensively. Like Gunter, kept on the back foot by Portugal's attacking players. Pushed forward but, when he got into promising positions, struggled to provide quality crosses. An early booking for a foul on Nani made his job of protecting Wales' defence difficult, but still the Liverpool midfielder buzzed around with intent. Typically sound in possession but not as influential as he has been earlier in the tournament. Showed imagination with a low corner which led to a chance for Gareth Bale but had only limited influence in open play before being replaced by Sam Vokes shortly after Portugal's second goal. Made some characteristic runs into the Portugal penalty area but could not make the crucial connections. Forced deeper as Portugal's midfield gained control in the second half, the Leicester Premier League winner had to curb his attacking instincts. Trademark runs from deep and at a startling pace had Portugal's defenders backtracking in the first half but his influence waned in the second period. The Real Madrid forward's audacious long-range shot was Wales' last effort. Brimming with confidence following his stunning goal in the quarter-final win over Belgium, the free agent stretched Portugal's defence with his powerful running. He was starved of the ball in the second half, however, as the match wore on. Brought on shortly after Wales fell 2-0 behind, the Burnley striker failed to connect meaningfully with any of the crosses which came his way. Did not see much of the ball and, when he did, was not in a position to cause Portugal any problems. Tried making his usual probing runs between the opponents' midfield and defence but found himself crowded out.",36730443,"[0, 18801, 20083, 16, 45, 2800, 15, 42, 2187, 1336, 222, 49, 472, 731, 11, 5, 934, 914, 11, 5, 750, 9, 12093, 1037, 116, 9351, 196, 157, 19, 8062, 18, 419, 18996, 9, 20238, 53, 21, 33128, 7, 7631, 8767, 5472, 7991, 13, 8062, 18, 1273, 724, 4, 20, 9793, 5928, 869, 21, 9327, 7, 28, 1593, 12, 26620, 30, 234, 1543, 18, 3816, 20576, 13, 5, 200, 724, 4, 3075, 2650, 136, 5, 1856, 9, 7991, 3931, 1025, 31, 8062, 18, 314, 6, 5, 4913, 5142, 21, 1682, 3610, 30, 5, 3829, 9, 6340, 3938, 208, 23833, 8, ...]","[0, 771, 4575, 108, 24666, 5122, 336, 422, 376, 7, 41, 253, 65, 177, 137, 5, 507, 25, 51, 685, 132, 12, 288, 7, 8062, 11, 5, 94, 237, 4, 2]",Wales' heroic Euro 2016 run came to an end one game before the final as they lost 2-0 to Portugal in the last four.
1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","The Â£42m MV Loch Seaforth made its first passenger sailing last month but is still in a ""test period"" and not fully in service. Thursday's fault took five hours to fix and the ferry was cleared again for sailings. Another ferry, the Isle of Lewis, took the passengers involved. Bad weather has led the cancellations of Friday sailings on the Stornoway-Ullapool route and other services on Scotland's west coast. Ferry operator Caledonian MacBrayne said withdrawing the Loch Seaforth had been an operational decision and the fault would not have prevented the ship from sailing. A spokesman said: ""Yesterday evening an issue arose with an engine room ventilation fan which required attention and an operational decision was taken to remove her from the route while it was fixed. ""While passengers were delayed, and we regret any inconvenience to them, no-one was stranded."" A spokesman added: ""This was not a major issue but it required around five hours of work as the fan was in a difficult to reach location.""",31759248,"[0, 133, 1437, 2537, 29254, 3714, 119, 28830, 26384, 27411, 22494, 156, 63, 78, 4408, 17664, 94, 353, 53, 16, 202, 11, 10, 22, 21959, 675, 113, 8, 45, 1950, 11, 544, 4, 296, 18, 7684, 362, 292, 722, 7, 4190, 8, 5, 15169, 21, 6049, 456, 13, 18840, 1033, 4, 2044, 15169, 6, 5, 18930, 9, 3577, 6, 362, 5, 3670, 963, 4, 5654, 1650, 34, 669, 5, 24068, 1635, 9, 273, 18840, 1033, 15, 5, 312, 4244, 20574, 12, 791, 890, 1115, 8110, 3420, 8, 97, 518, 15, 3430, 18, 3072, 3673, 4, 25820, 5364, 2912, 196, 26399, ...]","[0, 250, 92, 15169, 1490, 13, 5, 312, 4244, 20574, 7, 121, 890, 1115, 8110, 3420, 21, 8059, 13375, 31, 4408, 5941, 142, 9, 10, 23624, 30911, 2378, 4, 2]",A new ferry built for the Stornoway to Ullapool route was temporarily withdrawn from passenger duties because of a faulty ventilation fan.
2,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Broadband speeds in Kingsmere, on the edge of Bicester, rarely exceed 2Mbps, and some homes cannot get a landline. Residents of the 400 homes have put up posters warning potential newcomers of the issue. Developer Countryside Properties said it had provided ducting for the cables but it was up to BT what went in them. BT said it proposed sharing the Â£45,000 cost of providing fibre-optic broadband. The development at Kingsmere is part of the 13,000 homes planned by the government to turn Bicester into one of a new generation of garden cities, announced earlier this month. Resident Matt Maunder said: ""I'm a home worker, and I need good broadband to do my job. We've actually got residents who moved here in August who still don't have a phone line - that's just unacceptable. ""I can't carry out my job effectively, I can't take advantage of services like Skype, my family live abroad so I can't get in touch with them as easily as I would like. ""Unfortunately we have got people now saying they wish they hadn't moved here because of the way the service is and that's a real shame, particularly because it's been lauded as the latest and greatest housing development in the country."" BT said it had reached an agreement for 726 additional homes yet to be built and a proposal for the existing houses would be ready by 10 January. Countryside Properties said the ducting installed at Kingsmere was based on a design agreed with BT in 2010, based on a copper network. A spokesman said: ""It is then BT/Openreach's decision as to whether they would run copper or fibre through the ducting."" Communications Minister Ed Vaizey said: ""You wouldn't move into a brand new house in 2014/2015 and not expect to get superfast broadband. It is unacceptable.""",30585271,"[0, 28806, 9484, 9706, 11, 5414, 25416, 6, 15, 5, 3543, 9, 163, 40755, 6, 7154, 11514, 132, 36030, 6, 8, 103, 1611, 1395, 120, 10, 1212, 1902, 4, 10073, 9, 5, 3675, 1611, 33, 342, 62, 15736, 2892, 801, 19298, 9, 5, 696, 4, 31285, 5093, 3730, 13094, 26, 24, 56, 1286, 29259, 154, 13, 5, 19185, 53, 24, 21, 62, 7, 12482, 99, 439, 11, 106, 4, 12482, 26, 24, 1850, 3565, 5, 1437, 2537, 29254, 1898, 6, 151, 701, 9, 1976, 21060, 12, 19693, 636, 11451, 4, 20, 709, 23, 5414, 25416, 16, 233, 9, 5, 508, ...]","[0, 35129, 11, 65, 9, 5, 168, 18, 92, 5671, 1947, 32, 2892, 801, 184, 4859, 89, 16, 117, 1769, 11451, 15, 49, 709, 4, 2]",Residents in one of the government's new garden cities are warning potential home buyers there is no fast broadband on their development.
3,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Mr Trump said Mr Obama had learned well before the 8 November poll about the accusations and ""did nothing"". His comments followed an article in the Washington Post which said that Mr Obama learned last August of President Vladimir Putin's ""direct involvement"". The alleged meddling is the subject of high-level investigations in the US. President Putin has repeatedly denied any Russian interference into the presidential election. The Washington Post article says Mr Obama was told early last August by sources deep within the Russian government that Mr Putin was directly involved in a cyber campaign to disrupt the election, injure Hillary Clinton and aid a Trump victory. The Post said Mr Obama secretly debated dozens of options to punish Russia but in the end settled on what it called symbolic measures - the expulsion of 35 diplomats and closure of two Russian compounds. They came in late December, well after the election. The Post reported that Mr Obama was concerned he might himself be seen as trying to manipulate the election. The paper quoted a former administration official as saying: ""From national security people there was a sense of immediate introspection, of, 'Wow, did we mishandle this'."" Measures Mr Obama had considered but which were not put into action included planting cyber weapons in the Russian infrastructure and releasing information personally damaging to Mr Putin. Imagine, for a moment, that you're Barack Obama in August 2016. You've just been informed by the CIA that Vladimir Putin has ordered a wide-ranging effort to disrupt the US presidential election. What do you do? Mr Obama responded in typical fashion - cautiously. He alerted state officials, warned Russia and attempted (unsuccessfully) to fashion a bipartisan response with Republicans in Congress. Now the second-guessing has begun. Some Democrats are saying the Obama team should have gone public with such a startling discovery before election day. The president feared such a move would prompt the Republican nominee to accuse him of meddling and undermine faith in the electoral process. He believed Mrs Clinton was going to win anyway, so it was best not to rock the boat. Mr Trump himself is now questioning why Mr Obama didn't do more - a curious position given that he recently described the Russia hacking story as a Democratic ""hoax"". These latest revelations add yet another wrinkle to a 2016 campaign that will be hashed and rehashed for the foreseeable future. The most pressing question now, however, is not what Mr Obama did. It's what the US government does next. Mr Trump tweeted on Friday: ""The Obama Administration knew far in advance of November 8th about election meddling by Russia. Did nothing about it. WHY?"" He followed that up with two more tweets on Saturday, the second saying: ""Obama Administration official said they ""choked"" when it came to acting on Russian meddling of election. They didn't want to hurt Hillary?"" He repeats the argument in an interview with Fox News, which will air on Sunday. ""If he had the information, why didn't he do something about it? He should have done something about it. But you don't read that. It's quite sad."" Allegations of collusion between the Trump team and Russian officials during the election have dogged the president's first five months in office. He has repeatedly denied the allegations, calling the investigations a ""witch hunt"". US investigators are looking into whether Russian cyber hackers targeted US electoral systems to help Mr Trump win. US media say special counsel Robert Mueller is also investigating Mr Trump for possible obstruction of justice over the Russia inquiries. They involve the president's firing of FBI chief James Comey, who led one of the inquiries, and Mr Trump's alleged attempt to end a probe into sacked national security adviser Michael Flynn.",40395433,"[0, 10980, 140, 26, 427, 1284, 56, 2435, 157, 137, 5, 290, 759, 2902, 59, 5, 6124, 8, 22, 24001, 1085, 845, 832, 1450, 1432, 41, 1566, 11, 5, 663, 1869, 61, 26, 14, 427, 1284, 2435, 94, 830, 9, 270, 6546, 3176, 18, 22, 27555, 5292, 845, 20, 1697, 13683, 16, 5, 2087, 9, 239, 12, 4483, 4941, 11, 5, 382, 4, 270, 3176, 34, 3987, 2296, 143, 1083, 8149, 88, 5, 1939, 729, 4, 20, 663, 1869, 1566, 161, 427, 1284, 21, 174, 419, 94, 830, 30, 1715, 1844, 624, 5, 1083, 168, 14, 427, 3176, 21, 2024, ...]","[0, 6517, 807, 140, 34, 1238, 39, 9933, 4282, 1284, 9, 30365, 81, 1697, 1083, 8149, 11, 5, 382, 729, 11, 336, 4, 2]",President Donald Trump has accused his predecessor Barack Obama of inaction over alleged Russian interference in the US election in 2016.
4,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","James, 29, who presents the drive-time show, studied drama and had a number of shows on the students' union radio station before rising to fame. The DJ, who is covering Radio 1's Big Weekend in Norwich this bank holiday, said he was ""completely honoured"". Father Ted co-writer Graham Linehan and broadcaster Dame Jenni Murray are also receiving degrees. James said: ""My affair with UEA continues - it really is wonderful and I'm completely honoured to be given this. ""I spent a very happy three years there and it's the place where I fell in love with drama and radio, so it's very special to me. ""As soon as the ceremony is over, I'll set about officially changing my name on all credit cards, mortgage documents and, most importantly, the Radio 1 website."" James and Linehan will receive an Honorary Doctorate of Letters at a ceremony in Norwich in July, with BBC Radio 4's Woman's Hour presenter Murray being given an Honorary Doctorate of Civil Law. Other names being recognised for their distinguished careers include comedian Arthur Smith, author Erica Wagner, actor Samuel West and the Bishop of Norwich, the Rt Rev Graham James.",32837393,"[0, 18031, 6, 1132, 6, 54, 6822, 5, 1305, 12, 958, 311, 6, 8069, 4149, 8, 56, 10, 346, 9, 924, 15, 5, 521, 108, 2918, 3188, 1992, 137, 2227, 7, 9444, 4, 20, 7766, 6, 54, 16, 4631, 4611, 112, 18, 1776, 16520, 11, 18749, 42, 827, 2317, 6, 26, 37, 21, 22, 28655, 16962, 845, 9510, 8115, 1029, 12, 9408, 4572, 5562, 4134, 8, 10901, 9038, 23710, 118, 4479, 32, 67, 2806, 4176, 4, 957, 26, 35, 22, 2387, 7226, 19, 121, 14684, 1388, 111, 24, 269, 16, 4613, 8, 38, 437, 2198, 16962, 7, 28, 576, 42, ...]","[0, 28713, 4611, 112, 7766, 4275, 957, 16, 7, 1325, 41, 23536, 3299, 877, 31, 5, 589, 9, 953, 7413, 14190, 36, 9162, 250, 322, 2]",BBC Radio 1 DJ Greg James is to receive an honorary doctorate from the University of East Anglia (UEA).


Unnamed: 0,attention_mask,document,id,input_ids,labels,summary
0,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Media playback is not supported on this device How did their players rate in the biggest match in the history of Welsh football? Coped well with Portugal's early flurry of crosses but was powerless to deny Cristiano Ronaldo for Portugal's opening goal. The Crystal Palace player was unfortunate to be wrong-footed by Nani's deflection for the second goal. Watchful against the threat of Ronaldo cutting inside from Portugal's left, the Reading defender was kept busy by the likes of Renato Sanches and Nani and struggled to influence the game in attack. Switched to the left side of Wales' three centre-backs in Ben Davies' absence, the West Brom man timed his advances well to make interceptions. Beaten by Ronaldo for Portugal's opening goal but won a team-high eight aerial duels. Made some strong early challenges, particularly on Cristiano Ronaldo, to assert his authority on the game. The Swansea City skipper led by example, winning 100% of his 50-50 contests with Portugal players. Making his first international start since March 2015, he competed well in the air as Portugal sought to make the most of Ronaldo's aerial prowess. Wales might have missed Ben Davies' distribution but his replacement was solid defensively. Like Gunter, kept on the back foot by Portugal's attacking players. Pushed forward but, when he got into promising positions, struggled to provide quality crosses. An early booking for a foul on Nani made his job of protecting Wales' defence difficult, but still the Liverpool midfielder buzzed around with intent. Typically sound in possession but not as influential as he has been earlier in the tournament. Showed imagination with a low corner which led to a chance for Gareth Bale but had only limited influence in open play before being replaced by Sam Vokes shortly after Portugal's second goal. Made some characteristic runs into the Portugal penalty area but could not make the crucial connections. Forced deeper as Portugal's midfield gained control in the second half, the Leicester Premier League winner had to curb his attacking instincts. Trademark runs from deep and at a startling pace had Portugal's defenders backtracking in the first half but his influence waned in the second period. The Real Madrid forward's audacious long-range shot was Wales' last effort. Brimming with confidence following his stunning goal in the quarter-final win over Belgium, the free agent stretched Portugal's defence with his powerful running. He was starved of the ball in the second half, however, as the match wore on. Brought on shortly after Wales fell 2-0 behind, the Burnley striker failed to connect meaningfully with any of the crosses which came his way. Did not see much of the ball and, when he did, was not in a position to cause Portugal any problems. Tried making his usual probing runs between the opponents' midfield and defence but found himself crowded out.",36730443,"[0, 18801, 20083, 16, 45, 2800, 15, 42, 2187, 1336, 222, 49, 472, 731, 11, 5, 934, 914, 11, 5, 750, 9, 12093, 1037, 116, 9351, 196, 157, 19, 8062, 18, 419, 18996, 9, 20238, 53, 21, 33128, 7, 7631, 8767, 5472, 7991, 13, 8062, 18, 1273, 724, 4, 20, 9793, 5928, 869, 21, 9327, 7, 28, 1593, 12, 26620, 30, 234, 1543, 18, 3816, 20576, 13, 5, 200, 724, 4, 3075, 2650, 136, 5, 1856, 9, 7991, 3931, 1025, 31, 8062, 18, 314, 6, 5, 4913, 5142, 21, 1682, 3610, 30, 5, 3829, 9, 6340, 3938, 208, 23833, 8, ...]","[0, 771, 4575, 108, 24666, 5122, 336, 422, 376, 7, 41, 253, 65, 177, 137, 5, 507, 25, 51, 685, 132, 12, 288, 7, 8062, 11, 5, 94, 237, 4, 2]",Wales' heroic Euro 2016 run came to an end one game before the final as they lost 2-0 to Portugal in the last four.
1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","The Â£42m MV Loch Seaforth made its first passenger sailing last month but is still in a ""test period"" and not fully in service. Thursday's fault took five hours to fix and the ferry was cleared again for sailings. Another ferry, the Isle of Lewis, took the passengers involved. Bad weather has led the cancellations of Friday sailings on the Stornoway-Ullapool route and other services on Scotland's west coast. Ferry operator Caledonian MacBrayne said withdrawing the Loch Seaforth had been an operational decision and the fault would not have prevented the ship from sailing. A spokesman said: ""Yesterday evening an issue arose with an engine room ventilation fan which required attention and an operational decision was taken to remove her from the route while it was fixed. ""While passengers were delayed, and we regret any inconvenience to them, no-one was stranded."" A spokesman added: ""This was not a major issue but it required around five hours of work as the fan was in a difficult to reach location.""",31759248,"[0, 133, 1437, 2537, 29254, 3714, 119, 28830, 26384, 27411, 22494, 156, 63, 78, 4408, 17664, 94, 353, 53, 16, 202, 11, 10, 22, 21959, 675, 113, 8, 45, 1950, 11, 544, 4, 296, 18, 7684, 362, 292, 722, 7, 4190, 8, 5, 15169, 21, 6049, 456, 13, 18840, 1033, 4, 2044, 15169, 6, 5, 18930, 9, 3577, 6, 362, 5, 3670, 963, 4, 5654, 1650, 34, 669, 5, 24068, 1635, 9, 273, 18840, 1033, 15, 5, 312, 4244, 20574, 12, 791, 890, 1115, 8110, 3420, 8, 97, 518, 15, 3430, 18, 3072, 3673, 4, 25820, 5364, 2912, 196, 26399, ...]","[0, 250, 92, 15169, 1490, 13, 5, 312, 4244, 20574, 7, 121, 890, 1115, 8110, 3420, 21, 8059, 13375, 31, 4408, 5941, 142, 9, 10, 23624, 30911, 2378, 4, 2]",A new ferry built for the Stornoway to Ullapool route was temporarily withdrawn from passenger duties because of a faulty ventilation fan.
2,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Broadband speeds in Kingsmere, on the edge of Bicester, rarely exceed 2Mbps, and some homes cannot get a landline. Residents of the 400 homes have put up posters warning potential newcomers of the issue. Developer Countryside Properties said it had provided ducting for the cables but it was up to BT what went in them. BT said it proposed sharing the Â£45,000 cost of providing fibre-optic broadband. The development at Kingsmere is part of the 13,000 homes planned by the government to turn Bicester into one of a new generation of garden cities, announced earlier this month. Resident Matt Maunder said: ""I'm a home worker, and I need good broadband to do my job. We've actually got residents who moved here in August who still don't have a phone line - that's just unacceptable. ""I can't carry out my job effectively, I can't take advantage of services like Skype, my family live abroad so I can't get in touch with them as easily as I would like. ""Unfortunately we have got people now saying they wish they hadn't moved here because of the way the service is and that's a real shame, particularly because it's been lauded as the latest and greatest housing development in the country."" BT said it had reached an agreement for 726 additional homes yet to be built and a proposal for the existing houses would be ready by 10 January. Countryside Properties said the ducting installed at Kingsmere was based on a design agreed with BT in 2010, based on a copper network. A spokesman said: ""It is then BT/Openreach's decision as to whether they would run copper or fibre through the ducting."" Communications Minister Ed Vaizey said: ""You wouldn't move into a brand new house in 2014/2015 and not expect to get superfast broadband. It is unacceptable.""",30585271,"[0, 28806, 9484, 9706, 11, 5414, 25416, 6, 15, 5, 3543, 9, 163, 40755, 6, 7154, 11514, 132, 36030, 6, 8, 103, 1611, 1395, 120, 10, 1212, 1902, 4, 10073, 9, 5, 3675, 1611, 33, 342, 62, 15736, 2892, 801, 19298, 9, 5, 696, 4, 31285, 5093, 3730, 13094, 26, 24, 56, 1286, 29259, 154, 13, 5, 19185, 53, 24, 21, 62, 7, 12482, 99, 439, 11, 106, 4, 12482, 26, 24, 1850, 3565, 5, 1437, 2537, 29254, 1898, 6, 151, 701, 9, 1976, 21060, 12, 19693, 636, 11451, 4, 20, 709, 23, 5414, 25416, 16, 233, 9, 5, 508, ...]","[0, 35129, 11, 65, 9, 5, 168, 18, 92, 5671, 1947, 32, 2892, 801, 184, 4859, 89, 16, 117, 1769, 11451, 15, 49, 709, 4, 2]",Residents in one of the government's new garden cities are warning potential home buyers there is no fast broadband on their development.
3,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","Mr Trump said Mr Obama had learned well before the 8 November poll about the accusations and ""did nothing"". His comments followed an article in the Washington Post which said that Mr Obama learned last August of President Vladimir Putin's ""direct involvement"". The alleged meddling is the subject of high-level investigations in the US. President Putin has repeatedly denied any Russian interference into the presidential election. The Washington Post article says Mr Obama was told early last August by sources deep within the Russian government that Mr Putin was directly involved in a cyber campaign to disrupt the election, injure Hillary Clinton and aid a Trump victory. The Post said Mr Obama secretly debated dozens of options to punish Russia but in the end settled on what it called symbolic measures - the expulsion of 35 diplomats and closure of two Russian compounds. They came in late December, well after the election. The Post reported that Mr Obama was concerned he might himself be seen as trying to manipulate the election. The paper quoted a former administration official as saying: ""From national security people there was a sense of immediate introspection, of, 'Wow, did we mishandle this'."" Measures Mr Obama had considered but which were not put into action included planting cyber weapons in the Russian infrastructure and releasing information personally damaging to Mr Putin. Imagine, for a moment, that you're Barack Obama in August 2016. You've just been informed by the CIA that Vladimir Putin has ordered a wide-ranging effort to disrupt the US presidential election. What do you do? Mr Obama responded in typical fashion - cautiously. He alerted state officials, warned Russia and attempted (unsuccessfully) to fashion a bipartisan response with Republicans in Congress. Now the second-guessing has begun. Some Democrats are saying the Obama team should have gone public with such a startling discovery before election day. The president feared such a move would prompt the Republican nominee to accuse him of meddling and undermine faith in the electoral process. He believed Mrs Clinton was going to win anyway, so it was best not to rock the boat. Mr Trump himself is now questioning why Mr Obama didn't do more - a curious position given that he recently described the Russia hacking story as a Democratic ""hoax"". These latest revelations add yet another wrinkle to a 2016 campaign that will be hashed and rehashed for the foreseeable future. The most pressing question now, however, is not what Mr Obama did. It's what the US government does next. Mr Trump tweeted on Friday: ""The Obama Administration knew far in advance of November 8th about election meddling by Russia. Did nothing about it. WHY?"" He followed that up with two more tweets on Saturday, the second saying: ""Obama Administration official said they ""choked"" when it came to acting on Russian meddling of election. They didn't want to hurt Hillary?"" He repeats the argument in an interview with Fox News, which will air on Sunday. ""If he had the information, why didn't he do something about it? He should have done something about it. But you don't read that. It's quite sad."" Allegations of collusion between the Trump team and Russian officials during the election have dogged the president's first five months in office. He has repeatedly denied the allegations, calling the investigations a ""witch hunt"". US investigators are looking into whether Russian cyber hackers targeted US electoral systems to help Mr Trump win. US media say special counsel Robert Mueller is also investigating Mr Trump for possible obstruction of justice over the Russia inquiries. They involve the president's firing of FBI chief James Comey, who led one of the inquiries, and Mr Trump's alleged attempt to end a probe into sacked national security adviser Michael Flynn.",40395433,"[0, 10980, 140, 26, 427, 1284, 56, 2435, 157, 137, 5, 290, 759, 2902, 59, 5, 6124, 8, 22, 24001, 1085, 845, 832, 1450, 1432, 41, 1566, 11, 5, 663, 1869, 61, 26, 14, 427, 1284, 2435, 94, 830, 9, 270, 6546, 3176, 18, 22, 27555, 5292, 845, 20, 1697, 13683, 16, 5, 2087, 9, 239, 12, 4483, 4941, 11, 5, 382, 4, 270, 3176, 34, 3987, 2296, 143, 1083, 8149, 88, 5, 1939, 729, 4, 20, 663, 1869, 1566, 161, 427, 1284, 21, 174, 419, 94, 830, 30, 1715, 1844, 624, 5, 1083, 168, 14, 427, 3176, 21, 2024, ...]","[0, 6517, 807, 140, 34, 1238, 39, 9933, 4282, 1284, 9, 30365, 81, 1697, 1083, 8149, 11, 5, 382, 729, 11, 336, 4, 2]",President Donald Trump has accused his predecessor Barack Obama of inaction over alleged Russian interference in the US election in 2016.
4,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]","James, 29, who presents the drive-time show, studied drama and had a number of shows on the students' union radio station before rising to fame. The DJ, who is covering Radio 1's Big Weekend in Norwich this bank holiday, said he was ""completely honoured"". Father Ted co-writer Graham Linehan and broadcaster Dame Jenni Murray are also receiving degrees. James said: ""My affair with UEA continues - it really is wonderful and I'm completely honoured to be given this. ""I spent a very happy three years there and it's the place where I fell in love with drama and radio, so it's very special to me. ""As soon as the ceremony is over, I'll set about officially changing my name on all credit cards, mortgage documents and, most importantly, the Radio 1 website."" James and Linehan will receive an Honorary Doctorate of Letters at a ceremony in Norwich in July, with BBC Radio 4's Woman's Hour presenter Murray being given an Honorary Doctorate of Civil Law. Other names being recognised for their distinguished careers include comedian Arthur Smith, author Erica Wagner, actor Samuel West and the Bishop of Norwich, the Rt Rev Graham James.",32837393,"[0, 18031, 6, 1132, 6, 54, 6822, 5, 1305, 12, 958, 311, 6, 8069, 4149, 8, 56, 10, 346, 9, 924, 15, 5, 521, 108, 2918, 3188, 1992, 137, 2227, 7, 9444, 4, 20, 7766, 6, 54, 16, 4631, 4611, 112, 18, 1776, 16520, 11, 18749, 42, 827, 2317, 6, 26, 37, 21, 22, 28655, 16962, 845, 9510, 8115, 1029, 12, 9408, 4572, 5562, 4134, 8, 10901, 9038, 23710, 118, 4479, 32, 67, 2806, 4176, 4, 957, 26, 35, 22, 2387, 7226, 19, 121, 14684, 1388, 111, 24, 269, 16, 4613, 8, 38, 437, 2198, 16962, 7, 28, 576, 42, ...]","[0, 28713, 4611, 112, 7766, 4275, 957, 16, 7, 1325, 41, 23536, 3299, 877, 31, 5, 589, 9, 953, 7413, 14190, 36, 9162, 250, 322, 2]",BBC Radio 1 DJ Greg James is to receive an honorary doctorate from the University of East Anglia (UEA).


# 

In [55]:
tokenized_xsum['test'].features

{'attention_mask': Sequence(feature=Value(dtype='int8', id=None), length=-1, id=None),
 'document': Value(dtype='string', id=None),
 'id': Value(dtype='string', id=None),
 'input_ids': Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None),
 'labels': Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None),
 'summary': Value(dtype='string', id=None)}

## Compare Machine Summaries to Professional Human Written Summaries
To score our machine generated summaries against professional human written ones, we compute the cosine similarities between embeddings to measure the semantic similaritiy between two texts. The comparisons we will be marking include: human summary to machine summary, human summary to original document, and machine summary to original document

### We are going to focus on 10 articles and build 10 models to inspect each pair individually

In [56]:
def listToString(s): 
    str1 = "" 
    
    for ele in s: 
        str1 += ele  
 
    return str1 

In [57]:
article1 = tokenized_xsum['test']['document'][0]
article2 = tokenized_xsum['test']['document'][1]
article3 = tokenized_xsum['test']['document'][2]
article4 = tokenized_xsum['test']['document'][3]
article5 = tokenized_xsum['test']['document'][4]
article6 = tokenized_xsum['test']['document'][5]
article7 = tokenized_xsum['test']['document'][7]
article8 = tokenized_xsum['test']['document'][8]
article9 = tokenized_xsum['test']['document'][9]
article10 = tokenized_xsum['test']['document'][10]

summary1 = tokenized_xsum['test']['summary'][0]
summary2 = tokenized_xsum['test']['summary'][1]
summary3 = tokenized_xsum['test']['summary'][2]
summary4 = tokenized_xsum['test']['summary'][3]
summary5 = tokenized_xsum['test']['summary'][4]
summary6 = tokenized_xsum['test']['summary'][5]
summary7 = tokenized_xsum['test']['summary'][7]
summary8 = tokenized_xsum['test']['summary'][8]
summary9 = tokenized_xsum['test']['summary'][9]
summary10 = tokenized_xsum['test']['summary'][10]


## Model 1

In [58]:
input1 = tokenizer(article1, return_tensors='pt', truncation=True)
summary_ids1 = model.generate(input1['input_ids'], max_length=500, early_stopping=False)
machineSummary1 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids1])

In [65]:
machineSummary1 = listToString(machineSummary1)
summary1 = listToString(summary1)
original1 = listToString(article1)

comparison1 = [summary1, machineSummary1, original1]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings1 = token_model.encode(comparison1)
print(util.pytorch_cos_sim(comparison_embeddings1[0], comparison_embeddings1[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings1[0], comparison_embeddings1[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings1[1], comparison_embeddings1[2])) # machine summary to original article

tensor([[0.7415]])
tensor([[0.7645]])
tensor([[0.9807]])


In [60]:
comparison1

['There is a "chronic" need for more housing for prison leavers in Wales, according to a charity.',
 'Prison Link Cymru had 1,099 referrals in 2015-16 and said some ex-offenders were living rough for up to a year before finding suitable accommodation. Workers at the charity claim investment in housing would be cheaper than jailing homeless repeat offenders. Welsh Government said more people than ever were getting help to address housing problems. Changes to the Housing Act in Wales, introduced in 2015, removed the right for prison leavers to be given priority for accommodation.',
 'Prison Link Cymru had 1,099 referrals in 2015-16 and said some ex-offenders were living rough for up to a year before finding suitable accommodation. Workers at the charity claim investment in housing would be cheaper than jailing homeless repeat offenders. The Welsh Government said more people than ever were getting help to address housing problems. Changes to the Housing Act in Wales, introduced in 2015, r

# Model 2

In [66]:
input2 = tokenizer(article2, return_tensors='pt', truncation=True)
summary_ids2 = model.generate(input2['input_ids'], max_length=500, early_stopping=False)
machineSummary2 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids2])

In [67]:
machineSummary2 = listToString(machineSummary2)
summary2 = listToString(summary2)
original2 = listToString(article2)

comparison2 = [summary2, machineSummary2, original2]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings2 = token_model.encode(comparison2)
print(util.pytorch_cos_sim(comparison_embeddings2[0], comparison_embeddings2[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings2[0], comparison_embeddings2[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings2[1], comparison_embeddings2[2])) # machine summary to original article

tensor([[0.7994]])
tensor([[0.7776]])
tensor([[0.9891]])


In [68]:
comparison2

['A man has appeared in court after firearms, ammunition and cash were seized by police in Edinburgh.',
 'Officers searched properties in the Waterfront Park and Colonsay View areas of the city. Detectives said three firearms, ammunition and a five-figure sum of money were recovered. A 26-year-old man who was arrested and charged appeared at Edinburgh Sheriff Court on Thursday.',
 'Officers searched properties in the Waterfront Park and Colonsay View areas of the city on Wednesday. Detectives said three firearms, ammunition and a five-figure sum of money were recovered. A 26-year-old man who was arrested and charged appeared at Edinburgh Sheriff Court on Thursday.']

# Model 3

In [70]:
input3 = tokenizer(article3, return_tensors='pt', truncation=True)
summary_ids3 = model.generate(input3['input_ids'], max_length=500, early_stopping=False)
machineSummary3 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids3])

In [71]:
machineSummary3 = listToString(machineSummary3)
summary3 = listToString(summary3)
original3 = listToString(article3)

comparison3 = [summary3, machineSummary3, original3]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings3 = token_model.encode(comparison3)
print(util.pytorch_cos_sim(comparison_embeddings3[0], comparison_embeddings3[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings3[0], comparison_embeddings3[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings3[1], comparison_embeddings3[2])) # machine summary to original article

tensor([[0.7647]])
tensor([[0.8125]])
tensor([[0.9745]])


In [75]:
comparison3

['Four people accused of kidnapping and torturing a mentally disabled man in a "racially motivated" attack streamed on Facebook have been denied bail.',
 'Jordan Hill, Brittany Covington and Tesfaye Cooper, all 18, appeared in a Chicago court on Friday. The four have been charged with hate crimes and aggravated kidnapping and battery, among other things. An online fundraiser for their victim has collected $51,000 (Â£42,500) so far. Judge Maria Kuriakos Ciesil asked: "Where was your sense of decency?"',
 'Jordan Hill, Brittany Covington and Tesfaye Cooper, all 18, and Tanishia Covington, 24, appeared in a Chicago court on Friday. The four have been charged with hate crimes and aggravated kidnapping and battery, among other things. An online fundraiser for their victim has collected $51,000 (Â£42,500) so far. Denying the four suspects bail, Judge Maria Kuriakos Ciesil asked: "Where was your sense of decency?" Prosecutors told the court the beating started in a van and continued at a hous

# Model 4

In [73]:
input4 = tokenizer(article4, return_tensors='pt', truncation=True)
summary_ids4 = model.generate(input4['input_ids'], max_length=500, early_stopping=False)
machineSummary4 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids4])

In [76]:
machineSummary4 = listToString(machineSummary4)
summary4 = listToString(summary4)
original4 = listToString(article4)

comparison4 = [summary4, machineSummary4, original4]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings4 = token_model.encode(comparison4)
print(util.pytorch_cos_sim(comparison_embeddings4[0], comparison_embeddings4[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings4[0], comparison_embeddings4[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings4[1], comparison_embeddings4[2])) # machine summary to original article

tensor([[0.5789]])
tensor([[0.5764]])
tensor([[0.9999]])


In [77]:
comparison4

['West Brom have appointed Nicky Hammond as technical director, ending his 20-year association with Reading.',
 'The 48-year-old former Arsenal goalkeeper played for the Royals for four years. He was appointed youth academy director in 2000 and has been director of football since 2003. A West Brom statement said: "He played a key role in the Championship club twice winning promotion to the Premier League in 2006 and 2012"',
 'The 48-year-old former Arsenal goalkeeper played for the Royals for four years. He was appointed youth academy director in 2000 and has been director of football since 2003. A West Brom statement said: "He played a key role in the Championship club twice winning promotion to the Premier League in 2006 and 2012."']

# Model 5

In [78]:
input5 = tokenizer(article5, return_tensors='pt', truncation=True)
summary_ids5 = model.generate(input5['input_ids'], max_length=500, early_stopping=False)
machineSummary5 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids5])

In [80]:
machineSummary5 = listToString(machineSummary5)
summary5 = listToString(summary5)
original5 = listToString(article5)

comparison5 = [summary5, machineSummary5, original5]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings5 = token_model.encode(comparison5)
print(util.pytorch_cos_sim(comparison_embeddings5[0], comparison_embeddings5[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings5[0], comparison_embeddings5[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings5[1], comparison_embeddings5[2])) # machine summary to original article

tensor([[0.7724]])
tensor([[0.7405]])
tensor([[0.8663]])


In [81]:
comparison5

['The pancreas can be triggered to regenerate itself through a type of fasting diet, say US researchers.',
 'Experiments on mice put on modified form of the "fasting-mimicking diet" Diet regenerated a special type of cell in the pancreas called a beta cell. These are the cells that detect sugar in the blood and release insulin if it gets too high. There were benefits in both type 1 and type 2 diabetes in the mouse experiments. Separate trials of the diet in people have been shown to improve blood sugar levels.',
 'Restoring the function of the organ - which helps control blood sugar levels - reversed symptoms of diabetes in animal experiments. The study, published in the journal Cell, says the diet reboots the body. Experts said the findings were "potentially very exciting" as they could become a new treatment for the disease. The experiments were on mice put on a modified form of the "fasting-mimicking diet". When people go on it they spend five days on a low calorie, low protein, low

# Model 6

In [86]:
input6 = tokenizer(article6, return_tensors='pt', truncation=True)
summary_ids6 = model.generate(input6['input_ids'], max_length=500, early_stopping=False)
machineSummary6 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids6])

In [87]:
machineSummary6 = listToString(machineSummary6)
summary6 = listToString(summary6)
original6 = listToString(article6)

comparison6 = [summary6, machineSummary6, original6]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings6 = token_model.encode(comparison6)
print(util.pytorch_cos_sim(comparison_embeddings6[0], comparison_embeddings6[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings6[0], comparison_embeddings6[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings6[1], comparison_embeddings6[2])) # machine summary to original article

tensor([[0.4697]])
tensor([[0.4632]])
tensor([[0.9261]])


In [88]:
comparison6

['Since their impending merger was announced in January, there has been remarkably little comment about the huge proposed deal to combine Essilor and Luxottica.',
 "Two of the biggest firms in the lucrative international business of making spectacles. France's Essilor is the world's number one manufacturer of lenses and contact lenses. Italy's Luxottica is the leading frame manufacturer. If the deal goes through later this year the new company will become a behemoth of the industry.",
 'But there certainly should be. These are two of the biggest firms in the lucrative international business of making spectacles. France\'s Essilor is the world\'s number one manufacturer of lenses and contact lenses, while Italy\'s Luxottica is the leading frame manufacturer. It is not obvious that the merger is in the public interest, though the two firms certainly think it is. "The parties\' activities are highly complementary and the deal would generate significant synergies and innovation and would b

# Model 7

In [89]:
input7 = tokenizer(article7, return_tensors='pt', truncation=True)
summary_ids7 = model.generate(input7['input_ids'], max_length=500, early_stopping=False)
machineSummary7 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids7])

In [90]:
machineSummary7 = listToString(machineSummary7)
summary7 = listToString(summary7)
original7 = listToString(article7)

comparison7 = [summary7, machineSummary7, original7]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings7 = token_model.encode(comparison7)
print(util.pytorch_cos_sim(comparison_embeddings7[0], comparison_embeddings7[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings7[0], comparison_embeddings7[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings7[1], comparison_embeddings7[2])) # machine summary to original article

tensor([[0.8185]])
tensor([[0.7972]])
tensor([[0.8852]])


In [91]:
comparison7

['Have you heard the one about the computer programmer who bought a failing comedy club in Texas and turned it into a million dollar a year business?',
 'Kareem Badr and two friends bought Austin comedy club for $20,000 in 2009. Three years ago he was able to quit his day job and draw a salary from the club. Ibis World expects total US annual comedy club revenue to grow by 1.8% over the next five years to $344.6m in 2020. Top-tier performers make much more. According to Forbes, Canadian comedian Russell Peters grossed $19m with 64 shows in 2013. Jerry Seinfeld is the highest paid comedian in the US, set to earn $36m this year.',
 'It\'s no joke. But Kareem Badr says people did laugh in 2009 when he and two friends paid $20,000 (Â£13,000) for the Hideout in Austin, when it wasn\'t making money and the previous owner decided not to renew the lease. "We took over a sinking ship and each brought a bucket to bail it out," says Mr Badr. "None of us had any experience of running a business. B

# Model 8

In [92]:
input8 = tokenizer(article8, return_tensors='pt', truncation=True)
summary_ids8 = model.generate(input8['input_ids'], max_length=500, early_stopping=False)
machineSummary8 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids8])

In [93]:
machineSummary8 = listToString(machineSummary8)
summary8 = listToString(summary8)
original8 = listToString(article8)

comparison8 = [summary8, machineSummary8, original8]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings8 = token_model.encode(comparison8)
print(util.pytorch_cos_sim(comparison_embeddings8[0], comparison_embeddings8[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings8[0], comparison_embeddings8[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings8[1], comparison_embeddings8[2])) # machine summary to original article

tensor([[0.5554]])
tensor([[0.6455]])
tensor([[0.7296]])


In [94]:
comparison8

["The reaction from BT's investors told us much about media regulator Ofcom's ruling on the fate of Openreach, the BT subsidiary that provides much of the UK's broadband infrastructure.",
 "Ofcom chief Sharon White said there were 'practical obstacles' to a break-up. Pension scheme probably most influenced Ofcom's thinking. It has assets of about Â£40bn and a deficit, on some measures, of about\xa0Â£10bn. Separating the pension as part of a break up would be a costly headache.",
 'Relieved that the giant telecoms company would not be broken up, they piled into the shares, sending them up 3% in early trading. BT dodged a bullet - and, as the chief executive of Ofcom, Sharon White, admitted, it was for prosaic reasons. She said complications with land deals and BT\'s giant pension scheme meant there were "practical obstacles" to a break-up that would delay the process several years. It\'s the pension scheme that probably most influenced Ofcom\'s thinking. BT\'s retirement scheme, inherit

# Model 9

In [95]:
input9 = tokenizer(article9, return_tensors='pt', truncation=True)
summary_ids9 = model.generate(input9['input_ids'], max_length=500, early_stopping=False)
machineSummary9 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids9])

In [96]:
machineSummary9 = listToString(machineSummary9)
summary9 = listToString(summary9)
original9 = listToString(article9)

comparison9 = [summary9, machineSummary9, original9]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings9 = token_model.encode(comparison9)
print(util.pytorch_cos_sim(comparison_embeddings9[0], comparison_embeddings9[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings9[0], comparison_embeddings9[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings9[1], comparison_embeddings9[2])) # machine summary to original article

tensor([[0.8072]])
tensor([[0.8070]])
tensor([[0.7622]])


In [97]:
comparison9

["Manager Brendan Rodgers is sure Celtic can exploit the wide open spaces of Hampden when they meet Rangers in Sunday's League Cup semi-final.",
 "Celtic face Rangers in the Scottish Cup semi-final at Hampden Park. Brendan Rodgers' side won 5-1 at Celtic Park in the league last month. Rodgers lost two semi-finals in his time at Liverpool and is aiming to make it third time lucky at the club he joined in the summer. The Northern Irishman would not be drawn on whether this was a step on the way to a potential domestic treble.",
 '"I\'m really looking forward to it - the home of Scottish football," said Rodgers ahead of his maiden visit. "I hear the pitch is good, a nice big pitch suits the speed in our team and our intensity. "The technical area goes right out to the end of the pitch, but you might need a taxi to get back to your staff." This will be Rodgers\' second taste of the Old Firm derby and his experience of the fixture got off to a great start with a 5-1 league victory at Celtic

# Model 10

In [98]:
input10 = tokenizer(article10, return_tensors='pt', truncation=True)
summary_ids10 = model.generate(input10['input_ids'], max_length=500, early_stopping=False)
machineSummary10 = ([tokenizer.decode(g, skip_special_tokens=True) for g in summary_ids10])

In [99]:
machineSummary10 = listToString(machineSummary10)
summary10 = listToString(summary10)
original10 = listToString(article10)

comparison10 = [summary10, machineSummary10, original10]
token_model = SentenceTransformer('distilbert-base-nli-mean-tokens')
comparison_embeddings10 = token_model.encode(comparison10)
print(util.pytorch_cos_sim(comparison_embeddings10[0], comparison_embeddings10[1])) # human summary to machine summary similarity
print(util.pytorch_cos_sim(comparison_embeddings10[0], comparison_embeddings10[2])) # human summary to original article
print(util.pytorch_cos_sim(comparison_embeddings10[1], comparison_embeddings10[2])) # machine summary to original article

tensor([[0.8606]])
tensor([[0.8641]])
tensor([[0.9768]])


In [100]:
comparison10

["Queen's University Belfast is cutting 236 jobs and 290 student places due to a funding reduction.",
 "Queen's University to cut undergraduate places by 1,010 over the next three years. Move is in response to an £8m cut in the subsidy received from the Department of Employment and Learning. Job losses will be among both academic and non-academic staff and Queen's says no compulsory redundancies should be required. There are currently around 17,000 full-time undergraduate and postgraduate students at the university, and around 3,800 staff.",
 'The move is in response to an £8m cut in the subsidy received from the Department of Employment and Learning (DEL). The cut in undergraduate places will come into effect from September 2015. Job losses will be among both academic and non-academic staff and Queen\'s says no compulsory redundancies should be required. There are currently around 17,000 full-time undergraduate and postgraduate students at the university, and around 3,800 staff. Queen