
# Benchmark Evaluation


In this demo, **we will focus on evaluating large language models using a benchmark dataset specific to the task at hand.**

**Learning Objectives:**

*By the end of this demo, you will be able to;*

* Obtain reference/benchmark data set for task-specific LLM evaluation
* Evaluate an LLM's performance on a specific task using task-specific metrics
* Compare relative performance of two LLMs using a benchmark set

In [0]:
%pip install -U -qq databricks-sdk rouge_score evaluate textstat mlflow tiktoken
dbutils.library.restartPython()

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


Before starting the demo, run the provided classroom setup script. This script will define configuration variables necessary for the demo. Execute the following cell:

In [0]:
%run ../Includes/Classroom-Setup-03


The examples and models presented in this course are intended solely for demonstration and educational purposes.
 Please note that the models and prompt examples may sometimes contain offensive, inaccurate, biased, or harmful content.


**Other Conventions:**

Throughout this demo, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [0]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"Dataset Location:  {DA.paths.datasets}")

Username:          labuser11003544_1753435669@vocareum.com
Catalog Name:      dbacademy
Schema Name:       labuser11003544_1753435669
Working Directory: /Volumes/dbacademy/ops/labuser11003544_1753435669@vocareum_com
Dataset Location:  NestedNamespace (news='/Volumes/dbacademy_news/v01', arxiv='/Volumes/dbacademy_arxiv/v01')


## Demo Overview

In this demonstration, we will be evaluating the performance of an AI system designed to summarize text.

The text documents that we will be summarizing are a collection of fictional product reviews for grocery products.

The AI system works as follows:

1. Accepts a text document as input
2. Constructs an LLM prompt using few-shot learning to summarize the text
3. Submits the prompt to an LLM for summarization
4. Returns summarized text

See below for an example of the system.

## Step 1: Setup Models to Use

Next, we will setup the model that will be used for evaluation.

We will use **Databricks Claude 3.7 Sonnet** and **Llama 3.3 70B Instruct** for evaluation.

In [0]:
from databricks.sdk.service.serving import ChatMessage
from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

# Define the first model for summarization
def query_summary_system(input: str) -> str:
    messages = [
        {
            "role": "system",
            "content": "You are an assistant that summarizes text. Given a text input, you need to provide a one-sentence summary. You specialize in summarizing reviews of grocery products. Please keep the reviews in first-person perspective if they're originally written in first person. Do not change the sentiment. Do not create a run-on sentence – be concise."
        },
        { 
            "role": "user", 
            "content": input 
        }
    ]
    messages = [ChatMessage.from_dict(message) for message in messages]
    chat_response = w.serving_endpoints.query(
        name="databricks-claude-3-7-sonnet",
        messages=messages,
        temperature=0.1,
        max_tokens=128
    )

    return chat_response.as_dict()["choices"][0]["message"]["content"]

# Define the second model for summarization
def challenger_query_summary_system(input: str) -> str:
    messages = [
        {
            "role": "system",
            "content": "You are an assistant that summarizes text. Given a text input, you need to provide a one-sentence summary. You specialize in summarizing reviews of grocery products. Please keep the reviews in first-person perspective if they're originally written in first person. Do not change the sentiment. Do not create a run-on sentence – be concise."
        },
        { 
            "role": "user", 
            "content": input 
        }
    ]
    messages = [ChatMessage.from_dict(message) for message in messages]
    chat_response = w.serving_endpoints.query(
        name="databricks-meta-llama-3-3-70b-instruct",
        messages=messages,
        temperature=0.1,
        max_tokens=128
    )

    return chat_response.as_dict()["choices"][0]["message"]["content"]

Let's test the models!

In [0]:
query_summary_system(
    "This is the best frozen pizza I've ever had! Sure, it's not the healthiest, but it tasted just like it was delivered from our favorite pizzeria down the street. The cheese browned nicely and fresh tomatoes are a nice touch, too! I would buy it again despite it's high price. If I could change one thing, I'd made it a little healthier – could we get a gluten-free crust option? My son would love that."
)

"I loved this frozen pizza for its authentic pizzeria taste, nicely browned cheese and fresh tomatoes, though I wish they'd offer a gluten-free option despite the high price."

In [0]:
challenger_query_summary_system(
    "This is the best frozen pizza I've ever had! Sure, it's not the healthiest, but it tasted just like it was delivered from our favorite pizzeria down the street. The cheese browned nicely and fresh tomatoes are a nice touch, too! I would buy it again despite it's high price. If I could change one thing, I'd made it a little healthier – could we get a gluten-free crust option? My son would love that."
)

"I think this is the best frozen pizza I've ever had, with a delicious taste similar to a pizzeria's, and I would buy it again despite its high price."

To complete this workflow, we'll focus on the following steps:

1. Obtain a benchmark set for evaluating summarization
2. Compute summarization-specific evaluation metrics using the benchmark set
3. Compare performance with another LLM using the benchmark set and evaluation metrics

## Step 2: Benchmark and Reference Sets

As a reminder, our task-specific evaluation metrics (including ROUGE for summarization) require a benchmark set to compute scores.

There are two types of reference/benchmark sets that we can use:

1. Large, generic benchmark sets commonly used across use cases
2. Domain-specific benchmark sets specific to your use case

For this demo, we'll focus on the former.

### Generic Benchmark Set

First, we'll import a generic benchmark set used for evaluating text summarization.

We'll use the data set used in [Benchmarking Large Language Models for News Summarization](https://arxiv.org/abs/2301.13848) to evaluate how well our LLM solution summarizes general text.

This data set:

* is relatively large in scale at 599 records
* is related to news articles
* contains original text and *author-written* summaries of the original text

**Question:** What is the advantage of using ground-truth summaries that are written by the original author?

In [0]:
import pandas as pd

# Read and display the dataset
eval_data = pd.read_csv(f"{DA.paths.datasets.news}/csv/news-summaries.csv")
display(eval_data)

inputs,writer_summary
"Baltimore's mayor has sacked the US city's police chief, saying his leadership had become a distraction from fighting a ""crime surge"". Mayor Stephanie Rawlings-Blake said she was replacing Police Commissioner Anthony Batts with his deputy, Kevin Davis, for an interim period. The city was rocked by riots in April when a black man died after suffering injuries in police custody. Six officers were charged over the death of the 25-year-old, Freddie Gray. Speaking at a news conference on Wednesday, Mayor Rawlings-Blake said Mr Batts had ""served this city with distinction"" since becoming police chief in October 2012. But referring to the city's high homicide rate, she said ""too many continue to die"". ""The focus has been too much on the leadership of the department and not enough on the crime fighting,"" she told reporters, adding: ""We need to get the crime surge under control."" The city has seen a sharp increase in violence since Freddie Gray's death on 19 April, with 155 homicides this year, a 48% increase over the same period last year. On Tuesday, the police department announced that an outside organisation will review its response to the civil unrest that followed Mr Gray's death. The US justice department is also conducting a civil rights review of the Baltimore force and Mr Batts has been criticised by the city's police union. Earlier on Wednesday, the union released its report into the police handling of the rioting. It said officers had complained ""that they lacked basic riot equipment, training, and, as events unfolded, direction from leadership"". The report also said ""officers repeatedly expressed concern that the passive response to the civil unrest had allowed the disorder to grow into full scale rioting"". Recent events had ""placed attention on police leadership"", Ms Rawlings-Blake said, but denied her decision was influenced by the union report. Mr Davis, who is taking over immediately as interim police chief, praised his ""friend"" Mr Batts and said he was a ""true reform commissioner"". Mayor Rawlings-Blake said Mr Davis would ""bring accountability to police, hold officers who act out of line accountable for their actions"".","The mayor of Baltimore fired the police chief and replaced him with his deputy. According to the mayor, crime in the city was unacceptable. Riots in the city after a man died in police custody and a surge in homicide rates were cited as reasons for the firing."
"Western Sahara has welcomed Morocco's readmission to the African Union, 32 years after members refused to withdraw support for the territory's independence. It was a ""good opportunity"" and ""a chance to work together,"" a top Western Sahara official told the BBC. Morocco controls two-thirds of Western Sahara and sees it as part of its historic territory. However some, including the UN, see Western Sahara as Africa's last colony. Africa Live: More on this and other stories Find out more about Western Sahara A referendum was promised in 1991 but never carried out due to wrangling over who was eligible to vote. Thousands of Sahrawi refugees still live in refugee camps in Algeria - some have been there for 40 years. It is not clear what happens next but Western Sahara is hopeful that a committee set up by the AU will address the issues that both sides have raised. Some AU delegates said that it would be easier to resolve the issue with Morocco inside the AU. Sidi Mohammed, a Western Sahara official, told the BBC that Morocco's return to the AU means that it would now be expected to put ""in practice decisions taken by the AU with regard to a referendum in Western Sahara"". Mr Mohammed dismissed the suggestion that Morocco would now seek to get the AU to change its position, saying that the no country could unilaterally change the AU fundamental agreement, saying it opposed colonisation. In his speech at the AU summit, King Mohammed VI of Morocco said the readmission was not meant to divide the continental body. No. Algeria has always been a big supporter of Western Sahara's Polisario Front and it had wanted Morocco to accept independence of the territory as a condition for readmission. Zimbabwe and South Africa were also supportive of this stance but they were outnumbered by those who wanted Morocco back in the fold. There is no specific provision in the AU charter that bars any country from joining it. Morocco simply applied and the request was accepted by more than two-thirds of the 53 members. Morocco has been involved in intense lobbying and applied in July last year to rejoin the continental body. King Mohammed toured various African countries seeking support for the bid. No. While culturally the country's identity aligns with Arab states, its economic interests increasingly lie in Africa. This is a strategic move to continue exploring its interests in mining, construction, medical, insurance and banking sectors on the continent. Moroccan troops went into Western Sahara after Spain withdrew in 1975. Kitesurfing in a danger zone Inside world's most remote film festival Profile: African Union",Morocco joined the African Union after a referendum was promised over 30 years ago. The readmission was done in an effort to unite the continental body.
"With the new Avengers: Age of Ultron movie released this week, James Haskell showed off his inner Iron Man in a serious-looking Instagram post. The highly-anticipated movie premiered at Westfield London shopping centre on Tuesday evening with fans queuing up to see the A-list cast which includes Robert Downey Jr., Chris Hemsworth and Scarlett Johansson. And the London Wasps captain joined in on the hype as he posted the photo dressed as Downey Jr.'s character Iron Man. England flanker James Haskell dressed in Iron Man costume and posted it on his Instagram page. The London Wasps captain (middle) returned to the club where he started his career for the 2012 season. Haskell posted the image on Thursday along with the message: 'Avengers movie is out so thought i would release the inner Iron Man. @UnderArmourUK #TransformYourself #IWILL #AvengersAgeOfUltron.' The flanker returned to Wasps for the 2012 season after spells with Stade Francais, Ricoh Black Rams and Highlanders in New Zealand. Windsor-born Haskell first joined Wasps in 2002, playing eight seasons for the club and winning his first England cap five years later. But in 2009, he moved to Stade Francais in France and spent two seasons in the French capital before he made the move to Tokyo with the Ricoh Black Rams following the unsuccessful 2011 World Cup. Four months in Japan with the Rams and Haskell was on the move again when he switched to New Zealand to join the Highlanders. However, he made only 12 appearances and returned to England in 2012. Since returning to Wasps, Haskell has surpassed the 100 appearances mark for the club and has become a big part of the England squad with 57 caps to-date. Haskell has played his rugby in France, Japan and New Zealand after leaving the Wasps in 2009. Since his return to England, Haskell has enjoyed his rugby and surpassed the 100 appearance mark for Wasps.","James Haskell is a rugby player for the London Wasps. He has an extensive career with playing for other teams in France, Japan, and New Zealand. To celebrate the release of the new Avengers: Age of Ultron movie, Haskell posts an Instagram picture of him dress up as Iron Man."
"UK manufacturing activity contracted in April for the first time in three years, a survey has indicated, adding to fears over the economy's strength. The Markit/CIPS manufacturing Purchasing Managers' Index fell to 49.2 from 50.7 in March. A reading below 50 indicates falling output. It is the first time that activity in the sector has fallen since March 2013. Firms blamed soft domestic demand, a fall in new business from overseas and uncertainty ahead of the EU referendum. A slowdown in the oil and gas industry, a major customer for UK companies, is also hitting production. The index for new orders fell to 50.4 in April, from 51.9 the month before, matching February's three-year low. Rob Dobson, senior economist at Markit, said: ""On this evidence manufacturing production is now falling at a quarterly pace of around 1%, and will likely act as a drag on the economy again during the second quarter and putting greater pressure on the service sector to sustain GDP growth. ""The manufacturing labour market is also being impacted, with the data signalling close to 20,000 job losses over the past three months."" Last week, official figures showed UK economic growth slowed to 0.4% in the first quarter of the year from 0.6% in late 2015, propped up by the services sector. David Noble, group chief executive at the Chartered Institute of Procurement and Supply (CIPS), said: ""Recent fears over a stall in the UK's manufacturing sector have now become a reality. ""An atmosphere of deep unease is building throughout the manufacturing supply chain, eating away at new orders, reducing British exports and putting more jobs at risk. ""A sense of apprehension across the sector is being caused by enduring volatility in the oil and gas industry, falling retailer confidence and the uncertainty created by the EU referendum."" The Markit/CIPS survey found new export orders contracted for the fourth straight month in April as the global economy continued to slow. A measure of employment in the manufacturing sector was also below the 50 mark for its fourth straight month. Lee Hopley, chief economist at the manufacturers' organisation, EEF, said: ""The sharp drop to a three-year low and another month of reported job cuts could be the clearest sign yet that referendum uncertainty is starting to weigh on the real economy. ""However, this is just another straw on the back of a sector already grappling with the struggling oil and gas sector, softening domestic demand and weak order outlook from other parts of the world, all of which are failing to provide any counterbalance to the political uncertainty at home.""","Concerns over UK manufacturing activity have become more tangible, with activity falling in April for the first time since March of 2013. The culprits of this economic woe are noted to be soft domestic demand, a fall in new international business, and also uncertainty ahead of the EU referendum."
"An obese mother who enjoyed takeaways and boozy nights out has lost more than seven stone after a child on a bus pointed at her and asked whether she was pregnant. Lizzi Crawford, 32, tipped the scales at 20 stone when she overheard the young bus passenger ask his mum: 'Has she got a baby in her belly?' The embarrassing remark left the mother-of-six, from Stoke-on-Trent, mortified but inspired her to ditch her unhealthy lifestyle and shed the pounds, slimming down to a healthier 12.5st. Lizzi Crawford dropped over 7st after a stranger mistook her for being pregnant. Lizzi had reached a size 24 dress after living on a diet of burgers, pizzas and kebabs. But she also devoured liquid calories in the form of wine and spirits. However Lizzi never realised how big she had got until she heard the pregnancy remark on the bus. She said: 'It started when I was taking my kids to school and we were sitting on the bus. A kid then looked at me and said: ""Has she got a baby in her belly?""' Lizzi now admits that she was living an extremely unhealthy lifestyle. She continued: 'It was terrible. I was being a slob to be honest. Lizzi piled on the pounds thanks to a boozy lifestyle and diet of takeaways, pizzas and kebabs. As well as her battle to lose weight Lizzi also won her battle against cervical cancer. 'I was eating burgers, takeaways, pizzas, kebabs and drinking - mainly wine and spirits mixed with Dr Pepper. 'I knew I had to do something about my weight for the sake of my children.' In a serious bid to slim down Lizzi began cooking healthier meals and joining fitness and self-defense classes which saw her lose over 7st. Incredibly, she achieved her goal despite suffering the set-back of being diagnosed with cervical cancer in October 2012. As well as winning her fight against the disease following a hysterectomy and cancer treatment she has now won her battle against the bulge. Lizzi, who works a cleaner, kept fit by attending self-defense classes at T6 Fight Club in Burslem, Stoke-on-Trent. She also discovered Hourglass training - a fitness programme designed to keep a woman's curves while she gets healthy. Lizzi says that having support from other women at her gym helped her to achieve her goal. Lizzi joined the gym and began hourglass training and says that she is now addicted to fitness. 'I began to build relationships with the people at the gym. The girls were egging me on to eat well - they all cheer each other on. 'I've got some of the best friends I've ever made there. They don't look down their noses at you and you're always made to feel welcome.' The slimmer says she is now 'addicted' to her fitness classes and goes five times every week. She added that losing the weight has helped her mental being as well as her physical being. 'I can do a lot more things now. I can walk more places and do more with the kids - I can even do simple tasks like getting up and down the stairs easier now. 'It's helped me mentally because it was depressing when I was heavy, but since I started Hourglass, that has just gone. 'The weight loss has helped me in the workplace too. I find I can get around much quicker and finish earlier - it used to take me ages. 'Now I can spend more time with the kids.' Lizzi's mum, Mary Crawford, 57, said her daughter's slimming efforts had been 'amazing'. She added: 'She's found out about cooking the right way, she's been going to the gym and riding bikes. 'She's stuck at it and I think it's amazing what she's done. I'm really pleased, and I believe she will keep it off as she's found a routine that suits her. 'She's like the old Lizzi I used to know as a little girl.'","Lizzi, an obese mother of six, wouldn't have realized how big she got until a kid on the bus mistook her for being pregnant. She lived on a junk food booze diet, but after hearing that comment, Lizzi was inspired to lose weight. Since then, both her physical and mental health has significantly improved."
"A young Syrian boy has revealed how he saw depraved Islamic State militants playing football with a severed head inside the besieged Yarmouk refugee camp. Amjad Yaaqub, 16, said he stumbled on the barbaric scene shortly after the terrorists beat him unconscious when they burst into his family home at the camp in the Syrian capital Damascus. The schoolboy said the ISIS fighters were looking for his brother, who is a member of the Palestinian rebel group who ran and defended the camp for several years before ISIS carried out a bloody assault that has left more than 200 people dead in just seven days. His story was revealed as refugees in Yarmouk spoke of the daily atrocities they have witnessed since ISIS seized control of 90 per cent of the camp, including innocent children being slaughtered in front of their anguished parents. Scroll down for video. Scene of death: A destroyed graveyard is photographed in Yarmouk camp following the intense fighting. Innocent: Palestinians, who fled the Yarmouk refugee camp, sit on mattresses inside a school in Damascus. After enduring two years of famine and fighting, Ibrahim Abdel Fatah said he saw heads cut off by ISIS in the Palestinian camp of Yarmouk. That was it. He fled and hasn't looked back. Unshaven, pale and gaunt, he has found refuge with his wife and seven children at the Zeinab al-Haliyeh school in Tadamun, a southeastern district of the Syrian capital held by the army. 'I saw severed heads. They killed children in front of their parents. We were terrorised,' he said. 'We had heard of their cruelty from the television, but when we saw it ourselves... I can tell you, their reputation is well-deserved,' the 55-year-old said. The school is currently home to 98 displaced people, among them 40 children, who have been put up in three classrooms. The usual occupants, schoolchildren, have been evacuated temporarily from rooms where mattresses and bedding now blanket the floor. 'I left my house which was the only thing I had. My family lived on rations supplied by UNRWA,' the United Nations agency that looks after Palestinian refugees, the former caretaker said. Destroyed: In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. A man stands on a staircase inside a demolished building inside the Yarmouk Palestinian refugee camp. Anwar Abdel Hadi, a Palestine Liberation Organisation official in Damascus, said 500 families, or about 2,500 people, fled Yarmouk before IS fighters attacked the camp last Wednesday. Before the assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital. Yarmouk was once a thriving district housing 160,000 Palestinian refugees and Syrians. But that was before it too was caught up in the widespread civil unrest which erupted in 2011. In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. The camp has been encircled for more than a year, but is now reported to be almost completely under the control of ISIS and Al Qaeda's local affiliate, Jabhat al-Nusra. Residents who fled the advancing jihadists last week have been put up in regime-held areas nearby. According to Britain-based monitor the Syrian Observatory for Human Rights, nearly 200 people had died in Yarmouk from malnutrition and lack of medicines before last Wednesday's assault. Carnage: Before the Islamic State assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital Damascus. Keeping the faith: A Palestinian man who fled the Yarmouk refugee camp prays inside a school in Damascus. Speaking of the moment he stumbled on the ISIS militants, 16-year-old Amjad said: 'In Palestine Street, I saw two members of Daesh playing with a severed head as if it was a football. Wearing a baseball cap sideways, rapper-style, the youth has a swollen eye and chin. 'Daesh came to my home looking for my brother who's in the Palestinian Popular Committees. They beat me until I passed out and left me for dead,' he added, referring to the group by an Arabic acronym. At the entrance to the school, Umm Usama chatted with fellow refugees who had got out. 'I left the camp despite myself,' said the 40-year-old woman who had lived in Yarmouk for 17 years. 'I'd stayed on despite the bombings and famine. It was terrible, we ate grass, but at least I was at home. 'Daesh's arrival meant destruction and massacre. Their behaviour's not human and their religion is not ours,' added the thin woman with sunken eyes. Rubble: Destruction in Yarmouk Palestinian refugee camp in the Syrian capital Damascus earlier this week. Palestinians demanding the protection of refugees in Yarmouk stage a demonstration in Gaza city on Monday. Men lie around on mattresses as women gather in small groups, smoking cigarettes and drinking fruit juice as children run around the room. 'Everything changed when IS arrived. Before that we didn't fear death, because if there was fighting, the rebels made sure the civilians got to shelters,' said Abir, a 47-year-old woman who was born and raised in Yarmouk. There are no suitcases to be seen in the classrooms -- the families had to leave so quickly there was no time to pack anything. 'I left without bringing any belongings. My husband wasn't able to join me. I walked out hugging the walls so snipers couldn't see me,' said 19-year-old Nadia, nursing her two-month-old baby. Yesterday ISIS launched English-language radio news bulletins on its al-Bayan radio network. The militant group's English bulletin, promoted via Twitter, accompanies Arabic and Russian bulletins already airing on the network. The first bulletin, which provided an overview of their activities in Iraq, Syria and Libya, discussed a range of topics including the alleged death of an ISIS commander in Yarmouk, a suicide bombing in the Iraqi city of Kirkuk and mortar attacks on militias in Sirte, Libya. ISIS holds territory in a third of Iraq and Syria and is becoming increasingly active in Libya. The group already publishes a monthly online English-language magazine, Dabiq, with religious lessons, plus news about its activities.","A refugee camp in Syria has been almost completely taken over by ISIS and its local affiliate. People in the camp report atrocities such as children being killed in front of their parents, and football being played with severed heads. Many in the camps have fled to other locations."
"A call to deselect a UKIP member of the Welsh assembly has been rejected by the party's ruling body. A letter sent by party activists in north Wales claimed Michelle Brown has been ""abrasive and discourteous"" to them. It was sent to UKIP's national executive committee (NEC) before a row over racial slurs about a Labour MP, for which Ms Brown apologised. But UKIP chairman Paul Oakden said the letter did not follow proper process. A UKIP assembly group spokesman said the letter was written by a group with a ""long-standing grudge"" against the AM. The ruling NEC body discussed the issue at a meeting on Friday, where they also decided to allow a controversial anti-Islam campaigner to run for the UKIP leadership. Mr Oakden said: ""A member of the NEC had contacted the person that is putting this forward and said to them they need to follow the proper process of completing the necessary forms and submitting them to the NEC. ""Members simply emailing the NEC saying we want you to do this is not the correct disciplinary process for the party, by any stretch of the imagination. ""A member of the NEC has gone back and given advice on what they need to do."" Shaun Owen, secretary of UKIP's Delyn branch, wrote to the NEC saying: ""For some time we have been appalled by the abrasive and discourteous manner of Ms Brown towards UKIP locally. ""Her lack of effort in pursuing the aims of the party both locally and nationally is of concern to members across the region."" Mr Owen added he believed members would stop supporting UKIP if Ms Brown remained in the role. However, a spokesman for the party's assembly group dismissed the letter as written by a ""tiny and insignificant group"". In February, Ms Brown denied an allegation she had smoked recreational drugs in a hotel room. Later that month, she said she had acted ""with propriety"" after it was revealed she had discussed how an advert for a job in her assembly office could be changed to help her brother get an interview for the post. Meanwhile, UKIP's NEC confirmed that 11 hopefuls in the contest to succeed Paul Nuttall as leader will be able to run as candidates. The list includes Anne Marie Waters, the founder of the Sharia Watch pressure group, who has described Islam as evil. UKIP AM David Rowlands had said Ms Waters is probably ""too extreme"" to be allowed to stand but she claimed the party was trying to ""ostracise"" her. Other candidates who have also cleared the NEC's vetting process and are going forward to a vote of the membership include Welsh activist John Rees-Evans, London Assembly member Peter Whittle and Scottish MEP David Coburn. Mr Nuttall resigned after the general election in June when the party failed to win any seats and saw its vote plummet.","UKIP Welsh assembly member, Michelle Brown, is under scrutiny. While the call to have her deselected has been shut down, it does highlight relevant concerns about Ms. Brown's ability to fulfill her post. Accusations of recreational drug use, nepotism, and the use of racist language are the most relevant concerns brought forward thus far."
"A breakthrough has been made in the development of clean hydrogen power, scientists claim. At the moment, while hydrogen fuel is appealing, the production of hydrogen is incredible difficult - requiring huge amounts of energy. But the researchers say they have made a new material that can generate hydrogen from water, meaning it is less reliant on fossil fuels. Hydrogen-fuel is appealing for use in cars like the Vauxhall Zafira minivan pictured, but producing hydrogen requires huge amounts of energy. With the new breakthrough, it could be possible to make it more easily. Researchers at the University of Bath and Yale University created the invention. It uses a newly designed molecular catalyst to split water in an electrolyser and create clean and storable hydrogen fuel. Lead research Dr Ulrich Hintermair told MailOnline that the main problem with the production of hydrogen through a process known as water electrolysis was the waste oxygen it produces. Water splitting is an electro-chemical process in which two electrodes generate oxygen and hydrogen from water, respectively. The energy required to drive this process gets locked up in the hydrogen as the fuel with oxygen as a by-product. A fuel cell can then harness the energy again elsewhere by recombining the two. The new patented catalyst is more efficient at performing the crucial oxidation half of the reaction than any other existing material, minimising energy losses in the electricity-to-hydrogen conversion process. It can be directly applied to various electrode surfaces in a straightforward and highly economical manner. The process splits water into hydrogen and oxygen but, while the first part can be done quite efficiently, the latter was more difficult and lots of energy is lost. With this in mind the team designed a catalyst - a substance that alters the speed of the chemical reaction - to improve the efficiency. ‘Oxygen is the most difficult bit,’ Dr Hintermair explained. Their catalyst, placed on an electrode used in the production of hydrogen, is much more efficient - and although Dr Hintermair didn’t have an exact figure, he said the energy loss using it is ‘almost non-existent’. The major benefit from this breakthrough is that hydrogen could now be used more easily as a way to store energy from renewable sources like wind and solar. ‘We can make electricity out of sunlight and wind, low carbon renewable sources, but we can’t store it very well,’ Dr Hintermair continued. ‘We can put it in a battery but you can’t, for example, fly an airplane on a battery yet. ‘So we need to convert it into a chemical fuel, and for that water electrolysis is a key technology because we can take any renewable technology, connect it to an electrolyster and store it in hydrogen, which is a fantastic fuel.’ This, for example, would make hydrogen fuel cells for cars much more economical. On this right in this image is the catalyst being used in the water electrolysis process. The large bubbles are oxygen, while the smaller bubbles on the left are hydrogen. The team are in discussions with a number of energy companies about utilising this technology on a large scale and hope the breakthrough marks the start of contributing to providing the world with more sustainable fuels. ‘In theory it could be used on all systems, but it depends on cost and scale,’ said Dr Hintermair. As regulations tighten on the use of fossil fuels and their emissions, there is a growing focus on the need for cost effective and efficient ways of creating energy carriers from renewable sources. Solar power is thought to be able to provide up to four per cent of the UK's electricity by the end of the decade. However, while the price of photovoltaic technology has dramatically decreased in recent years as demand has risen, solar energy is problematic as it is intermittent, meaning electricity is only created when it is light. One use of the newly developed catalyst could be to store the energy produced by solar power by using the electricity to produce hydrogen which can then be used on demand, regardless of the time of day. Solar power is thought to be able to provide up to four per cent of the UK's electricity by the end of the decade (Wymeswold Solar Farm in Leicestershire, UK shown), but storing it is difficult. This new technology could store energy as hydrogen, which can then be used on demand. Dr Hintermair is a Whorrod research fellow at the Centre for Sustainable Chemical Technologies at the University of Bath. 'Hydrogen is a fantastically versatile and environmentally friendly fuel, however, hydrogen-powered applications are only as ""green"" as the hydrogen on which they run,' he said. 'Currently, over 90 per cent is derived from fossil fuels. If we want to bring about a clean hydrogen economy we must first generate clean hydrogen. 'This new molecular catalyst will hopefully play a large role in helping create hydrogen from renewable energy sources such as solar power. 'We are also interested in applying this technology to other forms of renewable energy such as tidal, wind and wave power.' Professor Matthew Davidson, head of the department of chemistry, added: 'Splitting water into its constituent parts is deceptively simple chemistry, but doing it in a sustainable way is one of the holy grails of chemistry because it is the key step in the goal of artificial photosynthesis. '[Dr Hintermair's] results are extremely exciting because of their potential for practical application.'","The University of Bath and Yale University have developed a new way to create hydrogen from water that leaves very little waste, which they claim has huge potential to provide green energy. This method would allow eco-friendly energy sources like wind to be stored and used as a chemical energy source."
"A young Syrian boy has revealed how he saw depraved Islamic State militants playing football with a severed head inside the besieged Yarmouk refugee camp. Amjad Yaaqub, 16, said he stumbled on the barbaric scene shortly after the terrorists beat him unconscious when they burst into his family home at the camp in the Syrian capital Damascus. The schoolboy said the ISIS fighters were looking for his brother, who is a member of the Palestinian rebel group who ran and defended the camp for several years before ISIS carried out a bloody assault that has left more than 200 people dead in just seven days. His story was revealed as refugees in Yarmouk spoke of the daily atrocities they have witnessed since ISIS seized control of 90 per cent of the camp, including innocent children being slaughtered in front of their anguished parents. Scroll down for video. Scene of death: A destroyed graveyard is photographed in Yarmouk camp following the intense fighting. Innocent: Palestinians, who fled the Yarmouk refugee camp, sit on mattresses inside a school in Damascus. After enduring two years of famine and fighting, Ibrahim Abdel Fatah said he saw heads cut off by ISIS in the Palestinian camp of Yarmouk. That was it. He fled and hasn't looked back. Unshaven, pale and gaunt, he has found refuge with his wife and seven children at the Zeinab al-Haliyeh school in Tadamun, a southeastern district of the Syrian capital held by the army. 'I saw severed heads. They killed children in front of their parents. We were terrorised,' he said. 'We had heard of their cruelty from the television, but when we saw it ourselves... I can tell you, their reputation is well-deserved,' the 55-year-old said. The school is currently home to 98 displaced people, among them 40 children, who have been put up in three classrooms. The usual occupants, schoolchildren, have been evacuated temporarily from rooms where mattresses and bedding now blanket the floor. 'I left my house which was the only thing I had. My family lived on rations supplied by UNRWA,' the United Nations agency that looks after Palestinian refugees, the former caretaker said. Destroyed: In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. A man stands on a staircase inside a demolished building inside the Yarmouk Palestinian refugee camp. Anwar Abdel Hadi, a Palestine Liberation Organisation official in Damascus, said 500 families, or about 2,500 people, fled Yarmouk before IS fighters attacked the camp last Wednesday. Before the assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital. Yarmouk was once a thriving district housing 160,000 Palestinian refugees and Syrians. But that was before it too was caught up in the widespread civil unrest which erupted in 2011. In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. The camp has been encircled for more than a year, but is now reported to be almost completely under the control of ISIS and Al Qaeda's local affiliate, Jabhat al-Nusra. Residents who fled the advancing jihadists last week have been put up in regime-held areas nearby. According to Britain-based monitor the Syrian Observatory for Human Rights, nearly 200 people had died in Yarmouk from malnutrition and lack of medicines before last Wednesday's assault. Carnage: Before the Islamic State assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital Damascus. Keeping the faith: A Palestinian man who fled the Yarmouk refugee camp prays inside a school in Damascus. Speaking of the moment he stumbled on the ISIS militants, 16-year-old Amjad said: 'In Palestine Street, I saw two members of Daesh playing with a severed head as if it was a football. Wearing a baseball cap sideways, rapper-style, the youth has a swollen eye and chin. 'Daesh came to my home looking for my brother who's in the Palestinian Popular Committees. They beat me until I passed out and left me for dead,' he added, referring to the group by an Arabic acronym. At the entrance to the school, Umm Usama chatted with fellow refugees who had got out. 'I left the camp despite myself,' said the 40-year-old woman who had lived in Yarmouk for 17 years. 'I'd stayed on despite the bombings and famine. It was terrible, we ate grass, but at least I was at home. 'Daesh's arrival meant destruction and massacre. Their behaviour's not human and their religion is not ours,' added the thin woman with sunken eyes. Rubble: Destruction in Yarmouk Palestinian refugee camp in the Syrian capital Damascus earlier this week. Palestinians demanding the protection of refugees in Yarmouk stage a demonstration in Gaza city on Monday. Men lie around on mattresses as women gather in small groups, smoking cigarettes and drinking fruit juice as children run around the room. 'Everything changed when IS arrived. Before that we didn't fear death, because if there was fighting, the rebels made sure the civilians got to shelters,' said Abir, a 47-year-old woman who was born and raised in Yarmouk. There are no suitcases to be seen in the classrooms -- the families had to leave so quickly there was no time to pack anything. 'I left without bringing any belongings. My husband wasn't able to join me. I walked out hugging the walls so snipers couldn't see me,' said 19-year-old Nadia, nursing her two-month-old baby. Yesterday ISIS launched English-language radio news bulletins on its al-Bayan radio network. The militant group's English bulletin, promoted via Twitter, accompanies Arabic and Russian bulletins already airing on the network. The first bulletin, which provided an overview of their activities in Iraq, Syria and Libya, discussed a range of topics including the alleged death of an ISIS commander in Yarmouk, a suicide bombing in the Iraqi city of Kirkuk and mortar attacks on militias in Sirte, Libya. ISIS holds territory in a third of Iraq and Syria and is becoming increasingly active in Libya. The group already publishes a monthly online English-language magazine, Dabiq, with religious lessons, plus news about its activities.","A refugee camp in Syria has been almost completely taken over by ISIS and its local affiliate. People in the camp report atrocities such as children being killed in front of their parents, and football being played with severed heads. Many in the camps have fled to other locations."
"The trial of a group of cult members in China who beat a woman to death at a McDonald's restaurant has opened in the city of Yantai in Shandong province. The woman, 37-year-old Wu Shuoyan, is alleged to have been killed last May simply for refusing to hand over her phone number to cult members. The murder, filmed on CCTV and on mobile phones, sparked outrage. The Church of the Almighty God cult is banned in China but claims to have millions of members. Following the brutal killing in May, Chinese authorities said that they detained hundreds of members of the cult, reports the BBC's Martin Patience in Beijing. Interviewed in prison later, one of the defendants, Zhang Lidong showed no remorse. He said: ""I beat her with all my might and stamped on her too. She was a demon. We had to destroy her."" The group had entered a small McDonalds branch in Zhaoyuan in Shandong province last May soliciting phone numbers and hoping to recruit members to their cult. Ms Wu was waiting in the restaurant with her seven-year-old son and when she refused to give her number, an act which prompted the beating while they screamed at other diners to keep away or they would face the same fate. The public face of the Church of the Almighty God is a website full of uplifting hymns and homilies. But its core belief is that God has returned to earth as a Chinese woman to wreak the apocalypse. The only person who claims direct contact with this god is a former physics teacher, Zhao Weishan, who founded the cult 25 years ago and has since fled to the United States, says BBC China Editor Carrie Gracie. No-one knows exactly where he is, but much of the website's message of outright hostility to the Chinese government is delivered in English as well as Chinese. The cult complains that religious faith has suffered from persecution by the Communist Party. Since the McDonald's murder, public outrage has forced the authorities to increase pressure on the Church of the Almighty God with almost daily arrests and raids.","Members of a banned cult called Church of the Almighty God have gone to trial for beating a woman to death. The cult has existed for 25 years, and claims God is reborn as a Chinese woman to start the apocalypse. Police have increased raids and arrests of cult members since the murder."


## Step 3: Compute the ROUGE Evaluation Metric

Next, we will want to compute our ROUGE-N metric to understand how well our system summarizes grocery generic text using the benchmark dataset.

We can compute the ROUGE metric (among others) using MLflow's new LLM evaluation capabilities. MLflow LLM evaluation includes default collections of metrics for pre-selected tasks, e.g, “question-answering” or "text-summarization" (our case). Depending on the LLM use case that you are evaluating, these pre-defined collections can greatly simplify the process of running evaluations.

The `mlflow.evaluate` function accepts the following parameters for this use case:

* An LLM model
* Reference data for evaluation (our benchmark set)
* Column with ground truth data
* The model/task type (e.g. `"text-summarization"`)

**Note:** The `text-summarization` type will automatically compute ROUGE-related metrics. For some metrics, additional library installs will be needed – you can see the requirements in the printed output.

In [0]:
# A custom function to iterate through our eval DF
def query_iteration(inputs):
    answers = []

    for index, row in inputs.iterrows():
        completion = query_summary_system(row["inputs"])
        answers.append(completion)

    return answers

# Test query_iteration function – it needs to return a list of output strings
query_iteration(eval_data.head())

["Baltimore's mayor fired the city's police chief, replacing him with his deputy, citing that his leadership had become a distraction from addressing the 48% increase in homicides amid ongoing tensions following Freddie Gray's death in police custody.",
 "Western Sahara officials welcomed Morocco's readmission to the African Union after 32 years, viewing it as an opportunity to work together despite ongoing disputes over Western Sahara's independence and Morocco's control of two-thirds of the territory.",
 'England flanker James Haskell shared an Instagram photo of himself dressed as Iron Man to celebrate the release of Avengers: Age of Ultron.',
 'UK manufacturing activity contracted in April for the first time in three years due to soft domestic demand, falling overseas business, and uncertainty ahead of the EU referendum, potentially acting as a drag on the economy.',
 'A mother-of-six lost over seven stone after being mortified when a child on a bus asked if she was pregnant, inspi

In [0]:
import mlflow

# MLflow's `evaluate` with a custom function
results = mlflow.evaluate(
    query_iteration,                      # iterative function from above
    eval_data.head(50),                   # limiting for speed
    targets="writer_summary",             # column with expected or "good" output
    model_type="text-summarization"       # type of model or task
)


 - For traditional ML or deep learning models: Use `mlflow.models.evaluate`, which maintains full compatibility with the original `mlflow.evaluate` API.

 - For LLMs or GenAI applications: Use the new `mlflow.genai.evaluate` API, which offers enhanced features specifically designed for evaluating LLMs and GenAI applications.

2025/07/25 09:44:00 INFO mlflow.models.evaluation.evaluators.default: Computing model predictions.
2025/07/25 09:45:40 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...


We can view the results for individual records by subsetting the handy `.tables` object.

Notice all of the different versions of the ROUGE metric. These are calculated using the HuggingFace `evaluator` library, and the metrics are detailed [here](https://huggingface.co/spaces/evaluate-metric/rouge).

In summary, the descriptions of each metric are below:

* "rouge1": unigram (1-gram) based scoring
* "rouge2": bigram (2-gram) based scoring
* "rougeL": Longest common subsequence based scoring.
* "rougeLSum": splits text using "\n"

In [0]:
display(results.tables["eval_results_table"].head(10))

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

inputs,writer_summary,outputs,token_count,flesch_kincaid_grade_level/v1/score,ari_grade_level/v1/score,rouge1/v1/score,rouge2/v1/score,rougeL/v1/score,rougeLsum/v1/score
"Baltimore's mayor has sacked the US city's police chief, saying his leadership had become a distraction from fighting a ""crime surge"". Mayor Stephanie Rawlings-Blake said she was replacing Police Commissioner Anthony Batts with his deputy, Kevin Davis, for an interim period. The city was rocked by riots in April when a black man died after suffering injuries in police custody. Six officers were charged over the death of the 25-year-old, Freddie Gray. Speaking at a news conference on Wednesday, Mayor Rawlings-Blake said Mr Batts had ""served this city with distinction"" since becoming police chief in October 2012. But referring to the city's high homicide rate, she said ""too many continue to die"". ""The focus has been too much on the leadership of the department and not enough on the crime fighting,"" she told reporters, adding: ""We need to get the crime surge under control."" The city has seen a sharp increase in violence since Freddie Gray's death on 19 April, with 155 homicides this year, a 48% increase over the same period last year. On Tuesday, the police department announced that an outside organisation will review its response to the civil unrest that followed Mr Gray's death. The US justice department is also conducting a civil rights review of the Baltimore force and Mr Batts has been criticised by the city's police union. Earlier on Wednesday, the union released its report into the police handling of the rioting. It said officers had complained ""that they lacked basic riot equipment, training, and, as events unfolded, direction from leadership"". The report also said ""officers repeatedly expressed concern that the passive response to the civil unrest had allowed the disorder to grow into full scale rioting"". Recent events had ""placed attention on police leadership"", Ms Rawlings-Blake said, but denied her decision was influenced by the union report. Mr Davis, who is taking over immediately as interim police chief, praised his ""friend"" Mr Batts and said he was a ""true reform commissioner"". Mayor Rawlings-Blake said Mr Davis would ""bring accountability to police, hold officers who act out of line accountable for their actions"".","The mayor of Baltimore fired the police chief and replaced him with his deputy. According to the mayor, crime in the city was unacceptable. Riots in the city after a man died in police custody and a surge in homicide rates were cited as reasons for the firing.","Baltimore's mayor fired the city's police chief, citing his leadership as a distraction from addressing the 48% increase in homicides, and appointed his deputy as interim commissioner to focus on crime fighting rather than departmental leadership.",45,21.0666666667,24.3066666667,0.3720930233,0.0952380952,0.2325581395,0.2325581395
"Western Sahara has welcomed Morocco's readmission to the African Union, 32 years after members refused to withdraw support for the territory's independence. It was a ""good opportunity"" and ""a chance to work together,"" a top Western Sahara official told the BBC. Morocco controls two-thirds of Western Sahara and sees it as part of its historic territory. However some, including the UN, see Western Sahara as Africa's last colony. Africa Live: More on this and other stories Find out more about Western Sahara A referendum was promised in 1991 but never carried out due to wrangling over who was eligible to vote. Thousands of Sahrawi refugees still live in refugee camps in Algeria - some have been there for 40 years. It is not clear what happens next but Western Sahara is hopeful that a committee set up by the AU will address the issues that both sides have raised. Some AU delegates said that it would be easier to resolve the issue with Morocco inside the AU. Sidi Mohammed, a Western Sahara official, told the BBC that Morocco's return to the AU means that it would now be expected to put ""in practice decisions taken by the AU with regard to a referendum in Western Sahara"". Mr Mohammed dismissed the suggestion that Morocco would now seek to get the AU to change its position, saying that the no country could unilaterally change the AU fundamental agreement, saying it opposed colonisation. In his speech at the AU summit, King Mohammed VI of Morocco said the readmission was not meant to divide the continental body. No. Algeria has always been a big supporter of Western Sahara's Polisario Front and it had wanted Morocco to accept independence of the territory as a condition for readmission. Zimbabwe and South Africa were also supportive of this stance but they were outnumbered by those who wanted Morocco back in the fold. There is no specific provision in the AU charter that bars any country from joining it. Morocco simply applied and the request was accepted by more than two-thirds of the 53 members. Morocco has been involved in intense lobbying and applied in July last year to rejoin the continental body. King Mohammed toured various African countries seeking support for the bid. No. While culturally the country's identity aligns with Arab states, its economic interests increasingly lie in Africa. This is a strategic move to continue exploring its interests in mining, construction, medical, insurance and banking sectors on the continent. Moroccan troops went into Western Sahara after Spain withdrew in 1975. Kitesurfing in a danger zone Inside world's most remote film festival Profile: African Union",Morocco joined the African Union after a referendum was promised over 30 years ago. The readmission was done in an effort to unite the continental body.,"Western Sahara officials welcomed Morocco's readmission to the African Union after 32 years, viewing it as an opportunity to work together despite ongoing disputes over Western Sahara's independence and Morocco's control of two-thirds of the territory.",44,23.0333333333,24.9608333333,0.3333333333,0.09375,0.2727272727,0.2727272727
"With the new Avengers: Age of Ultron movie released this week, James Haskell showed off his inner Iron Man in a serious-looking Instagram post. The highly-anticipated movie premiered at Westfield London shopping centre on Tuesday evening with fans queuing up to see the A-list cast which includes Robert Downey Jr., Chris Hemsworth and Scarlett Johansson. And the London Wasps captain joined in on the hype as he posted the photo dressed as Downey Jr.'s character Iron Man. England flanker James Haskell dressed in Iron Man costume and posted it on his Instagram page. The London Wasps captain (middle) returned to the club where he started his career for the 2012 season. Haskell posted the image on Thursday along with the message: 'Avengers movie is out so thought i would release the inner Iron Man. @UnderArmourUK #TransformYourself #IWILL #AvengersAgeOfUltron.' The flanker returned to Wasps for the 2012 season after spells with Stade Francais, Ricoh Black Rams and Highlanders in New Zealand. Windsor-born Haskell first joined Wasps in 2002, playing eight seasons for the club and winning his first England cap five years later. But in 2009, he moved to Stade Francais in France and spent two seasons in the French capital before he made the move to Tokyo with the Ricoh Black Rams following the unsuccessful 2011 World Cup. Four months in Japan with the Rams and Haskell was on the move again when he switched to New Zealand to join the Highlanders. However, he made only 12 appearances and returned to England in 2012. Since returning to Wasps, Haskell has surpassed the 100 appearances mark for the club and has become a big part of the England squad with 57 caps to-date. Haskell has played his rugby in France, Japan and New Zealand after leaving the Wasps in 2009. Since his return to England, Haskell has enjoyed his rugby and surpassed the 100 appearance mark for Wasps.","James Haskell is a rugby player for the London Wasps. He has an extensive career with playing for other teams in France, Japan, and New Zealand. To celebrate the release of the new Avengers: Age of Ultron movie, Haskell posts an Instagram picture of him dress up as Iron Man.",England flanker James Haskell embraced his inner Iron Man by posting a photo of himself in costume on Instagram to celebrate the release of Avengers: Age of Ultron.,32,13.4514285714,15.6153571429,0.4358974359,0.2368421053,0.3333333333,0.3333333333
"UK manufacturing activity contracted in April for the first time in three years, a survey has indicated, adding to fears over the economy's strength. The Markit/CIPS manufacturing Purchasing Managers' Index fell to 49.2 from 50.7 in March. A reading below 50 indicates falling output. It is the first time that activity in the sector has fallen since March 2013. Firms blamed soft domestic demand, a fall in new business from overseas and uncertainty ahead of the EU referendum. A slowdown in the oil and gas industry, a major customer for UK companies, is also hitting production. The index for new orders fell to 50.4 in April, from 51.9 the month before, matching February's three-year low. Rob Dobson, senior economist at Markit, said: ""On this evidence manufacturing production is now falling at a quarterly pace of around 1%, and will likely act as a drag on the economy again during the second quarter and putting greater pressure on the service sector to sustain GDP growth. ""The manufacturing labour market is also being impacted, with the data signalling close to 20,000 job losses over the past three months."" Last week, official figures showed UK economic growth slowed to 0.4% in the first quarter of the year from 0.6% in late 2015, propped up by the services sector. David Noble, group chief executive at the Chartered Institute of Procurement and Supply (CIPS), said: ""Recent fears over a stall in the UK's manufacturing sector have now become a reality. ""An atmosphere of deep unease is building throughout the manufacturing supply chain, eating away at new orders, reducing British exports and putting more jobs at risk. ""A sense of apprehension across the sector is being caused by enduring volatility in the oil and gas industry, falling retailer confidence and the uncertainty created by the EU referendum."" The Markit/CIPS survey found new export orders contracted for the fourth straight month in April as the global economy continued to slow. A measure of employment in the manufacturing sector was also below the 50 mark for its fourth straight month. Lee Hopley, chief economist at the manufacturers' organisation, EEF, said: ""The sharp drop to a three-year low and another month of reported job cuts could be the clearest sign yet that referendum uncertainty is starting to weigh on the real economy. ""However, this is just another straw on the back of a sector already grappling with the struggling oil and gas sector, softening domestic demand and weak order outlook from other parts of the world, all of which are failing to provide any counterbalance to the political uncertainty at home.""","Concerns over UK manufacturing activity have become more tangible, with activity falling in April for the first time since March of 2013. The culprits of this economic woe are noted to be soft domestic demand, a fall in new international business, and also uncertainty ahead of the EU referendum.","UK manufacturing activity contracted in April for the first time in three years due to soft domestic demand, falling overseas business, and uncertainty ahead of the EU referendum, potentially acting as a drag on the economy.",40,21.0666666667,21.2975,0.5882352941,0.3614457831,0.4941176471,0.4941176471
"An obese mother who enjoyed takeaways and boozy nights out has lost more than seven stone after a child on a bus pointed at her and asked whether she was pregnant. Lizzi Crawford, 32, tipped the scales at 20 stone when she overheard the young bus passenger ask his mum: 'Has she got a baby in her belly?' The embarrassing remark left the mother-of-six, from Stoke-on-Trent, mortified but inspired her to ditch her unhealthy lifestyle and shed the pounds, slimming down to a healthier 12.5st. Lizzi Crawford dropped over 7st after a stranger mistook her for being pregnant. Lizzi had reached a size 24 dress after living on a diet of burgers, pizzas and kebabs. But she also devoured liquid calories in the form of wine and spirits. However Lizzi never realised how big she had got until she heard the pregnancy remark on the bus. She said: 'It started when I was taking my kids to school and we were sitting on the bus. A kid then looked at me and said: ""Has she got a baby in her belly?""' Lizzi now admits that she was living an extremely unhealthy lifestyle. She continued: 'It was terrible. I was being a slob to be honest. Lizzi piled on the pounds thanks to a boozy lifestyle and diet of takeaways, pizzas and kebabs. As well as her battle to lose weight Lizzi also won her battle against cervical cancer. 'I was eating burgers, takeaways, pizzas, kebabs and drinking - mainly wine and spirits mixed with Dr Pepper. 'I knew I had to do something about my weight for the sake of my children.' In a serious bid to slim down Lizzi began cooking healthier meals and joining fitness and self-defense classes which saw her lose over 7st. Incredibly, she achieved her goal despite suffering the set-back of being diagnosed with cervical cancer in October 2012. As well as winning her fight against the disease following a hysterectomy and cancer treatment she has now won her battle against the bulge. Lizzi, who works a cleaner, kept fit by attending self-defense classes at T6 Fight Club in Burslem, Stoke-on-Trent. She also discovered Hourglass training - a fitness programme designed to keep a woman's curves while she gets healthy. Lizzi says that having support from other women at her gym helped her to achieve her goal. Lizzi joined the gym and began hourglass training and says that she is now addicted to fitness. 'I began to build relationships with the people at the gym. The girls were egging me on to eat well - they all cheer each other on. 'I've got some of the best friends I've ever made there. They don't look down their noses at you and you're always made to feel welcome.' The slimmer says she is now 'addicted' to her fitness classes and goes five times every week. She added that losing the weight has helped her mental being as well as her physical being. 'I can do a lot more things now. I can walk more places and do more with the kids - I can even do simple tasks like getting up and down the stairs easier now. 'It's helped me mentally because it was depressing when I was heavy, but since I started Hourglass, that has just gone. 'The weight loss has helped me in the workplace too. I find I can get around much quicker and finish earlier - it used to take me ages. 'Now I can spend more time with the kids.' Lizzi's mum, Mary Crawford, 57, said her daughter's slimming efforts had been 'amazing'. She added: 'She's found out about cooking the right way, she's been going to the gym and riding bikes. 'She's stuck at it and I think it's amazing what she's done. I'm really pleased, and I believe she will keep it off as she's found a routine that suits her. 'She's like the old Lizzi I used to know as a little girl.'","Lizzi, an obese mother of six, wouldn't have realized how big she got until a kid on the bus mistook her for being pregnant. She lived on a junk food booze diet, but after hearing that comment, Lizzi was inspired to lose weight. Since then, both her physical and mental health has significantly improved.","A mother-of-six lost over seven stone after being mortified when a child on a bus asked if she was pregnant, inspiring her to replace her diet of takeaways and alcohol with healthy meals and regular fitness classes despite battling cervical cancer along the way.",50,21.1472727273,24.0129545455,0.3564356436,0.0606060606,0.2178217822,0.2178217822
"A young Syrian boy has revealed how he saw depraved Islamic State militants playing football with a severed head inside the besieged Yarmouk refugee camp. Amjad Yaaqub, 16, said he stumbled on the barbaric scene shortly after the terrorists beat him unconscious when they burst into his family home at the camp in the Syrian capital Damascus. The schoolboy said the ISIS fighters were looking for his brother, who is a member of the Palestinian rebel group who ran and defended the camp for several years before ISIS carried out a bloody assault that has left more than 200 people dead in just seven days. His story was revealed as refugees in Yarmouk spoke of the daily atrocities they have witnessed since ISIS seized control of 90 per cent of the camp, including innocent children being slaughtered in front of their anguished parents. Scroll down for video. Scene of death: A destroyed graveyard is photographed in Yarmouk camp following the intense fighting. Innocent: Palestinians, who fled the Yarmouk refugee camp, sit on mattresses inside a school in Damascus. After enduring two years of famine and fighting, Ibrahim Abdel Fatah said he saw heads cut off by ISIS in the Palestinian camp of Yarmouk. That was it. He fled and hasn't looked back. Unshaven, pale and gaunt, he has found refuge with his wife and seven children at the Zeinab al-Haliyeh school in Tadamun, a southeastern district of the Syrian capital held by the army. 'I saw severed heads. They killed children in front of their parents. We were terrorised,' he said. 'We had heard of their cruelty from the television, but when we saw it ourselves... I can tell you, their reputation is well-deserved,' the 55-year-old said. The school is currently home to 98 displaced people, among them 40 children, who have been put up in three classrooms. The usual occupants, schoolchildren, have been evacuated temporarily from rooms where mattresses and bedding now blanket the floor. 'I left my house which was the only thing I had. My family lived on rations supplied by UNRWA,' the United Nations agency that looks after Palestinian refugees, the former caretaker said. Destroyed: In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. A man stands on a staircase inside a demolished building inside the Yarmouk Palestinian refugee camp. Anwar Abdel Hadi, a Palestine Liberation Organisation official in Damascus, said 500 families, or about 2,500 people, fled Yarmouk before IS fighters attacked the camp last Wednesday. Before the assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital. Yarmouk was once a thriving district housing 160,000 Palestinian refugees and Syrians. But that was before it too was caught up in the widespread civil unrest which erupted in 2011. In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. The camp has been encircled for more than a year, but is now reported to be almost completely under the control of ISIS and Al Qaeda's local affiliate, Jabhat al-Nusra. Residents who fled the advancing jihadists last week have been put up in regime-held areas nearby. According to Britain-based monitor the Syrian Observatory for Human Rights, nearly 200 people had died in Yarmouk from malnutrition and lack of medicines before last Wednesday's assault. Carnage: Before the Islamic State assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital Damascus. Keeping the faith: A Palestinian man who fled the Yarmouk refugee camp prays inside a school in Damascus. Speaking of the moment he stumbled on the ISIS militants, 16-year-old Amjad said: 'In Palestine Street, I saw two members of Daesh playing with a severed head as if it was a football. Wearing a baseball cap sideways, rapper-style, the youth has a swollen eye and chin. 'Daesh came to my home looking for my brother who's in the Palestinian Popular Committees. They beat me until I passed out and left me for dead,' he added, referring to the group by an Arabic acronym. At the entrance to the school, Umm Usama chatted with fellow refugees who had got out. 'I left the camp despite myself,' said the 40-year-old woman who had lived in Yarmouk for 17 years. 'I'd stayed on despite the bombings and famine. It was terrible, we ate grass, but at least I was at home. 'Daesh's arrival meant destruction and massacre. Their behaviour's not human and their religion is not ours,' added the thin woman with sunken eyes. Rubble: Destruction in Yarmouk Palestinian refugee camp in the Syrian capital Damascus earlier this week. Palestinians demanding the protection of refugees in Yarmouk stage a demonstration in Gaza city on Monday. Men lie around on mattresses as women gather in small groups, smoking cigarettes and drinking fruit juice as children run around the room. 'Everything changed when IS arrived. Before that we didn't fear death, because if there was fighting, the rebels made sure the civilians got to shelters,' said Abir, a 47-year-old woman who was born and raised in Yarmouk. There are no suitcases to be seen in the classrooms -- the families had to leave so quickly there was no time to pack anything. 'I left without bringing any belongings. My husband wasn't able to join me. I walked out hugging the walls so snipers couldn't see me,' said 19-year-old Nadia, nursing her two-month-old baby. Yesterday ISIS launched English-language radio news bulletins on its al-Bayan radio network. The militant group's English bulletin, promoted via Twitter, accompanies Arabic and Russian bulletins already airing on the network. The first bulletin, which provided an overview of their activities in Iraq, Syria and Libya, discussed a range of topics including the alleged death of an ISIS commander in Yarmouk, a suicide bombing in the Iraqi city of Kirkuk and mortar attacks on militias in Sirte, Libya. ISIS holds territory in a third of Iraq and Syria and is becoming increasingly active in Libya. The group already publishes a monthly online English-language magazine, Dabiq, with religious lessons, plus news about its activities.","A refugee camp in Syria has been almost completely taken over by ISIS and its local affiliate. People in the camp report atrocities such as children being killed in front of their parents, and football being played with severed heads. Many in the camps have fled to other locations.","A 16-year-old Syrian refugee witnessed ISIS militants playing football with a severed head in the Yarmouk refugee camp, where the terrorist group has taken control, leading to atrocities, mass displacement, and over 200 deaths in just one week.",50,18.4826315789,23.2271052632,0.3820224719,0.0459770115,0.202247191,0.202247191
"A call to deselect a UKIP member of the Welsh assembly has been rejected by the party's ruling body. A letter sent by party activists in north Wales claimed Michelle Brown has been ""abrasive and discourteous"" to them. It was sent to UKIP's national executive committee (NEC) before a row over racial slurs about a Labour MP, for which Ms Brown apologised. But UKIP chairman Paul Oakden said the letter did not follow proper process. A UKIP assembly group spokesman said the letter was written by a group with a ""long-standing grudge"" against the AM. The ruling NEC body discussed the issue at a meeting on Friday, where they also decided to allow a controversial anti-Islam campaigner to run for the UKIP leadership. Mr Oakden said: ""A member of the NEC had contacted the person that is putting this forward and said to them they need to follow the proper process of completing the necessary forms and submitting them to the NEC. ""Members simply emailing the NEC saying we want you to do this is not the correct disciplinary process for the party, by any stretch of the imagination. ""A member of the NEC has gone back and given advice on what they need to do."" Shaun Owen, secretary of UKIP's Delyn branch, wrote to the NEC saying: ""For some time we have been appalled by the abrasive and discourteous manner of Ms Brown towards UKIP locally. ""Her lack of effort in pursuing the aims of the party both locally and nationally is of concern to members across the region."" Mr Owen added he believed members would stop supporting UKIP if Ms Brown remained in the role. However, a spokesman for the party's assembly group dismissed the letter as written by a ""tiny and insignificant group"". In February, Ms Brown denied an allegation she had smoked recreational drugs in a hotel room. Later that month, she said she had acted ""with propriety"" after it was revealed she had discussed how an advert for a job in her assembly office could be changed to help her brother get an interview for the post. Meanwhile, UKIP's NEC confirmed that 11 hopefuls in the contest to succeed Paul Nuttall as leader will be able to run as candidates. The list includes Anne Marie Waters, the founder of the Sharia Watch pressure group, who has described Islam as evil. UKIP AM David Rowlands had said Ms Waters is probably ""too extreme"" to be allowed to stand but she claimed the party was trying to ""ostracise"" her. Other candidates who have also cleared the NEC's vetting process and are going forward to a vote of the membership include Welsh activist John Rees-Evans, London Assembly member Peter Whittle and Scottish MEP David Coburn. Mr Nuttall resigned after the general election in June when the party failed to win any seats and saw its vote plummet.","UKIP Welsh assembly member, Michelle Brown, is under scrutiny. While the call to have her deselected has been shut down, it does highlight relevant concerns about Ms. Brown's ability to fulfill her post. Accusations of recreational drug use, nepotism, and the use of racist language are the most relevant concerns brought forward thus far.",UKIP's ruling body rejected a call to deselect Welsh assembly member Michelle Brown despite complaints from party activists about her behavior.,24,15.0761904762,16.6571428571,0.2857142857,0.1333333333,0.2077922078,0.2077922078
"A breakthrough has been made in the development of clean hydrogen power, scientists claim. At the moment, while hydrogen fuel is appealing, the production of hydrogen is incredible difficult - requiring huge amounts of energy. But the researchers say they have made a new material that can generate hydrogen from water, meaning it is less reliant on fossil fuels. Hydrogen-fuel is appealing for use in cars like the Vauxhall Zafira minivan pictured, but producing hydrogen requires huge amounts of energy. With the new breakthrough, it could be possible to make it more easily. Researchers at the University of Bath and Yale University created the invention. It uses a newly designed molecular catalyst to split water in an electrolyser and create clean and storable hydrogen fuel. Lead research Dr Ulrich Hintermair told MailOnline that the main problem with the production of hydrogen through a process known as water electrolysis was the waste oxygen it produces. Water splitting is an electro-chemical process in which two electrodes generate oxygen and hydrogen from water, respectively. The energy required to drive this process gets locked up in the hydrogen as the fuel with oxygen as a by-product. A fuel cell can then harness the energy again elsewhere by recombining the two. The new patented catalyst is more efficient at performing the crucial oxidation half of the reaction than any other existing material, minimising energy losses in the electricity-to-hydrogen conversion process. It can be directly applied to various electrode surfaces in a straightforward and highly economical manner. The process splits water into hydrogen and oxygen but, while the first part can be done quite efficiently, the latter was more difficult and lots of energy is lost. With this in mind the team designed a catalyst - a substance that alters the speed of the chemical reaction - to improve the efficiency. ‘Oxygen is the most difficult bit,’ Dr Hintermair explained. Their catalyst, placed on an electrode used in the production of hydrogen, is much more efficient - and although Dr Hintermair didn’t have an exact figure, he said the energy loss using it is ‘almost non-existent’. The major benefit from this breakthrough is that hydrogen could now be used more easily as a way to store energy from renewable sources like wind and solar. ‘We can make electricity out of sunlight and wind, low carbon renewable sources, but we can’t store it very well,’ Dr Hintermair continued. ‘We can put it in a battery but you can’t, for example, fly an airplane on a battery yet. ‘So we need to convert it into a chemical fuel, and for that water electrolysis is a key technology because we can take any renewable technology, connect it to an electrolyster and store it in hydrogen, which is a fantastic fuel.’ This, for example, would make hydrogen fuel cells for cars much more economical. On this right in this image is the catalyst being used in the water electrolysis process. The large bubbles are oxygen, while the smaller bubbles on the left are hydrogen. The team are in discussions with a number of energy companies about utilising this technology on a large scale and hope the breakthrough marks the start of contributing to providing the world with more sustainable fuels. ‘In theory it could be used on all systems, but it depends on cost and scale,’ said Dr Hintermair. As regulations tighten on the use of fossil fuels and their emissions, there is a growing focus on the need for cost effective and efficient ways of creating energy carriers from renewable sources. Solar power is thought to be able to provide up to four per cent of the UK's electricity by the end of the decade. However, while the price of photovoltaic technology has dramatically decreased in recent years as demand has risen, solar energy is problematic as it is intermittent, meaning electricity is only created when it is light. One use of the newly developed catalyst could be to store the energy produced by solar power by using the electricity to produce hydrogen which can then be used on demand, regardless of the time of day. Solar power is thought to be able to provide up to four per cent of the UK's electricity by the end of the decade (Wymeswold Solar Farm in Leicestershire, UK shown), but storing it is difficult. This new technology could store energy as hydrogen, which can then be used on demand. Dr Hintermair is a Whorrod research fellow at the Centre for Sustainable Chemical Technologies at the University of Bath. 'Hydrogen is a fantastically versatile and environmentally friendly fuel, however, hydrogen-powered applications are only as ""green"" as the hydrogen on which they run,' he said. 'Currently, over 90 per cent is derived from fossil fuels. If we want to bring about a clean hydrogen economy we must first generate clean hydrogen. 'This new molecular catalyst will hopefully play a large role in helping create hydrogen from renewable energy sources such as solar power. 'We are also interested in applying this technology to other forms of renewable energy such as tidal, wind and wave power.' Professor Matthew Davidson, head of the department of chemistry, added: 'Splitting water into its constituent parts is deceptively simple chemistry, but doing it in a sustainable way is one of the holy grails of chemistry because it is the key step in the goal of artificial photosynthesis. '[Dr Hintermair's] results are extremely exciting because of their potential for practical application.'","The University of Bath and Yale University have developed a new way to create hydrogen from water that leaves very little waste, which they claim has huge potential to provide green energy. This method would allow eco-friendly energy sources like wind to be stored and used as a chemical energy source.","Scientists at the University of Bath and Yale University have developed a new molecular catalyst that significantly improves the efficiency of hydrogen production from water, potentially enabling more economical clean energy storage from renewable sources.",37,25.7057142857,25.9448571429,0.3908045977,0.2588235294,0.367816092,0.367816092
"A young Syrian boy has revealed how he saw depraved Islamic State militants playing football with a severed head inside the besieged Yarmouk refugee camp. Amjad Yaaqub, 16, said he stumbled on the barbaric scene shortly after the terrorists beat him unconscious when they burst into his family home at the camp in the Syrian capital Damascus. The schoolboy said the ISIS fighters were looking for his brother, who is a member of the Palestinian rebel group who ran and defended the camp for several years before ISIS carried out a bloody assault that has left more than 200 people dead in just seven days. His story was revealed as refugees in Yarmouk spoke of the daily atrocities they have witnessed since ISIS seized control of 90 per cent of the camp, including innocent children being slaughtered in front of their anguished parents. Scroll down for video. Scene of death: A destroyed graveyard is photographed in Yarmouk camp following the intense fighting. Innocent: Palestinians, who fled the Yarmouk refugee camp, sit on mattresses inside a school in Damascus. After enduring two years of famine and fighting, Ibrahim Abdel Fatah said he saw heads cut off by ISIS in the Palestinian camp of Yarmouk. That was it. He fled and hasn't looked back. Unshaven, pale and gaunt, he has found refuge with his wife and seven children at the Zeinab al-Haliyeh school in Tadamun, a southeastern district of the Syrian capital held by the army. 'I saw severed heads. They killed children in front of their parents. We were terrorised,' he said. 'We had heard of their cruelty from the television, but when we saw it ourselves... I can tell you, their reputation is well-deserved,' the 55-year-old said. The school is currently home to 98 displaced people, among them 40 children, who have been put up in three classrooms. The usual occupants, schoolchildren, have been evacuated temporarily from rooms where mattresses and bedding now blanket the floor. 'I left my house which was the only thing I had. My family lived on rations supplied by UNRWA,' the United Nations agency that looks after Palestinian refugees, the former caretaker said. Destroyed: In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. A man stands on a staircase inside a demolished building inside the Yarmouk Palestinian refugee camp. Anwar Abdel Hadi, a Palestine Liberation Organisation official in Damascus, said 500 families, or about 2,500 people, fled Yarmouk before IS fighters attacked the camp last Wednesday. Before the assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital. Yarmouk was once a thriving district housing 160,000 Palestinian refugees and Syrians. But that was before it too was caught up in the widespread civil unrest which erupted in 2011. In late December 2012, Yarmouk - just four miles from central Damascus - became a battlefield between pro- and anti-government forces before a merciless siege began. The camp has been encircled for more than a year, but is now reported to be almost completely under the control of ISIS and Al Qaeda's local affiliate, Jabhat al-Nusra. Residents who fled the advancing jihadists last week have been put up in regime-held areas nearby. According to Britain-based monitor the Syrian Observatory for Human Rights, nearly 200 people had died in Yarmouk from malnutrition and lack of medicines before last Wednesday's assault. Carnage: Before the Islamic State assault, there were around 18,000 people in Yarmouk in a southern neighbourhood of the Syrian capital Damascus. Keeping the faith: A Palestinian man who fled the Yarmouk refugee camp prays inside a school in Damascus. Speaking of the moment he stumbled on the ISIS militants, 16-year-old Amjad said: 'In Palestine Street, I saw two members of Daesh playing with a severed head as if it was a football. Wearing a baseball cap sideways, rapper-style, the youth has a swollen eye and chin. 'Daesh came to my home looking for my brother who's in the Palestinian Popular Committees. They beat me until I passed out and left me for dead,' he added, referring to the group by an Arabic acronym. At the entrance to the school, Umm Usama chatted with fellow refugees who had got out. 'I left the camp despite myself,' said the 40-year-old woman who had lived in Yarmouk for 17 years. 'I'd stayed on despite the bombings and famine. It was terrible, we ate grass, but at least I was at home. 'Daesh's arrival meant destruction and massacre. Their behaviour's not human and their religion is not ours,' added the thin woman with sunken eyes. Rubble: Destruction in Yarmouk Palestinian refugee camp in the Syrian capital Damascus earlier this week. Palestinians demanding the protection of refugees in Yarmouk stage a demonstration in Gaza city on Monday. Men lie around on mattresses as women gather in small groups, smoking cigarettes and drinking fruit juice as children run around the room. 'Everything changed when IS arrived. Before that we didn't fear death, because if there was fighting, the rebels made sure the civilians got to shelters,' said Abir, a 47-year-old woman who was born and raised in Yarmouk. There are no suitcases to be seen in the classrooms -- the families had to leave so quickly there was no time to pack anything. 'I left without bringing any belongings. My husband wasn't able to join me. I walked out hugging the walls so snipers couldn't see me,' said 19-year-old Nadia, nursing her two-month-old baby. Yesterday ISIS launched English-language radio news bulletins on its al-Bayan radio network. The militant group's English bulletin, promoted via Twitter, accompanies Arabic and Russian bulletins already airing on the network. The first bulletin, which provided an overview of their activities in Iraq, Syria and Libya, discussed a range of topics including the alleged death of an ISIS commander in Yarmouk, a suicide bombing in the Iraqi city of Kirkuk and mortar attacks on militias in Sirte, Libya. ISIS holds territory in a third of Iraq and Syria and is becoming increasingly active in Libya. The group already publishes a monthly online English-language magazine, Dabiq, with religious lessons, plus news about its activities.","A refugee camp in Syria has been almost completely taken over by ISIS and its local affiliate. People in the camp report atrocities such as children being killed in front of their parents, and football being played with severed heads. Many in the camps have fled to other locations.","A 16-year-old Syrian refugee witnessed ISIS militants playing football with a severed head in the Yarmouk refugee camp after they beat him unconscious while searching for his brother, amid reports of atrocities including children being killed in front of their parents.",49,20.2585365854,25.3770731707,0.3913043478,0.2,0.3260869565,0.3260869565
"The trial of a group of cult members in China who beat a woman to death at a McDonald's restaurant has opened in the city of Yantai in Shandong province. The woman, 37-year-old Wu Shuoyan, is alleged to have been killed last May simply for refusing to hand over her phone number to cult members. The murder, filmed on CCTV and on mobile phones, sparked outrage. The Church of the Almighty God cult is banned in China but claims to have millions of members. Following the brutal killing in May, Chinese authorities said that they detained hundreds of members of the cult, reports the BBC's Martin Patience in Beijing. Interviewed in prison later, one of the defendants, Zhang Lidong showed no remorse. He said: ""I beat her with all my might and stamped on her too. She was a demon. We had to destroy her."" The group had entered a small McDonalds branch in Zhaoyuan in Shandong province last May soliciting phone numbers and hoping to recruit members to their cult. Ms Wu was waiting in the restaurant with her seven-year-old son and when she refused to give her number, an act which prompted the beating while they screamed at other diners to keep away or they would face the same fate. The public face of the Church of the Almighty God is a website full of uplifting hymns and homilies. But its core belief is that God has returned to earth as a Chinese woman to wreak the apocalypse. The only person who claims direct contact with this god is a former physics teacher, Zhao Weishan, who founded the cult 25 years ago and has since fled to the United States, says BBC China Editor Carrie Gracie. No-one knows exactly where he is, but much of the website's message of outright hostility to the Chinese government is delivered in English as well as Chinese. The cult complains that religious faith has suffered from persecution by the Communist Party. Since the McDonald's murder, public outrage has forced the authorities to increase pressure on the Church of the Almighty God with almost daily arrests and raids.","Members of a banned cult called Church of the Almighty God have gone to trial for beating a woman to death. The cult has existed for 25 years, and claims God is reborn as a Chinese woman to start the apocalypse. Police have increased raids and arrests of cult members since the murder.","The trial of cult members who beat a woman to death at a McDonald's in China for refusing to give her phone number has begun, sparking outrage and leading to hundreds of arrests of members from the banned Church of the Almighty God.",46,17.9195348837,20.881627907,0.5154639175,0.2105263158,0.2680412371,0.2680412371


And we can view summarized (mean, variance, etc.) model-level (rather than record-level) results with the following:

In [0]:
results.metrics

{'flesch_kincaid_grade_level/v1/mean': 19.03158332053003,
 'flesch_kincaid_grade_level/v1/variance': 9.920752976135542,
 'flesch_kincaid_grade_level/v1/p90': 23.55196951219512,
 'ari_grade_level/v1/mean': 22.203561632847464,
 'ari_grade_level/v1/variance': 13.054990385022968,
 'ari_grade_level/v1/p90': 27.252891447368427,
 'rouge1/v1/mean': 0.36465009596853387,
 'rouge1/v1/variance': 0.007300583160622519,
 'rouge1/v1/p90': 0.5,
 'rouge2/v1/mean': 0.14686018524751876,
 'rouge2/v1/variance': 0.005853832207442289,
 'rouge2/v1/p90': 0.25925696594427244,
 'rougeL/v1/mean': 0.2523175808786251,
 'rougeL/v1/variance': 0.007047457459383682,
 'rougeL/v1/p90': 0.35299782541161856,
 'rougeLsum/v1/mean': 0.2523175808786251,
 'rougeLsum/v1/variance': 0.007047457459383682,
 'rougeLsum/v1/p90': 0.35299782541161856}

We are also able to review the results in the MLflow Experiment Tracking UI.

### What does good look like?

The ROUGE metrics range between 0 and 1 – where 0 indicates extremely dissimilar text and 1 indicates extremely similar text. However, our interpretation of what is "good" is usually going to be use-case specific. We don't always want a ROUGE score close to 1 because it's likely not reducing the text size too much.

To explore what "good" looks like, let's review a couple of our examples.

In [0]:
import pandas as pd
display(
    pd.DataFrame(
        results.tables["eval_results_table"]
    ).loc[0:1, ["inputs", "outputs", "rouge1/v1/score"]]
)

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

inputs,outputs,rouge1/v1/score
"Baltimore's mayor has sacked the US city's police chief, saying his leadership had become a distraction from fighting a ""crime surge"". Mayor Stephanie Rawlings-Blake said she was replacing Police Commissioner Anthony Batts with his deputy, Kevin Davis, for an interim period. The city was rocked by riots in April when a black man died after suffering injuries in police custody. Six officers were charged over the death of the 25-year-old, Freddie Gray. Speaking at a news conference on Wednesday, Mayor Rawlings-Blake said Mr Batts had ""served this city with distinction"" since becoming police chief in October 2012. But referring to the city's high homicide rate, she said ""too many continue to die"". ""The focus has been too much on the leadership of the department and not enough on the crime fighting,"" she told reporters, adding: ""We need to get the crime surge under control."" The city has seen a sharp increase in violence since Freddie Gray's death on 19 April, with 155 homicides this year, a 48% increase over the same period last year. On Tuesday, the police department announced that an outside organisation will review its response to the civil unrest that followed Mr Gray's death. The US justice department is also conducting a civil rights review of the Baltimore force and Mr Batts has been criticised by the city's police union. Earlier on Wednesday, the union released its report into the police handling of the rioting. It said officers had complained ""that they lacked basic riot equipment, training, and, as events unfolded, direction from leadership"". The report also said ""officers repeatedly expressed concern that the passive response to the civil unrest had allowed the disorder to grow into full scale rioting"". Recent events had ""placed attention on police leadership"", Ms Rawlings-Blake said, but denied her decision was influenced by the union report. Mr Davis, who is taking over immediately as interim police chief, praised his ""friend"" Mr Batts and said he was a ""true reform commissioner"". Mayor Rawlings-Blake said Mr Davis would ""bring accountability to police, hold officers who act out of line accountable for their actions"".","Baltimore's mayor fired the city's police chief, citing his leadership as a distraction from addressing the 48% increase in homicides, and appointed his deputy as interim commissioner to focus on crime fighting rather than departmental leadership.",0.3720930233
"Western Sahara has welcomed Morocco's readmission to the African Union, 32 years after members refused to withdraw support for the territory's independence. It was a ""good opportunity"" and ""a chance to work together,"" a top Western Sahara official told the BBC. Morocco controls two-thirds of Western Sahara and sees it as part of its historic territory. However some, including the UN, see Western Sahara as Africa's last colony. Africa Live: More on this and other stories Find out more about Western Sahara A referendum was promised in 1991 but never carried out due to wrangling over who was eligible to vote. Thousands of Sahrawi refugees still live in refugee camps in Algeria - some have been there for 40 years. It is not clear what happens next but Western Sahara is hopeful that a committee set up by the AU will address the issues that both sides have raised. Some AU delegates said that it would be easier to resolve the issue with Morocco inside the AU. Sidi Mohammed, a Western Sahara official, told the BBC that Morocco's return to the AU means that it would now be expected to put ""in practice decisions taken by the AU with regard to a referendum in Western Sahara"". Mr Mohammed dismissed the suggestion that Morocco would now seek to get the AU to change its position, saying that the no country could unilaterally change the AU fundamental agreement, saying it opposed colonisation. In his speech at the AU summit, King Mohammed VI of Morocco said the readmission was not meant to divide the continental body. No. Algeria has always been a big supporter of Western Sahara's Polisario Front and it had wanted Morocco to accept independence of the territory as a condition for readmission. Zimbabwe and South Africa were also supportive of this stance but they were outnumbered by those who wanted Morocco back in the fold. There is no specific provision in the AU charter that bars any country from joining it. Morocco simply applied and the request was accepted by more than two-thirds of the 53 members. Morocco has been involved in intense lobbying and applied in July last year to rejoin the continental body. King Mohammed toured various African countries seeking support for the bid. No. While culturally the country's identity aligns with Arab states, its economic interests increasingly lie in Africa. This is a strategic move to continue exploring its interests in mining, construction, medical, insurance and banking sectors on the continent. Moroccan troops went into Western Sahara after Spain withdrew in 1975. Kitesurfing in a danger zone Inside world's most remote film festival Profile: African Union","Western Sahara officials welcomed Morocco's readmission to the African Union after 32 years, viewing it as an opportunity to work together despite ongoing disputes over Western Sahara's independence and Morocco's control of two-thirds of the territory.",0.3333333333


**Discussion Questions:**
1. How do you interpret the ROUGE-1 score?
2. Do the scores reflect the summarization that you think is best?

## Step 4: Comparing LLM Performance

In practice, we will frequently be comparing LLMs (or larger AI systems) against one another when determining which is the best for our use case. As a result of this, it's important to become familiar with comparing these solutions.

In the below cell, we demonstrate computing the same metrics using the same reference dataset – but this time, we're summarizing using a system that utilizes a different LLM.

**Note:** This time, we're going to read our reference dataset from Delta.

In [0]:
# A compare custom function to iterate through our eval DF
def challenger_query_iteration(inputs):
    answers = []

    for index, row in inputs.iterrows():
        completion = challenger_query_summary_system(row["inputs"])
        answers.append(completion)

    return answers

# Compute challenger results
challenger_results = mlflow.evaluate(
    challenger_query_iteration,           # iterative function from above
    eval_data.head(50),
    targets="writer_summary",             # column with expected or "good" output
    model_type="text-summarization"       # type of model or task
)


 - For traditional ML or deep learning models: Use `mlflow.models.evaluate`, which maintains full compatibility with the original `mlflow.evaluate` API.

 - For LLMs or GenAI applications: Use the new `mlflow.genai.evaluate` API, which offers enhanced features specifically designed for evaluating LLMs and GenAI applications.

2025/07/25 09:46:02 INFO mlflow.models.evaluation.evaluators.default: Computing model predictions.
2025/07/25 09:46:40 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...


Let's take a look at our model-level results.

In [0]:
challenger_results.metrics

{'flesch_kincaid_grade_level/v1/mean': 17.843583371921884,
 'flesch_kincaid_grade_level/v1/variance': 9.53804337086556,
 'flesch_kincaid_grade_level/v1/p90': 21.699272727272735,
 'ari_grade_level/v1/mean': 20.385179164547687,
 'ari_grade_level/v1/variance': 14.55051111110084,
 'ari_grade_level/v1/p90': 25.043977272727272,
 'rouge1/v1/mean': 0.33090330392427647,
 'rouge1/v1/variance': 0.00612536675930271,
 'rouge1/v1/p90': 0.42875226039783004,
 'rouge2/v1/mean': 0.11782701899335928,
 'rouge2/v1/variance': 0.0049505816216773016,
 'rouge2/v1/p90': 0.22854545454545458,
 'rougeL/v1/mean': 0.21761264549798823,
 'rougeL/v1/variance': 0.004764221727284637,
 'rougeL/v1/p90': 0.2891666666666667,
 'rougeLsum/v1/mean': 0.21761264549798823,
 'rougeLsum/v1/variance': 0.004764221727284637,
 'rougeLsum/v1/p90': 0.2891666666666667}

And let's compare in the MLflow UI, looking at the experiment's **Chart** tab.

**Note:** We can filter specifically to ROUGE metrics.

### What about other tasks/metrics?

The `mlflow` library contains [a number of LLM task evaluation tools](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.evaluate) that we can use in our workflows.