# News Summarization


### Scrape the article you want to summarize

In [22]:
from scraping import return_single_article
article = return_single_article('https://www.aljazeera.com/news/2024/7/31/israel-subjecting-palestinian-detainees-to-torture-and-abuse-un-report')


In [23]:
print(article['title'], '\n\n')
print(article['authors'], '\n\n')
print(article['source'], '\n\n')
print(article['article'], '\n\n')

Israel subjecting Palestinian detainees to torture and abuse: UN report 


By  


aljazeera 


The report says ‘thousands’ of Palestinians detained arbitrarily by Israel during the war in Gaza.

Israel has detained thousands of Palestinians during the war in Gaza and stands accused of numerous cases of torture, the Office of the United Nations High Commissioner for Human Rights says in a new report.

The 23-page report, released on Wednesday, noted allegations of widespread abuse of prisoners being held incommunicado in arbitrary, prolonged detention. It was published during a tense standoff in Israel as far-right politicians and demonstrators opposed an investigation into alleged sexual abuse of detainees by soldiers.

Based primarily on interviews with released detainees and other victims from October 7 to June 30, the UN report found that since the war began, “thousands of Palestinians” including medical staff, have been “taken from Gaza to Israel, usually shackled and blindfolded”.

### Summarization

In [24]:
from hf_summarizer import bart_summarize

In [25]:
# this function uses this model: https://huggingface.co/sshleifer/distilbart-cnn-12-6
# which is a bert-like model trained on the cnn-dailymail dataset
# its one of the best summarization models available in transformers, very fast, good for deployment
summary = bart_summarize(article['article'])

In [26]:
summary

" UN report says 'thousands' of Palestinians detained arbitrarily by Israel during the war in Gaza. Report says detainees have been held in secret, without being given a reason for their detention. It also detailed ‘allegations of torture and other forms of cruel, inhuman and degrading treatment’ of women and men."

#### There are many other summary options available in hf_summarizer.py including different pegasus models, t5

### Statistical Summarization

In [27]:
from statistical_summarize import run_statistical_summarizers

In [28]:
run_statistical_summarizers(text=article['article'], num_sentences=5)

**********Statistical Summarizations**********


TF IDF Summary:
The documented abuse included food, sleep and water deprivation and being burned with cigarettes. “Some detainees said dogs were released on them, and others said they were subjected to waterboarding, or that their hands were tied and they were suspended from the ceiling. Some women and men also spoke of sexual and gender-based violence,” the report said. Palestinian detainees held in Israel are mostly men and boys who are residents, doctors or patients as well as captured Palestinian fighters, it added. Israel also fails to provide information regarding the fate of detainees while the Red Cross has been denied access to prisons and other facilities. 


Word Frequency Summary:
The report says ‘thousands’ of Palestinians detained arbitrarily by Israel during the war in Gaza. The 23-page report, released on Wednesday, noted allegations of widespread abuse of prisoners being held incommunicado in arbitrary, prolonged detenti

#### These statistical techniques do not use any Machine Learning algorithms but are very fast and give a decent result

### Sentiment Analysis

In [29]:
from sentiment_analysis import hf_topn_sentiment

In [30]:
# this function returns the top positive and top negative sentences from a piece of text,
# not exactly summarization but can get the most polarizing lines from an article
top_positive, top_negative = hf_topn_sentiment(article['article'])

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99

In [31]:
top_positive

[(0.9423840641975403,
  'It also detailed “allegations of torture and other forms of cruel, inhuman and degrading treatment, including sexual abuse of women and men”.')]

In [32]:
top_negative

[(0.9979819059371948,
  'UN High Commissioner for Human Rights Volker Turk said the testimonies gathered by his office and “other entities indicate a range of appalling acts … in flagrant violation of international human rights law and international humanitarian law”.'),
 (0.9996262788772583,
  'Israel also fails to provide information regarding the fate of detainees while the Red Cross has been denied access to prisons and other facilities.')]