In python, the standard way to use transformer models it to work with the Transformers library, that offers access to pre-trained language models and a user-friendly pipeline for different tasks, including text classification.

Check out the  [Transformer library](https://huggingface.co/docs/transformers/index) for more details.

Browse the repository of pre-trained models for text classification [here](https://huggingface.co/models?pipeline_tag=text-classification&sort=trending)

<br>
<a target="_blank" href="https://colab.research.google.com/drive/10d88csqxt7ClGnVvmwrJmfoGL_5u_RQB#scrollTo=66mf1h_5JKga">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

By default, Colab will run on CPUs. If you want to use GPU hardware and accelerate the computational process, you can go on Runtime, change runtime type on GPU and then Click on Connect on the upper right of the interface.

---



In [1]:
%%capture

# Equivalent to install.packages("") in R. But here, you have to install them everytime.

!pip install transformers==4.44.1 # The main library for accessing transformers from Huggingface
!pip install langdetect # A library to use language detection


The Transformers library provides a simple tool called a **pipeline** that significantly simplifies the process of using existing pre-trained language models for various tasks. It abstracts away the complexity of model loading, tokenization, and inference, allowing you to easily apply models to tasks like text classification, question answering, translation, and masked language modeling. By calling a pipeline with a specific task, such as "text-classification," you can quickly get predictions with minimal code. In this guide, we will explore different examples to demonstrate how to load a model through a pipeline. Each time, the pipeline requires you to specify:

- The task you want to perform
- The model you want to use from Hugging Face


In [2]:
from transformers import pipeline # Equivalent to library() in R

## Mask language modelling

Masked Language Modeling (MLM) is a technique where a model is trained to predict missing words in a sentence by replacing random words with a special token (e.g., [MASK]). For example, in the sentence, "The cat sat on the [MASK]," the model predicts "mat" based on the context.

MLM forms the basis for pre-trained models like BERT, which learn rich, contextual representations of words. This enables the models to understand word meanings in various contexts, making them effective for tasks like text classification, question answering, and sentiment analysis.

These models are also key to transfer learning. After being pre-trained with MLM, they can be fine-tuned on specific tasks, saving time and resources. This makes them highly adaptable for a wide range of NLP applications, such as named entity recognition, text classification, and more.









In [None]:
# Initialize the fill-mask pipeline with a pre-trained model (e.g., BERT)
fill_mask = pipeline("fill-mask", model="bert-base-uncased")

In [None]:
# Example sentences with a [MASK] token
sentences = [
    "The [MASK] of the Arctic region is changing rapidly, with temperatures rising at an alarming rate.",
    "The political [MASK] has been tense in the country, with protests growing in various cities over the past few months."
]

# Use the pipeline to predict the masked word for both sentences
for sentence in sentences:
    predictions = fill_mask(sentence)
    print(f"Predictions for sentence: {sentence}")

    # Print the top 5 predictions for the masked token
    for prediction in predictions:
        print(f"Prediction: {prediction['token_str']} - Score: {prediction['score']}")
    print("\n")  # Print a newline between the results for readability

Similar to static embeddings, you can use these contextual representations to learn about model biases based on how likely a given word would be predict by a model in different sentences. Here below an example taken from [Hauke Licht's course on Transformers](https://github.com/haukelicht/advanced_text_analysis/blob/main/notebooks/contextualized_embedding_transformers_explained.ipynb).


In [None]:
# Example sentences with a [MASK] token
sentences = [
    "Homosexuals are making our country [MASK].",
    "Straights are making our country [MASK]."
]
# Use the pipeline to predict the masked word for both sentences
for sentence in sentences:
    predictions = fill_mask(sentence)
    print(f"Predictions for sentence: {sentence}")

    # Print the top 5 predictions for the masked token
    for prediction in predictions:
        print(f"Prediction: {prediction['token_str']} - Score: {prediction['score']}")
    print("\n")  # Print a newline between the results for readability

In [6]:
# Example sentences with a [MASK] token
sentences = [
    "Men should [MASK].",
    "Women should [MASK]."
]
# Use the pipeline to predict the masked word for both sentences
for sentence in sentences:
    predictions = fill_mask(sentence)
    print(f"Predictions for sentence: {sentence}")

    # Print the top 5 predictions for the masked token
    for prediction in predictions:
        print(f"Prediction: {prediction['token_str']} - Score: {prediction['score']}")
    print("\n")  # Print a newline between the results for readability

Predictions for sentence: Men should [MASK].
Prediction: fight - Score: 0.0709475725889206
Prediction: die - Score: 0.06575854867696762
Prediction: know - Score: 0.04549961909651756
Prediction: talk - Score: 0.031104378402233124
Prediction: be - Score: 0.02514948882162571


Predictions for sentence: Women should [MASK].
Prediction: know - Score: 0.10045552998781204
Prediction: be - Score: 0.05500389635562897
Prediction: understand - Score: 0.040554508566856384
Prediction: talk - Score: 0.03237244114279747
Prediction: work - Score: 0.027062635868787766




## Text classification

As we have already seen earlier in the course, text classification is one of the most common tasks in text analysis, with many potential applications in political science. There are plenty of pre-trained models in Hugging Face that allow you to classify a wide variety of things. Most of these models are models that use the versions of BERT and fine-tune it to specific classification tasks.

### Sentiment

Rather than using more classical sentiment analysis through dictionaries, it is possible to use existing sentiment classifiers trained for this tasK

In [7]:
sentiment_classifier = pipeline("sentiment-analysis",model="cardiffnlp/twitter-roberta-base-sentiment-latest")

config.json:   0%|          | 0.00/929 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/501M [00:00<?, ?B/s]

Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [8]:
sentiment_tweets = [
    "We're proud to announce our new green energy policy to create a sustainable future for all! 🌱 #GoGreen #FutureFirst",
    "The opposition party's plan lacks substance and is not in the best interest of the public. #PolicyFailure",
    "Join us this weekend for our community outreach program. Let's work together to make a difference. #CommunityFirst",
    "Recent reports show that our healthcare reforms are making a difference, but there's still work to do. #Progress",
    "Shocking incompetence from the government on handling the cost of living crisis. We deserve better! #LeadershipFail",
    "Our vision for the future: a stronger economy, better jobs, and opportunities for everyone. #TogetherWeCan",
    "Debate tonight at 8 PM. Tune in to hear our plans for the next generation. #Election2025",
]

# Classify emotions and get the results with scores
sentiment_predictions = [sentiment_classifier(text) for text in sentiment_tweets]

# Print the text with corresponding emotion and scores
for text, result in zip(sentiment_tweets, sentiment_predictions):
    print(f"Text: {text}")
    for sentiment in result:
        print(f"Sentiment: {sentiment['label']}, Score: {sentiment['score']:.4f}")
    print("-----")

Text: We're proud to announce our new green energy policy to create a sustainable future for all! 🌱 #GoGreen #FutureFirst
Sentiment: positive, Score: 0.9838
-----
Text: The opposition party's plan lacks substance and is not in the best interest of the public. #PolicyFailure
Sentiment: negative, Score: 0.9035
-----
Text: Join us this weekend for our community outreach program. Let's work together to make a difference. #CommunityFirst
Sentiment: positive, Score: 0.9177
-----
Text: Recent reports show that our healthcare reforms are making a difference, but there's still work to do. #Progress
Sentiment: positive, Score: 0.6701
-----
Text: Shocking incompetence from the government on handling the cost of living crisis. We deserve better! #LeadershipFail
Sentiment: negative, Score: 0.9492
-----
Text: Our vision for the future: a stronger economy, better jobs, and opportunities for everyone. #TogetherWeCan
Sentiment: positive, Score: 0.9657
-----
Text: Debate tonight at 8 PM. Tune in to hear

### Emotions

In [9]:
# Use a text classification model to detect emotions in text

emotion_classifier = pipeline("text-classification", model="j-hartmann/emotion-english-distilroberta-base", trust_remote_code=True)


emotion_texts = [
    "The government's failure to address climate change is a betrayal to future generations.",
    "I'm incredibly proud of the progress we’ve made in passing comprehensive healthcare reform.",
    "How can politicians sleep at night knowing how many people are suffering because of their policies?",
    "I can't believe the corruption in the system. It’s become so blatant and nothing ever gets done about it.",
    "The election results were shocking, and many of us are still trying to process everything.",
    "We need urgent action to tackle poverty, but all we see are empty promises and no real solutions.",
    "It's inspiring to see so many young people getting involved in politics and pushing for real change.",
    "I’m deeply concerned about the growing divide in our country. We need to find a way to unite again."
]

# Classify emotions and get the results with scores
emotion_predictions = [emotion_classifier(text) for text in emotion_texts]

# Print the text with corresponding emotion and scores
for text, result in zip(emotion_texts, emotion_predictions):
    print(f"Text: {text}")
    for emotion in result:
        print(f"Emotion: {emotion['label']}, Score: {emotion['score']:.4f}")
    print("-----")


config.json:   0%|          | 0.00/1.00k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/329M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/294 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


Text: The government's failure to address climate change is a betrayal to future generations.
Emotion: disgust, Score: 0.6021
-----
Text: I'm incredibly proud of the progress we’ve made in passing comprehensive healthcare reform.
Emotion: joy, Score: 0.7973
-----
Text: How can politicians sleep at night knowing how many people are suffering because of their policies?
Emotion: neutral, Score: 0.3800
-----
Text: I can't believe the corruption in the system. It’s become so blatant and nothing ever gets done about it.
Emotion: disgust, Score: 0.7124
-----
Text: The election results were shocking, and many of us are still trying to process everything.
Emotion: surprise, Score: 0.9210
-----
Text: We need urgent action to tackle poverty, but all we see are empty promises and no real solutions.
Emotion: sadness, Score: 0.6377
-----
Text: It's inspiring to see so many young people getting involved in politics and pushing for real change.
Emotion: joy, Score: 0.7624
-----
Text: I’m deeply concer

## Topic classification

If you are looking at political texts and want to classify them according policy issues, there are different models available that automate the classification of the [Comparative Agendas Project](https://www.comparativeagendas.net/) into ~20 policy issues. For instance, the  [poltext lab repository](https://huggingface.co/poltextlab) contains more than 80 different models fine-tuned on CAP data of different languages and types of political texts.



In [10]:
# Use a topic classification model []

topic_classifier = pipeline("text-classification", model = "cornelius/partypress-multilingual", tokenizer = "cornelius/partypress-multilingual")

config.json:   0%|          | 0.00/2.07k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/712M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/335 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/996k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.92M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [11]:
topic_texts =  [
    "Addressing the climate crisis demands bold action.",
    "Governments must prioritize investments in renewable energy.",
    "Universal access to affordable healthcare is a moral imperative.",
    "Expanding public health insurance options can save lives.",
    "Education reform must focus on equity and accessibility for all.",
    "Reducing wealth inequality requires progressive taxation policies.",
    "Immigration policies should balance security with compassion.",
    "Investments in infrastructure can drive economic growth and resilience.",
    "We should push for more federalism at the european level"
    "A strong democracy depends on the protection of voting rights.",
    "Addressing systemic racism is essential for social justice.",
    "Affordable housing policies can reduce homelessness.",
    'We disagree with Trump',
    "Supporting small businesses fosters local economic growth.",
    "Foreign policy should prioritize diplomacy and multilateral cooperation."
]


# Classify each text and store results
classified_results = [
    {"text": text, **topic_classifier(text)[0]}
    for text in topic_texts
]

# Print the results
for result in classified_results:
    print(f"Text: {result['text']}")
    print(f"Topic: {result['label']}")
    print(f"Score: {result['score']:.4f}")
    print("-" * 40)


Text: Addressing the climate crisis demands bold action.
Topic: 7 - Environment
Score: 0.9803
----------------------------------------
Text: Governments must prioritize investments in renewable energy.
Topic: 8 - Energy
Score: 0.9814
----------------------------------------
Text: Universal access to affordable healthcare is a moral imperative.
Topic: 3 - Health
Score: 0.9401
----------------------------------------
Text: Expanding public health insurance options can save lives.
Topic: 3 - Health
Score: 0.4718
----------------------------------------
Text: Education reform must focus on equity and accessibility for all.
Topic: 6 - Education
Score: 0.9918
----------------------------------------
Text: Reducing wealth inequality requires progressive taxation policies.
Topic: 1 - Macroeconomics
Score: 0.8376
----------------------------------------
Text: Immigration policies should balance security with compassion.
Topic: 9 - Immigration
Score: 0.9891
--------------------------------------

In [12]:
immigration_multi_sentences = [
    "Immigration is a major topic in today's political discussions.",  # English
    "L'immigration est un sujet majeur dans les discussions politiques actuelles.",  # French
    "Die Einwanderung ist ein wichtiges Thema in den heutigen politischen Diskussionen.",  # German
    "La inmigración es un tema importante en los debates políticos actuales.",  # Spanish
    "L'immigrazione è un tema importante nei dibattiti politici odierni.",  # Italian
    "A imigração é um tema importante nas discussões políticas atuais.",  # Portuguese
    "La inmigración es un tema clave en los debates políticos actuales.",  # Catalan
    "İmmigrasyon, bugünün politik tartışmalarında önemli bir konudur.",  # Turkish
    "Emigracija je tema, o kojoj se danas često diskutuje u političkim krugovima."  # Serbian
]

# Classify each text and store results
multilingual_results = [
    {"text": text, **topic_classifier(text)[0]}
    for text in immigration_multi_sentences
]

# Print the results
for result in multilingual_results:
    print(f"Text: {result['text']}")
    print(f"Topic: {result['label']}")
    print(f"Score: {result['score']:.4f}")
    print("-" * 40)


Text: Immigration is a major topic in today's political discussions.
Topic: 9 - Immigration
Score: 0.8763
----------------------------------------
Text: L'immigration est un sujet majeur dans les discussions politiques actuelles.
Topic: 9 - Immigration
Score: 0.9874
----------------------------------------
Text: Die Einwanderung ist ein wichtiges Thema in den heutigen politischen Diskussionen.
Topic: 9 - Immigration
Score: 0.6355
----------------------------------------
Text: La inmigración es un tema importante en los debates políticos actuales.
Topic: 9 - Immigration
Score: 0.6872
----------------------------------------
Text: L'immigrazione è un tema importante nei dibattiti politici odierni.
Topic: 9 - Immigration
Score: 0.9790
----------------------------------------
Text: A imigração é um tema importante nas discussões políticas atuais.
Topic: 98 - Non-thematic
Score: 0.9301
----------------------------------------
Text: La inmigración es un tema clave en los debates políticos ac

It is of course possible to run this models on a larger number of texts.

In [13]:
import pandas as pd

politique_generale = pd.read_csv("https://raw.githubusercontent.com/luissattelmayer/intro-css/refs/heads/main/data/politique_generale.csv")
politique_generale

import nltk
from nltk.tokenize import sent_tokenize
nltk.download('punkt')
nltk.download('punkt_tab')

politique_generale["sentences"] = politique_generale["text"].apply(sent_tokenize)

# Epxlode the sentences into rows

politique_generale_sent = politique_generale.explode("sentences")
politique_generale_sent

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


Unnamed: 0,text,date,intervenant,sentences
0,"Madame la Présidente, d'abord le Gouvernement,...",2025-01-14,François Bayrou,"Madame la Présidente, d'abord le Gouvernement,..."
0,"Madame la Présidente, d'abord le Gouvernement,...",2025-01-14,François Bayrou,"Mesdames et Messieurs les députés, en vérité, ..."
0,"Madame la Présidente, d'abord le Gouvernement,...",2025-01-14,François Bayrou,"Sur ces bancs, même parmi ceux qui sont violem..."
0,"Madame la Présidente, d'abord le Gouvernement,...",2025-01-14,François Bayrou,"Et 84 % des Français jugent, paraît-il, que le..."
0,"Madame la Présidente, d'abord le Gouvernement,...",2025-01-14,François Bayrou,Et il m'arrive même de me demander où les 16 %...
...,...,...,...,...
33,Assurer la dignité et la liberté de la personn...,1959-01-15,Michel Debré,"Cependant, me semble-t-il, au milieu des diffi..."
33,Assurer la dignité et la liberté de la personn...,1959-01-15,Michel Debré,"L'autorité du chef de l'État, le souvenir des ..."
33,Assurer la dignité et la liberté de la personn...,1959-01-15,Michel Debré,"Nous devons, mais nous pouvons aussi donner à ..."
33,Assurer la dignité et la liberté de la personn...,1959-01-15,Michel Debré,"C'est, en fin de compte, Mesdames, Messieurs l..."


In [14]:
from tqdm import tqdm # Import library to have progress bars

def apply_model_to_text(df, text_column, classifier):
    # Initialize tqdm for progress bar
    tqdm.pandas()  # This allows the progress_apply method to work

    # Apply the classifier to the text column and get the result as a DataFrame with a progress bar
    results = df[text_column].progress_apply(lambda x: classifier(x)[0])  # Apply model on each text

    # Create new columns for the label and score
    df['label'] = results.apply(lambda x: x['label'])  # Extract the label
    df['score'] = results.apply(lambda x: x['score'])  # Extract the score

    return df

# Example usage
df_classified = apply_model_to_text(politique_generale_sent, 'sentences', topic_classifier)

df_classified



  0%|          | 14/13316 [00:01<29:38,  7.48it/s]


KeyboardInterrupt: 

In [15]:
df_classified.value_counts("label")

NameError: name 'df_classified' is not defined

In [16]:
import pandas as pd
import matplotlib.pyplot as plt

df_classified['date'] = pd.to_datetime(df_classified['date'])

# Count total sentences per date
total_counts = df_classified.groupby('date').size().reset_index(name='total_count')

# Count immigration sentences per date
immigration_counts = df_classified[df_classified['label'] == '9 - Immigration'] \
    .groupby('date').size().reset_index(name='immigration_count')

# Merge total and immigration counts on date
merged_counts = pd.merge(total_counts, immigration_counts, on='date', how='left')

# Fill NaN values for dates without immigration mentions
merged_counts['immigration_count'] = merged_counts['immigration_count'].fillna(0)

# Compute the share of immigration
merged_counts['immigration_share'] = merged_counts['immigration_count'] / merged_counts['total_count']

# Plot the share of immigration over time
plt.figure(figsize=(12, 6))
plt.plot(merged_counts['date'], merged_counts['immigration_share'], marker='o', linestyle='-')
plt.title("Share of Immigration Mentions Over Time", fontsize=16)
plt.xlabel("Date", fontsize=14)
plt.ylabel("Share of Immigration Mentions", fontsize=14)
plt.grid(True)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()


NameError: name 'df_classified' is not defined

## Sexism

There are also plenty of other classifiers that you might play with. Here for instance, I found a classifier fine-tuned to detect sexism.

In [None]:
sexism_classifier = pipeline("text-classification", model = "NLP-LTU/distilbert-sexism-detector")

# List of texts
texts_sexism = [
    "Women should stay at home and take care of the children, not pursue careers.",
    "Men are naturally better leaders than women because of their strength and decisiveness.",
    "Everyone should have the opportunity to pursue their dreams, regardless of gender.",
    "It’s important to encourage girls and boys equally to pursue careers in STEM fields.",
    "Girls are just not good at math or science; that’s why they don’t excel in these subjects."
]


# Classify the texts
results = sexism_classifier(texts_sexism)

# Print the text with its corresponding label
for text, result in zip(texts_sexism, results):
    print(f"Text: {text}")
    print(f"Label: {result['label']}")
    print("-----")


## Zero-short classification

In [None]:
# Use a zero-shot model

zero_shot_classifier = pipeline("zero-shot-classification", model="mlburnham/Political_DEBATE_base_v1.0") # To use the base model
hypothesis_template = 'The author of this text is {}.'
test_labels = ['sexist', 'not sexist']

zero_shot_classifier(texts_sexism, test_labels, hypothesis_template = hypothesis_template, multi_label = True)



## Token classification

In [None]:
ner_classifier = pipeline("token-classification", model="dslim/bert-base-NER", grouped_entities=True)

sentence = "Barack Obama, the 44th President of the United States, was born in Honolulu, Hawaii, on August 4, 1961, and graduated from Harvard Law School."

ner_classifier(sentence)

## Generative models (Decoders)

In [None]:
from transformers import set_seed

generator = pipeline('text-generation', model='gpt2')
set_seed(42)
generator("Donald Trump is,", max_length=30, num_return_sequences=5)


## Translation

You can access various open and free translation models that allow you to translate large amounts of text without needing to pay for services like DeepL or Google Translate. This is particularly useful if you're training a multilingual model and need to annotate texts in different languages. For example, you can translate a sample of these texts into a single target language, such as English, while still using the original texts to train the model.

Open-source machine translation (MT) models enable translation between multiple languages without relying on commercial services. The University of Helsinki has uploaded models for over 1,000 language pairs to the Hugging Face hub, and Facebook AI has open-sourced several multilingual models. The EasyNMT library provides a simple wrapper for these models. While most machine translation models translate between two languages in one direction (e.g., German to English, but not the reverse), some are capable of handling translations in multiple directions.


In [None]:
from langdetect import detect  # Importing langdetect for automatic language detection

# Initialize the translation pipeline with a pre-trained model
pipeline_translate = pipeline("translation", model="facebook/m2m100_418M")

Device set to use cuda:0


In [None]:
# Example sentences in different languages
texts = [
    "Climate change is causing more frequent and intense storms.",
    "El cambio climático está causando tormentas más frecuentes e intensas.",
    "Le changement climatique provoque des tempêtes plus fréquentes et plus intenses.",
    "Der Klimawandel verursacht häufigere und intensivere Stürme.",
    "Il cambiamento climatico sta causando tempeste più frequenti e intense.",
    "As mudanças climáticas estão causando tempestades mais frequentes e intensas.",
    "Klimaatverandering veroorzaakt frequentere en intensere stormen.",
    "تغير المناخ يسبب عواصف أكثر تواترًا وشدة."
]


# Translate the sentences into English
for text in texts:
    # Detect the source language using langdetect
    src_lang = detect(text)

    # Translate the sentence
    translated_text = pipeline_translate(text, src_lang=src_lang, tgt_lang="en")[0]['translation_text']

    # Print the detected language and the translation
    print(f"Detected Language: {src_lang}")
    print(f"Original: {text}")
    print(f"Translated: {translated_text}")
    print()  # Print a newline for readability


Detected Language: en
Original: Climate change is causing more frequent and intense storms.
Translated: Climate change is causing more frequent and intense storms.

Detected Language: es
Original: El cambio climático está causando tormentas más frecuentes e intensas.
Translated: Climate change is causing more frequent and intense storms.

Detected Language: fr
Original: Le changement climatique provoque des tempêtes plus fréquentes et plus intenses.
Translated: Climate change causes more frequent and intense storms.

Detected Language: de
Original: Der Klimawandel verursacht häufigere und intensivere Stürme.
Translated: Climate change causes more frequent and intense storms.

Detected Language: it
Original: Il cambiamento climatico sta causando tempeste più frequenti e intense.
Translated: Climate change is causing more frequent and intense storms.

Detected Language: pt
Original: As mudanças climáticas estão causando tempestades mais frequentes e intensas.
Translated: Climate change i

In [None]:
# Initialize the summarization pipeline with a pre-trained model (e.g., T5)
summarizer = pipeline("summarization", model="t5-small")


# Provided text
text = """
The end of the cessation of hostilities in Gaza is deeply concerning, I urge all sides not to squander progress made over the last week.
All sides must work for a return to cessation that would allow for the release of more hostages, provide much needed time and space to tackle the humanitarian crisis in Gaza, and open a dialogue for a political solution that provides for a long-term cessation of hostilities.
We will only reach that long-term solution if Israel is assured that Hamas cannot carry out an attack like October 7 ever again. Those who can influence Hamas must demand they release the remaining hostages immediately.
The levels of death and destruction over the past weeks has been intolerable. Far too many innocent Palestinians, including women and children, have been killed as part of military operations. There must be full accountability for all actions.
As fighting sadly resumes, Israel must not besiege or blockade Gaza. They must comply with international law by protecting innocent lives and civilian infrastructure like schools and hospitals.
With winter coming and the people in Gaza being forced to live in an ever-smaller section of the strip, attempts to address the humanitarian catastrophe cannot regress, aid must be ramped up. The people of Gaza need aid, food, water, fuel, shelter, and medicine in huge volumes, to ensure hospitals function and lives are saved. We know the risk of disease is high and must be mitigated.
Those displaced in this conflict also need assurances of their right to return home and rebuild their lives. Gaza cannot be left as a refugee camp, there can be no reoccupation or reduction of its territory.
The UK and partners must start work immediately to find a pathway to an enduring cessation of hostilities and a lasting political solution. We want to see the threat of Hamas removed, the end to illegal settlements and settler violence in the West Bank, and a plan for the reconstruction and renewal of Gaza.
Palestinians must be assured their future will not be like the past, that they and their children will be able to enjoy the security, opportunities and rights that we take for granted.
That will not be easy. Diplomatic work never is. But the past few days have shown what diplomacy can do.
These are the essential steps if we are to deliver a two-state solution, with a Palestinian state alongside a safe and secure Israel, the only credible basis for long-term peace.
Military action without this sort of plan cannot succeed.


"""

# Summarize the text
summary = summarizer(text, max_length=50, min_length=10, do_sample=False)

# Print the summary
print(summary[0]['summary_text'])