# **Hands-On GPT Models: Exploring Advanced Language Processing with Hugging Face Transformers**

### Welcome to the "Hands-On GPT Models" Jupyter Notebook! In this interactive exploration, we embark on a journey to uncover the capabilities of state-of-the-art language models provided by the Hugging Face Transformers library. This hands-on project is designed to be informative and engaging, suitable for both beginners and experienced practitioners in natural language processing (NLP).

##**Overview:**
###**Text Generation with GPT-2:**
  Discover the power of GPT-2 in generating creative and contextually relevant text based on given prompts.

###**Sentiment Analysis with DistilBERT:**
  Dive into sentiment analysis using DistilBERT to understand the emotional tone of various sentences.

###**Text Summarization with DistilBART:**
  Explore the art of summarization with DistilBART, distilling long articles into concise and informative summaries.

###**Named Entity Recognition (NER) with BERT:**
  Uncover named entities within text using BERT, a bidirectional transformer model, for advanced information extraction.

###**Question Answering with DistilBERT:**
  Engage in dynamic question-answering tasks, extracting relevant information from a given context using DistilBERT.

In [None]:
# Importing the required module from the Hugging Face Transformers library
from transformers import pipeline

##**Text Generation using GPT-2**

In [None]:
# Creating a text generation pipeline using the GPT-2 model
generator = pipeline('text-generation', model='gpt2')

# Generating multiple sequences based on a prompt
generator("I read a good novel.", max_length=30, num_return_sequences=5)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "I read a good novel.\n\n\nYou're probably wondering who the real author of all the books on this list is. Some of the reasons were"},
 {'generated_text': 'I read a good novel. I read an essay. I read what George Orwell wrote a long time ago, and I said to myself, this is'},
 {'generated_text': 'I read a good novel. The story tells some of the story behind how you start your life, not how you lose your love.\n\nI'},
 {'generated_text': 'I read a good novel. They used to be people you could write about with some sympathy.\n\nThis sort of thinking has been going on for'},
 {'generated_text': 'I read a good novel. Every time I read a word, it was with excitement, like a game where every character I mentioned would be doing something'}]

In [None]:
# Generating sequences for a different prompt
generator("We went on a movie date.", max_length=30, num_return_sequences=5)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'We went on a movie date. There were so many stars on set."\n\nThe actress was particularly excited for the movie about her friend, who'},
 {'generated_text': 'We went on a movie date. We went down there in the studio after that and got married. We stayed three days. My partner of two years'},
 {'generated_text': 'We went on a movie date. He got tired of watching the stuff. He went to bed, and the movie date was never over because it was'},
 {'generated_text': 'We went on a movie date. I was like "Oh shit." And they never told me what happened because this guy was out for lunch. So'},
 {'generated_text': "We went on a movie date. My wife and I were standing in front of my bedroom for what seemed like an hour. I didn't realize it"}]

## **Sentiment Analysis using DistilBERT**

In [None]:
# Creating a sentiment analysis pipeline using the DistilBERT model
sentiment_analyzer = pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')

# Defining a list of sentences for sentiment analysis
sentences = [
    "I love this product! It's amazing.",
    "The weather today is quite gloomy.",
    "The movie was not good.",
    "This restaurant serves delicious food."
]

# Analyzing sentiment for each sentence and printing the results
for sentence in sentences:
    result = sentiment_analyzer(sentence)
    print(f"Sentence: '{sentence}'\nSentiment: {result[0]['label']} ({result[0]['score']:.4f})\n")


Sentence: 'I love this product! It's amazing.'
Sentiment: POSITIVE (0.9999)

Sentence: 'The weather today is quite gloomy.'
Sentiment: NEGATIVE (0.9939)

Sentence: 'The movie was not good.'
Sentiment: NEGATIVE (0.9998)

Sentence: 'This restaurant serves delicious food.'
Sentiment: POSITIVE (0.9999)



## **Text Summarization using DistilBART**

In [None]:
# Creating a summarization pipeline using the DistilBART model
summarizer = pipeline('summarization', model='sshleifer/distilbart-cnn-12-6')

# Providing an article for summarization
article = """
Hubble Space Telescope has captured a stunning image of a distant galaxy.
The galaxy, known as NGC 4680, is located in the constellation of Hydra.
It is a spiral galaxy with arms stretching outward from a bright central core.
The image reveals intricate details of the galaxy's structure, including numerous stars and dust clouds.
Scientists use such images to study the formation and evolution of galaxies in the universe.
"""

# Generating a summary for the article and printing the original and generated summaries
summary = summarizer(article, max_length=150, min_length=50, length_penalty=2.0, num_beams=4)
print("Original Article:")
print(article)
print("\nGenerated Summary:")
print(summary[0]['summary_text'])

Your max_length is set to 150, but your input_length is only 90. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=45)


Original Article:

Hubble Space Telescope has captured a stunning image of a distant galaxy.
The galaxy, known as NGC 4680, is located in the constellation of Hydra.
It is a spiral galaxy with arms stretching outward from a bright central core.
The image reveals intricate details of the galaxy's structure, including numerous stars and dust clouds.
Scientists use such images to study the formation and evolution of galaxies in the universe.


Generated Summary:
 Hubble Space Telescope captures image of galaxy known as NGC 4680 . It is a spiral galaxy with arms stretching outward from a bright central core . The image reveals intricate details of the galaxy's structure, including numerous stars and dust clouds . Scientists use such images to study the formation and evolution of galaxies in the universe .


## **Named Entity Recognition (NER) using BERT**

In [None]:
# Creating a named entity recognition (NER) pipeline using the BERT model
ner_analyzer = pipeline('ner', model='dbmdz/bert-large-cased-finetuned-conll03-english')

# Analyzing named entities in a given text and printing the results
entities = ner_analyzer("Apple Inc. is planning to open a new store in Paris.")
print(entities)


Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'entity': 'I-ORG', 'score': 0.99973315, 'index': 1, 'word': 'Apple', 'start': 0, 'end': 5}, {'entity': 'I-ORG', 'score': 0.9994981, 'index': 2, 'word': 'Inc', 'start': 6, 'end': 9}, {'entity': 'I-LOC', 'score': 0.9995732, 'index': 12, 'word': 'Paris', 'start': 46, 'end': 51}]


##**Question Answering using DistilBERT**

In [None]:
# Creating a question-answering pipeline using the DistilBERT model
question_answerer = pipeline('question-answering', model='distilbert-base-cased-distilled-squad')

# Providing a context and a question for question-answering and printing the answer
context = "Hugging Face is a company that specializes in natural language processing."
question = "What does Hugging Face specialize in?"
answer = question_answerer(question=question, context=context)
print(answer['answer'])

natural language processing
