
# Building AI Applications with ChatGPT

Sumudu Tennakoon, PhD
<hr>

# NLP With OpenSource Language Models

In this notebook we will explore some basic fetures on Python programing language for those who have a prior programing expereince.

To learn more about Python, refeer to the following websites

- Python : https://www.python.org

To learn more about the Python packages we explore in this notebook, refer to the following websites

- HuggingFace : https://huggingface.co


# Getting Started with HuggingFace

* Run below code cell to install required libraries before you continue. Ignore that if you already installed them.

In [None]:
!pip install transformers sentencepiece

# Sentiment Analysis

In [1]:
from transformers import pipeline
classifier = pipeline('sentiment-analysis')
classifier('I enojoy watching this movie!')


  from .autonotebook import tqdm as notebook_tqdm
No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


[{'label': 'POSITIVE', 'score': 0.9986220598220825}]

In [3]:
from transformers import pipeline
classifier = pipeline('sentiment-analysis')
classifier('This movie was the worst in the series!')

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'NEGATIVE', 'score': 0.9997859597206116}]

## Question Answering

In [10]:

nlp = pipeline("question-answering")

context = """ Marie Curie, née Maria Sklodowska, was born in Warsaw on November 7, \
1867, the daughter of a secondary-school teacher. She received a general education \
in local schools and some scientific training from her father. She became involved \
in a students’ revolutionary organization and found it prudent to leave Warsaw, then \
in the part of Poland dominated by Russia, for Cracow, which at that time was under \
Austrian rule. In 1891, she went to Paris to continue her studies at the Sorbonne \
where she obtained Licenciateships in Physics and the Mathematical Sciences. She met \
Pierre Curie, Professor in the School of Physics in 1894 and in the following year \
they were married. She succeeded her husband as Head of the Physics Laboratory at \
the Sorbonne, gained her Doctor of Science degree in 1903, and following the tragic \
death of Pierre Curie in 1906, she took his place as Professor of General Physics in \
the Faculty of Sciences, the first time a woman had held this position. She was also \
appointed Director of the Curie Laboratory in the Radium Institute of the University \
of Paris, founded in 1914.
"""

nlp(question="When did Marie Curie Born?", context=context)


No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'score': 0.9644644856452942,
 'start': 58,
 'end': 74,
 'answer': 'November 7, 1867'}

In [13]:
nlp(question="What are the positions Marie Curie held at University of Paris?", context=context)

{'score': 0.4384763538837433,
 'start': 1001,
 'end': 1033,
 'answer': 'Director of the Curie Laboratory'}

## Text Generation

In [30]:
text_generator = pipeline("text-generation")
text_generator("An apple fell from the", max_length=6, do_sample=True)

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'An apple fell from the sky'}]

## Translation

In [3]:
from transformers import pipeline

MODEL = "Helsinki-NLP/opus-mt-en-fr"

text = "Hello. How are you?"
translator = pipeline("translation", model=MODEL)

translator(text)



[{'translation_text': 'Bonjour, comment allez-vous ?'}]

## Summarization

In [4]:
from transformers import pipeline

MODEL="facebook/bart-large-cnn"

summarizer = pipeline("summarization", model=MODEL)

text = """ Marie Curie, née Maria Sklodowska, was born in Warsaw on November 7, \
1867, the daughter of a secondary-school teacher. She received a general education \
in local schools and some scientific training from her father. She became involved \
in a students’ revolutionary organization and found it prudent to leave Warsaw, then \
in the part of Poland dominated by Russia, for Cracow, which at that time was under \
Austrian rule. In 1891, she went to Paris to continue her studies at the Sorbonne \
where she obtained Licenciateships in Physics and the Mathematical Sciences. She met \
Pierre Curie, Professor in the School of Physics in 1894 and in the following year \
they were married. She succeeded her husband as Head of the Physics Laboratory at \
the Sorbonne, gained her Doctor of Science degree in 1903, and following the tragic \
death of Pierre Curie in 1906, she took his place as Professor of General Physics in \
the Faculty of Sciences, the first time a woman had held this position. She was also \
appointed Director of the Curie Laboratory in the Radium Institute of the University \
of Paris, founded in 1914.
"""

summarizer(text)

Downloading (…)lve/main/config.json: 100%|██████████| 1.58k/1.58k [00:00<?, ?B/s]
Downloading pytorch_model.bin: 100%|██████████| 1.63G/1.63G [03:46<00:00, 7.18MB/s]
Downloading (…)neration_config.json: 100%|██████████| 363/363 [00:00<?, ?B/s] 
Downloading (…)olve/main/vocab.json: 100%|██████████| 899k/899k [00:00<00:00, 8.32MB/s]
Downloading (…)olve/main/merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 9.80MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 1.36M/1.36M [00:00<00:00, 9.11MB/s]


[{'summary_text': 'Marie Curie, née Maria Sklodowska, was born in Warsaw on November 7, 1867. She received a general education in local schools and some scientific training from her father. In 1891, she went to Paris to continue her studies at the Sorbonne where she obtained Licenciateships in Physics and the Mathematical Sciences.'}]

## Classification

In [5]:
from transformers import pipeline

MODEL="facebook/bart-large-mnli"

classifier = pipeline("zero-shot-classification", model=MODEL)

sequence_to_classify = "Today I am going to prepare a dinner for my friends"

candidate_labels = ['travel', 'cooking', 'playing', 'learning']

classifier(sequence_to_classify, candidate_labels)

Downloading (…)lve/main/config.json: 100%|██████████| 1.15k/1.15k [00:00<?, ?B/s]
Downloading model.safetensors: 100%|██████████| 1.63G/1.63G [04:06<00:00, 6.60MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 26.0/26.0 [00:00<00:00, 25.4kB/s]
Downloading (…)olve/main/vocab.json: 100%|██████████| 899k/899k [00:00<00:00, 2.24MB/s]
Downloading (…)olve/main/merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 8.20MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 1.36M/1.36M [00:00<00:00, 6.58MB/s]


{'sequence': 'Today I am going to prepare a dinner for my friends',
 'labels': ['cooking', 'learning', 'playing', 'travel'],
 'scores': [0.9570968747138977,
  0.03565406799316406,
  0.005505912937223911,
  0.0017431112937629223]}

In [6]:
sequence_to_classify = "I am going to visit Paris next year"

candidate_labels = ['travel', 'cooking', 'playing', 'learning']

classifier(sequence_to_classify, candidate_labels)

{'sequence': 'I am going to visit Paris next year',
 'labels': ['travel', 'learning', 'playing', 'cooking'],
 'scores': [0.7273194789886475,
  0.17637674510478973,
  0.09052838385105133,
  0.005775348283350468]}

In [7]:
sequence_to_classify = "I scored 75 runs in the cricket match yesterday"

candidate_labels = ['travel', 'cooking', 'playing', 'learning']

classifier(sequence_to_classify, candidate_labels)

{'sequence': 'I scored 75 runs in the cricket macth yesterday',
 'labels': ['playing', 'learning', 'travel', 'cooking'],
 'scores': [0.8781684041023254,
  0.09010912477970123,
  0.018548928201198578,
  0.013173486106097698]}

<hr/>
Last update 2023-07-04 by Sumudu Tennakoon

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.