In [1]:
"""
4개의 문장 중 앞 문장 두개는 긍정의 문장으로 구성하고, 뒤 문장 두개는 부정의 리뷰 글을
임의로 작성한 것이다
허깅페이스 파이프라인에서 감정분석을 나타내는 "sentiment-analysis" 모델을 불러온다
실행 결과에서 당초 의도대로 긍정 레이블과 부정 레이블을 잘 분류하는 것을 알 수 있다
"""
# 감성분석
from transformers import pipeline
classifier = pipeline('sentiment-analysis')
data = ["This is what a true masterpiece looks like", 
     "brilliant film, hard to better",
     "Are you kidding me. A horrible movie about horrible people.",
     "the plot itself is also very boring"]
results = classifier(data)
for result in results:
    print(f"레이블: {result['label']}, score: {round(result['score'], 3)}")


No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


레이블: POSITIVE, score: 1.0
레이블: POSITIVE, score: 1.0
레이블: NEGATIVE, score: 1.0
레이블: NEGATIVE, score: 1.0


In [2]:
"""
기본적으로 영어로 사전 훈련된 모델과 토크나이저를 활용한다. 다음 코드는 부정적인 글을
한글로 작성했지만 결과는 "긍정"으로 나타났다
모델이 문장을 제대로 이해하지 못했음을 알 수 있다
"""
classifier("나는 수학이 어렵다")

[{'label': 'POSITIVE', 'score': 0.6286845803260803}]

In [3]:
"""
이때 다국어로 학습한 모델을 불러온다면 결과는 달라진다.
별 5점 만점 중 별 2점으로 부정적인 글로 인식했음을 확인할 수 있다

링크가 없다고 뜸 - 뭐지
"""
# 다국어 모델 불러오기
classifier = pipeline('sentiment-analysis', 
                      model="nlptown/bert-base-multilingual-uncased-sentiment")
classifier("나는 수학이 어럽다")

404 Client Error: Not Found for url: https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment/resolve/main/tf_model.h5
404 Client Error: Not Found for url: https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment/resolve/main/tf_model.h5


ValueError: Could not load model nlptown/bert-base-multilingual-uncased-sentiment with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForSequenceClassification'>, <class 'transformers.models.bert.modeling_tf_bert.TFBertForSequenceClassification'>).

In [None]:
"""
다국어 모델로 한글 문장을 더 확인해보자. 긍정적인 문장을 별 5점으로 
부정인 문장은 별 3점과 2점으로 비교적 정확한 결과를 보인다
링크에러로 - 실행안됨
"""
# 감성분석(다국어)
classifier = pipeline('sentiment-analysis', 
                      model="nlptown/bert-base-multilingual-uncased-sentiment")

data = ["이 영화 최고", 
     "너무 지루하다",
     "또 보고싶은 최고의 걸작이다.",
     "내 취향은 아니다."]

results = classifier(data)

for i,result in enumerate(results):
    print(f"문장: {data[i]}, 레이블: {result['label']}, score: {round(result['score'], 3)}")

In [4]:
"""
5-2 질의 응답
사전에 학습된 질의응답(question-answering) 파이프 라인을 불러온다 
위키피디아에서 tensorflow를 검색해서 나온 소개글을 데이터로 넣고 응답을 확인해보자
입력된 텍스트 데이터에 대한 내용을 질문하면 문장 중에서 적절한 응답을 찾아서 제시한다
이와 같은 질의 응답 문제는 지문이 나오고 그 지문과 관련된 복수의 질문을 물어보는
국어/영어 시험 문제와 유사하다
"""
# 질의응답
# https://en.wikipedia.org/wiki/TensorFlow 텐서플로 소개 글 중 일부
from transformers import pipeline
nlp = pipeline("question-answering")
data = r"""
TensorFlow is Google Brain's second-generation system. Version 1.0.0 was released on February 11, 2017.[14] While the reference implementation runs on single devices, TensorFlow can run on multiple CPUs and GPUs (with optional CUDA and SYCL extensions for general-purpose computing on graphics processing units).[15] TensorFlow is available on 64-bit Linux, macOS, Windows, and mobile computing platforms including Android and iOS.
Its flexible architecture allows for the easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices.
TensorFlow computations are expressed as stateful dataflow graphs. The name TensorFlow derives from the operations that such neural networks perform on multidimensional data arrays, which are referred to as tensors. During the Google I/O Conference in June 2016, Jeff Dean stated that 1,500 repositories on GitHub mentioned TensorFlow, of which only 5 were from Google.[16]
In December 2017, developers from Google, Cisco, RedHat, CoreOS, and CaiCloud introduced Kubeflow at a conference. Kubeflow allows operation and deployment of TensorFlow on Kubernetes.
In March 2018, Google announced TensorFlow.js version 1.0 for machine learning in JavaScript.[17]
In Jan 2019, Google announced TensorFlow 2.0.[18] It became officially available in Sep 2019.[19]
In May 2019, Google announced TensorFlow Graphics for deep learning in computer graphics.[20]
"""
q1 = "What is TensorFlow?"
result = nlp(question=q1, context=data)
print(f"질문: {q1}, 응답: '{result['answer']}', score: {round(result['score'], 3)}")

q2 = "When is TensorFlow 2.0 announced?"
result = nlp(question="When is TensorFlow 2.0 announced?", context=data)
print(f"질문: {q2}, 응답: '{result['answer']}', score: {round(result['score'], 3)}")

No model was supplied, defaulted to distilbert-base-cased-distilled-squad (https://huggingface.co/distilbert-base-cased-distilled-squad)


Downloading:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/249M [00:00<?, ?B/s]

Some layers from the model checkpoint at distilbert-base-cased-distilled-squad were not used when initializing TFDistilBertForQuestionAnswering: ['dropout_19']
- This IS expected if you are initializing TFDistilBertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFDistilBertForQuestionAnswering were not initialized from the model checkpoint at distilbert-base-cased-distilled-squad and are newly initialized: ['dropout_39']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Downloading:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/208k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/426k [00:00<?, ?B/s]

질문: What is TensorFlow?, 응답: 'Google Brain's second-generation system', score: 0.801
질문: When is TensorFlow 2.0 announced?, 응답: 'Jan 2019', score: 0.771


In [5]:
"""
5-3 문장 생성
문장 생성(text-generation)파이프라인을 불러온 뒤 시작(Seed) 문장으로 "I love you, I will"을
입력하고, 최대 생성할 문장의 길이(10)를 지정한다 
결과를 보면 누가 봐도 깔끔한 문장이 생성되었다
"""
# 문자 생성
from transformers import pipeline

text_generator = pipeline("text-generation")
data = "I love you, I will"
print(text_generator(data, max_length=10, do_sample=False))

No model was supplied, defaulted to gpt2 (https://huggingface.co/gpt2)


Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/475M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


[{'generated_text': 'I love you, I will never forget you.'}]


In [6]:
"""
5-4 문장 요약
위키피디아에 있는 텐서플로 소개글을 다시 활용한다 
이 문장을 요약하는 문제다. 10~50개의 단어로 결과가 나올 수 있도록 설정하고 
요약으로 출력된 문장을 보면, 텐서플로가 구글 브레인의 두 번째 시스템이며 
언제 처음 출시되었는지, 어떤 시스템에 이용 가능한지 요약해서 설명하고 있다
"""
# 문장요약
from transformers import pipeline
summarizer = pipeline("summarization")
data ="""
TensorFlow is Google Brain's second-generation system. Version 1.0.0 was released on February 11, 2017.[14] While the reference implementation runs on single devices, TensorFlow can run on multiple CPUs and GPUs (with optional CUDA and SYCL extensions for general-purpose computing on graphics processing units).[15] TensorFlow is available on 64-bit Linux, macOS, Windows, and mobile computing platforms including Android and iOS.
Its flexible architecture allows for the easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices.
TensorFlow computations are expressed as stateful dataflow graphs. The name TensorFlow derives from the operations that such neural networks perform on multidimensional data arrays, which are referred to as tensors. During the Google I/O Conference in June 2016, Jeff Dean stated that 1,500 repositories on GitHub mentioned TensorFlow, of which only 5 were from Google.[16]
In December 2017, developers from Google, Cisco, RedHat, CoreOS, and CaiCloud introduced Kubeflow at a conference. Kubeflow allows operation and deployment of TensorFlow on Kubernetes.
In March 2018, Google announced TensorFlow.js version 1.0 for machine learning in JavaScript.[17]
In Jan 2019, Google announced TensorFlow 2.0.[18] It became officially available in Sep 2019.[19]
In May 2019, Google announced TensorFlow Graphics for deep learning in computer graphics.[20]
"""

print(summarizer(data, max_length=50, min_length=10, do_sample=False))

No model was supplied, defaulted to t5-small (https://huggingface.co/t5-small)


Downloading:   0%|          | 0.00/1.17k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/231M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFT5ForConditionalGeneration.

All the layers of TFT5ForConditionalGeneration were initialized from the model checkpoint at t5-small.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFT5ForConditionalGeneration for predictions without further training.


Downloading:   0%|          | 0.00/773k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.32M [00:00<?, ?B/s]

[{'summary_text': 'TensorFlow is available on 64-bit Linux, macOS, Windows, and mobile computing platforms including Android and iOS . its flexible architecture allows for the easy deployment of computation across a variety of platforms (CPUs, GPU'}]


In [None]:
"""
허깅페이스 문서를 참조하세요
"""