# Hugging Face

#### 추준호(20224224)

In [1]:
from transformers import pipeline

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


Moving 0 files to the new cache system


0it [00:00, ?it/s]

# Sentiment Analysis

In [2]:
classifier = pipeline('sentiment-analysis')

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [3]:
text = [
    "Fly me to the moon, and let me play among the stars",
    "April is the cruellest month, breeding Lilacs out of the dead land."
]

In [4]:
classifier(text)

[{'label': 'POSITIVE', 'score': 0.9996751546859741},
 {'label': 'NEGATIVE', 'score': 0.9489612579345703}]

# Zero-shot classification

### Few-shot learning <=> zero-shot

In [5]:
classifier = pipeline('zero-shot-classification')

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [6]:
text = [
    "German finance minister urges EU to rein in public spending",
    "China seeks more island security pacts to boost clout in Pacific"
]

In [7]:
classifier(
    text,
    candidate_labels=[
        'education', 'politics', 'business', 'economy', 'europe', 'asia'
    ]
)

[{'sequence': 'German finance minister urges EU to rein in public spending',
  'labels': ['europe', 'politics', 'economy', 'business', 'education', 'asia'],
  'scores': [0.4018935263156891,
   0.2552796006202698,
   0.2406008243560791,
   0.07709594815969467,
   0.016165118664503098,
   0.008965053595602512]},
 {'sequence': 'China seeks more island security pacts to boost clout in Pacific',
  'labels': ['politics', 'asia', 'business', 'economy', 'europe', 'education'],
  'scores': [0.5034072995185852,
   0.2767115533351898,
   0.14557377994060516,
   0.03344782441854477,
   0.020698733627796173,
   0.020160792395472527]}]

# Text Generation

In [8]:
generator = pipeline('text-generation')

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [9]:
# "The Myth of Sisyphus" by Albert Camus
text = "There is but one truly serious philosophical problem, and that is suicide. Judging"

In [10]:
generator(text, max_length=256)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "There is but one truly serious philosophical problem, and that is suicide. Judging the value of the man who commits suicide has never been tested scientifically. Yet they consider it a sure thing that the most successful human beings, regardless of their personal skill and genius, would always do so, and in so doing, to put themselves at risk. If that is the best way for a human being to lead a great life, how far will that lead to suicide? Sooner or later, a man's best chance of success is an unlikely one, because there is no guarantee that at any time that he will achieve this. If he does, there is nothing good to do. It is a matter of choice, but I suggest the best course. I can imagine no better course than death. The choice is always in the man's life, and that is why there is no guarantee that at any time that he will be able to achieve that. I do not think that there is even a chance for either of these things, particularly when it comes to the very young ma

# Mask filling

In [11]:
unmasker = pipeline('fill-mask')

No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [12]:
# Billie Eilish
text = "So you're a <mask> guy, Like it really rough guy"

In [13]:
unmasker(text)

[{'score': 0.9274507164955139,
  'token': 6744,
  'token_str': ' rough',
  'sequence': "So you're a rough guy, Like it really rough guy"},
 {'score': 0.03052147477865219,
  'token': 1828,
  'token_str': ' tough',
  'sequence': "So you're a tough guy, Like it really rough guy"},
 {'score': 0.0017656903946772218,
  'token': 1099,
  'token_str': ' bad',
  'sequence': "So you're a bad guy, Like it really rough guy"},
 {'score': 0.0016831890679895878,
  'token': 15455,
  'token_str': ' nasty',
  'sequence': "So you're a nasty guy, Like it really rough guy"},
 {'score': 0.0015438697300851345,
  'token': 543,
  'token_str': ' hard',
  'sequence': "So you're a hard guy, Like it really rough guy"}]

# NER(Named Entity Recognition)

In [14]:
ner = pipeline('ner')

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [15]:
text = "Steven Paul Jobs (February 24, 1955 – October 5, 2011) was an American entrepreneur, inventor, business magnate, media proprietor, and investor. He was the co-founder, chairman, and CEO of Apple; the chairman and majority shareholder of Pixar; a member of The Walt Disney Company's board of directors following its acquisition of Pixar; and the founder, chairman, and CEO of NeXT. He is widely recognized as a pioneer of the personal computer revolution of the 1970s and 1980s, along with his early business partner and fellow Apple co-founder Steve Wozniak."

In [16]:
ner(text)

[{'entity': 'I-PER',
  'score': 0.99945754,
  'index': 1,
  'word': 'Steven',
  'start': 0,
  'end': 6},
 {'entity': 'I-PER',
  'score': 0.9994562,
  'index': 2,
  'word': 'Paul',
  'start': 7,
  'end': 11},
 {'entity': 'I-PER',
  'score': 0.999501,
  'index': 3,
  'word': 'Job',
  'start': 12,
  'end': 15},
 {'entity': 'I-PER',
  'score': 0.99722654,
  'index': 4,
  'word': '##s',
  'start': 15,
  'end': 16},
 {'entity': 'I-MISC',
  'score': 0.996759,
  'index': 18,
  'word': 'American',
  'start': 62,
  'end': 70},
 {'entity': 'I-ORG',
  'score': 0.9993734,
  'index': 45,
  'word': 'Apple',
  'start': 189,
  'end': 194},
 {'entity': 'I-ORG',
  'score': 0.9989691,
  'index': 53,
  'word': 'Pi',
  'start': 237,
  'end': 239},
 {'entity': 'I-ORG',
  'score': 0.9961754,
  'index': 54,
  'word': '##xa',
  'start': 239,
  'end': 241},
 {'entity': 'I-ORG',
  'score': 0.9993247,
  'index': 55,
  'word': '##r',
  'start': 241,
  'end': 242},
 {'entity': 'I-ORG',
  'score': 0.9991516,
  'index

# Question answering

In [17]:
qa = pipeline('question-answering')

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [18]:
qa(
    context=text,
    question='Which companies are founded by steve jobs?'
)

{'score': 0.2442311942577362, 'start': 189, 'end': 194, 'answer': 'Apple'}

# Summarizaion

In [19]:
summarizer = pipeline('summarization')

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [20]:
summarizer(text, max_length=64)

[{'summary_text': " Steven Paul Jobs was the co-founder, chairman, and CEO of Apple . He is widely recognized as a pioneer of the personal computer revolution of the 1970s and 1980s, along with his early business partner Steve Wozniak . He was a member of The Walt Disney Company's board of directors"}]

# Translation

In [21]:
translator = pipeline('translation', model='Helsinki-NLP/opus-mt-en-fr')

Downloading:   0%|          | 0.00/778k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/802k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.34M [00:00<?, ?B/s]



In [22]:
translator("Hello Jieun")

[{'translation_text': 'Bonjour Jieun'}]

In [23]:
translator(text)

[{'translation_text': "Steven Paul Jobs (24 février 1955 – 5 octobre 2011) était un entrepreneur américain, inventeur, magnat des affaires, propriétaire de médias et investisseur. Il était le cofondateur, président et directeur général d'Apple; le président et actionnaire majoritaire de Pixar; un membre du conseil d'administration de la Walt Disney Company suite à son acquisition de Pixar; et le fondateur, président et chef de la direction de NeXT. Il est largement reconnu comme un pionnier de la révolution informatique personnelle des années 1970 et 1980, ainsi que son premier associé et cofondateur d'Apple Steve Wozniak."}]

In [24]:
translator = pipeline('translation', model='Helsinki-NLP/opus-mt-ko-en')

Downloading:   0%|          | 0.00/1.39k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/312M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/842k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/813k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.72M [00:00<?, ?B/s]

In [25]:
text = "한국산 가상화폐 루나와 테라USD(UST) 폭락으로 손실을 본 투자자들이 발행사 테라폼랩스의 권도형 최고경영자(CEO)를 고소했다."

In [26]:
translator(text)

[{'translation_text': "After losing a Korean virtual currency, Luna turusD (UST), investors filed charges against CEO's high-powered top manager for the launch service terafos."}]

# Sentiment Analysis - Korean

In [27]:
classifier = pipeline('sentiment-analysis', model='snunlp/KR-FinBert-SC')

Downloading:   0%|          | 0.00/881 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/406M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/372 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/143k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/294k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [28]:
text = [
    "한국산 가상화폐 루나와 테라USD(UST) 폭락으로 손실을 본 투자자들이 발행사 테라폼랩스의 권도형 최고경영자(CEO)를 고소했다.",
    "외국인, 올해 국내 주식 15조 원 순매도…삼성만 5조 원 팔았다",
    "尹, 탈원전 정상화 추진 “원전 수출 증진 위해 韓美 노력”",
]

In [29]:
classifier(text)

[{'label': 'negative', 'score': 0.9798452258110046},
 {'label': 'negative', 'score': 0.9699411988258362},
 {'label': 'positive', 'score': 0.995445728302002}]