<a href="https://colab.research.google.com/github/j-hartmann/nlp-in-marketing/blob/main/nlp_applications.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Natural Language Processing in Marketing
Oded Netzer (Columbia University) & Jochen Hartmann (University of Groningen)

## *Embeddings and vector semantics*

In [None]:
import gensim.downloader

### word2vec

In [None]:
w2v = gensim.downloader.load('word2vec-google-news-300')



In [None]:
w2v.most_similar('sofa')

[('couch', 0.8309178352355957),
 ('settee', 0.7764685750007629),
 ('sofas', 0.7543261051177979),
 ('loveseat', 0.7152645587921143),
 ('recliner', 0.7101271152496338),
 ('futon', 0.6624690294265747),
 ('leather_sofa', 0.6620596647262573),
 ('plush_sofa', 0.6556485295295715),
 ('ottoman', 0.6525834798812866),
 ('couches', 0.6501914262771606)]

In [None]:
w2v.most_similar('couch')

[('sofa', 0.8309179544448853),
 ('recliner', 0.7366936802864075),
 ('couches', 0.7016552090644836),
 ('comfy_couch', 0.6747691035270691),
 ('futon', 0.6523739695549011),
 ('al_Jabouri_slept', 0.6240309476852417),
 ('loveseat', 0.617920994758606),
 ('beanbag_chair', 0.616889476776123),
 ('recliner_chair', 0.6121512055397034),
 ('settee', 0.6086535453796387)]

In [None]:
w2v.similarity('couch', 'sofa')

0.8309179

### GloVe

In [None]:
glove = gensim.downloader.load('glove-wiki-gigaword-300')



In [None]:
glove.most_similar('sofa')

[('sofas', 0.6412794589996338),
 ('couch', 0.6295238733291626),
 ('couches', 0.5816987752914429),
 ('cushions', 0.5553663969039917),
 ('upholstered', 0.5553508996963501),
 ('comfy', 0.5491216778755188),
 ('armchairs', 0.5384072065353394),
 ('pillows', 0.514901876449585),
 ('recliner', 0.5056697130203247),
 ('overstuffed', 0.49476155638694763)]

In [None]:
glove.most_similar('couch')

[('sofa', 0.6295238733291626),
 ('couches', 0.5716592669487),
 ('comfy', 0.5274707674980164),
 ('sitting', 0.5219179391860962),
 ('lounging', 0.49838781356811523),
 ('cushions', 0.48835569620132446),
 ('armchair', 0.4837192893028259),
 ('bed', 0.4826846718788147),
 ('recliner', 0.4810296297073364),
 ('asleep', 0.4713967442512512)]

In [None]:
glove.similarity('couch', 'sofa')

0.62952393

### Sentence-BERT

In [None]:
!pip install -U sentence-transformers
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('all-MiniLM-L6-v2')

In [None]:
sentence1 = ['The couch is awesome!']
sentence2 = ['What a nice sofa.']

embedding1 = model.encode(sentence1, convert_to_tensor=True)
embedding2 = model.encode(sentence2, convert_to_tensor=True)

util.cos_sim(embedding1, embedding2)

tensor([[0.6372]], device='cuda:0')

### VADER

In [None]:
!pip install vaderSentiment
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

In [4]:
sentences = ['really cool, love it', 
             'To be or not to be',
             'This is terrible!!']

In [5]:
analyzer = SentimentIntensityAnalyzer()
for sentence in sentences:
    vs = analyzer.polarity_scores(sentence)
    print("{:-<65} {}".format(sentence, str(vs)))

really cool, love it--------------------------------------------- {'neg': 0.0, 'neu': 0.22, 'pos': 0.78, 'compound': 0.7947}
To be or not to be----------------------------------------------- {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
This is terrible!!----------------------------------------------- {'neg': 0.648, 'neu': 0.352, 'pos': 0.0, 'compound': -0.5696}


### Zero-shot classification

In [None]:
!pip install transformers

In [None]:
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

In [None]:
sequence1 = "What a great product"
sequence2 = "What a shitty product"
candidate_labels = ['positive', 'negative']

In [None]:
classifier(sequence1, candidate_labels)

{'labels': ['positive', 'negative'],
 'scores': [0.9976462721824646, 0.0023537033703178167],
 'sequence': 'What a great product'}

In [None]:
classifier(sequence2, candidate_labels)

{'labels': ['negative', 'positive'],
 'scores': [0.9989504218101501, 0.0010495946044102311],
 'sequence': 'What a shitty product'}

In [None]:
classifier("We need to sell more stuff", ['marketing', 'sales', 'computer science'], multi_label = True)

{'labels': ['sales', 'marketing', 'computer science'],
 'scores': [0.9904853105545044, 0.9404631853103638, 0.012987233698368073],
 'sequence': 'We need to sell more stuff'}

### Named entity recognition

**Option 1:** flair library

In [None]:
!pip install flair

In [None]:
from flair.data import Sentence
from flair.models import SequenceTagger

In [None]:
tagger = SequenceTagger.load("flair/ner-english")
sentence = Sentence("George Washington went to Washington")
tagger.predict(sentence)

In [None]:
for entity in sentence.get_spans('ner'):
    print(entity)

Span [1,2]: "George Washington"   [− Labels: PER (0.9985)]
Span [5]: "Washington"   [− Labels: LOC (0.9706)]


**Option 2:** transformers library

In [None]:
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
import pandas as pd

tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NER")
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")

In [None]:
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "George Washington went to Washington"

ner_results = nlp(example)
pd.DataFrame(ner_results)

Unnamed: 0,entity,score,index,word,start,end
0,B-PER,0.998344,1,George,0,6
1,I-PER,0.991562,2,Washington,7,17
2,B-LOC,0.999099,5,Washington,26,36


In [None]:
https://colab.research.google.com/github/nlp-with-transformers/notebooks/blob/main/01_introduction.ipynb#scrollTo=mwJyTlMWdSys
https://colab.research.google.com/github/nlp-with-transformers/notebooks/blob/main/11_future-directions.ipynb#scrollTo=wicked-flight

**SOURCES:** 
*   https://radimrehurek.com/gensim/models/word2vec.html
*   https://github.com/RaRe-Technologies/gensim-data
*   https://web.stanford.edu/~jurafsky/slp3/
*   https://www.sbert.net/docs/usage/semantic_textual_similarity.html
*   https://huggingface.co/facebook/bart-large-mnli
*   https://huggingface.co/flair/ner-english
*   https://huggingface.co/dslim/bert-base-NER
*   https://github.com/cjhutto/vaderSentiment



### The End