# Pyladies Workshop

The Watson Natural Language Processing library provides basic natural language processing functions for syntax analysis and out-of-the-box pre-trained models. The following natural language processing tasks are available as blocks in the Watson Natural Language Processing library:
- Syntax analysis
- Noun phrase extraction
- Keyword extraction and ranking
- Entity extraction
- Sentiment classification
- Tone classification
- Emotion classification

Many of the pre-trained models are available in many languages. In this workshop, we will perfrom some of the tasks mentioned above in Swedish, when Swedish is not supported, we will use English to demonstrate the functionalities. You can test other languages by changing 'sv'/'en' to another language code whenever you see it in a model name.
Find a list of available languages and language codes here: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/watson-nlp-block-catalog.html?audience=wdp#lang-codes

## Import Watson NLP library and create some text in Swedish

The first step is to import Watson NLP, which is already available to us here in IBM Watson Studio Notebooks.
(If you are getting errors, double check that you selected 'DO + NLP Runtime 22.1 on Python 3.9' when creating the notebook)

In [None]:
import watson_nlp

To keep things light, yet emotional, we have provided sample text on the topic 'Salta pinnar/pretzel sticks', but we encourage you to play around with your own text to see what results you get :) The sample text below translates to 'I only have bad experiences and memories of pretzel sticks. The salt itself is the only good part. The rest is dry, boring, and depressive.'

In [None]:
sv_text= 'Jag har bara dåliga erfarenheter och minnen av salta pinnar. \
Själva saltkornen är det enda som är gott. Resten är torrt, tråkigt och deprimerande.'

## Syntax Analysis

The Watson Natural Language Processing Syntax block encapsulates syntax analysis functionality. You can use this block to perform task like sentence detection, tokenization, part-of-speech tagging, lemmatization and dependency parsing in different languages.

Let's now run the syntax analysis on the Swedish text provided above.

The lemma is the root form of a word. For example, the lemma of 'running' is 'run', 'cats' is 'cat' etc. Part of speech is that thing we learn in high school but may have forgotten about by now, examples include 'noun' and 'adjective'.

In [None]:
# Load Syntax for Swedish
syntax_model = watson_nlp.load(watson_nlp.download('syntax_izumo_sv_stock'))

# Detect tokens, lemma and part-of-speech
syntax_prediction = syntax_model.run(sv_text, parsers=('token', 'lemma', 'part_of_speech'))

# Print the syntax result
print(syntax_prediction)

## Sentiment Analysis

In this exercise, we will explore the sentiment models that are available out-of-the-box for Swedish. Start by predicting the sentiment on a sentence level, and then the aggregated sentiment for the full text. In Watson NLP, sentiment can be either 'positive', 'negative', or 'neutral'.

In [None]:
import watson_nlp
from watson_nlp.toolkit import predict_document_sentiment

Next we load sentiment models and the sentiment_model is multilingual so no language code needs to be assgined.

In [None]:
sentiment_model = watson_nlp.load(watson_nlp.download('sentiment_sentence-bert_multi_stock'))

We are now ready to run the sentiment model on the results from the syntax model from the previous section.

For each setence you can see a score. A negative score means that the model predicts that the sentiment is negative.
A positive score means that the model predicts that the sentiment is positive. A score of 0 means neutral.

You can also see the probabilities that the sentiment is positive/negative/neutral for each sentence.

In [None]:
sentiment_result = sentiment_model.run_batch(syntax_prediction.get_sentence_texts(), syntax_prediction.sentences)
print(sentiment_result)

Our last step of this exercise is to use the sentiment_result from the previous step and predict the agreggated sentiment for the entire text.

In [None]:
document_sentiment = predict_document_sentiment(sentiment_result, sentiment_model.class_idxs)
print(document_sentiment)

## Targeted Sentiment Analysis

In this exersice we will try out the targeted sentiment analysis. Targeted sentiment analysis is an innovation from IBM Research which not only tells us the sentiment, it also tells us what the sentiment is about.

(Note that this model has been fine-tuned using English data, so results may vary in other languages.)

We start by loading the targets sentiment model.

In [None]:
targets_sentiment_model = watson_nlp.load(watson_nlp.download('targets-sentiment_sequence-bert_multi_stock'))

Next, we define the text that we want to use. The sample text is the same as in the previous exercise.

Don't be afraid to play around with your own text here.

In [None]:
sv_text = 'Jag har bara dåliga erfarenheter och minnen av salta pinnar. \
Själva saltkornen är det enda som är gott. Resten är torrt, tråkigt och deprimerande.'

Our next step is to run the syntax model (as defined in the previous exercise) on the text.

In [None]:
syntax_prediction = syntax_model.run(sv_text, parsers = ('token', 'lemma', 'part_of_speech'))
#print(syntax_result)

Now we are ready to run the targets sentiment model on the syntax resluts

In [None]:
targets_sentiment = targets_sentiment_model.run(syntax_prediction)
print(targets_sentiment)

## Emotion Analysis

The Emotion classification model is a pre-trained document classification model for the task of classifying the emotion in the input document. The model identifies the emotion of a document, and classifies it as:

- Anger
- Disgust
- Fear
- Joy
- Sadness

Currently, emotion analysis is only available for English and French.

In [None]:
#Let's first write some emotional text in English.
en_text= 'The UK is a joke. The British government is so embarrassing. Imagine being so bad at your job. It’s just so embarrassing.'

In [None]:
import watson_nlp


# Load the Emotion workflow model for English
emotion_model = watson_nlp.load(watson_nlp.download('ensemble_classification-workflow_en_emotion-stock'))

# Run the Tone model 
emotion_result = emotion_model.run(en_text)
print(emotion_result)

## Want to learn more about Watson NLP?  Here are some resources that you can check out. 

- Introduction about Watson NLP in Watson Studio: https://medium.com/ibm-data-ai/watson-natural-language-processing-now-generally-available-in-ibm-watson-studio-notebooks-8bc9d464f33d
- IBM official documentation of Watson NLP, you will find all the blocks that are available and what languages are supported for each block: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/watson-nlp.html?context=cpdaas&audience=wdp
- Sample Watson NLP project from Gallery in Cloud Pak for Data: https://eu-de.dataplatform.cloud.ibm.com/exchange/public/entry/view/636001e59902133a4a23fd89f010e4cb?context=cpdaas