# Sentiment Analysis 
#### Telling happy and unhappy people apart automatically

![](img/sentiment_analysis.jpg)

![](img/SentiWordNet.png)

## Step 1 : Create a type to represent the row

In [None]:
//Creating a type to represent each row in the SentiWordNet list 
type SentiWordNetEntry = {
                            POS : string;
                            ID  : string;
                            PositiveScore:string;
                            NegativeScore:string;
                            Words:string
                         }

### Step 2: Creating the <font color='red'>_in-memory_</font> representation of the SentiWordNet

In this step we shall load all the words and their polarity score in an in-memory collection

In [None]:
open System.IO

In [None]:
let sentiWordList = System.IO.File.ReadAllLines(@"C:\FSX_TALK\SentimentAnalysis\SentiWordNet_3.0.0.txt") 
                       |> Array.filter(fun line -> not (line.StartsWith "#"))
                       |> Array.map (fun line -> line.Split '\t')
                       |> Array.filter (fun tokens -> tokens.Length >= 5)
                       |> Array.map (fun lineTokens -> {POS = lineTokens.[0];
                                                       ID = lineTokens.[1];
                                                       PositiveScore = lineTokens.[2].Trim();
                                                        NegativeScore = lineTokens.[3].Trim();
                                                       Words  = lineTokens.[4]})
                       |> Array.map (fun item -> [item.Words.Substring(0,item.Words.LastIndexOf('#') + 1);
                                                  item.PositiveScore;item.NegativeScore])                                

In [None]:
sentiWordList |> Array.take 5

In [None]:
sentiWordList |> Array.length

### Step 3: Create a function to get the <font color="red">pol</font><font color="green">arity</font> of a given word

In [None]:
 let getPolarity (sentiWordNetList : string list[] ) word =
     let matchedItem = sentiWordList |> Array.filter (fun item -> item.[0].Contains word)
     match matchedItem.Length with
         | 0 -> (0.,0.) //No value found 
         | _ -> (float matchedItem.[0].[1] , float matchedItem.[0].[2])

The following cells show some examples of how to get the individual polarity of some words

In [None]:
getPolarity sentiWordList "good"

In [None]:
getPolarity sentiWordList "bad"

In [None]:
getPolarity sentiWordList "ugly"

In [None]:
getPolarity sentiWordList "fantastic"

### Step 4 : Create a function to find polarity of a sentence

In [None]:
let getPolarityScore (sentence:string) (sentiWordNetList:string list[]) =
    let words = sentence.Split ' '
    let polarities = words |> Array.map (fun word -> getPolarity sentiWordNetList word)
    
    let totalPositivity =  polarities |> Array.map fst |> Array.sum 
    let totalNegativity =  polarities |> Array.map snd |> Array.sum 
    
    printfn "Positive polarity of this sentence is %f " totalPositivity
    printfn "Negative polarity of this sentence is %f " totalNegativity
    
    if totalPositivity > totalNegativity then 1
    elif totalNegativity = totalPositivity then 0
    else -1

In [None]:
getPolarityScore "I love this product I thought the camera will be much better though" sentiWordList

In [None]:
getPolarityScore "don't buy this drug . it gave me a bummer" sentiWordList

In [None]:
getPolarityScore "what an awesome service" sentiWordList

In [None]:
getPolarity sentiWordList "though"

In [None]:
let showParts (sentence:string) =
    let words = sentence.Split ' '
    let pairs = words |> Array.map (fun w -> (w, getPolarity sentiWordList w))
    pairs

### Some samples from the real world 

In [None]:
getPolarityScore "just booked my flight to london very excited to be able to be at the #fsharpx!" sentiWordList

In [None]:
showParts "just booked my flight to london very excited to be able to be at the #fsharpx!"

In [None]:
showParts "Yeah, it is not something usual (yet). I had to do a lot of \"marketing\" to convince customers/colleagues to give F# a shot. My April talk on #FsharpX will be about how to possibly get there to have F# as daily job. :)"

In [None]:
getPolarityScore "Yeah, it is not something usual (yet). I had to do a lot of \"marketing\" to convince customers/colleagues to give F# a shot. My April talk on #FsharpX will be about how to possibly get there to have F# as daily job. :)" sentiWordList

In [None]:
getPolarityScore "Worst network now a day of #airtel in delhi  I don't know what to do ?" sentiWordList

In [None]:
getPolarityScore "your customer service is the worst. I have been trying to contact your customer service since the morning and every time I select the correct option it says you have failed a wrong number. You have the worst customer service." sentiWordList

In [None]:
getPolarityScore "good afternoon everyone ! It's MAGA Thursday and the march madness in the media , Democrat Party & swamp establishment continues! That means we're winning bigly" sentiWordList

In [None]:
getPolarityScore "On a scale 1-10 how much pain is a belly button piercing" sentiWordList

# Handling "Negations" 

![](https://www.mncatholic.org/wp-content/uploads/2016/05/Doubtful_Man_Shrugging_Shoulders1.jpg)

> ### "camera was not good"
This one echones a <b><font color="red">_negative_</font></b> sentiment
> ### "camere was not bad" 
This one echoes an <b><font color="green">_Okish_</font></b> (almost positive) sentiment 

### A special function to get around the data 

In [None]:
let isPositive (x:string) (y : string) =
    let mutable fx = 0.0
    let mutable fy = 0.0
    if x.Contains(".") then 
        fx <- float x
    if y.Contains(".") then 
        fy <- float y
    fx - fy > 0. 


In [None]:
isPositive

In [None]:
let allPositiveWords (sentiWordNetList : string list[]) = 
    sentiWordNetList 
        |> Array.filter(fun t -> isPositive t.[1] t.[2])
        |> Array.map (fun t -> t.[0])

In [None]:
let allNegativeWords (sentiWordNetList : string list[]) = 
    sentiWordNetList 
        |> Array.filter(fun t -> not (isPositive t.[1] t.[2]))
        |> Array.map (fun t -> t.[0])

In [None]:
sentiWordList.[0].[1]

In [None]:
sentiWordList |> Array.take 15

In [None]:
allPositiveWords sentiWordList |> Array.take 10

In [None]:
allNegativeWords sentiWordList |> Array.take 19

In [None]:
let delims = [|'#';' '|]
let pos = allPositiveWords sentiWordList 
              |> Array.map (fun t -> t.Split delims 
                                     |> Array.filter (fun z -> 
                                         System.Text.RegularExpressions.Regex.Match(z,"[a-zA-Z]+").Success))

In [None]:
let neg = allNegativeWords sentiWordList 
              |> Array.map (fun t -> t.Split delims 
                                     |> Array.filter (fun z -> 
                                         System.Text.RegularExpressions.Regex.Match(z,"[a-zA-Z]+").Success))

In [None]:
neg |> Array.take 13

In [None]:
let mutable posList = [""]
let mutable negList = [""]

In [None]:
printfn "%d" (pos |> Array.length)
printfn "%d" (neg |> Array.length)

let posList = pos |> Array.concat
let negList = neg |> Array.concat
posList

In [None]:
 let getPolarity2 (sentiWordNetList : string list[] ) (word :string) =
     let matchedItem = sentiWordList |> Array.filter (fun item -> item.[0].Contains word)
     match matchedItem.Length with
         | 0 -> if word = "Negative_detected" then (0.0,0.675)
                elif word = "Ok_detected" then (0.125,0.0)
                else (0.0,0.0) // No value found
         | _ -> (float matchedItem.[0].[1] , float matchedItem.[0].[2])

In [None]:
let negations = ["no";"not";"never";"seldom";"neither";"nor"]
let badCombos = negations |> Seq.collect(fun x -> posList |> Seq.map (fun y -> x + " " + y))
//Camera was not bad
let okCombos =  negations |> Seq.collect(fun x -> negList |> Seq.map (fun y -> x + " " + y))

In [None]:
badCombos |> Seq.toList |> Seq.contains "never ugly"

In [None]:
badCombos |> Seq.toList

In [None]:
let preprocess (sentence : string) (badCombinations:string seq) (okCombinations : string seq) = 
    let mutable sen = sentence
    
    for badWordCombo in badCombinations do 
        sen <- sen.Replace(badWordCombo , "Negative_detected")
    for okWordCombo in okCombinations do 
        sen <- sen.Replace(okWordCombo , "Ok_detected")
    
    sen

In [None]:
badCombos

In [None]:
getPolarity2 sentiWordList "not good

In [None]:
let mutable remark = "camera was not good" 
remark <- preprocess remark badCombos okCombos
remark

In [None]:
badCombos |> Seq.toList

In [None]:
posList |> Array.contains "good"

In [None]:
let getPolarityScore2 (sentence:string) (sentiWordNetList:string list[]) =
    let words = sentence.Split ' '
    
    let polarities = words |> Array.map (fun word -> getPolarity2 sentiWordNetList word)
    let totalPositivity = polarities |> Array.map fst
                                     |> Array.sum
    let totalNegativity = polarities |> Array.map snd
                                     |> Array.sum 
    printfn "Positive polarity of this sentence is %f " totalPositivity
    printfn "Negative polarity of this sentence is %f " totalNegativity
    if totalPositivity > totalNegativity then 1
    elif totalNegativity = totalPositivity then 0
    else -1

In [None]:
getPolarityScore2 (preprocess "camera was not good" badCombos okCombos) sentiWordList

In [None]:
getPolarityScore "camera was not good" sentiWordList

In [None]:
getPolarityScore "camera was not bad" sentiWordList

In [None]:
getPolarityScore2 (preprocess "camera was not bad" badCombos okCombos) sentiWordList

# Identifying <font color="green">"_Praise_"</font> and <font color="red">"_Criticism_"</font> 

In [None]:
let prob (list:(string list)list) word = 
    let matchCount = list |> List.filter (fun z -> z |> List.contains word) |> List.length |> float
    matchCount / float list.Length

In [None]:
let probBoth list w1 w2 = 
    let matchCount = list |> List.filter(fun z -> z |> List.contains w1 && 
                                                  z |> List.contains w2)
                          |> List.length |> float
    matchCount / float list.Length

In [None]:
//An example usage
let lst = ["A";"B";"C"]
probBoth [lst;["Cx";"Dx"];["A";"Dx";"W";"x";"X"];["A";"Dx";"M";"C";"Z"]] "C" "Dx"

![](https://i.kinja-img.com/gawker-media/image/upload/s--D8pbiLtD--/c_scale,f_auto,fl_progressive,q_80,w_800/qymhr4748b8d8kb6xi4e.jpg)

## Semantic Orientation 

## $$SO(\omega) = \sum_{\omega_p \in positive-words} A(\omega,\omega_p)-\sum_{\omega_n \in negative-words}A(\omega,\omega_n)$$

## Pointwise Mutual Information

## $$PMI(word_1,word_2) = log_2{\frac{p(word_1 \,\& \, word_2)}{p(word_1) \bullet p(word_2)}}$$

In [None]:
let pmi (docs: (string list)list) (word1:string) (word2:string) = 
    let numerator = probBoth docs word1 word2
    let denominator  =  (prob docs word1 ) * (prob docs word2)
    if numerator > 0. && denominator > 0. then log(numerator / denominator) else 0.

In [None]:
let pWords = ["good";"nice";"excellent";"positive";"fortunate";"correct";"superior"]
let nWords = ["bad";"nasty";"poor";"negative";"unfortunate";"wrong";"inferior"]


In [None]:
pmi

In [None]:
let calculateSO (docs :string list list) (words : string list) = 
    let mutable res = 0.
    for  i in 0 .. docs.Length - 1 do 
        for j in 0 .. docs.[i].Length - 1 do
            for pw in words do 
                res <- res + pmi docs docs.[i].[j] pw
    res

In [None]:
calculateSO

In [None]:
let soPMI (reviews : string list list list  ) = 
    let mutable posi = 0.
    let mutable negi = 0.
    reviews |> List.map (fun docs -> posi <- calculateSO docs pWords
                                     negi <- calculateSO docs nWords
                                     (docs, posi  - negi))

In [None]:
soPMI

In [None]:
soPMI [ 
        //First review of bank #1
        [["positive";"outlook"];["good";"service"];["nice";"people"];["bad";"location"]];
        //second review of bank #2
        [["nasty";"behaviour"];["unfortunate";"outcome"];["poor";"quality"]]
    ]

In [None]:
let review = "Recently I had to fly round trip from Phoenix to Los Angeles. I have flown America West before and have had a few delays before but nothing like what I just experienced. No one should ever fly this airline again until they get their act together. On the flight out I checked in at the gate and asked if my luggage could be checked through to my final destination since I had to change airlines in Los Angeles. I was told that I could not. Later I found out I could have but it didn't matter anyway since my luggage didn't arrive in Los Angeles to begin with. The real nightmare began at the gate(s). A few minutes before boarding was to begin they announced a gate change to another concourse. Once at the new gate we were informed of another gate change, than another and then another back to the first concourse. Once there we were told the flight was canceled and that we were to go back to the other concourse again to try and get on another flight. Needless to say tempers were high amongst the passengers and the gate personnel were no help at all. They must be bullet proof to customer aggravation and complaints due to so many canceled flights. Of course the next flight was delayed also by about an hour. With all the gate changes and the canceled flight I was not the only one whose luggage did not arrive in Los Angeles. I put in a claim and they found it even though it took them three days to have it delivered to me. My return flight went a little smoother, we only had two delays but then after boarding we sat in the plane 30 minutes before taking off. When passengers asked about compensation for all the hassles and delays they were told nothing would be done. I have been waiting for five weeks now to be reimbursed for the clothing I had to buy while my luggage was missing. Needless to say, I won't be flying this airline again anytime in the future. The funniest part of the ordeal was when a passenger started passing around a newspaper that had America West ranked last in on time flights around the plane. Everyone got a chuckle out of that."

In [None]:
review

In [None]:
let review2 = "My most recent experience with America West has left a bad taste in my mouth for this Airline. I know that some delays are sometimes inevitable. However, when it becomes the way that an airline does business, it leaves a lot to be desired. I had four segments to my round trip and 3 of those legs were late. This tells me that 75% of the time this airline is either late or does not fulfill its agreement with it's customers. During my trip, I spoke with several customers and the majority seems to have the same experience and opinion that I have. The airline schedules flights that possibly conflict with their capability of getting to the destinations on time and it would seem that all flights in the western part of the US and the international flights have to land in Phoenix, Az. Why can't this airline have more direct nonstop flights from points of origin to final destination. Additionally, the personnel at some of the ticket counters and at the gates seem to have a bad attitude towards the client. Don't they realize that it is the customer who pays their salaries? I cannot totally condemn all of the personnel as I did encounter two of about ten that were quite friendly and helpful. My biggest beef is with the scheduling and the inability of the airline to stay on time. 3 of the 4 segments of my round trip were delayed. I understand that sometimes this is inevitable. However, with America West, this seems to be the way of doing business. This seems to be true both domestically and internationally."

In [None]:
review2

In [None]:
pWords |> List.map (fun pW -> pW, review2.Contains pW)

In [None]:
nWords |> List.map (fun pW -> pW, review2.Contains pW)

In [None]:
getPolarityScore review2 sentiWordList

In [None]:
getPolarityScore review sentiWordList