<a href="https://colab.research.google.com/github/rtkilian/data-science-blogging/blob/main/spaCy_sentiment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sentiment Analysis with spaCy

## Installation
We start by installing [spaCy](https://spacy.io/usage) as per their instructions.

In [1]:
!pip install -U pip setuptools wheel
!pip install -U spacy
!python -m spacy download en_core_web_sm

Collecting pip
  Downloading pip-21.3-py3-none-any.whl (1.7 MB)
[K     |████████████████████████████████| 1.7 MB 10.0 MB/s 
Collecting setuptools
  Downloading setuptools-58.2.0-py3-none-any.whl (946 kB)
[K     |████████████████████████████████| 946 kB 47.9 MB/s 
Installing collected packages: setuptools, pip
  Attempting uninstall: setuptools
    Found existing installation: setuptools 57.4.0
    Uninstalling setuptools-57.4.0:
      Successfully uninstalled setuptools-57.4.0
  Attempting uninstall: pip
    Found existing installation: pip 21.1.3
    Uninstalling pip-21.1.3:
      Successfully uninstalled pip-21.1.3
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.[0m
Successfully installed pip-21.3 setuptools-58.2.0


Collecting spacy
  Downloading spacy-3.1.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.9 MB)
     |████████████████████████████████| 5.9 MB 6.1 MB/s            
[?25hCollecting catalogue<2.1.0,>=2.0.6
  Downloading catalogue-2.0.6-py3-none-any.whl (17 kB)
Collecting pathy>=0.3.5
  Downloading pathy-0.6.0-py3-none-any.whl (42 kB)
     |████████████████████████████████| 42 kB 1.2 MB/s             
Collecting pydantic!=1.8,!=1.8.1,<1.9.0,>=1.7.4
  Downloading pydantic-1.8.2-cp37-cp37m-manylinux2014_x86_64.whl (10.1 MB)
     |████████████████████████████████| 10.1 MB 35.8 MB/s            
Collecting typer<0.5.0,>=0.3.0
  Downloading typer-0.4.0-py3-none-any.whl (27 kB)
Collecting srsly<3.0.0,>=2.4.1
  Downloading srsly-2.4.1-cp37-cp37m-manylinux2014_x86_64.whl (456 kB)
     |████████████████████████████████| 456 kB 57.4 MB/s            
Collecting thinc<8.1.0,>=8.0.9
  Downloading thinc-8.0.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (623 kB)
     |██████

We also install [spaCyTextBlob](https://spacy.io/universe/project/spacy-textblob) which allows us to do sentiment analysis in spaCy.

In [5]:
!pip install spacytextblob

Collecting spacytextblob
  Downloading spacytextblob-3.0.1-py3-none-any.whl (4.1 kB)
Installing collected packages: spacytextblob
Successfully installed spacytextblob-3.0.1


## Set Up
We import our libraries

In [6]:
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

We also load the spaCy language model and extend our pipeline using TextBlob.

In [7]:
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('spacytextblob')

<spacytextblob.spacytextblob.SpacyTextBlob at 0x7f64d02fcb90>

## Sentiment Classification
Our aim is to build a sentiment analysis model that takes a user inputted string and classifies the text as positive, neutral or negative. The model should output its prediction in a dictionary which will simplify the development of our API.

In [10]:
# User input text
user_input = 'This is a wonderful campsite. I loved the serenity and the birds chirping in the morning.'

# Process user input
doc = nlp(user_input)

We can now print the sentiment of the user input. The score ranges from -1 (negative) to 1 (positive). 

In [14]:
doc._.polarity

0.85

In [16]:
# Convert the model output to a dictionary
input_polarity = doc._.polarity
sentiment = {
    'score': input_polarity
}
print(sentiment)

{'score': 0.85}


When it comes time to develop our API, our dictionary will be converted to JSON and will be the response of our API after someone makes a request for sentiment analysis.