Procedures for executing this notebook:

1. Execute code block 0
2. If you want to use a text already present in the `data/` folder, modify code block 1, and run code blocks 1, 3, 4
3. If you want to upload your own text, upload your text through the file explorer panel on the left, modify code block 1, and run code blocks 1, 2, 3, 4
4. The other code blocks are just for your exploration

## 0. Set up Google Colab environment
! git clone "https://github.com/longyuxi/naive-sentiment-analyzer"
! mv naive-sentiment-analyzer/* ./
! rm -rf sample_data/
! rm -rf naive-sentiment-analyzer/
! pip install afinn
import nltk
nltk.download('punkt')
nltk.download('wordnet')

#### 1. Run the following block of code every time you want to switch the files you are concerned with

In [None]:
INPUT_TEXT = 'data/conversion/origen-conversion.txt'
OUTPUT_CSV = 'data/conversion/origen-conversion.csv'

#### 2. Generates a csv file containing the sentiment information extracted from the text. This block does not need to be run if the csv file is already present (i.e. the information has been generated before).

In [None]:
import demensuris_analyze
demensuris_analyze.enhanced_analyze_text(INPUT_TEXT, OUTPUT_CSV)

Print out the nth sentence in the text

In [None]:
SENTENCE_NUMBER = 1

import demensuris_analyze
demensuris_analyze.print_sentence_by_number(INPUT_TEXT, SENTENCE_NUMBER)

In [None]:
WORD_OR_PATTERN = "said"

import demensuris_analyze
demensuris_analyze.find_pattern(INPUT_TEXT, WORD_OR_PATTERN)

## WordNet enhanced AFINN

#### Just for your curiosity, if you want to see what the WordNet enhanced AFINN score of a certain word is
Uncomment these code by deleting the hashtags before them

In [None]:
import demensuris_analyze
demensuris_analyze.enhanced_afinn('sin')

In [None]:
import demensuris_analyze
demensuris_analyze.print_synonyms_and_antonyms('sin')

## Plots the sentiment score as a function of sentence count based on the data in OUTPUT_CSV

Note: The black dotted line is zero

#### 3. Plot against sentence count

In [None]:
# Plot against sentence count
import matplotlib.pyplot as plt
import csv
import numpy as np
plt.style.use('seaborn-whitegrid')

with open(OUTPUT_CSV) as csvfile:
    csvreader = csv.reader(csvfile)
    csvdatahorizontal = list(csvreader)
    csvdatahorizontal = csvdatahorizontal[1:]
    numpy_array = np.array(csvdatahorizontal)
    csvdatavertical = numpy_array.T
    csvdatavertical = csvdatavertical.astype(float)
    fig = plt.figure()
    ax = plt.axes()
    ax.plot(csvdatavertical[0],csvdatavertical[3])
    ax.plot(np.zeros(len(csvdatavertical[4])), 'k:')
    ax.set_title(OUTPUT_CSV[:-4])
    ax.set_xlabel("Sentence count")
    ax.set_ylabel("Total AFINN Score")

#### 4. Plot against word count

In [None]:
# Plot against word count
import matplotlib.pyplot as plt
import csv
import numpy as np
plt.style.use('seaborn-whitegrid')

with open(OUTPUT_CSV) as csvfile:
    csvreader = csv.reader(csvfile)
    csvdatahorizontal = list(csvreader)
    csvdatahorizontal = csvdatahorizontal[1:]
    numpy_array = np.array(csvdatahorizontal)
    csvdatavertical = numpy_array.T
    csvdatavertical = csvdatavertical.astype(float)
    fig = plt.figure()
    ax = plt.axes()
    ax.plot(csvdatavertical[4],csvdatavertical[3])
    ax.plot(np.zeros(int(csvdatavertical[4][-1])), 'k:')
    ax.set_title(OUTPUT_CSV[:-4]) 
    ax.set_xlabel("Word count")
    ax.set_ylabel("Total AFINN Score")
