# Experimentation in Psychology and Linguistics

## Assignment 2: Analysis of adjectives in Twitter data

Frans Adriaans, f.w.adriaans@uu.nl

### Submission instructions: 
You will find questions and tasks throughout this Notebook. Provide answers to the questions and discuss your findings in the designated spaces throughout the Notebook. Make sure you save your work and submit your assignment on Blackboard before the deadline. You are allowed to work in teams of 2 students. Individual submissions are allowed, but not recommended.

**Deadline: March 8, 2019, 17:00**


## Goals of the assignment

Adjectives are important for sentiment analysis (e.g., 'happy' vs 'sad'). The problem addressed in this assignment is that Twitter data doesn't come with linguistic annotations. This makes it difficult to perform quantitative linguistic analyses on the data. We will experiment with different methods to discover adjectives in the data. For this you will have to rely on linguistic resources. We will use our automatically annotated data to see whether there are differences in the frequencies of positive and negative adjectives in the data, thereby giving a very rudimentary estimate of the overall sentiment in the corpus. By doing this assignment you will get experience with linguistic data collection and annotation in Python/NLTK, and you will get a feeling for corpus-based research.

The assignment is structured as follows:
* In Part 1 you will be provided with basic snippets of Python code that will allow you to load corpora from NLTK (Natural Language Toolkit) and perform basic quantitative analyses.
* In Part 2 you will be guided through a basic method for discovering adjectives in the data. 
* In Part 3 you will conduct several follow-up analyses on your own to get better results. Here you will explore new strategies and report on the findings.



# PART 1: Getting started

### Accessing corpora via NLTK


The code below imports the nltk module and downloads the Brown corpus which was discussed during class. We will use it to replicate the data shown in class. You can use this code to verify that NLTK is working properly, and you can study the code to get an idea of how to work with NLTK.

N.B. If you have a hard time understanding the code in this assignment, then take a few minutes to consult the NLTK book: http://www.nltk.org/book/ (especially Chapter 1, section 3: *Computing with Language: Simple Statistics*)

In [1]:
# The following code imports NLTK and makes sure it works.
import nltk
try:
    nltk.pos_tag("ok".split())
    print("The nltk is ready to use")
except LookupError:
    print("Downloading the nltk's \"book\" bundle. This will take a few minutes.")
    nltk.download("book")

# The code below can be used to download and import a specific corpus
nltk.download("brown")
from nltk.corpus import brown



The nltk is ready to use


[nltk_data] Downloading package brown to /home/asun/nltk_data...
[nltk_data]   Package brown is already up-to-date!


### Inspecting the data

We first want to get an impression of what the data looks like, before we start doing any analyses. This involves some basic printing, counting, and data preprocessing.

In [2]:
# As a first inspection we print the different text categories that were used to create the corpus
print(brown.categories())

# A list of all words in the corpus can be obtained using brown.words(). Let's print the first 20 words in the list:
print(brown.words()[0:20])

# Let's print the total number of words in the corpus:
print("There are", len(brown.words()), "words in the Brown corpus.")

# You will notice that the number of words is a bit higher than one million. 
# This is due to NLTK's tokenization, which treats punctuation characters as words.
# We can use a regular expression to only select "proper" words, containing alphanumeric characters:
import re
properwords = [w.lower() for w in brown.words() if re.search(r"^\w+$", w)]
print("There are", len(properwords), "words, excluding punctuation.")





['adventure', 'belles_lettres', 'editorial', 'fiction', 'government', 'hobbies', 'humor', 'learned', 'lore', 'mystery', 'news', 'religion', 'reviews', 'romance', 'science_fiction']
['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', 'Friday', 'an', 'investigation', 'of', "Atlanta's", 'recent', 'primary', 'election', 'produced', '``', 'no', 'evidence', "''", 'that']
There are 1161192 words in the Brown corpus.
There are 988331 words, excluding punctuation.


### Creating a frequency distribution

You can use NLTK's FreqDist method to build a frequency distribution. The code below should give you the same results as the table on Slide 8 from last Tuesday's class.

In [3]:
# nltk.FreqDist takes a list as input
fdist = nltk.FreqDist(properwords)

# We can print the most common words:
fdist.most_common(10)



[('the', 69971),
 ('of', 36412),
 ('and', 28853),
 ('to', 26158),
 ('a', 23195),
 ('in', 21337),
 ('that', 10594),
 ('is', 10109),
 ('was', 9815),
 ('he', 9548)]

## NLTK's Twitter Sample Corpus

Now that we have a basic understanding of corpora in NLTK, we are ready to start working with the Twitter data.

NLTK has a Twitter sample corpus ('`twitter_samples`'). The sample contains 20,000 tweets (mostly on British politics), plus a set of 10,000 which have been annotated with regards to sentiment (positive and negative).

In [4]:
# YOUR TURN: download and import the 'twitter_samples' corpus
nltk.download("twitter_samples")
from nltk.corpus import twitter_samples



[nltk_data] Downloading package twitter_samples to
[nltk_data]     /home/asun/nltk_data...
[nltk_data]   Package twitter_samples is already up-to-date!


### Inspecting the data

In [5]:
# Let's see which files make up the corpus
twitter_samples.fileids()

# The tweet text contents are available as strings. Let's print a few to see what they look like:
tweets = twitter_samples.strings()

print(tweets[0]) # the first tweet
print(tweets[1]) # the second tweet
print(tweets[5:10]) # tweets 5-10



hopeless for tmr :(
Everything in the kids section of IKEA is so cute. Shame I'm nearly 19 in 2 months :(
["oh god, my babies' faces :( https://t.co/9fcwGvaki0", '@RileyMcDonough make me smile :((', '@f0ggstar @stuartthull work neighbour on motors. Asked why and he said hates the updates on search :( http://t.co/XvmTUikWln', 'why?:("@tahuodyy: sialan:( https://t.co/Hv1i0xcrL2"', 'Athabasca glacier was there in #1948 :-( #athabasca #glacier #jasper #jaspernationalpark #alberta #explorealberta #… http://t.co/dZZdqmf7Cz']


In [6]:
# YOUR TURN: count how many tweets there are in total
print("There are", len(tweets), "tweets")




There are 30000 tweets


In [7]:
# The tweet texts shown above will be sufficient for the current assignment, but if you want to use the full twitter 
# JSON format (https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object)
# then you can use the following: 

full = twitter_samples.docs()

print(full[0]) # the first tweet
print(full[1]) # the second tweet
print(full[5:10]) # tweets 5-10

#happydatamining

{'contributors': None, 'coordinates': None, 'text': 'hopeless for tmr :(', 'user': {'screen_name': 'yuwraxkim', 'time_zone': 'Jakarta', 'profile_background_image_url': 'http://pbs.twimg.com/profile_background_images/585476378365014016/j1mvQu3c.png', 'profile_background_image_url_https': 'https://pbs.twimg.com/profile_background_images/585476378365014016/j1mvQu3c.png', 'default_profile_image': False, 'url': None, 'profile_text_color': '000000', 'following': False, 'listed_count': 3, 'entities': {'description': {'urls': []}}, 'utc_offset': 25200, 'profile_sidebar_border_color': '000000', 'name': 'yuwra ✈ ', 'favourites_count': 196, 'followers_count': 1281, 'location': 'wearegsd;favor;pucukfams;barbx', 'protected': False, 'notifications': False, 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/622631732399898624/kmYsX_k1_normal.jpg', 'profile_use_background_image': True, 'profile_image_url': 'http://pbs.twimg.com/profile_images/622631732399898624/kmYsX_k1_normal.jpg', 'lan

### Creating a frequency distribution for the twitter sample

We will now create a frequency distribution for words that occur in tweets. Note that the .words() method is not available for this corpus, so you will have to split each tweet into words yourself (e.g. by using string.split()) in order to create one long list of words.

In [8]:
# YOUR TURN: create a frequency distribution for words used in tweets. Print the 25 most frequent words.

# Empty list to store the splitted words
tweet_words=[]

# Split
for i in tweets:
    tweet_words.append(i.split())

import itertools

fTW= list (itertools.chain(*tweet_words))

#frequency
fdist = nltk.FreqDist(fTW)

# We can print the most common words:
fdist.most_common(25)













[('RT', 13500),
 ('the', 12282),
 ('to', 9590),
 ('a', 7283),
 ('of', 6278),
 ('in', 5934),
 ('is', 5179),
 ('I', 4729),
 ('Miliband', 4609),
 ('and', 4597),
 ('with', 3966),
 ('on', 3812),
 (':(', 3802),
 ('for', 3721),
 ('you', 3677),
 ('Tories', 3403),
 (':)', 3311),
 ('SNP', 2866),
 ('it', 2695),
 ('that', 2591),
 ('Ed', 2539),
 ('will', 2230),
 ('-', 2201),
 ('have', 2166),
 ('Labour', 2089)]

#### Question:
At this point you have collected sufficient data to form a first impression of the overall sentiment in this corpus. What is the overall sentiment, and what data supports that conclusion? Can you put a number on the overall sentiment (e.g. X% positive, Y% negative)?

#### Answer:
Based on the actual words of the first data we collected we cannot really determine the overall sentiment. However two smiley faces (":(",":)") are also included, and from them we conclude that the overall sentiment is slightly more negative than positive; negative: 53%, positive: 47%).

### Creating a frequency distribution of hashtags

You can think of hashtags as a particular type of word. They are easy to find in the data: they start with '#'. In the cell below, create a frequency distribution of hashtags.


In [9]:
# YOUR TURN: create a frequency distribution of hashtags. Print the 25 most frequent hashtags. Also print the total
# number of hashtags found.

# We iterate over the list of flattened tweet words and store the ones starting with a hashtag

hList=[]

for i in fTW:
    for j in i:
        if j == "#":
            hList.append(i)

#frequency
fdist = nltk.FreqDist(hList)

# We can print the most common words:
fdist.most_common(25)






[('#bbcqt', 2025),
 ('#AskNigelFarage', 1104),
 ('#UKIP', 867),
 ('#GE2015', 611),
 ('#SNP', 550),
 ('#BBCQT', 231),
 ('#AskFarage', 210),
 ('#BBCqt', 201),
 ('#Labour', 186),
 ('#ukip', 186),
 ('#VoteSNP', 181),
 ('#GE15', 140),
 ('#Plaid15', 114),
 ('#Farage', 108),
 ('#voteSNP', 94),
 ('#Miliband', 92),
 ('#SNPbecause', 89),
 ('#Tories', 89),
 ('#snp', 89),
 ('#VoteUKIP', 76),
 ('#NHS', 74),
 ('#…', 73),
 ('#SNPout', 64),
 ('#TeamNigel', 62),
 ('#BBcqt', 59)]

#### Question:
(a) How many hashtags are there in this corpus? 
(b) During class we discussed the difference between Tags and Commentaries. Are the most common hashtags mostly Tags or Commentaries? Explain why.

#### Answer:
(a) There are 14870 hashtags.
(b) The most common hashtags are mainly Tags. Commentaries are usually full sentences (so, longer than Tags and they often include verbs) and from the data we can see that the majority are abbreviations. 

# PART 2: Finding adjectives in the data

At this point we have collected basic lexical information (words and their frequencies) from the corpus. We will now explore ways to find adjectives in the data. For this you will need to rely on other linguistc resources, which are conveniently provided by NLTK. We will start with a lexical analysis using WordNet. (In Part 3 you will experiment with alternative strategies.)



### WordNet

The approach that we will be exploring here is one in which we first want to create a list of English adjectives, and then want to check words in the Twitter corpus against that list to find out whether or not a word in a tweet is an adjective. A list of adjectives can be created with the use of WordNet, which is available through NLTK.

"WordNet® is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept." https://wordnet.princeton.edu/

Read how to use WordNet in NLTK:
http://www.nltk.org/howto/wordnet.html



#### Step 1: Create a list of adjectives

In [10]:
# YOUR TURN: Import WordNet and create a list of adjectives. 
# HINT: You will need to study the NLTK WordNet page listed above to figure out how to extract all adjectives
# ("Access to all Synsets" is where you want to look...)

from nltk.corpus import wordnet as wn

# We create a list with all the adjectives in WordNet

sy = []

for synset in list(wn.all_synsets('a'))[:]:
    sy.append(synset)
    
print(len(sy))
 











18156


#### Question:
How many different adjectives are there in WordNet? 

(your number should be close to what is reported here: https://wordnet.princeton.edu/documentation/wnstats7wn )

#### Answer:
There are 18156	different adjectives.

#### Step 2: Find adjectives in tweets

In [51]:
# YOUR TURN: Use your WordNet adjectives list to find adjectives in the twitter corpus. 
# Collect all twitter adjective occurrences in a list. 
# NB this might take about 5-10 minutes to run. Time for coffee?

adjList=[]
adjTweets=[]

# We take the adjectives from the list obtained from WordNet because the type of the variable was a WordNet type
# We iterate over the list of flattened tweet words and also over the list with the adjectives from WordNet
# We also add the adjectives that are on both lists to a new one

for i in sy:
    adjList.append(str(i)[8:-7])
 
for j in fTW:
    if j in adjList:
        adjTweets.append(j)
    
print (len(adjTweets))
print (adjTweets)
    

    








50940
['hopeless', 'in', 'in', 'sliding', 'starting', 'on', 'on', 'in', 'good', 'going', 'happy', 'tired', 'up', 'about', 'same', 'kind', 'in', 'sure', 'stupid', 'just', 'digital', 'any', 'lonely', 'just', 'hard', 'no', 'available', 'much', 'far', 'away', 'safe', 'sad', 'in', 'late', 'much', 'sick', 'on', 'first', 'back', 'old', 'u', 'pale', 'after', 'in', 'massive', 'rash', 'all', 'most', 'painful', 'go', 'home', 'busy', 'one', 'through', 'all', 'in', 'go', 'on', 'very', 'u', 'off', 'worst', 'still', 'bad', 'about', 'on', 'driving', 'like', 'powerful', 'influential', 'sad', 'in', 'regional', 'doomed', 'other', 'easy', 'green', 'in', 'sure', 'sane', 'three', 'much', 'up', 'sad', 'just', 'still', 'tired', 'massive', 'about', 'genuine', 'up', 'late', 'u', 'out', 'u', 'sound', 'upset', 'much', 'much', 'serious', 'hot', 'still', 'found', 'cut', 'own', 'much', 'short', 'received', 'like', 'mum', 'out', 'loud', 'up', 'some', 'up', 'like', 'beautiful', 'cute', 'on', 'u', 'no', 'back', 'accept

#### Step 3: Create a frequency distribution of adjectives in Twitter

In [52]:
# YOUR TURN: Create a frequency distribution and print the 50 most common adjectives.


#frequency
fdist = nltk.FreqDist(adjTweets)

# We can print the most common words:
fdist.most_common(50)





[('in', 5934),
 ('on', 3812),
 ('out', 1816),
 ('just', 1342),
 ('all', 1260),
 ('about', 1078),
 ('no', 1055),
 ('like', 1023),
 ('going', 931),
 ('up', 890),
 ('more', 847),
 ('back', 684),
 ('one', 604),
 ('off', 487),
 ('u', 466),
 ('even', 452),
 ('good', 394),
 ('most', 394),
 ('go', 385),
 ('any', 369),
 ('well', 362),
 ('after', 360),
 ('down', 359),
 ('much', 358),
 ('then', 328),
 ('still', 311),
 ('last', 309),
 ('right', 305),
 ('new', 298),
 ('same', 282),
 ('working', 281),
 ('very', 277),
 ('some', 271),
 ('best', 266),
 ('another', 253),
 ('every', 242),
 ('left', 226),
 ('other', 225),
 ('here', 218),
 ('made', 215),
 ('cut', 202),
 ('done', 202),
 ('great', 193),
 ('many', 190),
 ('better', 190),
 ('own', 177),
 ('must', 176),
 ('five', 173),
 ('heard', 169),
 ('front', 167)]

#### Question:
Report on the outcome of your analysis. Are you satisfied with the results so far? Do you get an impression of the overall sentiment in the corpus? If you are not satisfied with the analysis, then describe what seems to be going wrong.


#### Answer:
The results do not really give us a good impression of the overall sentiment. We observe that the most frequent adjectives are also prepositions and so, it is possible that they were not mainly used as adjectives. 

# PART 3: Follow-up analyses

## Follow-up 1: Create a better WordNet adjective list

You might not be very happy with the results so far. Can you come up with ways to make a better list of adjectives?

In [126]:
# We flatten the adjective list to only have each word once

alF=[]

for i in adjList:
    if i not in alF:
        alF.append(i)

# Tagging of the flattened list        
        
adjtag = nltk.pos_tag(alF)

# definitive adjective list
dal = []

# Keeping only the adjectives tagged as adjectives ('JJ')

for i in range(len(alF)):
    for j in adjtag[i][1]:
        if j == 'J' and adjtag[i][0] not in dal:
            dal.append(adjtag[i][0])

print(len(dal))

aTw = []

# Comparing the list of flattened tweet words with the last list we created with the adjectives
# Storing the words that are in both lists in a new one

for k in fTW:
    if k in dal:
        aTw.append(k)

#frequency
fdist = nltk.FreqDist(aTw)

# We can print the most common words:
fdist.most_common(50)


8485


[('more', 847),
 ('u', 466),
 ('good', 394),
 ('most', 394),
 ('last', 309),
 ('right', 305),
 ('new', 298),
 ('same', 282),
 ('best', 266),
 ('other', 225),
 ('great', 193),
 ('many', 190),
 ('better', 190),
 ('own', 177),
 ('sure', 154),
 ('political', 154),
 ('global', 149),
 ('live', 142),
 ('top', 140),
 ('racist', 140),
 ('hard', 137),
 ('first', 128),
 ('sad', 127),
 ('happy', 125),
 ('such', 119),
 ('bad', 118),
 ('unrelated', 118),
 ('late', 115),
 ('clear', 110),
 ('final', 106),
 ('whole', 105),
 ('full', 103),
 ('sound', 101),
 ('real', 101),
 ('nice', 100),
 ('public', 97),
 ('little', 96),
 ('wanted', 91),
 ('free', 91),
 ('common', 90),
 ('big', 89),
 ('few', 88),
 ('wrong', 87),
 ('old', 86),
 ('correct', 85),
 ('broken', 82),
 ('strong', 76),
 ('poor', 75),
 ('able', 74),
 ('least', 71)]

First we flattened the adjective list we had in order to take out the words that were repeated. To improve the adjective list further, we decided that a good approach is to take out all the words not tagged as adjectives (JJ), because we did not want to work with prepositions since they were possibly mostly not used as adjectives and even when they were, they do not really give us any information about sentiment.


## Follow-up 2: A manual list of positive and negative adjectives

Another approach would be to prespecify a set of positive and negative adjectives, and create a frequency distribution for only those words. Make a list of at least 10 positive and 10 negative words, and get their word frequencies from the Twitter corpus. Given this experiment, is the sentiment in this corpus mostly positive or mostly negative? 


In [114]:
# List with positive and negative adjectives
Words= ["happy", "good", "nice", "content", "beautiful", "wonderful", "cool", "positive", "overjoyed", "thrilled", "sad", "depressed", "stressed", "tired", "miserable", "horrible", "desolate", "bad", "negative", "terrible"]

# Empty list to store the adjectives and comparing both lists as before
futweets=[]

for j in fTW:
    if j in Words:
        futweets.append(j)

#frequency
fdist = nltk.FreqDist(futweets)

# We can print the most common words:
fdist.most_common(50)
    








[('good', 394),
 ('sad', 127),
 ('happy', 125),
 ('bad', 118),
 ('nice', 100),
 ('tired', 52),
 ('beautiful', 46),
 ('positive', 41),
 ('wonderful', 40),
 ('horrible', 36),
 ('cool', 34),
 ('terrible', 29),
 ('negative', 20),
 ('content', 4),
 ('miserable', 2),
 ('stressed', 1),
 ('thrilled', 1)]

According to the list we created the overall sentiment is positive.

## Follow-up 3: Use POS tagging to find adjectives

Instead of a lexical approach, you might get better results if you use a part-of-speech tagger. Study the NLTK book chapter on Tagging (http://www.nltk.org/book/ch05.html) and use the techniques described there to add POS tags to our corpus of tweets. Use the tagged corpus to create a new adjective frequency distribution. Describe your findings with respect to positive and negative adjectives: what is the overall sentiment?

In [125]:
# Tagging the list of flattened tweet words
ptag = nltk.pos_tag(fTW)

tagtw = []

# Keeping only the adjectives tagged as adjectives ('JJ')

for i in range(len(fTW)):
    for j in ptag[i][1]:
        if j == 'J':
            tagtw.append(ptag[i][0])

print(len(tagtw))

#frequency
fdist = nltk.FreqDist(tagtw)

# We can print the most common words:
fdist.most_common(50)

73554


[('w/', 1260),
 (':(', 1170),
 (':-)', 1156),
 ('more', 1110),
 (':)', 902),
 (':-(', 778),
 ('good', 778),
 ('Scottish', 624),
 ('last', 618),
 ('new', 596),
 ('u', 566),
 ('same', 564),
 ('next', 556),
 ('best', 520),
 ('#bbcqt', 504),
 ('much', 500),
 ('(-1)', 456),
 ('other', 450),
 ('only', 388),
 ('great', 386),
 ('(+1)', 386),
 ('many', 380),
 ('i', 350),
 ('own', 326),
 ('Labour', 326),
 ('political', 308),
 ('it.', 300),
 ('global', 298),
 ('right', 290),
 ('financial', 288),
 ("tonight's", 274),
 ('&amp;', 260),
 ('…', 252),
 ('happy', 248),
 ('better', 240),
 ('bad', 236),
 ('unrelated', 236),
 ('sure', 234),
 ('(-)', 226),
 ('sad', 220),
 ('first', 218),
 ('#UKIP', 218),
 ('clear', 216),
 ('final', 212),
 ('hard', 206),
 ('full', 206),
 ('follow', 204),
 ('child', 204),
 ('real', 202),
 ('most', 200)]

We decided to include smiley faces, because they are a strong indicator of sentiment, so we did not exclude symbols. This led to some trivial results (for instance the first one) that we decided to ignore.
So, based on both the smiley faces and on the adjectives we collected we conclude that the overall sentiment is positive :). 

## Follow-up 4: Adjective distributions in hashtags

Perhaps hashtags provide more useful information about sentiment than running text? Explore this idea by doing an adjective frequency analysis on hashtags. Describe the difference with the running text analysis.



In [148]:
haf = []

#Comparing the list of hashtags with the list created in the first follow-up

for j in hList:
    i = str(j[1:])
    if i in alF:
        haf.append(i)
print (len(haf))

#frequency
fdist = nltk.FreqDist(haf)

# We can print the most common words:
fdist.most_common(50)

156


[('simple', 28),
 ('sexy', 11),
 ('wet', 6),
 ('gay', 5),
 ('amateur', 5),
 ('lesbian', 5),
 ('free', 5),
 ('hot', 5),
 ('happy', 5),
 ('sexual', 4),
 ('french', 4),
 ('interracial', 4),
 ('cute', 4),
 ('bankrupt', 4),
 ('sad', 2),
 ('bored', 2),
 ('marine', 2),
 ('clueless', 2),
 ('gross', 1),
 ('jealous', 1),
 ('u', 1),
 ('naked', 1),
 ('horny', 1),
 ('isolated', 1),
 ('unloved', 1),
 ('romance', 1),
 ('fantastic', 1),
 ('open', 1),
 ('brilliant', 1),
 ('pretty', 1),
 ('soulful', 1),
 ('philosophical', 1),
 ('sharing', 1),
 ('organic', 1),
 ('commercial', 1),
 ('excited', 1),
 ('patent', 1),
 ('funny', 1),
 ('together', 1),
 ('electoral', 1),
 ('exposed', 1),
 ('fair', 1),
 ('bluff', 1),
 ('gutless', 1),
 ('nutty', 1),
 ('low', 1),
 ('orange', 1),
 ('balanced', 1),
 ('proud', 1),
 ('irrelevant', 1)]

The hashtag adjective distribution provides us with a qualitatively better list of adjectives, but with lower frequencies (due to the lower overall number of adjectives in the list). We have used the flattened list because people usually do not use prepositions in hashtags (and as we see in the results, no prepositions came up in this analysis).
To see the overall sentiment, we think it is better to conduct a text analysis, in the sense that you can easily get an impression of the overall sentiment out of it. However, the hashtag distribution illustates better the content of the tweets (for instance here we can observe a sex-related theme), but we believe that these results would require further analysis so as to determine whether their overall sentiment is positive or not. 
Finally, to improve this kind of analysis, we strongly believe that it is required that some context is taken into account as well.