# **Project - Part 3: Ranking**
## **Ranking Score**
Given a query, we want to get the top-20 documents related to the query.
## **Goal**
**Find all the documents that contain all the words in the query and sort them by their relevance with regard to the query.** 
## **Score**
>1. You’re asked to provide 2 different ways of ranking:
>>* **TF-IDF + cosine similarity**: Classical scoring, we have also seen during the
practical labs
>>* **Your-Score + cosine similarity**: Here the task is to create a new score, and it’s up to you to create a new one.
>>* **BM25**
>* Explain how the ranking differs when using TF-IDF and BM25 and think about the
pros and cons of using them.
Regarding your own score, justify the choice of the score (pros and cons). The only
constraint you have is that the score needs to involve the tweets information
regarding the popularity over the social network (number of likes, number of tweets,
number of comments, etc...)
>2. Return a top-20 list of documents for each of the 5 queries, using word2vec + cosine
similarity.
To use the word2vec, you need to generate the tweet representation, which here is
expressed as a unique vector of the same dimension of the words, generated as the
average of the vectors representing the words included in the tweet:
Ex. “I won the election”
Having a vector (generated through word2vec) representing each word, e.g. I=v_1,
won = v_2, the = v_3, election = v_4, all of the same number of dimensions n, it is
possible to represent the text above and generating a unique representation, by
averaging v_1, v_2, v_3, v_4. The result will be a new vector v, of the same dimension
n representing the text “I won the election”. Since it’s a tweet in our case we will talk
about tweet2vec.
>3. Can you imagine a better representation than word2vec? Justify your answer.
(**HINT** - what about Doc2vec? Sentence2vec? Which are the pros and cons?)

## Setup

#### Mount Drive

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


#### Import packages

In [10]:
import nltk
nltk.download('stopwords')
nltk.download('punkt')
from collections import defaultdict
from array import array
from nltk.stem import PorterStemmer
from nltk.corpus import stopwords
import collections
import json
import re
from tabulate import tabulate
stemmer = nltk.stem.SnowballStemmer('english')
stopwords = set(stopwords.words('english'))
# Packets needed for lab 3
import math
import numpy as np
import pandas as pd
import collections
from numpy import linalg as la
from gensim.models.word2vec import Word2Vec

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


#### Load JSON Data

In [7]:
# Load file path
file_name = '/content/drive/Shareddrives/IRWA/PROJECT/data/tw_hurricane_data.json'
# Use json.loads function with list comprehension to obtain all the tweets
lines = [json.loads(line) for line in open(file_name,'r')]
# Print first tweet for checking purposes
print(lines[0]['entities']['media'][0]['url'])

https://t.co/VROTxNS9rz


In [8]:
# Print total number of tweets
print("Total number of Tweets: {}".format(len(lines)))

Total number of Tweets: 4000


#### Load Map Data

In [9]:
id_map= pd.read_csv("/content/drive/Shareddrives/IRWA/PROJECT/data/tweet_document_ids_map.csv", sep='\t', 
                    engine='python', names = ["doc_id","tweet_id"])
id_map.head()

Unnamed: 0,doc_id,tweet_id
0,doc_1,1575918182698979328
1,doc_2,1575918151862304768
2,doc_3,1575918140839673873
3,doc_4,1575918135009738752
4,doc_5,1575918119251419136


## Preprocess

In [11]:
class Tweet:
  def __init__(self, doc_id,id, tweet, username, date, hashtags, likes, retweets, url):
    self.doc_id = doc_id
    self.id = id
    self.tweet = tweet
    self.username = username
    self.date = date
    self.hashtags = hashtags
    self.likes = likes
    self.retweets = retweets
    self.url = url
  def aslist(self):
        return [self.doc_id,self.id, self.tweet, self.username, self.date, self.hashtags, self.likes, self.retweets, self.url]
  def __iter__(self):
        return iter(self.aslist())
 
tweets = []

for i in range(len(lines)):

    hashtags = []
    url = ""
    doc_id = id_map.loc[id_map['tweet_id'] == lines[i]['id'], 'doc_id'].iloc[0]

    if 'media' in lines[i]['entities']:
      url = lines[i]['entities']['media'][0]['url']

    for j in range(len(lines[i]['entities']['hashtags'])):
      hashtags.append(lines[i]['entities']['hashtags'][j].get('text'))

    tweets.append(Tweet(doc_id,
                        lines[i]['id'], 
                        lines[i]['full_text'], 
                        lines[i]['user']['screen_name'], 
                        lines[i]['created_at'], 
                        hashtags, 
                        lines[i]['favorite_count'], 
                        lines[i]['retweet_count'], 
                        url))

In [12]:
# Remove white spaces
def remove_white_space(text):
    return ' '.join(text.split())
# Remove stopwords
def remove_stopwords(words):
    return [w for w in words if w.lower() not in stopwords]
# Remove emojis
def remove_emojis(data):
    emoj = re.compile("["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
        u"\U00002500-\U00002BEF"  # chinese char
        u"\U00002702-\U000027B0"
        u"\U00002702-\U000027B0"
        u"\U000024C2-\U0001F251"
        u"\U0001f926-\U0001f937"
        u"\U00010000-\U0010ffff"
        u"\u2640-\u2642" 
        u"\u2600-\u2B55"
        u"\u200d"
        u"\u23cf"
        u"\u23e9"
        u"\u231a"
        u"\ufe0f"  # dingbats
        u"\u3030"
                      "]+", re.UNICODE)
    return emoj.sub(r'', data)
# Remove punctuation and hashtags
def remove_punctuation(data):
    return re.sub(r'[^\w\s]', '', data)
# Remove numbers
def remove_numbers(data):
    return re.sub(r'[0-9]', '', data)
# Remove https
def remove_https(words):
  return [w for w in words if not w.startswith("https") ]

# Preprocess text
def preprocess(text):
    text = text.replace('\\n', '')
    text = remove_emojis(text)
    text = remove_punctuation(text)
    text = remove_numbers(text)
    text = remove_white_space(text)
    words = nltk.tokenize.word_tokenize(text)
    words = [stemmer.stem(word) for word in words]
    words = remove_stopwords(words)
    words = remove_https(words)
    return words


## Ranking
In this section are going to implement TD-IDF + cosine similarity, BM25 and OwnScore + cosine similarity ranking methods 

### Index Creation Function


In [13]:
# Create index function
def create_global_index(tweets,num_tweets):
    index = defaultdict(list)
    tweet_index = {}  # dictionary to map tweet id with index in tweets list
    counter = 0 # keep track of index inside tweets
    tf = defaultdict(list)  # term frequencies of terms in documents (documents in the same order as in the main index)
    df = defaultdict(int)  # document frequencies of terms in the corpus
    idf = defaultdict(float) # inverse document frequency for each term
    tweet_features = defaultdict(list) #length likes retweets of each tweet (in this way we don't need to pass tweets in rank docs function)

    avg_tweet_length = 0 #store average length of tweets
    avg_tweet_likes = 0 #store average likes of tweets
    avg_tweet_rt = 0 #store average rt of tweets

    for t in tweets:  # for all tweets

        tweet_id = t.id
        terms = preprocess(t.tweet) #preprocess tweet and return list of terms
        tweet_index[tweet_id] = counter # Save original tweets position with tweet id to recover all the information
        current_page_index = {}

        tweet_features[tweet_id] = [len(terms),t.likes, t.retweets] #Store useful features
        avg_tweet_length += len(terms) # sum length of actual tweet
        avg_tweet_likes += t.likes # sum likes of actual tweet
        avg_tweet_rt += t.retweets # sum rt of actual tweet

        counter = counter + 1 # Move to next tweets position

        for position, term in enumerate(terms):  
            try:
                # if the term is already in the dict append the position to the corresponding list
                current_page_index[term][1].append(position) 
            except:
                # Add the new term as dict key and initialize the array of positions and add the position
                current_page_index[term] = [tweet_id, array('I', [position])]  #'I' indicates unsigned int (int in Python)

        # normalize term frequencies
        # Compute the denominator to normalize term frequencies (formula 2 above)
        # norm is the same for all terms of a document.
        norm = 0
        for term, posting in current_page_index.items():
            # posting will contain the list of positions for current term in current document. 
            # posting ==> [current_doc, [list of positions]] 
            # you can use it to infer the frequency of current term.
            norm += len(posting[1]) ** 2
        norm = math.sqrt(norm)

        #calculate the tf(dividing the term frequency by the above computed norm) and df weights
        for term, posting in current_page_index.items():
            # append the tf for current term (tf = term frequency in current doc/norm)
            tf[term].append(np.round(len(posting[1]) / norm, 4)) ## SEE formula (1) above
            #increment the document frequency of current term (number of documents containing the current term)
            df[term] += 1 # increment DF for current term
            
        # Compute IDF
        for term in df:
            idf[term] = np.round(np.log(float(num_tweets / df[term])), 4)

        #merge the current page index with the main index
        for term_page, posting_page in current_page_index.items():
            index[term_page].append(posting_page)

    avg_tweet_length = avg_tweet_length/num_tweets #Average length of all tweets
    avg_tweet_likes = avg_tweet_likes/num_tweets #Average likes of all tweets
    avg_tweet_rt = avg_tweet_rt/num_tweets #Average rt of all tweets

    return index, tf, df, idf, tweet_index, avg_tweet_length, avg_tweet_likes, avg_tweet_rt, tweet_features


In [14]:
# Create tfidf index with all the tweets WARNING: THIS PROCESS MAY TAKE LONG TIME (we run it for 4 min)
num_tweets= len(tweets)
index, tf, df, idf, tweet_index, avg_tweet_length, avg_tweet_likes, avg_tweet_rt, tweet_features = create_global_index(tweets, num_tweets)

### TF-IDF + Cosine Similarity Rank


In [15]:
def rank_tfidf_documents(terms, docs, index, idf, tf, title_index):
  
    # I'm interested only on the element of the docVector corresponding to the query terms 
    # The remaining elements would became 0 when multiplied to the query_vector
    doc_vectors = defaultdict(lambda: [0] * len(terms)) # I call doc_vectors[k] for a nonexistent key k, the key-value pair (k,[0]*len(terms)) will be automatically added to the dictionary
    query_vector = [0] * len(terms)

    # compute the norm for the query tf
    query_terms_count = collections.Counter(terms)  # get the frequency of each term in the query. 

    query_norm = la.norm(list(query_terms_count.values()))

    for termIndex, term in enumerate(terms):  #termIndex is the index of the term in the query
        if term not in index:
            continue
        # query_vector[termIndex]=idf[term]  # original
        ## Compute tf*idf(normalize TF as done with documents)
        query_vector[termIndex] = query_terms_count[term] / query_norm * idf[term]

        # Generate doc_vectors for matching docs
        for doc_index, (doc, postings) in enumerate(index[term]):          
            if doc in docs:
                doc_vectors[doc][termIndex] = tf[term][doc_index] * idf[term]  

    # Calculate the score of each doc 
    # compute the cosine similarity between queyVector and each docVector:
    
    doc_scores = [[np.dot(curDocVec, query_vector), doc] for doc, curDocVec in doc_vectors.items()]
    doc_scores.sort(reverse=True)
    result_docs = [x[1] for x in doc_scores]
    result_rank = [x[0] for x in doc_scores] #get rank value
    #print document titles instead if document id's
    #result_docs=[ title_index[x] for x in result_docs ]
    if len(result_docs) == 0:
        print("No results found, try again")
        query = input()
        docs = search_tweets(query, index,0)
    return result_docs, result_rank

### BM25 Rank

In [16]:
def rank_BM25_documents(terms, docs, index, idf, tf, title_index, avg_tweet_length, k1 = 1.2, b = 0.75):
  
    # I'm interested only on the element of the docVector corresponding to the query terms 
    # The remaining elements would became 0 when multiplied to the query_vector
    doc_vectors = defaultdict(lambda: [0] * len(terms)) # I call doc_vectors[k] for a nonexistent key k, the key-value pair (k,[0]*len(terms)) will be automatically added to the dictionary

    for termIndex, term in enumerate(terms):  #termIndex is the index of the term in the query
        if term not in index:
            continue
 
        # Generate BM25 Score
        for doc_index, (doc, postings) in enumerate(index[term]):          
            if doc in docs:
              doc_vectors[doc][termIndex] = (tf[term][doc_index] * idf[term] * (k1 + 1)) / k1*((1-b)+b*(tweet_features[doc][0]/avg_tweet_length)) + tf[term][doc_index]
              
              
    # Calculate the score of each doc 
    # compute the sum between BM25 scores for each query term
    
    doc_scores = [[np.sum(curDocVec), doc] for doc, curDocVec in doc_vectors.items()]
    doc_scores.sort(reverse=True)
    result_docs = [x[1] for x in doc_scores]
    result_rank = [x[0] for x in doc_scores] #get rank value
    
    if len(result_docs) == 0:
        print("No results found, try again")
        query = input()
        docs = search_tweets(query, index, 1)
    return result_docs, result_rank

### Own-score + Cosine Similarity Rank
We are gonna use a variation of TF-IDF combined with likes and retweets relevance factor in order to scale rank according to the tweet popularity. We decided to not use comments since each user can comment a number of uncertain times in a tweet which could lead to biased results.

In [17]:
def popularity_score (cos_similarity, doc, avg_tweet_likes, avg_tweet_rt, tweet_features, like_param = 1, rt_param = 1):
  #like/rt param to assign weights to like or rt
  like_factor = like_param*(math.log2((tweet_features[doc][1]/ avg_tweet_likes)+1))
  rt_factor = rt_param*(math.log2((tweet_features[doc][2]/ avg_tweet_rt)+1))

  popularity_score = cos_similarity * (like_factor + rt_factor + 1)
  return popularity_score

In [18]:
def rank_ownscore_documents(terms, docs, index, idf, tf, avg_tweet_likes, avg_tweet_rt, tweet_features):
  
    # I'm interested only on the element of the docVector corresponding to the query terms 
    # The remaining elements would became 0 when multiplied to the query_vector
    doc_vectors = defaultdict(lambda: [0] * len(terms)) # I call doc_vectors[k] for a nonexistent key k, the key-value pair (k,[0]*len(terms)) will be automatically added to the dictionary
    query_vector = [0] * len(terms)

    # compute the norm for the query tf
    query_terms_count = collections.Counter(terms)  # get the frequency of each term in the query. 

    query_norm = la.norm(list(query_terms_count.values()))

    for termIndex, term in enumerate(terms):  #termIndex is the index of the term in the query
        if term not in index:
            continue
        # query_vector[termIndex]=idf[term]  # original
        ## Compute tf*idf(normalize TF as done with documents)
        query_vector[termIndex] = query_terms_count[term] / query_norm * idf[term]

        # Generate doc_vectors for matching docs
        for doc_index, (doc, postings) in enumerate(index[term]):          
            if doc in docs:
                doc_vectors[doc][termIndex] = tf[term][doc_index] * idf[term]  

    # Calculate the score of each doc 
    # compute the cosine similarity between queyVector and each docVector:
    
    doc_scores = [[popularity_score(np.dot(curDocVec, query_vector),doc,avg_tweet_likes,avg_tweet_rt,tweet_features), doc] for doc, curDocVec in doc_vectors.items()]
    doc_scores.sort(reverse=True)
    result_docs = [x[1] for x in doc_scores]
    result_rank = [x[0] for x in doc_scores] #get rank value
    #print document titles instead if document id's
    #result_docs=[ title_index[x] for x in result_docs ]
    if len(result_docs) == 0:
        print("No results found, try again")
        query = input()
        docs = search_tweets(query, index,2)
    return result_docs, result_rank

### Search Function + Results

In [19]:
def search_tweets(query, index, rank_method=0):
    query = preprocess(query)#create list of query terms (each term is preprocessed to match terms in index)
    docs = set()
    for term in query:
        try:
            # store in term_docs the ids of the docs that contain "term"                        
            term_docs = [posting[0] for posting in index[term]]
            
            # docs = docs Union term_docs
            docs |= set(term_docs)
        except:
            #term is not in index
            pass
    docs = list(docs)

    if rank_method == 0:
        ranked_docs, ranked_score = rank_tfidf_documents(query, docs, index, idf, tf, tweet_index)#TFIDF

    elif rank_method == 1:
        ranked_docs, ranked_score = rank_BM25_documents(query, docs, index, idf, tf, tweet_index,avg_tweet_length)#BM25

    else:
        ranked_docs, ranked_score = rank_ownscore_documents(query, docs, index, idf, tf, avg_tweet_likes, avg_tweet_rt, tweet_features)#OwnScore

    return ranked_docs, ranked_score

In [20]:
# Define query to visualize - top 20 ranked tweets displayed
# example used in our report: "Hurricane Ian in Florida"
def query_visualizer(rank_method,top=20):
  print("Insert your query (i.e.: 'hurricane ian'):\n")
  query = input()
  ranked_docs, ranked_score = search_tweets(query, index, rank_method)
  visualization_tweets = []
  #create table headers
  headers = ['DOC_ID','ID','TWEET','USERNAME','DATE','HASHTAGS','LIKES', 'RETWEETS', 'URL']
  print("\n======================\nSample of {} results out of {} for the searched query:\n".format(top, len(ranked_docs)))
  #create table of tweets for each match
  for d_id in ranked_docs[:top]:
      t = tweet_index[d_id]
      visualization_tweets.append(tweets[t])
  #print ranked score
  print("Ranked Scores:", ranked_score[:top])
  #print table
  print(tabulate(visualization_tweets, headers=headers, tablefmt='grid'))

In [None]:
# Use function query_visualizer with params: 0-TFIDF + cosine similarity 1-BM25  2-OwnScore + cosine similarity
# QUERY VISUALIZER FOR TF-IDF + COSINE SIMILARITY
query_visualizer(0)

Insert your query (i.e.: 'hurricane ian'):

Hurricane Ian in Florida

Sample of 20 results out of 1652 for the searched query:

Ranked Scores: [1.8258286378520352, 1.789552785189727, 1.7459817961083353, 1.7346067117505881, 1.691911179067111, 1.691911179067111, 1.6918745404611746, 1.6684997189010455, 1.6549087220364438, 1.6410892170707685, 1.6337945885507572, 1.6216500200380557, 1.6050197167887936, 1.6050197167887936, 1.569714508761535, 1.569714508761535, 1.566258059943268, 1.535680600009897, 1.5351564739582635, 1.5351564739582635]
+----------+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------------------------------+---------------------------------------------------------------------------------

In [None]:
# Use function query_visualizer with params: 0-TFIDF + cosine similarity 1-BM25  2-OwnScore + cosine similarity
# QUERY VISUALIZER FOR BM25
query_visualizer(1)

Insert your query (i.e.: 'hurricane ian'):

Hurricane Ian in Florida

Sample of 20 results out of 1652 for the searched query:

Ranked Scores: [7.221504364275249, 7.056026939215071, 6.365508096282052, 5.017353855754623, 4.773291475854702, 4.748736619365952, 4.748736619365952, 4.656271401098901, 4.6052996703296705, 4.581951360650837, 4.579611019792431, 4.563808592369833, 4.562947685099207, 4.562947685099207, 4.562947685099207, 4.559576023321997, 4.538182970861025, 4.538182970861025, 4.487052152486482, 4.4844957618131875]
+----------+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------------------------------+--------------------------------------------------------------------------------------------

In [21]:
# Use function query_visualizer with params: 0-TFIDF + cosine similarity 1-BM25  2-OwnScore + cosine similarity
# QUERY VISUALIZER FOR OWNSCORE + COSINE SIMILARITY
query_visualizer(2)

Insert your query (i.e.: 'hurricane ian'):

Hurricane Ian in Florida

Sample of 20 results out of 1652 for the searched query:

Ranked Scores: [14.106781309772229, 9.555826487034496, 8.691054464657139, 8.432189617124324, 7.634858492546442, 6.926221731147095, 6.812073145799741, 6.219436157220656, 6.051752620255814, 6.040894998289596, 5.930623857082698, 5.7296037551199674, 5.571924715046872, 5.49046656272171, 5.4372456089414944, 5.2169895443875856, 5.216937976992326, 5.120850375565973, 5.053470400543737, 4.897441903320323]
+----------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------------------------------+----------------------------------------------------------------------------------

## Word2vec



In [22]:
tokenized_tweets = []
for t in tweets:
  terms = preprocess(t.tweet)
  tokenized_tweets.append(terms)

model = Word2Vec(sentences=tokenized_tweets, window=5, min_count=1, workers=4)
model.save("word2vec.model")




In [23]:
def custom_cosine_similarity(v1, v2):
  return np.dot(v1,v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))


In [24]:
def searchrank_tweet2vec(query, index):
    query = preprocess(query)#create list of query terms (each term is preprocessed to match terms in index)
    docs = set()
    word2vec_query = 0
    for term in query:
        try:
            # store in term_docs the ids of the docs that contain "term"                        
            term_docs = [posting[0] for posting in index[term]]
            
            # docs = docs Union term_docs
            docs |= set(term_docs)

            # Store value of term
            word2vec_query += model.wv[term]
        except:
            #term is not in index
            pass

    word2vec_query = word2vec_query / len(query)
    docs = list(docs)
    
    #create doc_vectors dict to store word2vec tweets with tweet id as key
    doc_vectors = defaultdict(lambda: [0])
    
    # for each tweet, serialize data and add vector of each word in dict, then average dividing by len of terms
    for doc in docs: 
        i = tweet_index[doc]
        tweet_text = preprocess(tweets[i].tweet) 
        for term in tweet_text:       
          doc_vectors[doc] +=  model.wv[term]
        doc_vectors[doc] = doc_vectors[doc] / len(tweet_text)

    # Apply custom cosine similarity between tweet vector and query vector and sort results
    doc_scores = [[custom_cosine_similarity(word2vec_t, word2vec_query), doc] for doc, word2vec_t in doc_vectors.items()]
    doc_scores.sort(reverse=True)
    result_docs = [x[1] for x in doc_scores]
    result_rank = [x[0] for x in doc_scores] #get rank value
  
    if len(result_docs) == 0:
        print("No results found, try again")
        query = input()
        docs = search_tweets(query, index,0)
    
    return result_docs, result_rank


In [25]:
# Define query to visualize - top 20 ranked tweets displayed
# example used in our report: "Hurricane Ian in Florida"
def visualize_tweet2vec(top=20):
  print("Insert your query (i.e.: 'hurricane ian'):\n")
  query = input()
  ranked_docs, ranked_score = searchrank_tweet2vec(query, index)
  visualization_tweets = []
  #create table headers
  headers = ['DOC_ID','ID','TWEET','USERNAME','DATE','HASHTAGS','LIKES', 'RETWEETS', 'URL']
  print("\n======================\nSample of {} results out of {} for the searched query:\n".format(top, len(ranked_docs)))
  #create table of tweets for each match
  for d_id in ranked_docs[:top]:
      t = tweet_index[d_id]
      visualization_tweets.append(tweets[t])
  #print ranked score
  print("Ranked Scores:", ranked_score[:top])
  #print table
  print(tabulate(visualization_tweets, headers=headers, tablefmt='grid'))

In [26]:
visualize_tweet2vec()

Insert your query (i.e.: 'hurricane ian'):

Hurricane Ian in Florida

Sample of 20 results out of 1652 for the searched query:

Ranked Scores: [0.9999934415894935, 0.9999863350575942, 0.9999863032475308, 0.9999856716375896, 0.9999848657005996, 0.9999840388071167, 0.999983756445964, 0.9999833057768138, 0.9999828485515999, 0.9999815942571244, 0.9999795324126657, 0.9999794273834453, 0.9999789382268979, 0.9999781963441102, 0.9999778916131046, 0.9999765407122465, 0.999976464752372, 0.9999760704044498, 0.9999755782482284, 0.999975260034233]
+----------+---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------------------------------+------------------------------------------------------------------------------