#Text Summarization
The idea of document summarization is a
bit different from keyphrase extraction or topic modeling. In this case, the end result
is still in the form of some document, but with a few sentences based on the length we
might want the summary to be. This is similar to an abstract or an executive summary
in a research paper. The main objective of automated document summarization is
to perform this summarization without involving human input, except for running
computer programs. Mathematical and statistical models help in building and
automating the task of summarizing documents by observing their content and context.

There are two broad approaches to document summarization using automated
techniques. They are described as follows:
- __Extraction-based techniques:__ These methods use mathematical
and statistical concepts like SVD to extract some key subset of the
content from the original document such that this subset of content
contains the core information and acts as the focal point of the entire
document. This content can be words, phrases, or even sentences.
The end result from this approach is a short executive summary of a
couple of lines extracted from the original document. No new content
is generated in this technique, hence the name extraction-based.
- __Abstraction-based techniques:__ These methods are more complex
and sophisticated. They leverage language semantics to create
representations and use natural language generation (NLG)
techniques where the machine uses knowledge bases and semantic
representations to generate text on its own and create summaries
just like a human would write them. Thanks to deep learning, we can
implement these techniques easily but they require a lot of data and
compute.

We will cover extraction based methods here due to constraints of needed a lot of data + compute for abstraction based methods. But you can leverage the seq2seq models you learnt in language translation on an appropriate dataset to build deep learning based abstractive summarizers


#Extraction based techniques


## Install necessary dependencies

In [None]:
import nltk
import numpy as np
import pandas as pd
nltk.download('punkt')
nltk.download('stopwords')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


True

## Get Text Document

We use the description of a very popular role-playing game (RPG) Skyrim from
Bethesda Softworks for summarization. 

In [None]:
DOCUMENT = """
The Elder Scrolls V: Skyrim is an action role-playing video game developed by Bethesda Game Studios 
and published by Bethesda Softworks. It is the fifth main installment in The Elder Scrolls series, 
following The Elder Scrolls IV: Oblivion.
The game's main story revolves around the player character's quest to defeat Alduin the World-Eater, 
a dragon who is prophesied to destroy the world. The game is set 200 years after the events of Oblivion 
and takes place in the fictional province of Skyrim. Over the course of the game, the player completes 
quests and develops the character by improving skills. The game continues the open-world tradition of 
its predecessors by allowing the player to travel anywhere in the game world at any time, and to ignore 
or postpone the main storyline indefinitely.
The team opted for a unique and more diverse open world than Oblivion's Imperial Province of Cyrodiil, 
which game director and executive producer Todd Howard considered less interesting by comparison. 
The game was released to critical acclaim, with reviewers particularly mentioning the character advancement 
and setting, and is considered to be one of the greatest video games of all time.


The Elder Scrolls V: Skyrim is an action role-playing game, playable from either a first or 
third-person perspective. The player may freely roam over the land of Skyrim which is an open world 
environment consisting of wilderness expanses, dungeons, cities, towns, fortresses, and villages. 
Players may navigate the game world more quickly by riding horses or by utilizing a fast-travel system 
which allows them to warp to previously discovered locations. The game's main quest can be completed or 
ignored at the player's preference after the first stage of the quest is finished. However, some quests 
rely on the main storyline being at least partially completed. Non-player characters (NPCs) populate the 
world and can be interacted with in a number of ways: the player may engage them in conversation, 
marry an eligible NPC, kill them or engage in a nonlethal "brawl". The player may 
choose to join factions which are organized groups of NPCs — for example, the Dark Brotherhood, a band 
of assassins. Each of the factions has an associated quest path to progress through. Each city and town 
in the game world has jobs that the player can engage in, such as farming.

Players have the option to develop their character. At the beginning of the game, players create 
their character by selecting their sex and choosing between one of several races including humans, 
orcs, elves, and anthropomorphic cat or lizard-like creatures and then customizing their character's 
appearance. Over the course of the game, players improve their character's skills which are numerical 
representations of their ability in certain areas. There are eighteen skills divided evenly among the 
three schools of combat, magic, and stealth. When players have trained skills enough to meet the 
required experience, their character levels up. Health is depleted primarily when the player 
takes damage and the loss of all health results in death. Magicka is depleted by the use of spells, 
certain poisons and by being struck by lightning-based attacks. Stamina determines the player's 
effectiveness in combat and is depleted by sprinting, performing heavy "power attacks" 
and being struck by frost-based attacks. Skyrim is the first entry in The Elder Scrolls to 
include dragons in the game's wilderness. Like other creatures, dragons are generated randomly in 
the world and will engage in combat with NPCs, creatures and the player. Some dragons may attack 
cities and towns when in their proximity. The player character can absorb the souls of dragons 
in order to use powerful spells called "dragon shouts" or "Thu'um". A regeneration 
period limits the player's use of shouts in gameplay.

Skyrim is set around 200 years after the events of The Elder Scrolls IV: Oblivion, although it is 
not a direct sequel. The game takes place in Skyrim, a province of the Empire on the continent of 
Tamriel, amid a civil war between two factions: the Stormcloaks, led by Ulfric Stormcloak, and the 
Imperial Legion, led by General Tullius. The player character is a Dragonborn, a mortal born with 
the soul and power of a dragon. Alduin, a large black dragon who returns to the land after being 
lost in time, serves as the game's primary antagonist. Alduin is the first dragon created by Akatosh, 
one of the series' gods, and is prophesied to destroy and consume the world.
"""

In [None]:
import re
DOCUMENT = re.sub(r'\n|\r', ' ', DOCUMENT) #Combining all the paragraphs
DOCUMENT = re.sub(r' +', ' ', DOCUMENT)
DOCUMENT = DOCUMENT.strip()

In [None]:
print(DOCUMENT)

The Elder Scrolls V: Skyrim is an action role-playing video game developed by Bethesda Game Studios and published by Bethesda Softworks. It is the fifth main installment in The Elder Scrolls series, following The Elder Scrolls IV: Oblivion. The game's main story revolves around the player character's quest to defeat Alduin the World-Eater, a dragon who is prophesied to destroy the world. The game is set 200 years after the events of Oblivion and takes place in the fictional province of Skyrim. Over the course of the game, the player completes quests and develops the character by improving skills. The game continues the open-world tradition of its predecessors by allowing the player to travel anywhere in the game world at any time, and to ignore or postpone the main storyline indefinitely. The team opted for a unique and more diverse open world than Oblivion's Imperial Province of Cyrodiil, which game director and executive producer Todd Howard considered less interesting by comparison.

Sentences Collection

In [None]:
sentences = nltk.sent_tokenize(DOCUMENT)
print(sentences)
print("No. of sentences:",len(sentences))

['The Elder Scrolls V: Skyrim is an action role-playing video game developed by Bethesda Game Studios and published by Bethesda Softworks.', 'It is the fifth main installment in The Elder Scrolls series, following The Elder Scrolls IV: Oblivion.', "The game's main story revolves around the player character's quest to defeat Alduin the World-Eater, a dragon who is prophesied to destroy the world.", 'The game is set 200 years after the events of Oblivion and takes place in the fictional province of Skyrim.', 'Over the course of the game, the player completes quests and develops the character by improving skills.', 'The game continues the open-world tradition of its predecessors by allowing the player to travel anywhere in the game world at any time, and to ignore or postpone the main storyline indefinitely.', "The team opted for a unique and more diverse open world than Oblivion's Imperial Province of Cyrodiil, which game director and executive producer Todd Howard considered less intere

## Basic Text pre-processing

In [None]:
stop_words = nltk.corpus.stopwords.words('english')

def normalize_document(doc):
    # lower case and remove special characters\whitespaces
    doc = re.sub(r'[^a-zA-Z\s]', '', doc, re.I|re.A)
    doc = doc.lower()
    doc = doc.strip()
    # tokenize document
    tokens = nltk.word_tokenize(doc)
    # filter stopwords out of document
    filtered_tokens = [token for token in tokens if token not in stop_words]
    # re-create document from filtered tokens
    doc = ' '.join(filtered_tokens)
    return doc

normalize_corpus = np.vectorize(normalize_document) #Function Def Vectorize
norm_sentences = normalize_corpus(sentences)
norm_sentences[:3]

array(['elder scrolls v skyrim action roleplaying video game developed bethesda game studios published bethesda softworks',
       'fifth main installment elder scrolls series following elder scrolls iv oblivion',
       'games main story revolves around player characters quest defeat alduin worldeater dragon prophesied destroy world'],
      dtype='<U183')

In [None]:
stop_words.sort()

## *1. UNI-GRAM*

In [None]:
# Creation of Unigarms
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(ngram_range =(1, 1))
X1 = vectorizer.fit_transform(norm_sentences)
unigrams = (vectorizer.get_feature_names())
print(unigrams)
print(len(unigrams))

['ability', 'absorb', 'acclaim', 'action', 'advancement', 'akatosh', 'alduin', 'allowing', 'allows', 'although', 'amid', 'among', 'antagonist', 'anthropomorphic', 'anywhere', 'appearance', 'areas', 'around', 'assassins', 'associated', 'attack', 'attacks', 'band', 'beginning', 'bethesda', 'black', 'born', 'brawl', 'brotherhood', 'called', 'cat', 'certain', 'character', 'characters', 'choose', 'choosing', 'cities', 'city', 'civil', 'combat', 'comparison', 'completed', 'completes', 'considered', 'consisting', 'consume', 'continent', 'continues', 'conversation', 'course', 'create', 'created', 'creatures', 'critical', 'customizing', 'cyrodiil', 'damage', 'dark', 'death', 'defeat', 'depleted', 'destroy', 'determines', 'develop', 'developed', 'develops', 'direct', 'director', 'discovered', 'diverse', 'divided', 'dragon', 'dragonborn', 'dragons', 'dungeons', 'effectiveness', 'eighteen', 'either', 'elder', 'eligible', 'elves', 'empire', 'engage', 'enough', 'entry', 'environment', 'evenly', 'eve



###TF

In [None]:
print("Norm Sentences:", norm_sentences)
print("Len of Norm Sentences:", len(norm_sentences))

Norm Sentences: ['elder scrolls v skyrim action roleplaying video game developed bethesda game studios published bethesda softworks'
 'fifth main installment elder scrolls series following elder scrolls iv oblivion'
 'games main story revolves around player characters quest defeat alduin worldeater dragon prophesied destroy world'
 'game set years events oblivion takes place fictional province skyrim'
 'course game player completes quests develops character improving skills'
 'game continues openworld tradition predecessors allowing player travel anywhere game world time ignore postpone main storyline indefinitely'
 'team opted unique diverse open world oblivions imperial province cyrodiil game director executive producer todd howard considered less interesting comparison'
 'game released critical acclaim reviewers particularly mentioning character advancement setting considered one greatest video games time'
 'elder scrolls v skyrim action roleplaying game playable either first thirdp

In [None]:
def computeTF(doc):
  valTF = []
  for each in doc:
    wordDict = dict.fromkeys(unigrams, 0)
    sentence = each.split(" ")

    for word in sentence:
      if len(word) == 1:
        sentence.remove(word)

    for word in sentence:
      wordDict[word]+=1

    res = []
    for i in wordDict:
      comp = float(wordDict[i] / len(each))
      res.append(round(comp, 4))
    
    valTF.append(res)
  return(valTF)
    
TF = computeTF(norm_sentences)

In [None]:
TF = np.array(TF)
TF = TF.T
type(TF)

numpy.ndarray

In [None]:
df = pd.DataFrame(TF, index=unigrams)
df.sort_index(ascending=True).head(10)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,25,26,27,28,29,30,31,32,33,34
ability,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
absorb,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0111,0.0,0.0,0.0,0.0,0.0,0.0
acclaim,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0074,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
action,0.0088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0109,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
advancement,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0074,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
akatosh,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0119
alduin,0.0,0.0,0.0088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0119
allowing,0.0,0.0,0.0,0.0,0.0,0.0072,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
allows,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
although,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0127,0.0,0.0,0.0,0.0


###DF (Document Frequency/Term Presence)

In [None]:
DF = vectorizer.fit_transform(norm_sentences)
DF = DF.toarray()
DF = DF.T
DF

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 1, ..., 0, 0, 1],
       [0, 0, 1, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

In [None]:
df = pd.DataFrame(DF, index=unigrams)
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,25,26,27,28,29,30,31,32,33,34
ability,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
absorb,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
acclaim,0,0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
action,1,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
advancement,0,0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ways,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
wilderness,0,0,0,0,0,0,0,0,0,1,...,1,0,0,0,0,0,0,0,0,0
world,0,0,1,0,0,1,1,0,0,1,...,0,1,0,0,0,0,0,0,0,1
worldeater,0,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


###TF-IDF

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer

tv = TfidfVectorizer(min_df=0., max_df=1., use_idf=True)
dt_matrix = tv.fit_transform(norm_sentences)
dt_matrix = dt_matrix.toarray()
dt_matrix

array([[0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.18074668, 0.30828241,
        0.        ],
       ...,
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.19328565, 0.        ,
        0.        ]])

In [None]:
td_matrix = dt_matrix.T #Transpose Matrix
td_matrix
print(td_matrix.shape)

(270, 35)


In [None]:
pd.DataFrame(np.round(td_matrix, 2), index=unigrams)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,25,26,27,28,29,30,31,32,33,34
ability,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.00
absorb,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.0,0.31,0.0,0.0,0.0,0.0,0.0,0.00
acclaim,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.28,0.00,0.00,...,0.00,0.00,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.00
action,0.25,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.32,0.00,...,0.00,0.00,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.00
advancement,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.28,0.00,0.00,...,0.00,0.00,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ways,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.00
wilderness,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.00,0.24,...,0.37,0.00,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.00
world,0.00,0.0,0.18,0.00,0.0,0.16,0.14,0.00,0.00,0.16,...,0.00,0.19,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.19
worldeater,0.00,0.0,0.31,0.00,0.0,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.0,0.00,0.0,0.0,0.0,0.0,0.0,0.00


## *2. BI-GRAM* 

In [None]:
# Forming Bigrams
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(ngram_range = (2,2))
bigrams_matrix = vectorizer.fit_transform(norm_sentences) 
bigrams = (vectorizer.get_feature_names())
print(bigrams)

['ability certain', 'absorb souls', 'acclaim reviewers', 'action roleplaying', 'advancement setting', 'akatosh one', 'alduin first', 'alduin large', 'alduin worldeater', 'allowing player', 'allows warp', 'although direct', 'amid civil', 'among three', 'anthropomorphic cat', 'anywhere game', 'around player', 'around years', 'associated quest', 'attack cities', 'attacks struck', 'band assassins', 'beginning game', 'bethesda game', 'bethesda softworks', 'black dragon', 'born soul', 'brotherhood band', 'called dragon', 'cat lizardlike', 'certain areas', 'certain poisons', 'character absorb', 'character advancement', 'character dragonborn', 'character improving', 'character levels', 'character selecting', 'characters appearance', 'characters npcs', 'characters quest', 'characters skills', 'choose join', 'choosing one', 'cities towns', 'city town', 'civil war', 'combat depleted', 'combat magic', 'combat npcs', 'completed ignored', 'completes quests', 'considered less', 'considered one', 'con



### TF-IDF

In [None]:
# Applying TFIDF
vectorizer = TfidfVectorizer(ngram_range = (2, 2))
bigram_matrix = vectorizer.fit_transform(norm_sentences)
bigram_matrix = bigram_matrix.toarray()
bigram_matrix = bigram_matrix.T
print("\n\nScores : \n", bigram_matrix)



Scores : 
 [[0.         0.         0.         ... 0.         0.         0.        ]
 [0.         0.         0.         ... 0.         0.         0.        ]
 [0.         0.         0.         ... 0.         0.         0.        ]
 ...
 [0.         0.         0.         ... 0.         0.         0.        ]
 [0.         0.         0.27111489 ... 0.         0.         0.        ]
 [0.         0.         0.         ... 0.         0.         0.        ]]


In [None]:
pd.DataFrame(np.round(bigram_matrix, 2), index=bigrams)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,25,26,27,28,29,30,31,32,33,34
ability certain,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.0,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0
absorb souls,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.0,0.0,...,0.0,0.0,0.0,0.29,0.0,0.00,0.0,0.0,0.0,0.0
acclaim reviewers,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.26,0.0,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0
action roleplaying,0.26,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.3,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0
advancement setting,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.26,0.0,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
world oblivions,0.00,0.0,0.00,0.00,0.0,0.00,0.23,0.00,0.0,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0
world quickly,0.00,0.0,0.00,0.00,0.0,0.00,0.00,0.00,0.0,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0
world time,0.00,0.0,0.00,0.00,0.0,0.25,0.00,0.00,0.0,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0
worldeater dragon,0.00,0.0,0.27,0.00,0.0,0.00,0.00,0.00,0.0,0.0,...,0.0,0.0,0.0,0.00,0.0,0.00,0.0,0.0,0.0,0.0


## *3. TRI-GRAMS*

In [None]:
# Forming Trigrams
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(ngram_range = (3,3))
trigrams_matrix = vectorizer.fit_transform(norm_sentences) 
trigrams = (vectorizer.get_feature_names())
print(trigrams)

['ability certain areas', 'absorb souls dragons', 'acclaim reviewers particularly', 'action roleplaying game', 'action roleplaying video', 'advancement setting considered', 'akatosh one series', 'alduin first dragon', 'alduin large black', 'alduin worldeater dragon', 'allowing player travel', 'allows warp previously', 'although direct sequel', 'amid civil war', 'among three schools', 'anthropomorphic cat lizardlike', 'anywhere game world', 'around player characters', 'around years events', 'associated quest path', 'attack cities towns', 'attacks struck frostbased', 'beginning game players', 'bethesda game studios', 'black dragon returns', 'born soul power', 'brotherhood band assassins', 'called dragon shouts', 'cat lizardlike creatures', 'certain poisons struck', 'character absorb souls', 'character advancement setting', 'character dragonborn mortal', 'character improving skills', 'character selecting sex', 'characters npcs populate', 'characters quest defeat', 'characters skills numer



### TF-IDF

In [None]:
# Applying TFIDF
vectorizer = TfidfVectorizer(ngram_range = (3, 3))
trigrams_matrix = vectorizer.fit_transform(norm_sentences)
trigrams_matrix = trigrams_matrix.toarray()
trigrams_matrix = trigrams_matrix.T
print("\n\nScores : \n", trigrams_matrix)



Scores : 
 [[0.        0.        0.        ... 0.        0.        0.       ]
 [0.        0.        0.        ... 0.        0.        0.       ]
 [0.        0.        0.        ... 0.        0.        0.       ]
 ...
 [0.        0.        0.2773501 ... 0.        0.        0.       ]
 [0.        0.        0.        ... 0.        0.        0.       ]
 [0.        0.        0.        ... 0.        0.        0.       ]]


In [None]:
pd.DataFrame(np.round(trigrams_matrix, 2), index=trigrams)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,25,26,27,28,29,30,31,32,33,34
ability certain areas,0.0,0.0,0.00,0.00,0.0,0.00,0.0,0.00,0.00,0.0,...,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,0.0,0.0
absorb souls dragons,0.0,0.0,0.00,0.00,0.0,0.00,0.0,0.00,0.00,0.0,...,0.0,0.0,0.0,0.3,0.0,0.00,0.0,0.0,0.0,0.0
acclaim reviewers particularly,0.0,0.0,0.00,0.00,0.0,0.00,0.0,0.27,0.00,0.0,...,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,0.0,0.0
action roleplaying game,0.0,0.0,0.00,0.00,0.0,0.00,0.0,0.00,0.34,0.0,...,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,0.0,0.0
action roleplaying video,0.3,0.0,0.00,0.00,0.0,0.00,0.0,0.00,0.00,0.0,...,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
world quickly riding,0.0,0.0,0.00,0.00,0.0,0.00,0.0,0.00,0.00,0.0,...,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,0.0,0.0
world time ignore,0.0,0.0,0.00,0.00,0.0,0.26,0.0,0.00,0.00,0.0,...,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,0.0,0.0
worldeater dragon prophesied,0.0,0.0,0.28,0.00,0.0,0.00,0.0,0.00,0.00,0.0,...,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,0.0,0.0
years events elder,0.0,0.0,0.00,0.00,0.0,0.00,0.0,0.00,0.00,0.0,...,0.0,0.0,0.0,0.0,0.0,0.32,0.0,0.0,0.0,0.0


## N-Gram