
Text Summarization
==================


Shows how to summarize text by extracting the most important sentences from it.

This module automatically summarizes the given text by extracting one or more important sentences from the text. Similarly, it can also extract keywords. 

This summarizer is based on the "TextRank" algorithm. 



**Extractive summarization** involves selecting and combining crucial sentences, phrases, or words from an original text to create a shorter version. Generally, the extracted information remains unchanged from the input. (Relative order of sentences may change but every sentence is taken from input)

In [35]:
from pprint import pprint as print
from gensim.summarization import summarize

`gensim` is a robust, efficient open-source python library which is designed for Natural Language Processing(NLP), topic modeling, similarity retrieval using corpora and document indexing etc..

`gensim.summarization.summarize` this function provided by the `gensim` library, 


This function is important for summarizing text. It uses a modified version of the TextRank algorithm to select the important sentences from the input text and create a summary that contain the most important information.

**TextRank** is an unsupervised, graph-based ranking algorithm used for Natural Language Processing tasks, like extracting keywords from text or generating summaries.Its works by assuming the text like a graph with nodes as words or sentences and the edges indicate how the nodes relate to each other.TextRank uses PageRank to determine the most important nodes which are used to generate summaries or highlight important sentences in texts.

Let's consider this example for better understanding.

Extractive summarization is a text summarization technique based on identifying and separating the primary sentences or phrases in the source text to create summary. The extractive summarization systems employ statistical algorithms and linguistic analysis to assess word frequency, sentence position, and keyword occurrence to gauge the importance of each type of textual input. The prioritized sentences are then placed together to develop a brief, information summary. The primary benefit of extractive summarization is its simplicity and the ability for computational deployment. Additionally, the process is relatively straight forward, as the summary is based on the pre-existing text and its extraction. However, in the operational mode, the summaries may lose interpersonal aspects and lack a wholistic context."


In [36]:
# text = (
#     "Extractive summarization is a text summarization technique based on identifying and separating the primary sentences or phrases in the source text to create summary. The extractive summarization systems employ statistical algorithms and linguistic analysis to assess word frequency, sentence position, and keyword occurrence to gauge the importance of each type of textual input. The prioritized sentences are then placed together to develop a brief, information summary. The primary benefit of extractive summarization is its simplicity and the ability for computational deployment. Additionally, the process is relatively straight forward, as the summary is based on the pre-existing text and its extraction. However, in the operational mode, the summaries may lose interpersonal aspects and lack a wholistic context."
# )
text = input("Enter text: ")
print(text)

('Extractive summarization is a text summarization technique based on '
 'identifying and separating the primary sentences or phrases in the source '
 'text to create summary. The extractive summarization systems employ '
 'statistical algorithms and linguistic analysis to assess word frequency, '
 'sentence position, and keyword occurrence to gauge the importance of each '
 'type of textual input. The prioritized sentences are then placed together to '
 'develop a brief, information summary. The primary benefit of extractive '
 'summarization is its simplicity and the ability for computational '
 'deployment. Additionally, the process is relatively straight forward, as the '
 'summary is based on the pre-existing text and its extraction. However, in '
 'the operational mode, the summaries may lose interpersonal aspects and lack '
 'a wholistic context.')


When we pass the string data(input text) as an input to the summarize function, the function will process the data and generate a summary based on the input text.

In [37]:
print(summarize(text))

('Extractive summarization is a text summarization technique based on '
 'identifying and separating the primary sentences or phrases in the source '
 'text to create summary.')


As we are implementing Extractive summarization, the output generated is not created externally, it just the part of the input text which is most significant among all the sentences.

In [38]:
print(summarize(text, ratio=0.6))

('Extractive summarization is a text summarization technique based on '
 'identifying and separating the primary sentences or phrases in the source '
 'text to create summary.\n'
 'The primary benefit of extractive summarization is its simplicity and the '
 'ability for computational deployment.\n'
 'Additionally, the process is relatively straight forward, as the summary is '
 'based on the pre-existing text and its extraction.')


In [39]:
print(summarize(text, split=True))

['Extractive summarization is a text summarization technique based on '
 'identifying and separating the primary sentences or phrases in the source '
 'text to create summary.']


In [40]:
print(summarize(text, word_count=50))

('Extractive summarization is a text summarization technique based on '
 'identifying and separating the primary sentences or phrases in the source '
 'text to create summary.\n'
 'Additionally, the process is relatively straight forward, as the summary is '
 'based on the pre-existing text and its extraction.')


Here the summarize function has 2 attributes for each call, i.e; the input text and ratio, split, wordcount respectively. So, each attribute has its own function.

*ratio* controls the length of summary as a fraction of input text.

*split* when set to True, returns the summary as a list of sentences instead of single sentence/string.

*word_count* specifies the maximum number of words in the summary.


KeyWord Identification
==================

As mentioned earlier, this module also supports keyword extraction. Keyword extraction works in the same way as summary generation (i.e. sentence extraction), in that the algorithm tries to find words that are important or seem to be representative of the text as a whole.

In [41]:
from gensim.summarization import keywords

`gensim.summarization.keywords` function is used for identification and extracting the words/phrases from the text data.

The gensim keywords module uses Natural Language Processing techniques, like term frequency-inverse document frequency (TF-IDF), to find the most important keywords in a given text.

In [42]:
print(keywords(text))

('sentences\n'
 'sentence\n'
 'summarization\n'
 'text\n'
 'straight\n'
 'interpersonal\n'
 'deployment\n'
 'summary\n'
 'summaries\n'
 'information')


In [43]:
print(keywords(text,  lemmatize=True))

('sentence\n'
 'summarization\n'
 'text\n'
 'interpersonal\n'
 'deployment\n'
 'straight\n'
 'information\n'
 'summaries')


In [44]:
print(keywords(text, split=True))

['sentences',
 'sentence',
 'summarization',
 'text',
 'straight',
 'interpersonal',
 'deployment',
 'summary',
 'information',
 'summaries']


In [45]:
print(keywords(text, ratio=0.4))

('summarization\n'
 'primary sentences\n'
 'frequency sentence\n'
 'text\n'
 'interpersonal\n'
 'straight\n'
 'deployment\n'
 'summary\n'
 'summaries\n'
 'information\n'
 'word\n'
 'technique\n'
 'systems employ statistical\n'
 'analysis')


Similarly, in keyword extraction, there are two parameters: input text and lemmatize, split, ratio are also important parameters.

*lemmatize* when set to True, this parameter lemmatizes the keywords.(Lemmatization reduces the given word into its root word)

*split* when set to True, returns the keywords as a list.

*ratio* specifies the number of keywords relative to the number of words in input text.


Lets take a larger example, **Text file** as input data

In [46]:
with open('avatar.txt', 'r') as file:
    text1 = file.read()

print(text1)

("In 2154, humans have depleted Earth's natural resources, leading to a severe "
 'energy crisis. The Resources Development Administration (RDA) mines a '
 'valuable mineral Unobtanium on Pandora, a densely forested habitable moon '
 'orbiting Polyphemus, a fictional gas giant in the Alpha Centauri star '
 'system. Pandora, whose atmosphere is poisonous to humans, is inhabited by '
 "the Na'Vi, a species of 10-foot tall (3.0 m), blue-skinned, sapient "
 'humanoids that live in harmony with nature and worship a mother goddess '
 'named Eywa. It takes 6 years to get from Earth to Pandora in cryogenic '
 'sleep.\n'
 '\n'
 "To explore Pandora's biosphere, scientists use Na'Vi-human hybrids (grown "
 'from human + native DNA) called "avatars", operated by genetically matched '
 'humans. Jake Sully (Sam Worthington), a paraplegic former Marine, replaces '
 'his deceased identical twin brother as an operator of one. Jake was leading '
 'a purposeless life on Earth and was contacted by RDA whe

In [47]:
print(summarize(text1))

("To explore Pandora's biosphere, scientists use Na'Vi-human hybrids (grown "
 'from human + native DNA) called "avatars", operated by genetically matched '
 'humans.\n'
 'Tracy (Michelle Rodriguez) is the pilot assigned to Grace and her team of '
 "Na'Vis. While escorting the avatars of Grace and fellow scientist Dr. Norm "
 "Spellman (Joel David Moore), Jake's avatar is attacked by a Thanator (while "
 'they were visiting the school that Grace was operating to teach the '
 'Omaticaya.\n'
 "Colonel Miles Quaritch (Stephen Lang), head of RDA's private security force, "
 'promises Jake that the company will restore his legs if he gathers '
 "information about the Na'Vi and the clan's gathering place, a giant tree "
 'called Hometree, which stands above the richest deposit of Unobtanium in the '
 'area.\n'
 'She even takes Jake to the tree of souls, their most sacred site), he and '
 'Neytiri choose each other as mates.\n'
 "When Quaritch shows a video recording of Jake's attack on the b

In [48]:
print(summarize(text1, ratio=0.7))

("Pandora, whose atmosphere is poisonous to humans, is inhabited by the Na'Vi, "
 'a species of 10-foot tall (3.0 m), blue-skinned, sapient humanoids that live '
 'in harmony with nature and worship a mother goddess named Eywa.\n'
 "To explore Pandora's biosphere, scientists use Na'Vi-human hybrids (grown "
 'from human + native DNA) called "avatars", operated by genetically matched '
 'humans.\n'
 'Jake was leading a purposeless life on Earth and was contacted by RDA when '
 'his brother died.\n'
 'his brother represented a significant investment by RDA, since the avatars '
 'are linked to the human DNA/genome.\n'
 'Since Jake is a twin, he has the same exact DNA as his brother and can take '
 'his place in the Avatar program.\n'
 'Dr. Grace Augustine (Sigourney Weaver), head of the Avatar Program, '
 'considers Sully an inadequate replacement (as she considers Jake a mere '
 'Jarhead) but accepts his assignment as a bodyguard for excursions deep into '
 "Na'Vi territory.\n"
 'Tracy (

In [49]:
print(summarize(text1, split=True))

["To explore Pandora's biosphere, scientists use Na'Vi-human hybrids (grown "
 'from human + native DNA) called "avatars", operated by genetically matched '
 'humans.',
 'Tracy (Michelle Rodriguez) is the pilot assigned to Grace and her team of '
 "Na'Vis. While escorting the avatars of Grace and fellow scientist Dr. Norm "
 "Spellman (Joel David Moore), Jake's avatar is attacked by a Thanator (while "
 'they were visiting the school that Grace was operating to teach the '
 'Omaticaya.',
 "Colonel Miles Quaritch (Stephen Lang), head of RDA's private security force, "
 'promises Jake that the company will restore his legs if he gathers '
 "information about the Na'Vi and the clan's gathering place, a giant tree "
 'called Hometree, which stands above the richest deposit of Unobtanium in the '
 'area.',
 'She even takes Jake to the tree of souls, their most sacred site), he and '
 'Neytiri choose each other as mates.',
 "When Quaritch shows a video recording of Jake's attack on the bulld

In [50]:
print(summarize(text1, word_count=100))

("Colonel Miles Quaritch (Stephen Lang), head of RDA's private security force, "
 'promises Jake that the company will restore his legs if he gathers '
 "information about the Na'Vi and the clan's gathering place, a giant tree "
 'called Hometree, which stands above the richest deposit of Unobtanium in the '
 'area.\n'
 'The clan attempts to transfer Grace from her human body into her avatar with '
 'the aid of the Tree of Souls, but she dies before the process can be '
 'completed.\n'
 'Jake destroys a makeshift bomber before it can reach the Tree of Souls; '
 'Quaritch, wearing an AMP suit, escapes from his own damaged aircraft and '
 "breaks open the avatar link unit containing Jake's human body, exposing it "
 "to Pandora's poisonous atmosphere.")


In [51]:
print(keywords(text1))

('jake\n'
 'grace\n'
 'quaritch\n'
 'neytiri\n'
 'humans\n'
 'human\n'
 'calls\n'
 'hometree\n'
 'tree\n'
 'selfridge\n'
 'predator\n'
 'dna called\n'
 'scientists\n'
 'scientist\n'
 'native\n'
 'natives\n'
 'tracy\n'
 'trudy\n'
 'destroyed\n'
 'destroy\n'
 'destroying\n'
 'destroys\n'
 'avatars\n'
 'avatar\n'
 'resources\n'
 'neural\n'
 'brother\n'
 'chief\n'
 'centauri\n'
 'sign\n'
 'night\n'
 'energy\n'
 'vortex\n'
 'mineral\n'
 'banshee\n'
 'gathers\n'
 'gathering\n'
 'gather\n'
 'replaces\n'
 'replacement\n'
 'slywanin\n'
 'kill\n'
 'killed\n'
 'killing\n'
 'kills\n'
 'tsu\n'
 'pilot\n'
 'force\n'
 'forces\n'
 'forced\n'
 'sapient\n'
 'wildlife unexpectedly\n'
 'rda\n'
 'administration\n'
 'administrator\n'
 'takes\n'
 'escape\n'
 'escapes\n'
 'michelle\n'
 'sully\n'
 'miles\n'
 'forest\n'
 'orders\n'
 'forested habitable moon orbiting\n'
 'considers\n'
 'unites\n'
 'unit')


In [52]:
print(keywords(text1, lemmatize=True))

('jake\n'
 'grace\n'
 'quaritch\n'
 'neytiri\n'
 'human\n'
 'calls\n'
 'hometree\n'
 'tree\n'
 'selfridge\n'
 'predator\n'
 'scientist\n'
 'dna\n'
 'natives\n'
 'tracy\n'
 'trudy\n'
 'destroys\n'
 'avatar\n'
 'resources\n'
 'neural\n'
 'brother\n'
 'chief\n'
 'sign\n'
 'slywanin\n'
 'banshee\n'
 'mineral\n'
 'centauri\n'
 'vortex\n'
 'gather\n'
 'night\n'
 'replacement\n'
 'energy\n'
 'kills\n'
 'tsu\n'
 'pilot\n'
 'forced\n'
 'wildlife unexpectedly\n'
 'sapient\n'
 'rda\n'
 'administrator\n'
 'takes\n'
 'escapes\n'
 'michelle\n'
 'sully\n'
 'miles\n'
 'forest\n'
 'orders\n'
 'considers\n'
 'habitable moon orbiting\n'
 'unit')


In [53]:
print(keywords(text1, ratio=0.05))

('jake\n'
 'grace\n'
 'quaritch\n'
 'neytiri\n'
 'humans\n'
 'human\n'
 'calls\n'
 'hometree\n'
 'tree\n'
 'selfridge\n'
 'predator\n'
 'dna called\n'
 'scientists\n'
 'scientist\n'
 'native\n'
 'natives')


In [54]:
print(keywords(text1, split=True))

['jake',
 'grace',
 'quaritch',
 'neytiri',
 'humans',
 'human',
 'calls',
 'hometree',
 'tree',
 'selfridge',
 'predator',
 'dna called',
 'scientists',
 'scientist',
 'native',
 'natives',
 'tracy',
 'trudy',
 'destroyed',
 'destroy',
 'destroying',
 'destroys',
 'avatars',
 'avatar',
 'resources',
 'neural',
 'brother',
 'chief',
 'gathers',
 'gathering',
 'gather',
 'replaces',
 'replacement',
 'sign',
 'night',
 'banshee',
 'centauri',
 'vortex',
 'mineral',
 'energy',
 'slywanin',
 'kill',
 'killed',
 'killing',
 'kills',
 'tsu',
 'pilot',
 'force',
 'forces',
 'forced',
 'wildlife unexpectedly',
 'sapient',
 'rda',
 'administration',
 'administrator',
 'takes',
 'escape',
 'escapes',
 'michelle',
 'sully',
 'miles',
 'forest',
 'orders',
 'forested habitable moon orbiting',
 'considers',
 'unites',
 'unit']


Now we will look into **PDF** files as input data

In [55]:
import PyPDF2

`PyPDF2` is a python library, which is used for reading and manuplating the PDF files.
It has many functionalities in handling PDF files.  
*  Reading PDF files: Read and extract text, and content from PDF files.
*  Merging PDF files: Combining multiple PDF files into single file.
*  Splitting PDF files: Split PDF file into multiples files.
*  Encrypting and Decrypting PDFs: Add passwords to PDF files for security.
* Rotating Pages: Rotate pages in PDF files.

In [56]:
reader = PyPDF2.PdfReader('CRED casestudy.pdf')

fulltext=""
for pgnum in range(len(reader.pages)):
    pagecontent=reader.pages[pgnum].extract_text()
    fulltext+=pagecontent

print(fulltext)

(' CRED - CASE STUDY \n'
 ' Technology-Entrepreneur \n'
 ' Kunal Shah, a famous Indian entrepreneur , and founder of CRED, was born on '
 'May 20th, 1983. \n'
 ' His father owned a small pharmaceutical distribution business in their '
 'hometown, while his \n'
 ' mother worked in the insurance sector . Unfortunately , when Kunal was just '
 '16, his family faced \n'
 " financial dif ficulties due to his father's struggling business. As a "
 'result, he had to start working to \n'
 ' support his family . At the age of 16, he began working in a data entry '
 'job, where he gained \n'
 ' valuable experience in data management and analysis. Additionally , he used '
 'to teach computer \n'
 ' science to the children in his neighborhood and ran a cyber cafe from his '
 'house. \n'
 ' He went to a regular school for his schooling. He wanted to study science '
 'in college, but due to \n'
 " his family's financial situation, he ended up earning a Bachelor's degree "
 'in Philosophy from \n'
 '

In [57]:
print(summarize(fulltext))

('simplify credit card management and improve financial control.\n'
 'platform that aims to motivate users to make good financial choices and, in '
 'turn, create a more \n'
 'Cred is a FinT ech company that was founded in 2018 with a mission to '
 'simplify the credit card \n'
 'The platform aims to revolutionize how credit card users interact with their '
 'finances by of fering \n'
 'exclusive privileges and benefits to those with good credit scores, thereby '
 'creating a flywheel \n'
 'It is a platform that aims to reward creditworthy individuals while '
 'addressing trust issues within \n'
 "The company's mission is to create a platform that not only facilitates "
 'credit card \n'
 'bill payments but also of fers rewards for timely payments, thereby '
 'encouraging financial \n'
 'The idea was to of fer rewards to people who pay their credit card \n'
 "Cred’ s unique approach as a 'T rustT ech' platform, rather than just "
 "'Fintech', reflects Shah's belief \n"
 'company aims t

In [58]:
print(summarize(fulltext, split=True))

['simplify credit card management and improve financial control.',
 'platform that aims to motivate users to make good financial choices and, in '
 'turn, create a more ',
 'Cred is a FinT ech company that was founded in 2018 with a mission to '
 'simplify the credit card ',
 'The platform aims to revolutionize how credit card users interact with their '
 'finances by of fering ',
 'exclusive privileges and benefits to those with good credit scores, thereby '
 'creating a flywheel ',
 'It is a platform that aims to reward creditworthy individuals while '
 'addressing trust issues within ',
 "The company's mission is to create a platform that not only facilitates "
 'credit card ',
 'bill payments but also of fers rewards for timely payments, thereby '
 'encouraging financial ',
 'The idea was to of fer rewards to people who pay their credit card ',
 "Cred’ s unique approach as a 'T rustT ech' platform, rather than just "
 "'Fintech', reflects Shah's belief ",
 'company aims to build tr

In [59]:
print(summarize(fulltext, ratio=0.1, split=True))

['simplify credit card management and improve financial control.',
 'platform that aims to motivate users to make good financial choices and, in '
 'turn, create a more ',
 'Cred is a FinT ech company that was founded in 2018 with a mission to '
 'simplify the credit card ',
 'The platform aims to revolutionize how credit card users interact with their '
 'finances by of fering ',
 "The company's mission is to create a platform that not only facilitates "
 'credit card ',
 'The idea was to of fer rewards to people who pay their credit card ',
 'Cred is a members-only credit card bill payment platform that rewards its '
 'members for clearing ',
 'from premier brands upon clearing their credit card bills on cred.',
 'and services designed to enhance the credit card payment experience.',
 '●  Utility Bill Payments  : Launched in April 2022, this  feature allows '
 'their users to pay ',
 'NPCI, this feature enables customers to make UPI payments using their credit '
 'cards.',
 'Cred’ s 

In [60]:
print(summarize(fulltext, word_count=150))

('Cred is a FinT ech company that was founded in 2018 with a mission to '
 'simplify the credit card \n'
 "The company's mission is to create a platform that not only facilitates "
 'credit card \n'
 'Cred is a members-only credit card bill payment platform that rewards its '
 'members for clearing \n'
 '●  Cred App:  It offers easy sign-up and attractive credit card bill of '
 'fers.\n'
 '●  Businesses that pr ovide offers on the app:  Cred of fers its users many '
 'exclusive deals \n'
 'Cred is a financial services company that focuses on fintech solutions.\n'
 'products designed for credit card users, including rewards for timely '
 'payments and other \n'
 '●  Diverse Financial Services  : Beyond credit card payments,  Cred of fers '
 'services like \n'
 'Secondly , Cred simplifies credit card payments by providing a single '
 'platform that allows users \n'
 'promoting financial responsibility , Cred is helping users take control of '
 'their finances and create ')


In [61]:
print(keywords(fulltext))

('cred\n'
 'users\n'
 'user\n'
 'credit\n'
 'credits\n'
 'payments\n'
 'payment\n'
 'marketing\n'
 'market\n'
 'platform\n'
 'financially responsible\n'
 'include\n'
 'including\n'
 'data\n'
 'customer service\n'
 'fintech\n'
 'capital\n'
 'risk\n'
 'risks\n'
 'faced financial dif\n'
 'rewarding\n'
 'services\n'
 'exclusive\n'
 'includes offering\n'
 'innovation\n'
 'innovative\n'
 'cards\n'
 'card\n'
 'companies\n'
 'base\n'
 'banking\n'
 'bank\n'
 'allows customers\n'
 'business\n'
 'businesses\n'
 'launched\n'
 'security\n'
 'secure\n'
 'fers rewards\n'
 'management\n'
 'manage\n'
 'ech company\n'
 'trust\n'
 'consumer\n'
 'consumers\n'
 'pay\n'
 'paying\n'
 'offers\n'
 'offerings\n'
 'investors\n'
 'investor\n'
 'reward creditworthy individuals\n'
 'economic\n'
 'making\n'
 'money\n'
 'feature\n'
 'featur\n'
 'individual\n'
 'strong\n'
 'strategy\n'
 'strategies\n'
 'features uniqueness\n'
 'providing\n'
 'provides\n'
 'provide\n'
 'unique\n'
 'strategic\n'
 'calls\n'
 'called\n'
 

In [62]:
print(keywords(fulltext, lemmatize=True))

('cred\n'
 'financially\n'
 'user\n'
 'credits\n'
 'payment\n'
 'market\n'
 'platform\n'
 'data\n'
 'customer service\n'
 'fintech\n'
 'risks\n'
 'rewarding\n'
 'exclusive\n'
 'includes offering\n'
 'innovative\n'
 'card\n'
 'companies\n'
 'base\n'
 'bank\n'
 'businesses\n'
 'launched\n'
 'secure\n'
 'manage\n'
 'trust\n'
 'consumers\n'
 'paying\n'
 'investor\n'
 'economic\n'
 'ech\n'
 'making\n'
 'money\n'
 'individual\n'
 'strong\n'
 'strategies\n'
 'features uniqueness\n'
 'provide\n'
 'strategic\n'
 'called\n'
 'brand\n'
 'finance\n'
 'face\n'
 'fer\n'
 'good\n'
 'key\n'
 'works\n'
 'apps\n'
 'indians\n'
 'behavior\n'
 'allow\n'
 'private\n'
 'shah\n'
 'sector\n'
 'diverse\n'
 'global\n'
 'new\n'
 'low\n'
 'timely\n'
 'significant\n'
 'edge\n'
 'creditworthiness\n'
 'score\n'
 'institutional\n'
 'park\n'
 'wealth\n'
 'model\n'
 'technological\n'
 'bills\n'
 'responsibly\n'
 'ones\n'
 'incentivize\n'
 'rent\n'
 'vuca\n'
 'growth\n'
 'enables\n'
 'small\n'
 'utilization\n'
 'dif\n'
 

In [63]:
print(keywords(fulltext, split=True))

['cred',
 'users',
 'user',
 'credit',
 'credits',
 'payments',
 'payment',
 'marketing',
 'market',
 'platform',
 'financially responsible',
 'include',
 'including',
 'data',
 'customer service',
 'fintech',
 'capital',
 'risk',
 'risks',
 'faced financial dif',
 'rewarding',
 'services',
 'exclusive',
 'includes offering',
 'innovation',
 'innovative',
 'cards',
 'card',
 'companies',
 'base',
 'banking',
 'bank',
 'allows customers',
 'business',
 'businesses',
 'launched',
 'security',
 'secure',
 'fers rewards',
 'management',
 'manage',
 'ech company',
 'trust',
 'consumer',
 'consumers',
 'pay',
 'paying',
 'offers',
 'offerings',
 'investors',
 'investor',
 'reward creditworthy individuals',
 'economic',
 'making',
 'money',
 'feature',
 'featur',
 'individual',
 'strong',
 'strategy',
 'strategies',
 'features uniqueness',
 'providing',
 'provides',
 'provide',
 'unique',
 'strategic',
 'calls',
 'called',
 'brands',
 'brand',
 'finances',
 'finance',
 'faces',
 'face',
 'fer

In [64]:
print(keywords(fulltext, ratio=0.1))

('cred\n'
 'financially\n'
 'users\n'
 'user\n'
 'credit\n'
 'credits\n'
 'payments\n'
 'payment\n'
 'marketing\n'
 'market\n'
 'platform\n'
 'faced financial\n'
 'include\n'
 'including\n'
 'customers\n'
 'data\n'
 'customer service\n'
 'fintech\n'
 'capital\n'
 'risk\n'
 'risks\n'
 'reward\n'
 'rewarding\n'
 'services\n'
 'exclusive\n'
 'includes offering\n'
 'innovation\n'
 'innovative\n'
 'cards\n'
 'card\n'
 'companies\n'
 'base\n'
 'banking\n'
 'bank\n'
 'business\n'
 'businesses\n'
 'launched\n'
 'security\n'
 'secure\n'
 'fers rewards\n'
 'management\n'
 'manage\n'
 'ech company\n'
 'trust\n'
 'consumer\n'
 'consumers\n'
 'pay\n'
 'paying\n'
 'offers\n'
 'offerings\n'
 'investors\n'
 'investor\n'
 'economic\n'
 'making\n'
 'money\n'
 'feature\n'
 'featur\n'
 'individuals\n'
 'individual\n'
 'strong\n'
 'strategy\n'
 'strategies\n'
 'features uniqueness\n'
 'providing\n'
 'provides\n'
 'provide\n'
 'unique\n'
 'strategic\n'
 'calls\n'
 'called\n'
 'brands\n'
 'brand\n'
 'finance