## NLP Class Exercise 6

Author: Aashish Singh

In [1]:
# !pip install -U spacy
# !python -m spacy download en_core_web_sm
# !pip install pandarallel

In [1]:
import pandas as pd
from concurrent.futures import ThreadPoolExecutor
import re
import nltk
import nltk.corpus  
from nltk.corpus import stopwords
from nltk.text import Text
from nltk.tree import Tree
from nltk import ne_chunk, pos_tag, word_tokenize
import spacy
from spacy.tokens import DocBin
from collections import Counter
import matplotlib.pyplot as plt
import seaborn as sns
import spacy
from pandarallel import pandarallel

pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', 500)

nltk.download('stopwords')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('words')

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/aashishsingh/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     /Users/aashishsingh/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/aashishsingh/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package maxent_ne_chunker to
[nltk_data]     /Users/aashishsingh/nltk_data...
[nltk_data]   Package maxent_ne_chunker is already up-to-date!
[nltk_data] Downloading package words to
[nltk_data]     /Users/aashishsingh/nltk_data...
[nltk_data]   Package words is already up-to-date!


True

In [2]:
import multiprocessing

num_processors = multiprocessing.cpu_count()

print(f'Available CPUs: {num_processors}')

pandarallel.initialize(nb_workers=num_processors-1, use_memory_fs=False)

Available CPUs: 12
INFO: Pandarallel will run on 11 workers.
INFO: Pandarallel will use standard multiprocessing data transfer (pipe) to transfer data between the main process and workers.


In [3]:
news_path = 'https://storage.googleapis.com/msca-bdp-data-open/news/news_some_company.json'

df = pd.read_json(news_path, orient='records', lines=True)

In [4]:
df.head(5)

Unnamed: 0,crawled,language,text,title
0,2019-05-07T04:18:26.000+03:00,english,"No comments\nPacking can be stressful for anyone, right? If you’re like me, you do that thing where you pack 8 pairs of underwear for 3 days (like what do you actually think is going to happen?) and 10 shirts because you just don’t know what you’re going to “feel” like wearing. Am I alone on this one?\nFortunately, after so many trips, we kind of go on autopilot when we’re packing. We have a shared checklist on our phone that we reuse every time to make sure we’ve packed the essentials. But ...",The Most Useful Things I Bring to Disney
1,2019-05-07T04:19:12.028+03:00,english,"I couldn't find another thread for this, so I apologize if this is duplicated somewhere.\nWith the 50th of WDW coming in 2021 there are a number of new attractions on the way as well as some other things.\nWe know Tron for MK, GotG for EPCOT. I would guess there will be something new coming for the other two parks as well.\nOn top of that they'll have several new hotels opening in time for the celebration.\nToday I was told Cinderella Castle is slated to get a new, special paint (or overlay)...",Walt Disney World 50th Rumored Plans
2,2019-05-07T04:19:38.000+03:00,english,"05-06-2019, 01:01 PM Here we go again with another professional football league. This is the second go-around for Vince\nMcMahon's XFL. McMahon is putting $500 million of his own money into the league that is scheduled\nto start one week after next year's Super Bowl. He has a three-year deal with Fox and Disney to\nbroadcast the games. McMahon said there would be no gimmicks or hokey stuff this time.\nWhat's the over/under on this league lasting three years? ""A trophy carries dust. Memories ...",XFL Strikes Deal with Fox and Disney
3,2019-05-07T04:27:37.005+03:00,english,"Wednesday, July 11, 2018 McDonald's Disney World Millennium Cups We're almost finished with summer. Most families are trying to get their vacations out of the way so that they can prepare for the next school season to begin. Most families head to Walt Disney World in Orlando, Florida for their vacations. I remember the mid-to-late 90s, when WDW had the heaviest television advertising with their ""Remember the Magic"" ad campaign. While I can't offer my readers trips to Walt Disney World, I can...",McDonald's Disney World Millennium Cups
4,2019-05-07T04:36:07.017+03:00,english,"As Disney’s Hollywood Studios celebrated its 30th anniversary last week in Florida, the theme park is in the midst of a growth spurt. Still bustling from the opening of Toy Story Land last summer, fans are gearing up for the largest expansion in Walt Disney World history, according to Florida Today , which is part of the USA TODAY Network.\nStar Wars Galaxy’s Edge will open at the Florida park with much-anticipated fanfare Aug. 29, following the opening of the new land at Disneyland in Calif...",Disney World Star Wars: Galaxy's Edge: Hollywood Studios looks ahead


In [6]:
nltk.download('punkt')

# Define the text cleaning function
def clean_text(text):
    # Remove punctuation
    text = re.sub(r'[^\w\s]', '', text)
    # Remove numbers
    text = re.sub(r'\d+', '', text)
    # Remove stop words
    stop_words = set(stopwords.words('english'))
    text = ' '.join(word for word in text.split() if word not in stop_words)
    return text

[nltk_data] Downloading package punkt to
[nltk_data]     /Users/aashishsingh/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [7]:
# Let's apply cleaning to news text and title
df['clean_title'] = df['title'].apply(clean_text)
df['clean_text'] = df['text'].apply(clean_text)
df['clean_title_text'] = df['clean_title'] + ' ' + df['clean_text']
df.head(5)

Unnamed: 0,crawled,language,text,title,clean_title,clean_text,clean_title_text
0,2019-05-07T04:18:26.000+03:00,english,"No comments\nPacking can be stressful for anyone, right? If you’re like me, you do that thing where you pack 8 pairs of underwear for 3 days (like what do you actually think is going to happen?) and 10 shirts because you just don’t know what you’re going to “feel” like wearing. Am I alone on this one?\nFortunately, after so many trips, we kind of go on autopilot when we’re packing. We have a shared checklist on our phone that we reuse every time to make sure we’ve packed the essentials. But ...",The Most Useful Things I Bring to Disney,The Most Useful Things I Bring Disney,No comments Packing stressful anyone right If youre like thing pack pairs underwear days like actually think going happen shirts dont know youre going feel like wearing Am I alone one Fortunately many trips kind go autopilot packing We shared checklist phone reuse every time make sure weve packed essentials But musthaves I make sure pack every time go Disney Im gonna let secrets FuelRod Im sure portable charging pack bring along I personally love FuelRod This company partnered Disney FuelRod...,The Most Useful Things I Bring Disney No comments Packing stressful anyone right If youre like thing pack pairs underwear days like actually think going happen shirts dont know youre going feel like wearing Am I alone one Fortunately many trips kind go autopilot packing We shared checklist phone reuse every time make sure weve packed essentials But musthaves I make sure pack every time go Disney Im gonna let secrets FuelRod Im sure portable charging pack bring along I personally love FuelRod...
1,2019-05-07T04:19:12.028+03:00,english,"I couldn't find another thread for this, so I apologize if this is duplicated somewhere.\nWith the 50th of WDW coming in 2021 there are a number of new attractions on the way as well as some other things.\nWe know Tron for MK, GotG for EPCOT. I would guess there will be something new coming for the other two parks as well.\nOn top of that they'll have several new hotels opening in time for the celebration.\nToday I was told Cinderella Castle is slated to get a new, special paint (or overlay)...",Walt Disney World 50th Rumored Plans,Walt Disney World th Rumored Plans,I couldnt find another thread I apologize duplicated somewhere With th WDW coming number new attractions way well things We know Tron MK GotG EPCOT I would guess something new coming two parks well On top theyll several new hotels opening time celebration Today I told Cinderella Castle slated get new special paint overlay golden anniversary I didnt get details Id assume much tasteful th cake castle Has anyone else heard things tap th Id guess new night time parade planning along shows,Walt Disney World th Rumored Plans I couldnt find another thread I apologize duplicated somewhere With th WDW coming number new attractions way well things We know Tron MK GotG EPCOT I would guess something new coming two parks well On top theyll several new hotels opening time celebration Today I told Cinderella Castle slated get new special paint overlay golden anniversary I didnt get details Id assume much tasteful th cake castle Has anyone else heard things tap th Id guess new night time...
2,2019-05-07T04:19:38.000+03:00,english,"05-06-2019, 01:01 PM Here we go again with another professional football league. This is the second go-around for Vince\nMcMahon's XFL. McMahon is putting $500 million of his own money into the league that is scheduled\nto start one week after next year's Super Bowl. He has a three-year deal with Fox and Disney to\nbroadcast the games. McMahon said there would be no gimmicks or hokey stuff this time.\nWhat's the over/under on this league lasting three years? ""A trophy carries dust. Memories ...",XFL Strikes Deal with Fox and Disney,XFL Strikes Deal Fox Disney,PM Here go another professional football league This second goaround Vince McMahons XFL McMahon putting million money league scheduled start one week next years Super Bowl He threeyear deal Fox Disney broadcast games McMahon said would gimmicks hokey stuff time Whats overunder league lasting three years A trophy carries dust Memories last forever Old Coach,XFL Strikes Deal Fox Disney PM Here go another professional football league This second goaround Vince McMahons XFL McMahon putting million money league scheduled start one week next years Super Bowl He threeyear deal Fox Disney broadcast games McMahon said would gimmicks hokey stuff time Whats overunder league lasting three years A trophy carries dust Memories last forever Old Coach
3,2019-05-07T04:27:37.005+03:00,english,"Wednesday, July 11, 2018 McDonald's Disney World Millennium Cups We're almost finished with summer. Most families are trying to get their vacations out of the way so that they can prepare for the next school season to begin. Most families head to Walt Disney World in Orlando, Florida for their vacations. I remember the mid-to-late 90s, when WDW had the heaviest television advertising with their ""Remember the Magic"" ad campaign. While I can't offer my readers trips to Walt Disney World, I can...",McDonald's Disney World Millennium Cups,McDonalds Disney World Millennium Cups,Wednesday July McDonalds Disney World Millennium Cups Were almost finished summer Most families trying get vacations way prepare next school season begin Most families head Walt Disney World Orlando Florida vacations I remember midtolate WDW heaviest television advertising Remember Magic ad campaign While I cant offer readers trips Walt Disney World I provide taste The Happiest Place Earth four cups sold McDonalds commemorate Disneys millennium advertising campaign Theres little bit personal...,McDonalds Disney World Millennium Cups Wednesday July McDonalds Disney World Millennium Cups Were almost finished summer Most families trying get vacations way prepare next school season begin Most families head Walt Disney World Orlando Florida vacations I remember midtolate WDW heaviest television advertising Remember Magic ad campaign While I cant offer readers trips Walt Disney World I provide taste The Happiest Place Earth four cups sold McDonalds commemorate Disneys millennium advertis...
4,2019-05-07T04:36:07.017+03:00,english,"As Disney’s Hollywood Studios celebrated its 30th anniversary last week in Florida, the theme park is in the midst of a growth spurt. Still bustling from the opening of Toy Story Land last summer, fans are gearing up for the largest expansion in Walt Disney World history, according to Florida Today , which is part of the USA TODAY Network.\nStar Wars Galaxy’s Edge will open at the Florida park with much-anticipated fanfare Aug. 29, following the opening of the new land at Disneyland in Calif...",Disney World Star Wars: Galaxy's Edge: Hollywood Studios looks ahead,Disney World Star Wars Galaxys Edge Hollywood Studios looks ahead,As Disneys Hollywood Studios celebrated th anniversary last week Florida theme park midst growth spurt Still bustling opening Toy Story Land last summer fans gearing largest expansion Walt Disney World history according Florida Today part USA TODAY Network Star Wars Galaxys Edge open Florida park muchanticipated fanfare Aug following opening new land Disneyland California May The acre land transport guests galaxy far far away We want go discover terms said Scott Mallwitz executive creative d...,Disney World Star Wars Galaxys Edge Hollywood Studios looks ahead As Disneys Hollywood Studios celebrated th anniversary last week Florida theme park midst growth spurt Still bustling opening Toy Story Land last summer fans gearing largest expansion Walt Disney World history according Florida Today part USA TODAY Network Star Wars Galaxys Edge open Florida park muchanticipated fanfare Aug following opening new land Disneyland California May The acre land transport guests galaxy far far away ...


In [12]:
# Retrieve top 10 documents most relevant to "Florida"
florida_relevant = df[df['clean_title_text'].str.contains("florida", case=False, na=False)]

# Sort by relevance (this example assumes relevance is indicated by the length of the text; adjust as necessary)
florida_relevant_sorted = florida_relevant.sort_values(by='clean_title_text', key=lambda x: x.str.len(), ascending=False)

# Retrieve top 10
top_10_florida_relevant = florida_relevant_sorted.head(10)
top_10_florida_relevant

Unnamed: 0,crawled,language,text,title,clean_title,clean_text,clean_title_text
2479,2019-05-09T17:34:45.005+03:00,english,"Prepared Remarks:\nOperator\nGood day, ladies and gentlemen, and welcome to the Walt Disney Fiscal 2019 Second Quarter Financial Results Conference Call. At this time, all participants are in a listen-only mode. Later we will conduct a question-and-answer session and instructions will follow at that time. (Operator Instructions) As a reminder, this conference call is being recorded.\nI would now like to introduce your host for today's conference, Lowell Singer, Vice President, Investor Relat...",Walt Disney Company (DIS) Q2 2019 Earnings Call Transcript,Walt Disney Company DIS Q Earnings Call Transcript,Prepared Remarks Operator Good day ladies gentlemen welcome Walt Disney Fiscal Second Quarter Financial Results Conference Call At time participants listenonly mode Later conduct questionandanswer session instructions follow time Operator Instructions As reminder conference call recorded I would like introduce host todays conference Lowell Singer Vice President Investor Relations Sir please go ahead Lowell Singer Senior Vice President Investor Relations Good afternoon welcome The Walt Disney...,Walt Disney Company DIS Q Earnings Call Transcript Prepared Remarks Operator Good day ladies gentlemen welcome Walt Disney Fiscal Second Quarter Financial Results Conference Call At time participants listenonly mode Later conduct questionandanswer session instructions follow time Operator Instructions As reminder conference call recorded I would like introduce host todays conference Lowell Singer Vice President Investor Relations Sir please go ahead Lowell Singer Senior Vice President Invest...
1193,2019-05-08T09:13:45.028+03:00,english,"Home / America / Walt Disney World Resort Update for May 7-13, 2019 by Alan S. Dalinka Walt Disney World Resort Update for May 7-13, 2019 by Alan S. Dalinka\nWalt Disney World Resort Update for May 7-13, 2019\nPage Navigation\nRecently, we soft launched our new format for the Update. The content categories remain the same and are presented in pretty much the same order as always, but we have added enough new readability enhancements that the overall Update has a somewhat refreshed look. The ...","Walt Disney World Resort Update for May 7-13, 2019 by Alan S. Dalinka",Walt Disney World Resort Update May Alan S Dalinka,Home America Walt Disney World Resort Update May Alan S Dalinka Walt Disney World Resort Update May Alan S Dalinka Walt Disney World Resort Update May Page Navigation Recently soft launched new format Update The content categories remain presented pretty much order always added enough new readability enhancements overall Update somewhat refreshed look The new light green button top Update lets readers quickly skip ahead jumping introductory like even Writers Note major sections way dropdown ...,Walt Disney World Resort Update May Alan S Dalinka Home America Walt Disney World Resort Update May Alan S Dalinka Walt Disney World Resort Update May Alan S Dalinka Walt Disney World Resort Update May Page Navigation Recently soft launched new format Update The content categories remain presented pretty much order always added enough new readability enhancements overall Update somewhat refreshed look The new light green button top Update lets readers quickly skip ahead jumping introductory ...
938,2019-05-08T03:16:04.057+03:00,english,"× Navigating Our Walt Disney World Resort Update\nRecently, we soft launched our new format for the Update. The content categories remain the same and are presented in pretty much the same order as always, but we have added enough new readability enhancements that the overall Update has a somewhat refreshed look. The new light green button at the top of the Update lets readers quickly skip ahead (jumping over introductory comments like this and even the Writer's Note) to the major sections b...","Walt Disney World Resort Update for May 7-13, 2019",Walt Disney World Resort Update May,Navigating Our Walt Disney World Resort Update Recently soft launched new format Update The content categories remain presented pretty much order always added enough new readability enhancements overall Update somewhat refreshed look The new light green button top Update lets readers quickly skip ahead jumping introductory comments like even Writers Note major sections way dropdown menu categories used listed Writers Note Each major section heading highlighted light green take next one click...,Walt Disney World Resort Update May Navigating Our Walt Disney World Resort Update Recently soft launched new format Update The content categories remain presented pretty much order always added enough new readability enhancements overall Update somewhat refreshed look The new light green button top Update lets readers quickly skip ahead jumping introductory comments like even Writers Note major sections way dropdown menu categories used listed Writers Note Each major section heading highlig...
2073,2019-05-09T05:48:45.001+03:00,english,"Walt Disney Co (NYSE: DIS ) Q2 2019 Earnings Conference Call May 8, 2019 4:30 PM ET\nCompany Participants\nLowell Singer - SVP, IR\nRobert Iger - Chairman & CEO\nChristine McCarthy - Senior EVP & CFO\nConference Call Participants\nBenjamin Swinburne - Morgan Stanley\nJessica Reif Ehrlich - Bank of America Merrill Lynch\nAlexia Quadrani - JPMorgan Chase & Co.\nMichael Nathanson - MoffettNathanson\nMarci Ryvicker - Wolfe Research\nKannan Venkateshwar - Barclays Bank\nDouglas Mitchelson - Crédi...",Walt Disney Co (DIS) CEO Robert Iger on Q2 2019 Results - Earnings Call Transcript,Walt Disney Co DIS CEO Robert Iger Q Results Earnings Call Transcript,Walt Disney Co NYSE DIS Q Earnings Conference Call May PM ET Company Participants Lowell Singer SVP IR Robert Iger Chairman CEO Christine McCarthy Senior EVP CFO Conference Call Participants Benjamin Swinburne Morgan Stanley Jessica Reif Ehrlich Bank America Merrill Lynch Alexia Quadrani JPMorgan Chase Co Michael Nathanson MoffettNathanson Marci Ryvicker Wolfe Research Kannan Venkateshwar Barclays Bank Douglas Mitchelson Crédit Suisse John Hodulik UBS Investment Bank Timothy Nollen Macquarie...,Walt Disney Co DIS CEO Robert Iger Q Results Earnings Call Transcript Walt Disney Co NYSE DIS Q Earnings Conference Call May PM ET Company Participants Lowell Singer SVP IR Robert Iger Chairman CEO Christine McCarthy Senior EVP CFO Conference Call Participants Benjamin Swinburne Morgan Stanley Jessica Reif Ehrlich Bank America Merrill Lynch Alexia Quadrani JPMorgan Chase Co Michael Nathanson MoffettNathanson Marci Ryvicker Wolfe Research Kannan Venkateshwar Barclays Bank Douglas Mitchelson C...
1007,2019-05-08T04:49:29.006+03:00,english,"Disney slates Fox films, 'Avatar' pushed another year Photo Credit: AP Photo/20th Century Fox, File FILE - This image released by 20th Century Fox shows the characters Neytiri, right, and Jake in a scene from the 2009 movie ""Avatar."" The Walt Disney Co. on Tuesday laid out its plans for upcoming 20th Century Fox films. James Cameron’s long-delayed “Avatar 2” will now open in theaters in December 2021 instead of its most recent date of December 2020. The two subsequent “Avatar” sequels will m...","Disney slates Fox films, 'Avatar' pushed another year",Disney slates Fox films Avatar pushed another year,Disney slates Fox films Avatar pushed another year Photo Credit AP Phototh Century Fox File FILE This image released th Century Fox shows characters Neytiri right Jake scene movie Avatar The Walt Disney Co Tuesday laid plans upcoming th Century Fox films James Camerons longdelayed Avatar open theaters December instead recent date December The two subsequent Avatar sequels move respectively AP Phototh Century Fox File Disney slates Fox films Avatar pushed another year By AP Film Writer May PM...,Disney slates Fox films Avatar pushed another year Disney slates Fox films Avatar pushed another year Photo Credit AP Phototh Century Fox File FILE This image released th Century Fox shows characters Neytiri right Jake scene movie Avatar The Walt Disney Co Tuesday laid plans upcoming th Century Fox films James Camerons longdelayed Avatar open theaters December instead recent date December The two subsequent Avatar sequels move respectively AP Phototh Century Fox File Disney slates Fox films ...
1751,2019-05-09T00:26:58.032+03:00,english,"The Mandalorian, a big-budget Star Wars spinoff starring Pedro Pascal, will be one of the Disney Plus original series available at launch in November.\nDisney At an epic presentation last month, Disney laid out a vast catalog of new and legacy shows you'll be able to stream starting in November with the launch of the company's Netflix competitor, Disney Plus , on Nov. 12 in the US.\nWith the massive libraries of Disney -- and, now 21st Century Fox -- as candidates for the service, what do we...",Disney Plus: Every show and movie that will (or may be) available to stream - CNET,Disney Plus Every show movie may available stream CNET,The Mandalorian bigbudget Star Wars spinoff starring Pedro Pascal one Disney Plus original series available launch November Disney At epic presentation last month Disney laid vast catalog new legacy shows youll able stream starting November launch companys Netflix competitor Disney Plus Nov US With massive libraries Disney st Century Fox candidates service know included This article includes Disneys official list every show movie release date confirmed writing plus titles executives indicate...,Disney Plus Every show movie may available stream CNET The Mandalorian bigbudget Star Wars spinoff starring Pedro Pascal one Disney Plus original series available launch November Disney At epic presentation last month Disney laid vast catalog new legacy shows youll able stream starting November launch companys Netflix competitor Disney Plus Nov US With massive libraries Disney st Century Fox candidates service know included This article includes Disneys official list every show movie release...
1317,2019-05-08T13:18:17.023+03:00,english,"In the bright sun of Orlando, a photographer snaps a picture. His lens pointing toward the rather familiar sight of Sabur and Lil Dick. The pair are in the front of Disney World with tickets in hand. They have just arrived inside the park. The photographer hands Sabur the claim check for the picture, as he shoves it in his pocket.\nLil Dick: Dude...I'm so stoked...I can't believe we are here. In Russia we always heard about Disney, but now I am here...and I am the man, thanks Sabur....thank ...",Disney World....SUCKS!,Disney WorldSUCKS,In bright sun Orlando photographer snaps picture His lens pointing toward rather familiar sight Sabur Lil Dick The pair front Disney World tickets hand They arrived inside park The photographer hands Sabur claim check picture shoves pocket Lil Dick DudeIm stokedI cant believe In Russia always heard Disney I hereand I man thanks Saburthank much Sabur Easy kidenough gushingI figured hell yearso closing triple threat I figured could take moment relax bit clear heads fun The two begin walking la...,Disney WorldSUCKS In bright sun Orlando photographer snaps picture His lens pointing toward rather familiar sight Sabur Lil Dick The pair front Disney World tickets hand They arrived inside park The photographer hands Sabur claim check picture shoves pocket Lil Dick DudeIm stokedI cant believe In Russia always heard Disney I hereand I man thanks Saburthank much Sabur Easy kidenough gushingI figured hell yearso closing triple threat I figured could take moment relax bit clear heads fun The tw...
781,2019-05-08T00:32:07.000+03:00,english,Reasons You May Want to Stay on Disney World Property: The theming at many of the resorts is fabulous (NOT all of them…but many). There are lots of fun activities and events for on property guests at the resorts. You truly have an immersive experience when you NEVER leave the “Disney bubble” during your stay. Disney Cast Members are phenominal . Disney takes care of their guests. Period. I’ve never had a negative experience at a Disney Resort. If you’re only visiting Disney World one time in...,Why Your Family Shouldn’t Stay On-Property at Walt Disney World Resort,Why Your Family Shouldnt Stay OnProperty Walt Disney World Resort,Reasons You May Want Stay Disney World Property The theming many resorts fabulous NOT thembut many There lots fun activities events property guests resorts You truly immersive experience NEVER leave Disney bubble stay Disney Cast Members phenominal Disney takes care guests Period Ive never negative experience Disney Resort If youre visiting Disney World one time entire life unlimited funds may worth Yes short list But honestly For LOT people Disney bubble worth It outweighs every reason stay...,Why Your Family Shouldnt Stay OnProperty Walt Disney World Resort Reasons You May Want Stay Disney World Property The theming many resorts fabulous NOT thembut many There lots fun activities events property guests resorts You truly immersive experience NEVER leave Disney bubble stay Disney Cast Members phenominal Disney takes care guests Period Ive never negative experience Disney Resort If youre visiting Disney World one time entire life unlimited funds may worth Yes short list But honestly...
912,2019-05-08T02:40:33.024+03:00,english,"Print\nIf you’re headed to Orlando for the first time this summer, then my top tips for your Walt Disney World Summer Holiday are going to come in very handy! Make sure you save this and my other Disney Tips to your Pinterest boards to help with all of your Disney World planning!\nWe had our first family vacation to Walt Disney World in Orlando, Florida back in 2011 – it seems so long ago now! With a seventh trip under our belts, a Disney Vacation Club purchase at Polynesian Villas & Bungalo...",12 Tips For Your First Walt Disney World Summer Holiday,Tips For Your First Walt Disney World Summer Holiday,Print If youre headed Orlando first time summer top tips Walt Disney World Summer Holiday going come handy Make sure save Disney Tips Pinterest boards help Disney World planning We first family vacation Walt Disney World Orlando Florida back seems long ago With seventh trip belts Disney Vacation Club purchase Polynesian Villas Bungalows I safely say comes planning holiday Walt Disney World I know stuff Tips For Your First Walt Disney World Summer Holiday Know basics I often hear people descr...,Tips For Your First Walt Disney World Summer Holiday Print If youre headed Orlando first time summer top tips Walt Disney World Summer Holiday going come handy Make sure save Disney Tips Pinterest boards help Disney World planning We first family vacation Walt Disney World Orlando Florida back seems long ago With seventh trip belts Disney Vacation Club purchase Polynesian Villas Bungalows I safely say comes planning holiday Walt Disney World I know stuff Tips For Your First Walt Disney World...
4,2019-05-07T04:36:07.017+03:00,english,"As Disney’s Hollywood Studios celebrated its 30th anniversary last week in Florida, the theme park is in the midst of a growth spurt. Still bustling from the opening of Toy Story Land last summer, fans are gearing up for the largest expansion in Walt Disney World history, according to Florida Today , which is part of the USA TODAY Network.\nStar Wars Galaxy’s Edge will open at the Florida park with much-anticipated fanfare Aug. 29, following the opening of the new land at Disneyland in Calif...",Disney World Star Wars: Galaxy's Edge: Hollywood Studios looks ahead,Disney World Star Wars Galaxys Edge Hollywood Studios looks ahead,As Disneys Hollywood Studios celebrated th anniversary last week Florida theme park midst growth spurt Still bustling opening Toy Story Land last summer fans gearing largest expansion Walt Disney World history according Florida Today part USA TODAY Network Star Wars Galaxys Edge open Florida park muchanticipated fanfare Aug following opening new land Disneyland California May The acre land transport guests galaxy far far away We want go discover terms said Scott Mallwitz executive creative d...,Disney World Star Wars Galaxys Edge Hollywood Studios looks ahead As Disneys Hollywood Studios celebrated th anniversary last week Florida theme park midst growth spurt Still bustling opening Toy Story Land last summer fans gearing largest expansion Walt Disney World history according Florida Today part USA TODAY Network Star Wars Galaxys Edge open Florida park muchanticipated fanfare Aug following opening new land Disneyland California May The acre land transport guests galaxy far far away ...


In [13]:
# Identify top 10 documents most relevant to "Florida" but not related to "Disney"
non_disney_relevant = florida_relevant[~florida_relevant['clean_title_text'].str.contains("disney", case=False, na=False)]

# Sort by relevance (same assumption as above)
non_disney_relevant_sorted = non_disney_relevant.sort_values(by='clean_title_text', key=lambda x: x.str.len(), ascending=False)

# Retrieve top 10
top_10_non_disney_relevant = non_disney_relevant
top_10_non_disney_relevant

Unnamed: 0,crawled,language,text,title,clean_title,clean_text,clean_title_text


In [21]:
# Step 1: Count occurrences of "Florida" in the clean_title_text column (case insensitive)
df['florida_count'] = df['clean_title_text'].str.lower().str.count('florida')

# Retrieve top 10 documents most relevant to "Florida"
florida_relevant_sorted = df.sort_values(by='florida_count', ascending=False)

# Retrieve top 10
top_10_florida_relevant = florida_relevant_sorted.head(10)

# Step 2: Identify top 10 documents most relevant to "Florida" but not related to "Disney"
non_disney_relevant = florida_relevant_sorted[~florida_relevant_sorted['clean_title_text'].str.lower().str.contains("disney")]

# Retrieve top 10
top_10_non_disney_relevant = non_disney_relevant.head(10)

In [19]:
top_10_florida_relevant

Unnamed: 0,crawled,language,text,title,clean_title,clean_text,clean_title_text,florida_count
288,2019-05-07T18:26:29.017+03:00,english,"Tweet\nIt’s now May, which means lovebug season has ‘officially’ returned to Walt Disney World. In this post, we’ll rant a bit about these insects, share info about them, and tips for avoiding love and other bugs during their peak months in Florida.\nLike Pop Warner and Jersey Week, love bug seasons are seemingly unexplainable natural phenomenons that’s spoken of in hushed whispers among Walt Disney World fans. No one likes the annual infestations, but we don’t want to anger our new insect o...",Love Bugs at Disney World,Love Bugs Disney World,Tweet Its May means lovebug season officially returned Walt Disney World In post well rant bit insects share info tips avoiding love bugs peak months Florida Like Pop Warner Jersey Week love bug seasons seemingly unexplainable natural phenomenons thats spoken hushed whispers among Walt Disney World fans No one likes annual infestations dont want anger new insect overlords tough bug documentary Animal Kingdom warned us After heading Walt Disney World yesterday I thought second light rain star...,Love Bugs Disney World Tweet Its May means lovebug season officially returned Walt Disney World In post well rant bit insects share info tips avoiding love bugs peak months Florida Like Pop Warner Jersey Week love bug seasons seemingly unexplainable natural phenomenons thats spoken hushed whispers among Walt Disney World fans No one likes annual infestations dont want anger new insect overlords tough bug documentary Animal Kingdom warned us After heading Walt Disney World yesterday I thought...,10
491,2019-05-07T21:53:38.019+03:00,english,"MILITARY RATE Active and Retired Military Members only?!?! included in these rates. For Bahama Cruises, approximately $66 per person. For Caribbean Cruises, approximately $95 per person. For Alaskan Cruises, approximately $182 per person. 7-Night Alaska Cruise departing from Vancouver, Canada: $230 per person, per night, based on double occupancy. Cruise fare for two Guests $3,220. Available Sail Date: July 29. Disney Wonder. 7-Night Alaska Cruise departing from Vancouver, Canada: $200 per ...","Disney Cruise Line MILITARY RATE SPECIALS! Available May 7-12, 2019",Disney Cruise Line MILITARY RATE SPECIALS Available May,MILITARY RATE Active Retired Military Members included rates For Bahama Cruises approximately per person For Caribbean Cruises approximately per person For Alaskan Cruises approximately per person Night Alaska Cruise departing Vancouver Canada per person per night based double occupancy Cruise fare two Guests Available Sail Date July Disney Wonder Night Alaska Cruise departing Vancouver Canada per person per night based double occupancy Cruise fare two Guests Available Sail Date August Disne...,Disney Cruise Line MILITARY RATE SPECIALS Available May MILITARY RATE Active Retired Military Members included rates For Bahama Cruises approximately per person For Caribbean Cruises approximately per person For Alaskan Cruises approximately per person Night Alaska Cruise departing Vancouver Canada per person per night based double occupancy Cruise fare two Guests Available Sail Date July Disney Wonder Night Alaska Cruise departing Vancouver Canada per person per night based double occupancy...,9
108,2019-05-07T09:59:55.016+03:00,english,"Join us for the inaugural Disney Cruise Line Blog Group Cruise aboard the Disney Dream. Click here for booking details.\nThe overall number of special offers from Disney Cruise Line continues to climb this week. The number of sailings with the MTO rate is up to 40 sail dates with 3 new dates. The Military rate features cruises across the spectrum of upcoming itineraries from The Bahamas, Caribbean, Baja, Alaska, and Europe. The Canadian Resident sailings now feature dates into early 2020. Th...",Special Offers on Disney Cruise Line Sailings as of 5/6/2019,Special Offers Disney Cruise Line Sailings,Join us inaugural Disney Cruise Line Blog Group Cruise aboard Disney Dream Click booking details The overall number special offers Disney Cruise Line continues climb week The number sailings MTO rate sail dates new dates The Military rate features cruises across spectrum upcoming itineraries The Bahamas Caribbean Baja Alaska Europe The Canadian Resident sailings feature dates early The Florida Resident rates includes new July dates Fantasy total dates week The final offer week Disney Cruise ...,Special Offers Disney Cruise Line Sailings Join us inaugural Disney Cruise Line Blog Group Cruise aboard Disney Dream Click booking details The overall number special offers Disney Cruise Line continues climb week The number sailings MTO rate sail dates new dates The Military rate features cruises across spectrum upcoming itineraries The Bahamas Caribbean Baja Alaska Europe The Canadian Resident sailings feature dates early The Florida Resident rates includes new July dates Fantasy total dat...,9
912,2019-05-08T02:40:33.024+03:00,english,"Print\nIf you’re headed to Orlando for the first time this summer, then my top tips for your Walt Disney World Summer Holiday are going to come in very handy! Make sure you save this and my other Disney Tips to your Pinterest boards to help with all of your Disney World planning!\nWe had our first family vacation to Walt Disney World in Orlando, Florida back in 2011 – it seems so long ago now! With a seventh trip under our belts, a Disney Vacation Club purchase at Polynesian Villas & Bungalo...",12 Tips For Your First Walt Disney World Summer Holiday,Tips For Your First Walt Disney World Summer Holiday,Print If youre headed Orlando first time summer top tips Walt Disney World Summer Holiday going come handy Make sure save Disney Tips Pinterest boards help Disney World planning We first family vacation Walt Disney World Orlando Florida back seems long ago With seventh trip belts Disney Vacation Club purchase Polynesian Villas Bungalows I safely say comes planning holiday Walt Disney World I know stuff Tips For Your First Walt Disney World Summer Holiday Know basics I often hear people descr...,Tips For Your First Walt Disney World Summer Holiday Print If youre headed Orlando first time summer top tips Walt Disney World Summer Holiday going come handy Make sure save Disney Tips Pinterest boards help Disney World planning We first family vacation Walt Disney World Orlando Florida back seems long ago With seventh trip belts Disney Vacation Club purchase Polynesian Villas Bungalows I safely say comes planning holiday Walt Disney World I know stuff Tips For Your First Walt Disney World...,7
1896,2019-05-09T02:09:09.017+03:00,english,"Walt Disney World Resort honors top Disney VoluntEARS with $2,500 grants to gift to the nonprofit organizations of their choosing Posted By: Chip on: Print\nWhile Walt Disney World Resort VoluntEARS are recognized year-round for their dedication to serving the Central Florida community, it’s only once a year top Disney VoluntEARS receive the highly sought-after title of “VoluntEARS of the Year.”\nEach year, Walt Disney World Resort honors a VoluntEAR of the Year, VoluntEARS Team of the Year ...","Walt Disney World Resort honors top Disney VoluntEARS with $2,500 grants to gift to the nonprofit organizations of their choosing",Walt Disney World Resort honors top Disney VoluntEARS grants gift nonprofit organizations choosing,Walt Disney World Resort honors top Disney VoluntEARS grants gift nonprofit organizations choosing Posted By Chip Print While Walt Disney World Resort VoluntEARS recognized yearround dedication serving Central Florida community year top Disney VoluntEARS receive highly soughtafter title VoluntEARS Year Each year Walt Disney World Resort honors VoluntEAR Year VoluntEARS Team Year VoluntEAR Family Year taking Central Florida community commitments beyond The VoluntEAR Year VoluntEARS Team Year ...,Walt Disney World Resort honors top Disney VoluntEARS grants gift nonprofit organizations choosing Walt Disney World Resort honors top Disney VoluntEARS grants gift nonprofit organizations choosing Posted By Chip Print While Walt Disney World Resort VoluntEARS recognized yearround dedication serving Central Florida community year top Disney VoluntEARS receive highly soughtafter title VoluntEARS Year Each year Walt Disney World Resort honors VoluntEAR Year VoluntEARS Team Year VoluntEAR Famil...,6
2592,2019-05-09T20:22:02.022+03:00,english,"0\nA 69-year-old great-grandmother was recently arrested at the gates of Disney World in Florida for having CBD oil in her purse, in spite of having a letter from her doctor attesting to the fact that she needed it for her arthritis pain.\nHester Jordan Burkhalter had been using the oil since her doctor recommended it.\n“I have really bad arthritis in my legs, in my arms and in my shoulder,” said Burkhalter. “I use (CBD oil) for the pain because it helps.”\nWhen she was stopped by security o...","Great-grandmother arrested at Disney, jailed for doctor-recommended CBD oil sparks legal firestorm",Greatgrandmother arrested Disney jailed doctorrecommended CBD oil sparks legal firestorm,A yearold greatgrandmother recently arrested gates Disney World Florida CBD oil purse spite letter doctor attesting fact needed arthritis pain Hester Jordan Burkhalter using oil since doctor recommended I really bad arthritis legs arms shoulder said Burkhalter I use CBD oil pain helps When stopped security outside Magic Kingdom family going bagcheck process arrested She Orlando visit Disney parks family onceinalifetime trip planned past two years In notsomagical twist quietspoken grandmother...,Greatgrandmother arrested Disney jailed doctorrecommended CBD oil sparks legal firestorm A yearold greatgrandmother recently arrested gates Disney World Florida CBD oil purse spite letter doctor attesting fact needed arthritis pain Hester Jordan Burkhalter using oil since doctor recommended I really bad arthritis legs arms shoulder said Burkhalter I use CBD oil pain helps When stopped security outside Magic Kingdom family going bagcheck process arrested She Orlando visit Disney parks family ...,6
4,2019-05-07T04:36:07.017+03:00,english,"As Disney’s Hollywood Studios celebrated its 30th anniversary last week in Florida, the theme park is in the midst of a growth spurt. Still bustling from the opening of Toy Story Land last summer, fans are gearing up for the largest expansion in Walt Disney World history, according to Florida Today , which is part of the USA TODAY Network.\nStar Wars Galaxy’s Edge will open at the Florida park with much-anticipated fanfare Aug. 29, following the opening of the new land at Disneyland in Calif...",Disney World Star Wars: Galaxy's Edge: Hollywood Studios looks ahead,Disney World Star Wars Galaxys Edge Hollywood Studios looks ahead,As Disneys Hollywood Studios celebrated th anniversary last week Florida theme park midst growth spurt Still bustling opening Toy Story Land last summer fans gearing largest expansion Walt Disney World history according Florida Today part USA TODAY Network Star Wars Galaxys Edge open Florida park muchanticipated fanfare Aug following opening new land Disneyland California May The acre land transport guests galaxy far far away We want go discover terms said Scott Mallwitz executive creative d...,Disney World Star Wars Galaxys Edge Hollywood Studios looks ahead As Disneys Hollywood Studios celebrated th anniversary last week Florida theme park midst growth spurt Still bustling opening Toy Story Land last summer fans gearing largest expansion Walt Disney World history according Florida Today part USA TODAY Network Star Wars Galaxys Edge open Florida park muchanticipated fanfare Aug following opening new land Disneyland California May The acre land transport guests galaxy far far away ...,5
2305,2019-05-09T12:46:11.004+03:00,english,"Tags: Disney Citizenship , Disney VoluntEARS , Walt Disney World\nWalt Disney World recently recognized their top Disney VoluntEARS by honoring these individuals and groups with the VoluntEAR of the Year Awards. In addition to receiving the awards, the top VoluntEARS also received a $2,500 grant to gift to a nonprofit organization of their choosing.\nWhat’s happening: In celebration of hard work and dedication to their community, Walt Disney World Resort has named their 2019 Disney VoluntEAR...",Walt Disney World Recognizes 2019 Disney VoluntEARS of the Year,Walt Disney World Recognizes Disney VoluntEARS Year,Tags Disney Citizenship Disney VoluntEARS Walt Disney World Walt Disney World recently recognized top Disney VoluntEARS honoring individuals groups VoluntEAR Year Awards In addition receiving awards top VoluntEARS also received grant gift nonprofit organization choosing Whats happening In celebration hard work dedication community Walt Disney World Resort named Disney VoluntEARS Disney recently recognized VoluntEAR Year VoluntEARS Team Year VoluntEAR Family Year taking Central Florida commun...,Walt Disney World Recognizes Disney VoluntEARS Year Tags Disney Citizenship Disney VoluntEARS Walt Disney World Walt Disney World recently recognized top Disney VoluntEARS honoring individuals groups VoluntEAR Year Awards In addition receiving awards top VoluntEARS also received grant gift nonprofit organization choosing Whats happening In celebration hard work dedication community Walt Disney World Resort named Disney VoluntEARS Disney recently recognized VoluntEAR Year VoluntEARS Team Year...,5
2067,2019-05-09T05:32:32.000+03:00,english,"By Morgan Sung 2019-05-09 01:25:09 UTC\nA family trip to Disney World came to a halt when a great-grandmother was arrested for carrying CBD oil, which her doctor recommended to ease her arthritis.\nHester Burkhalter, 69, was arrested on Apr. 15 and charged with felony possession of hashish. The Tampa Bay Times reports that Burkhalter was stopped at a bag check just outside of Magic Kingdom that morning, and Disney security found her 1-ounce bottle of peppermint-flavored CBD tincture. In phot...",Woman arrested for bringing doctor-recommended CBD oil to Disney World,Woman arrested bringing doctorrecommended CBD oil Disney World,By Morgan Sung UTC A family trip Disney World came halt greatgrandmother arrested carrying CBD oil doctor recommended ease arthritis Hester Burkhalter arrested Apr charged felony possession hashish The Tampa Bay Times reports Burkhalter stopped bag check outside Magic Kingdom morning Disney security found ounce bottle peppermintflavored CBD tincture In photos obtained Orlandos Fox bottle labeled mg CBD mg THC I really bad arthritis legs arms shoulder Burkhalter told Fox I use pain helps Acco...,Woman arrested bringing doctorrecommended CBD oil Disney World By Morgan Sung UTC A family trip Disney World came halt greatgrandmother arrested carrying CBD oil doctor recommended ease arthritis Hester Burkhalter arrested Apr charged felony possession hashish The Tampa Bay Times reports Burkhalter stopped bag check outside Magic Kingdom morning Disney security found ounce bottle peppermintflavored CBD tincture In photos obtained Orlandos Fox bottle labeled mg CBD mg THC I really bad arthrit...,5
2832,2019-05-10T01:37:14.033+03:00,english,"Photo: Getty Images Like every single live-action Disney movie out this year, measles is making the comeback no one asked for. Already, there have been more than 700 cases of the viral and vaccine-preventable disease reported in the U.S. this year—a record high since it was eradicated in the country nearly 20 years ago.\nThings could get much worse , though. And according to new research published Thursday, there are certain U.S. counties where measles could still hit hard this year.\nAdvert...","Like every single live-action Disney movie out this year, me",Like every single liveaction Disney movie year,Photo Getty Images Like every single liveaction Disney movie year measles making comeback one asked Already cases viral vaccinepreventable disease reported US yeara record high since eradicated country nearly years ago Things could get much worse though And according new research published Thursday certain US counties measles could still hit hard year Advertisement Measles longer found naturally US thanks largely mandatory vaccination program made impossible disease maintain constant foothol...,Like every single liveaction Disney movie year Photo Getty Images Like every single liveaction Disney movie year measles making comeback one asked Already cases viral vaccinepreventable disease reported US yeara record high since eradicated country nearly years ago Things could get much worse though And according new research published Thursday certain US counties measles could still hit hard year Advertisement Measles longer found naturally US thanks largely mandatory vaccination program ma...,5


In [20]:
top_10_non_disney_relevant

Unnamed: 0,crawled,language,text,title,clean_title,clean_text,clean_title_text,florida_count


In [25]:
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans

# Combine the title and text for analysis
df['combined'] = df['title'] + " " + df['text']

# Vectorize the text using TF-IDF
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(df['combined'])

# Cluster the text using KMeans
n_clusters = 5
kmeans = KMeans(n_clusters=n_clusters, random_state=42)
df['cluster'] = kmeans.fit_predict(X)

# Print summarized themes and text snippets for each cluster
for cluster in range(n_clusters):
    cluster_titles = df[df['cluster'] == cluster]['title'][:3].tolist()  # Get first 3 titles
    cluster_texts = df[df['cluster'] == cluster]['text'][:3].tolist()    # Get first 3 text snippets
    
    print(f"Cluster {cluster}:")
    print("Main Themes:")
    for title in cluster_titles:
        print(f"- {title}")
    
    print("\nText Snippets:")
    for text in cluster_texts:
        print(f"- {text[:30]}...")  # Print only the first 150 characters of each text snippet
    print("\n" + "="*50 + "\n")

Cluster 0:
Main Themes:
- Investors Sell Walt Disney (DIS) on Strength (DIS)
- Walt Disney Co (DIS) Expected to Post Earnings of $1.58 Per Share
- Disney Is Set To Finalize The Last Piece Of Its Fox Acquisition

Text Snippets:
- Investors Sell Walt Disney (DI...
- Walt Disney Co (DIS) Expected ...
- As part of the approval for Di...


Cluster 1:
Main Themes:
- The Most Useful Things I Bring to Disney
- Walt Disney World 50th Rumored Plans
- XFL Strikes Deal with Fox and Disney

Text Snippets:
- No comments
Packing can be str...
- I couldn't find another thread...
- 05-06-2019, 01:01 PM Here we g...


Cluster 2:
Main Themes:
- Disney-Fox Updates Release Schedule: Sets Three Untitled ‘Star Wars’ Movies, ‘New Mutants’ Heads To 2020, ‘Ad Astra’ To Open Fall & More
- New Star Wars trilogy starting Christmas 2022, Disney confirms - CNET
- New Star Wars Release Dates Unveiled by Disney!

Text Snippets:
- Tumblr Disney
Following their ...
- Star Wars Episode 9: The Rise ...
- 
BEGIN SLIDESHOW 