<center><font size=8>Hands-on - Analyzing Text Data</font></center>

# **Problem Statement**

## **Business Context**

In the rapidly evolving landscape of the entertainment industry, understanding audience feedback through movie reviews is essential for refining content and shaping marketing strategies. However, the sheer volume of reviews presents challenges in efficiently processing and analyzing this information. To remain competitive, entertainment companies must find effective ways to clean and structure this data, enabling them to derive valuable insights for enhancing viewer experiences and making informed decisions.

## **Objective**

As a data scientist, your objective is to develop an efficient text preprocessing pipeline that will clean and structure a dataset of movie reviews. This preprocessing step will ensure that the data is standardized and ready for further analysis, ultimately supporting the identification of trends and insights that can drive content and marketing strategies in the entertainment industry.

In [None]:
!pip install pandas==2.2.2 numpy==2.0.2 nltk==3.9.1 scikit-learn==1.6.1

# For cloud execution environments like Google Colab or Jupyter Notebooks, you can use the following command to install specific versions of the libraries:
# %pip install pandas==2.2.2 numpy==2.0.2 nltk==3.9.1 scikit-learn==1.6.1




In [39]:
# to read and manipulate the data
import pandas as pd
import numpy as np
import re
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
nltk.download('wordnet')
from nltk.stem.porter import PorterStemmer
from sklearn.feature_extraction.text import CountVectorizer

# setting column to the maximum column width as per the data
pd.set_option('max_colwidth', None)

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/vishalkhapre/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/vishalkhapre/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


In [40]:
# uncomment and run the below code snippets if the dataset is present in the Google Drive
# from google.colab import drive
# drive.mount('/content/drive')

In [41]:
# loading data into a pandas dataframe
reviews = pd.read_csv("movie_reviews.csv")

In [42]:
# creating a copy of the data
data = reviews.copy()

In [43]:
data.head(5)

Unnamed: 0,review
0,"Okay, I know this does'nt project India in a good light. But the overall theme of the movie is not India, it's Shakti. The power of a warlord, and the power of a mother. The relationship between Nandini and her husband and son swallow you up in their warmth. Then things go terribly wrong. The interaction between Nandini and her father in law - the power of their dysfunctional relationship - and the lives changed by it are the strengths of this movie. Shah Rukh Khan's performance seems to be a mere cameo compared to the believable desperation of Karisma Kapoor. It is easy to get caught up in the love, violence and redemption of lives in this film, and find yourself heaving a sigh of relief and sadness at the climax. The musical interludes are strengths, believable and well done."
1,"Despite John Travolta's statements in interviews that this was his favorite role of his career, ""Be Cool"" proves to be a disappointing sequel to 1995's witty and clever ""Get Shorty.""<br /><br />Travolta delivers a pleasant enough performance in this mildly entertaining film, but ultimately the movie falls flat due to an underdeveloped plot, unlikeable characters, and a surprising lack of chemistry between leads Travolta and Uma Thurman. Although there are some laughs, this unfunny dialog example (which appeared frequently in the trailers) kind of says it all: Thurman: Do you dance? Travolta: Hey, I'm from Brooklyn.<br /><br />The film suggests that everyone in the entertainment business is a gangster or aspires to be one, likening it to organized crime. In ""Get Shorty,"" the premise of a gangster ""going legitimate"" by getting into movies was a clever fish-out-of water idea, but in ""Be Cool,"" it seems the biz has entirely gone crooked since then.<br /><br />The film is interestingly casted and the absolute highlight is a ""monolgue"" delivered by The Rock, whose character is an aspiring actor as well as a goon, where he reenacts a scene between Gabrielle Union and Kirsten Dunst from ""Bring It On."" Vince Vaughan's character thinks he's black and he's often seen dressed as a pimp-- this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward.<br /><br />Overall, ""Be Cool"" may be worth a rental for John Travolta die-hards (of which I am one), but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time. Fans of ""Get Shorty"" may actually wish to avoid this, as the sequel is devoid of most things that made that one a winner. I rate this movie an admittedly harsh 4/10."
2,"I am a kung fu fan, but not a Woo fan. I have no interest in gangster movies filled with over-the-top gun-play. Now, martial arts; *that's* beautiful! And John Woo surprised me here by producing a highly entertaining kung fu movie, which almost has *too much* fighting, if such a thing is possible! This is good stuff.<br /><br />Many of the fight scenes are very good (and some of them are less good), and the main characters are amusing and likable. The bad guys are a bit too unbelievably evil, but entertaining none the less. You gotta see the Sleeping Wizard!! He can only fight when he's asleep - it's hysterical!<br /><br />Upon repeated viewings, however, Last Hurrah For Chivalry can tend to get a little boring and long-winded, also especially because many of the fight scenes are actually not that good. Hence, I rate it ""only"" a 7 out of 10. But it really is almost an ""8"".<br /><br />All in all one of the better kung fu movies, made smack-dab in the heart of kung fu cinema's prime. All the really good kung fu movies are from the mid- to late 1970ies, with some notable exceptions from the late '60ies and early '70ies (and early '80ies, to be fair)."
3,"He seems to be a control freak. I have heard him comment on ""losing control of the show"" and tell another guest who brought live animals that he had one rule-""no snakes."" He needs to hire a comedy writer because his jokes are lame. The only reason I watch him is because he some some great guests and bands. <br /><br />I watched the Craig Ferguson show for a while but his show is even worse. He likes to bull sh** to burn time.I don't think either man has much of a future in late night talk shows.<br /><br />Daily also has the annoying habit of sticking his tongue out to lick his lips. He must do this at least 10 times a show. I do like the Joe Firstman band. Carson Daily needs to lighten up before it is too late."
4,"Admittedly, there are some scenes in this movie that seem a little unrealistic. The ravishing woman first panics and then, only a few minutes later, she starts kissing the young lad while the old guy is right next to her. But as the film goes along we learn that she is a little volatile girl (or slut) and that partly explains her behavior. The cinematography of this movie is well done. We get to see the elevator from almost every angle and perspective, and some of those images and scenes really raise the tension. Götz George plays his character well, a wannabe hot-shot getting old and being overpowered by young men like the Jaennicke character. Wolfgang Kieling who I admired in Hitchcock's THE TORN CURTAIN delivers a great performance that, although he doesn't say much, he is by far the best actor in this play. One critic complained about how unrealistic the film was and that in a real case of emergency nothing would really happen. But then again, how realistic are films such as Mission impossible or Phone Booth. Given the fact that we are talking about a movie here, and that in a movie you always have to deal with some scenes that aren't very likely to occur in real life, you can still enjoy this movie. It's a lot better than many things that I see on German TV these days and I think that the vintage 80's style added something to this film."


In [44]:
data.shape

(10000, 1)

In [45]:
data.isnull().sum()

review    0
dtype: int64

In [46]:
# checking for duplicate values
data.duplicated().sum()

np.int64(18)

In [47]:
# keeping only the first occurence of duplicate values and dropping the rest
data = data.drop_duplicates(keep = 'first')

In [48]:
# reseting the index of the dataframe
data = data.reset_index(drop = True)

In [49]:
review = data['review'][0]
pattern = r'[a-z]'

cleaned_review = ''.join(re.sub(pattern,' ',review))

print(review)
print(cleaned_review)

Okay, I know this does'nt project India in a good light. But the overall theme of the movie is not India, it's Shakti. The power of a warlord, and the power of a mother. The relationship between Nandini and her husband and son swallow you up in their warmth. Then things go terribly wrong. The interaction between Nandini and her father in law - the power of their dysfunctional relationship - and the lives changed by it are the strengths of this movie. Shah Rukh Khan's performance seems to be a mere cameo compared to the believable desperation of Karisma Kapoor. It is easy to get caught up in the love, violence and redemption of lives in this film, and find yourself heaving a sigh of relief and sadness at the climax. The musical interludes are strengths, believable and well done.
O   , I               '           I                    . B                                         I    ,   '  S     . T                     ,                          . T                        N               

In [50]:
# defining a function to remove special characters
def remove_special_characters(text):
    # Defining the regex pattern to match non-alphanumeric characters
    pattern = '[^A-Za-z0-9]+'

    # Finding the specified pattern and replacing non-alphanumeric characters with a blank string
    new_text = ''.join(re.sub(pattern, ' ', text))

    return new_text

In [51]:
# Applying the function to remove special characters
data['cleaned_text'] = data['review'].apply(remove_special_characters)

In [52]:
# checking a couple of instances of cleaned data
data.loc[0:3, ['review','cleaned_text']]

Unnamed: 0,review,cleaned_text
0,"Okay, I know this does'nt project India in a good light. But the overall theme of the movie is not India, it's Shakti. The power of a warlord, and the power of a mother. The relationship between Nandini and her husband and son swallow you up in their warmth. Then things go terribly wrong. The interaction between Nandini and her father in law - the power of their dysfunctional relationship - and the lives changed by it are the strengths of this movie. Shah Rukh Khan's performance seems to be a mere cameo compared to the believable desperation of Karisma Kapoor. It is easy to get caught up in the love, violence and redemption of lives in this film, and find yourself heaving a sigh of relief and sadness at the climax. The musical interludes are strengths, believable and well done.",Okay I know this does nt project India in a good light But the overall theme of the movie is not India it s Shakti The power of a warlord and the power of a mother The relationship between Nandini and her husband and son swallow you up in their warmth Then things go terribly wrong The interaction between Nandini and her father in law the power of their dysfunctional relationship and the lives changed by it are the strengths of this movie Shah Rukh Khan s performance seems to be a mere cameo compared to the believable desperation of Karisma Kapoor It is easy to get caught up in the love violence and redemption of lives in this film and find yourself heaving a sigh of relief and sadness at the climax The musical interludes are strengths believable and well done
1,"Despite John Travolta's statements in interviews that this was his favorite role of his career, ""Be Cool"" proves to be a disappointing sequel to 1995's witty and clever ""Get Shorty.""<br /><br />Travolta delivers a pleasant enough performance in this mildly entertaining film, but ultimately the movie falls flat due to an underdeveloped plot, unlikeable characters, and a surprising lack of chemistry between leads Travolta and Uma Thurman. Although there are some laughs, this unfunny dialog example (which appeared frequently in the trailers) kind of says it all: Thurman: Do you dance? Travolta: Hey, I'm from Brooklyn.<br /><br />The film suggests that everyone in the entertainment business is a gangster or aspires to be one, likening it to organized crime. In ""Get Shorty,"" the premise of a gangster ""going legitimate"" by getting into movies was a clever fish-out-of water idea, but in ""Be Cool,"" it seems the biz has entirely gone crooked since then.<br /><br />The film is interestingly casted and the absolute highlight is a ""monolgue"" delivered by The Rock, whose character is an aspiring actor as well as a goon, where he reenacts a scene between Gabrielle Union and Kirsten Dunst from ""Bring It On."" Vince Vaughan's character thinks he's black and he's often seen dressed as a pimp-- this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward.<br /><br />Overall, ""Be Cool"" may be worth a rental for John Travolta die-hards (of which I am one), but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time. Fans of ""Get Shorty"" may actually wish to avoid this, as the sequel is devoid of most things that made that one a winner. I rate this movie an admittedly harsh 4/10.",Despite John Travolta s statements in interviews that this was his favorite role of his career Be Cool proves to be a disappointing sequel to 1995 s witty and clever Get Shorty br br Travolta delivers a pleasant enough performance in this mildly entertaining film but ultimately the movie falls flat due to an underdeveloped plot unlikeable characters and a surprising lack of chemistry between leads Travolta and Uma Thurman Although there are some laughs this unfunny dialog example which appeared frequently in the trailers kind of says it all Thurman Do you dance Travolta Hey I m from Brooklyn br br The film suggests that everyone in the entertainment business is a gangster or aspires to be one likening it to organized crime In Get Shorty the premise of a gangster going legitimate by getting into movies was a clever fish out of water idea but in Be Cool it seems the biz has entirely gone crooked since then br br The film is interestingly casted and the absolute highlight is a monolgue delivered by The Rock whose character is an aspiring actor as well as a goon where he reenacts a scene between Gabrielle Union and Kirsten Dunst from Bring It On Vince Vaughan s character thinks he s black and he s often seen dressed as a pimp this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward br br Overall Be Cool may be worth a rental for John Travolta die hards of which I am one but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time Fans of Get Shorty may actually wish to avoid this as the sequel is devoid of most things that made that one a winner I rate this movie an admittedly harsh 4 10
2,"I am a kung fu fan, but not a Woo fan. I have no interest in gangster movies filled with over-the-top gun-play. Now, martial arts; *that's* beautiful! And John Woo surprised me here by producing a highly entertaining kung fu movie, which almost has *too much* fighting, if such a thing is possible! This is good stuff.<br /><br />Many of the fight scenes are very good (and some of them are less good), and the main characters are amusing and likable. The bad guys are a bit too unbelievably evil, but entertaining none the less. You gotta see the Sleeping Wizard!! He can only fight when he's asleep - it's hysterical!<br /><br />Upon repeated viewings, however, Last Hurrah For Chivalry can tend to get a little boring and long-winded, also especially because many of the fight scenes are actually not that good. Hence, I rate it ""only"" a 7 out of 10. But it really is almost an ""8"".<br /><br />All in all one of the better kung fu movies, made smack-dab in the heart of kung fu cinema's prime. All the really good kung fu movies are from the mid- to late 1970ies, with some notable exceptions from the late '60ies and early '70ies (and early '80ies, to be fair).",I am a kung fu fan but not a Woo fan I have no interest in gangster movies filled with over the top gun play Now martial arts that s beautiful And John Woo surprised me here by producing a highly entertaining kung fu movie which almost has too much fighting if such a thing is possible This is good stuff br br Many of the fight scenes are very good and some of them are less good and the main characters are amusing and likable The bad guys are a bit too unbelievably evil but entertaining none the less You gotta see the Sleeping Wizard He can only fight when he s asleep it s hysterical br br Upon repeated viewings however Last Hurrah For Chivalry can tend to get a little boring and long winded also especially because many of the fight scenes are actually not that good Hence I rate it only a 7 out of 10 But it really is almost an 8 br br All in all one of the better kung fu movies made smack dab in the heart of kung fu cinema s prime All the really good kung fu movies are from the mid to late 1970ies with some notable exceptions from the late 60ies and early 70ies and early 80ies to be fair
3,"He seems to be a control freak. I have heard him comment on ""losing control of the show"" and tell another guest who brought live animals that he had one rule-""no snakes."" He needs to hire a comedy writer because his jokes are lame. The only reason I watch him is because he some some great guests and bands. <br /><br />I watched the Craig Ferguson show for a while but his show is even worse. He likes to bull sh** to burn time.I don't think either man has much of a future in late night talk shows.<br /><br />Daily also has the annoying habit of sticking his tongue out to lick his lips. He must do this at least 10 times a show. I do like the Joe Firstman band. Carson Daily needs to lighten up before it is too late.",He seems to be a control freak I have heard him comment on losing control of the show and tell another guest who brought live animals that he had one rule no snakes He needs to hire a comedy writer because his jokes are lame The only reason I watch him is because he some some great guests and bands br br I watched the Craig Ferguson show for a while but his show is even worse He likes to bull sh to burn time I don t think either man has much of a future in late night talk shows br br Daily also has the annoying habit of sticking his tongue out to lick his lips He must do this at least 10 times a show I do like the Joe Firstman band Carson Daily needs to lighten up before it is too late


In [53]:
# changing the case of the text data to lower case
data['cleaned_text'] = data['cleaned_text'].str.lower()

In [54]:
# checking a couple of instances of cleaned data
data.loc[0:3, ['review','cleaned_text']]

Unnamed: 0,review,cleaned_text
0,"Okay, I know this does'nt project India in a good light. But the overall theme of the movie is not India, it's Shakti. The power of a warlord, and the power of a mother. The relationship between Nandini and her husband and son swallow you up in their warmth. Then things go terribly wrong. The interaction between Nandini and her father in law - the power of their dysfunctional relationship - and the lives changed by it are the strengths of this movie. Shah Rukh Khan's performance seems to be a mere cameo compared to the believable desperation of Karisma Kapoor. It is easy to get caught up in the love, violence and redemption of lives in this film, and find yourself heaving a sigh of relief and sadness at the climax. The musical interludes are strengths, believable and well done.",okay i know this does nt project india in a good light but the overall theme of the movie is not india it s shakti the power of a warlord and the power of a mother the relationship between nandini and her husband and son swallow you up in their warmth then things go terribly wrong the interaction between nandini and her father in law the power of their dysfunctional relationship and the lives changed by it are the strengths of this movie shah rukh khan s performance seems to be a mere cameo compared to the believable desperation of karisma kapoor it is easy to get caught up in the love violence and redemption of lives in this film and find yourself heaving a sigh of relief and sadness at the climax the musical interludes are strengths believable and well done
1,"Despite John Travolta's statements in interviews that this was his favorite role of his career, ""Be Cool"" proves to be a disappointing sequel to 1995's witty and clever ""Get Shorty.""<br /><br />Travolta delivers a pleasant enough performance in this mildly entertaining film, but ultimately the movie falls flat due to an underdeveloped plot, unlikeable characters, and a surprising lack of chemistry between leads Travolta and Uma Thurman. Although there are some laughs, this unfunny dialog example (which appeared frequently in the trailers) kind of says it all: Thurman: Do you dance? Travolta: Hey, I'm from Brooklyn.<br /><br />The film suggests that everyone in the entertainment business is a gangster or aspires to be one, likening it to organized crime. In ""Get Shorty,"" the premise of a gangster ""going legitimate"" by getting into movies was a clever fish-out-of water idea, but in ""Be Cool,"" it seems the biz has entirely gone crooked since then.<br /><br />The film is interestingly casted and the absolute highlight is a ""monolgue"" delivered by The Rock, whose character is an aspiring actor as well as a goon, where he reenacts a scene between Gabrielle Union and Kirsten Dunst from ""Bring It On."" Vince Vaughan's character thinks he's black and he's often seen dressed as a pimp-- this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward.<br /><br />Overall, ""Be Cool"" may be worth a rental for John Travolta die-hards (of which I am one), but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time. Fans of ""Get Shorty"" may actually wish to avoid this, as the sequel is devoid of most things that made that one a winner. I rate this movie an admittedly harsh 4/10.",despite john travolta s statements in interviews that this was his favorite role of his career be cool proves to be a disappointing sequel to 1995 s witty and clever get shorty br br travolta delivers a pleasant enough performance in this mildly entertaining film but ultimately the movie falls flat due to an underdeveloped plot unlikeable characters and a surprising lack of chemistry between leads travolta and uma thurman although there are some laughs this unfunny dialog example which appeared frequently in the trailers kind of says it all thurman do you dance travolta hey i m from brooklyn br br the film suggests that everyone in the entertainment business is a gangster or aspires to be one likening it to organized crime in get shorty the premise of a gangster going legitimate by getting into movies was a clever fish out of water idea but in be cool it seems the biz has entirely gone crooked since then br br the film is interestingly casted and the absolute highlight is a monolgue delivered by the rock whose character is an aspiring actor as well as a goon where he reenacts a scene between gabrielle union and kirsten dunst from bring it on vince vaughan s character thinks he s black and he s often seen dressed as a pimp this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward br br overall be cool may be worth a rental for john travolta die hards of which i am one but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time fans of get shorty may actually wish to avoid this as the sequel is devoid of most things that made that one a winner i rate this movie an admittedly harsh 4 10
2,"I am a kung fu fan, but not a Woo fan. I have no interest in gangster movies filled with over-the-top gun-play. Now, martial arts; *that's* beautiful! And John Woo surprised me here by producing a highly entertaining kung fu movie, which almost has *too much* fighting, if such a thing is possible! This is good stuff.<br /><br />Many of the fight scenes are very good (and some of them are less good), and the main characters are amusing and likable. The bad guys are a bit too unbelievably evil, but entertaining none the less. You gotta see the Sleeping Wizard!! He can only fight when he's asleep - it's hysterical!<br /><br />Upon repeated viewings, however, Last Hurrah For Chivalry can tend to get a little boring and long-winded, also especially because many of the fight scenes are actually not that good. Hence, I rate it ""only"" a 7 out of 10. But it really is almost an ""8"".<br /><br />All in all one of the better kung fu movies, made smack-dab in the heart of kung fu cinema's prime. All the really good kung fu movies are from the mid- to late 1970ies, with some notable exceptions from the late '60ies and early '70ies (and early '80ies, to be fair).",i am a kung fu fan but not a woo fan i have no interest in gangster movies filled with over the top gun play now martial arts that s beautiful and john woo surprised me here by producing a highly entertaining kung fu movie which almost has too much fighting if such a thing is possible this is good stuff br br many of the fight scenes are very good and some of them are less good and the main characters are amusing and likable the bad guys are a bit too unbelievably evil but entertaining none the less you gotta see the sleeping wizard he can only fight when he s asleep it s hysterical br br upon repeated viewings however last hurrah for chivalry can tend to get a little boring and long winded also especially because many of the fight scenes are actually not that good hence i rate it only a 7 out of 10 but it really is almost an 8 br br all in all one of the better kung fu movies made smack dab in the heart of kung fu cinema s prime all the really good kung fu movies are from the mid to late 1970ies with some notable exceptions from the late 60ies and early 70ies and early 80ies to be fair
3,"He seems to be a control freak. I have heard him comment on ""losing control of the show"" and tell another guest who brought live animals that he had one rule-""no snakes."" He needs to hire a comedy writer because his jokes are lame. The only reason I watch him is because he some some great guests and bands. <br /><br />I watched the Craig Ferguson show for a while but his show is even worse. He likes to bull sh** to burn time.I don't think either man has much of a future in late night talk shows.<br /><br />Daily also has the annoying habit of sticking his tongue out to lick his lips. He must do this at least 10 times a show. I do like the Joe Firstman band. Carson Daily needs to lighten up before it is too late.",he seems to be a control freak i have heard him comment on losing control of the show and tell another guest who brought live animals that he had one rule no snakes he needs to hire a comedy writer because his jokes are lame the only reason i watch him is because he some some great guests and bands br br i watched the craig ferguson show for a while but his show is even worse he likes to bull sh to burn time i don t think either man has much of a future in late night talk shows br br daily also has the annoying habit of sticking his tongue out to lick his lips he must do this at least 10 times a show i do like the joe firstman band carson daily needs to lighten up before it is too late


In [55]:
# removing extra whitespaces from the text
data['cleaned_text'] = data['cleaned_text'].str.strip()

In [56]:
# checking a couple of instances of cleaned data
data.loc[0:3, ['review','cleaned_text']]

Unnamed: 0,review,cleaned_text
0,"Okay, I know this does'nt project India in a good light. But the overall theme of the movie is not India, it's Shakti. The power of a warlord, and the power of a mother. The relationship between Nandini and her husband and son swallow you up in their warmth. Then things go terribly wrong. The interaction between Nandini and her father in law - the power of their dysfunctional relationship - and the lives changed by it are the strengths of this movie. Shah Rukh Khan's performance seems to be a mere cameo compared to the believable desperation of Karisma Kapoor. It is easy to get caught up in the love, violence and redemption of lives in this film, and find yourself heaving a sigh of relief and sadness at the climax. The musical interludes are strengths, believable and well done.",okay i know this does nt project india in a good light but the overall theme of the movie is not india it s shakti the power of a warlord and the power of a mother the relationship between nandini and her husband and son swallow you up in their warmth then things go terribly wrong the interaction between nandini and her father in law the power of their dysfunctional relationship and the lives changed by it are the strengths of this movie shah rukh khan s performance seems to be a mere cameo compared to the believable desperation of karisma kapoor it is easy to get caught up in the love violence and redemption of lives in this film and find yourself heaving a sigh of relief and sadness at the climax the musical interludes are strengths believable and well done
1,"Despite John Travolta's statements in interviews that this was his favorite role of his career, ""Be Cool"" proves to be a disappointing sequel to 1995's witty and clever ""Get Shorty.""<br /><br />Travolta delivers a pleasant enough performance in this mildly entertaining film, but ultimately the movie falls flat due to an underdeveloped plot, unlikeable characters, and a surprising lack of chemistry between leads Travolta and Uma Thurman. Although there are some laughs, this unfunny dialog example (which appeared frequently in the trailers) kind of says it all: Thurman: Do you dance? Travolta: Hey, I'm from Brooklyn.<br /><br />The film suggests that everyone in the entertainment business is a gangster or aspires to be one, likening it to organized crime. In ""Get Shorty,"" the premise of a gangster ""going legitimate"" by getting into movies was a clever fish-out-of water idea, but in ""Be Cool,"" it seems the biz has entirely gone crooked since then.<br /><br />The film is interestingly casted and the absolute highlight is a ""monolgue"" delivered by The Rock, whose character is an aspiring actor as well as a goon, where he reenacts a scene between Gabrielle Union and Kirsten Dunst from ""Bring It On."" Vince Vaughan's character thinks he's black and he's often seen dressed as a pimp-- this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward.<br /><br />Overall, ""Be Cool"" may be worth a rental for John Travolta die-hards (of which I am one), but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time. Fans of ""Get Shorty"" may actually wish to avoid this, as the sequel is devoid of most things that made that one a winner. I rate this movie an admittedly harsh 4/10.",despite john travolta s statements in interviews that this was his favorite role of his career be cool proves to be a disappointing sequel to 1995 s witty and clever get shorty br br travolta delivers a pleasant enough performance in this mildly entertaining film but ultimately the movie falls flat due to an underdeveloped plot unlikeable characters and a surprising lack of chemistry between leads travolta and uma thurman although there are some laughs this unfunny dialog example which appeared frequently in the trailers kind of says it all thurman do you dance travolta hey i m from brooklyn br br the film suggests that everyone in the entertainment business is a gangster or aspires to be one likening it to organized crime in get shorty the premise of a gangster going legitimate by getting into movies was a clever fish out of water idea but in be cool it seems the biz has entirely gone crooked since then br br the film is interestingly casted and the absolute highlight is a monolgue delivered by the rock whose character is an aspiring actor as well as a goon where he reenacts a scene between gabrielle union and kirsten dunst from bring it on vince vaughan s character thinks he s black and he s often seen dressed as a pimp this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward br br overall be cool may be worth a rental for john travolta die hards of which i am one but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time fans of get shorty may actually wish to avoid this as the sequel is devoid of most things that made that one a winner i rate this movie an admittedly harsh 4 10
2,"I am a kung fu fan, but not a Woo fan. I have no interest in gangster movies filled with over-the-top gun-play. Now, martial arts; *that's* beautiful! And John Woo surprised me here by producing a highly entertaining kung fu movie, which almost has *too much* fighting, if such a thing is possible! This is good stuff.<br /><br />Many of the fight scenes are very good (and some of them are less good), and the main characters are amusing and likable. The bad guys are a bit too unbelievably evil, but entertaining none the less. You gotta see the Sleeping Wizard!! He can only fight when he's asleep - it's hysterical!<br /><br />Upon repeated viewings, however, Last Hurrah For Chivalry can tend to get a little boring and long-winded, also especially because many of the fight scenes are actually not that good. Hence, I rate it ""only"" a 7 out of 10. But it really is almost an ""8"".<br /><br />All in all one of the better kung fu movies, made smack-dab in the heart of kung fu cinema's prime. All the really good kung fu movies are from the mid- to late 1970ies, with some notable exceptions from the late '60ies and early '70ies (and early '80ies, to be fair).",i am a kung fu fan but not a woo fan i have no interest in gangster movies filled with over the top gun play now martial arts that s beautiful and john woo surprised me here by producing a highly entertaining kung fu movie which almost has too much fighting if such a thing is possible this is good stuff br br many of the fight scenes are very good and some of them are less good and the main characters are amusing and likable the bad guys are a bit too unbelievably evil but entertaining none the less you gotta see the sleeping wizard he can only fight when he s asleep it s hysterical br br upon repeated viewings however last hurrah for chivalry can tend to get a little boring and long winded also especially because many of the fight scenes are actually not that good hence i rate it only a 7 out of 10 but it really is almost an 8 br br all in all one of the better kung fu movies made smack dab in the heart of kung fu cinema s prime all the really good kung fu movies are from the mid to late 1970ies with some notable exceptions from the late 60ies and early 70ies and early 80ies to be fair
3,"He seems to be a control freak. I have heard him comment on ""losing control of the show"" and tell another guest who brought live animals that he had one rule-""no snakes."" He needs to hire a comedy writer because his jokes are lame. The only reason I watch him is because he some some great guests and bands. <br /><br />I watched the Craig Ferguson show for a while but his show is even worse. He likes to bull sh** to burn time.I don't think either man has much of a future in late night talk shows.<br /><br />Daily also has the annoying habit of sticking his tongue out to lick his lips. He must do this at least 10 times a show. I do like the Joe Firstman band. Carson Daily needs to lighten up before it is too late.",he seems to be a control freak i have heard him comment on losing control of the show and tell another guest who brought live animals that he had one rule no snakes he needs to hire a comedy writer because his jokes are lame the only reason i watch him is because he some some great guests and bands br br i watched the craig ferguson show for a while but his show is even worse he likes to bull sh to burn time i don t think either man has much of a future in late night talk shows br br daily also has the annoying habit of sticking his tongue out to lick his lips he must do this at least 10 times a show i do like the joe firstman band carson daily needs to lighten up before it is too late


In [57]:
# defining a function to remove stop words using the NLTK library
def remove_stopwords(text):
    # Split text into separate words
    words = text.split()

    # Removing English language stopwords
    new_text = ' '.join([word for word in words if word not in stopwords.words('english')])

    return new_text

In [58]:
# Applying the function to remove stop words using the NLTK library
data['cleaned_text_without_stopwords'] = data['cleaned_text'].apply(remove_stopwords)

In [59]:
# checking a couple of instances of cleaned data
data.loc[0:3,['cleaned_text','cleaned_text_without_stopwords']]

Unnamed: 0,cleaned_text,cleaned_text_without_stopwords
0,okay i know this does nt project india in a good light but the overall theme of the movie is not india it s shakti the power of a warlord and the power of a mother the relationship between nandini and her husband and son swallow you up in their warmth then things go terribly wrong the interaction between nandini and her father in law the power of their dysfunctional relationship and the lives changed by it are the strengths of this movie shah rukh khan s performance seems to be a mere cameo compared to the believable desperation of karisma kapoor it is easy to get caught up in the love violence and redemption of lives in this film and find yourself heaving a sigh of relief and sadness at the climax the musical interludes are strengths believable and well done,okay know nt project india good light overall theme movie india shakti power warlord power mother relationship nandini husband son swallow warmth things go terribly wrong interaction nandini father law power dysfunctional relationship lives changed strengths movie shah rukh khan performance seems mere cameo compared believable desperation karisma kapoor easy get caught love violence redemption lives film find heaving sigh relief sadness climax musical interludes strengths believable well done
1,despite john travolta s statements in interviews that this was his favorite role of his career be cool proves to be a disappointing sequel to 1995 s witty and clever get shorty br br travolta delivers a pleasant enough performance in this mildly entertaining film but ultimately the movie falls flat due to an underdeveloped plot unlikeable characters and a surprising lack of chemistry between leads travolta and uma thurman although there are some laughs this unfunny dialog example which appeared frequently in the trailers kind of says it all thurman do you dance travolta hey i m from brooklyn br br the film suggests that everyone in the entertainment business is a gangster or aspires to be one likening it to organized crime in get shorty the premise of a gangster going legitimate by getting into movies was a clever fish out of water idea but in be cool it seems the biz has entirely gone crooked since then br br the film is interestingly casted and the absolute highlight is a monolgue delivered by the rock whose character is an aspiring actor as well as a goon where he reenacts a scene between gabrielle union and kirsten dunst from bring it on vince vaughan s character thinks he s black and he s often seen dressed as a pimp this was quite funny in the first scene that introduces him and gets tired and embarrassing almost immediately afterward br br overall be cool may be worth a rental for john travolta die hards of which i am one but you may want to keep your finger close to the fast forward button to get through it without feeling that you wasted too much time fans of get shorty may actually wish to avoid this as the sequel is devoid of most things that made that one a winner i rate this movie an admittedly harsh 4 10,despite john travolta statements interviews favorite role career cool proves disappointing sequel 1995 witty clever get shorty br br travolta delivers pleasant enough performance mildly entertaining film ultimately movie falls flat due underdeveloped plot unlikeable characters surprising lack chemistry leads travolta uma thurman although laughs unfunny dialog example appeared frequently trailers kind says thurman dance travolta hey brooklyn br br film suggests everyone entertainment business gangster aspires one likening organized crime get shorty premise gangster going legitimate getting movies clever fish water idea cool seems biz entirely gone crooked since br br film interestingly casted absolute highlight monolgue delivered rock whose character aspiring actor well goon reenacts scene gabrielle union kirsten dunst bring vince vaughan character thinks black often seen dressed pimp quite funny first scene introduces gets tired embarrassing almost immediately afterward br br overall cool may worth rental john travolta die hards one may want keep finger close fast forward button get without feeling wasted much time fans get shorty may actually wish avoid sequel devoid things made one winner rate movie admittedly harsh 4 10
2,i am a kung fu fan but not a woo fan i have no interest in gangster movies filled with over the top gun play now martial arts that s beautiful and john woo surprised me here by producing a highly entertaining kung fu movie which almost has too much fighting if such a thing is possible this is good stuff br br many of the fight scenes are very good and some of them are less good and the main characters are amusing and likable the bad guys are a bit too unbelievably evil but entertaining none the less you gotta see the sleeping wizard he can only fight when he s asleep it s hysterical br br upon repeated viewings however last hurrah for chivalry can tend to get a little boring and long winded also especially because many of the fight scenes are actually not that good hence i rate it only a 7 out of 10 but it really is almost an 8 br br all in all one of the better kung fu movies made smack dab in the heart of kung fu cinema s prime all the really good kung fu movies are from the mid to late 1970ies with some notable exceptions from the late 60ies and early 70ies and early 80ies to be fair,kung fu fan woo fan interest gangster movies filled top gun play martial arts beautiful john woo surprised producing highly entertaining kung fu movie almost much fighting thing possible good stuff br br many fight scenes good less good main characters amusing likable bad guys bit unbelievably evil entertaining none less gotta see sleeping wizard fight asleep hysterical br br upon repeated viewings however last hurrah chivalry tend get little boring long winded also especially many fight scenes actually good hence rate 7 10 really almost 8 br br one better kung fu movies made smack dab heart kung fu cinema prime really good kung fu movies mid late 1970ies notable exceptions late 60ies early 70ies early 80ies fair
3,he seems to be a control freak i have heard him comment on losing control of the show and tell another guest who brought live animals that he had one rule no snakes he needs to hire a comedy writer because his jokes are lame the only reason i watch him is because he some some great guests and bands br br i watched the craig ferguson show for a while but his show is even worse he likes to bull sh to burn time i don t think either man has much of a future in late night talk shows br br daily also has the annoying habit of sticking his tongue out to lick his lips he must do this at least 10 times a show i do like the joe firstman band carson daily needs to lighten up before it is too late,seems control freak heard comment losing control show tell another guest brought live animals one rule snakes needs hire comedy writer jokes lame reason watch great guests bands br br watched craig ferguson show show even worse likes bull sh burn time think either man much future late night talk shows br br daily also annoying habit sticking tongue lick lips must least 10 times show like joe firstman band carson daily needs lighten late


In [60]:
# Loading the Porter Stemmer
ps = PorterStemmer()

# defining a function to perform stemming
def apply_porter_stemmer(text):
    # Split text into separate words
    words = text.split()

    # Applying the Porter Stemmer on every word of a message and joining the stemmed words back into a single string
    new_text = ' '.join([ps.stem(word) for word in words])

    return new_text

In [61]:
# Applying the function to perform stemming
data['final_cleaned_text'] = data['cleaned_text_without_stopwords'].apply(apply_porter_stemmer)

In [62]:
# checking a couple of instances of cleaned data
data.loc[0:2,['cleaned_text_without_stopwords','final_cleaned_text']]

Unnamed: 0,cleaned_text_without_stopwords,final_cleaned_text
0,okay know nt project india good light overall theme movie india shakti power warlord power mother relationship nandini husband son swallow warmth things go terribly wrong interaction nandini father law power dysfunctional relationship lives changed strengths movie shah rukh khan performance seems mere cameo compared believable desperation karisma kapoor easy get caught love violence redemption lives film find heaving sigh relief sadness climax musical interludes strengths believable well done,okay know nt project india good light overal theme movi india shakti power warlord power mother relationship nandini husband son swallow warmth thing go terribl wrong interact nandini father law power dysfunct relationship live chang strength movi shah rukh khan perform seem mere cameo compar believ desper karisma kapoor easi get caught love violenc redempt live film find heav sigh relief sad climax music interlud strength believ well done
1,despite john travolta statements interviews favorite role career cool proves disappointing sequel 1995 witty clever get shorty br br travolta delivers pleasant enough performance mildly entertaining film ultimately movie falls flat due underdeveloped plot unlikeable characters surprising lack chemistry leads travolta uma thurman although laughs unfunny dialog example appeared frequently trailers kind says thurman dance travolta hey brooklyn br br film suggests everyone entertainment business gangster aspires one likening organized crime get shorty premise gangster going legitimate getting movies clever fish water idea cool seems biz entirely gone crooked since br br film interestingly casted absolute highlight monolgue delivered rock whose character aspiring actor well goon reenacts scene gabrielle union kirsten dunst bring vince vaughan character thinks black often seen dressed pimp quite funny first scene introduces gets tired embarrassing almost immediately afterward br br overall cool may worth rental john travolta die hards one may want keep finger close fast forward button get without feeling wasted much time fans get shorty may actually wish avoid sequel devoid things made one winner rate movie admittedly harsh 4 10,despit john travolta statement interview favorit role career cool prove disappoint sequel 1995 witti clever get shorti br br travolta deliv pleasant enough perform mildli entertain film ultim movi fall flat due underdevelop plot unlik charact surpris lack chemistri lead travolta uma thurman although laugh unfunni dialog exampl appear frequent trailer kind say thurman danc travolta hey brooklyn br br film suggest everyon entertain busi gangster aspir one liken organ crime get shorti premis gangster go legitim get movi clever fish water idea cool seem biz entir gone crook sinc br br film interestingli cast absolut highlight monolgu deliv rock whose charact aspir actor well goon reenact scene gabriel union kirsten dunst bring vinc vaughan charact think black often seen dress pimp quit funni first scene introduc get tire embarrass almost immedi afterward br br overal cool may worth rental john travolta die hard one may want keep finger close fast forward button get without feel wast much time fan get shorti may actual wish avoid sequel devoid thing made one winner rate movi admittedli harsh 4 10
2,kung fu fan woo fan interest gangster movies filled top gun play martial arts beautiful john woo surprised producing highly entertaining kung fu movie almost much fighting thing possible good stuff br br many fight scenes good less good main characters amusing likable bad guys bit unbelievably evil entertaining none less gotta see sleeping wizard fight asleep hysterical br br upon repeated viewings however last hurrah chivalry tend get little boring long winded also especially many fight scenes actually good hence rate 7 10 really almost 8 br br one better kung fu movies made smack dab heart kung fu cinema prime really good kung fu movies mid late 1970ies notable exceptions late 60ies early 70ies early 80ies fair,kung fu fan woo fan interest gangster movi fill top gun play martial art beauti john woo surpris produc highli entertain kung fu movi almost much fight thing possibl good stuff br br mani fight scene good less good main charact amus likabl bad guy bit unbeliev evil entertain none less gotta see sleep wizard fight asleep hyster br br upon repeat view howev last hurrah chivalri tend get littl bore long wind also especi mani fight scene actual good henc rate 7 10 realli almost 8 br br one better kung fu movi made smack dab heart kung fu cinema prime realli good kung fu movi mid late 1970i notabl except late 60i earli 70i earli 80i fair


In [63]:
# Initializing CountVectorizer with top 1000 words
bow_vec = CountVectorizer(max_features = 1000)

# Applying TfidfVectorizer on data
data_features_BOW = bow_vec.fit_transform(data['final_cleaned_text'])

# Convert the data features to array
data_features_BOW = data_features_BOW.toarray()

# Shape of the feature vector
data_features_BOW.shape

(9982, 1000)

In [64]:
# Getting the 1000 words considered by the BoW model
words = bow_vec.get_feature_names_out()

In [65]:
# Checking the words considered by BoW model
words

array(['10', '100', '20', '30', '50', '60', '70', '80', '90', 'abil',
       'abl', 'absolut', 'accent', 'accept', 'achiev', 'across', 'act',
       'action', 'actor', 'actress', 'actual', 'ad', 'adapt', 'add',
       'admit', 'adult', 'adventur', 'age', 'ago', 'agre', 'air', 'alien',
       'aliv', 'allow', 'almost', 'alon', 'along', 'alreadi', 'also',
       'although', 'alway', 'amaz', 'america', 'american', 'among',
       'amount', 'amus', 'anim', 'ann', 'annoy', 'anoth', 'answer',
       'anti', 'anyon', 'anyth', 'anyway', 'apart', 'appar', 'appeal',
       'appear', 'appreci', 'approach', 'armi', 'around', 'arriv', 'art',
       'artist', 'ask', 'aspect', 'atmospher', 'attack', 'attempt',
       'attent', 'attract', 'audienc', 'averag', 'avoid', 'aw', 'award',
       'away', 'awesom', 'babi', 'back', 'background', 'bad', 'badli',
       'band', 'bare', 'base', 'basic', 'battl', 'beat', 'beauti',
       'becam', 'becom', 'begin', 'behind', 'believ', 'best', 'better',
       'beyo

In [66]:
# Creating a DataFrame from the data features
df_BOW = pd.DataFrame(data_features_BOW, columns=bow_vec.get_feature_names_out())
df_BOW.head()

Unnamed: 0,10,100,20,30,50,60,70,80,90,abil,...,writer,written,wrong,wrote,ye,year,yet,york,young,zombi
0,0,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,1,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,2,0
