# Comprehensive Sentiment Analysis Dataset

I combined these four datasets to create a unified dataset that encompasses a diverse range of text data, including movie reviews, social media posts, and more. This combined dataset will facilitate sentiment analysis tasks and allow for more comprehensive and accurate sentiment analysis models. The combined dataset contains the 'corpus_name' and 'raw_sentence' columns for easier identification and analysis of the data sources.

By merging these datasets, I aim to create a more powerful and robust dataset for sentiment analysis, leveraging the strengths of each individual dataset and providing a richer and more representative sample of sentiment-labeled text data.

Please note that the URLs provided above are for reference and access to the original datasets. The combined dataset is available in the CSV file named 'combined_data.csv'.

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

# 01 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
You can find more details and the dataset at: https://nlp.stanford.edu/sentiment/code.html

In [2]:
# Load "stanford.csv" and display a random sample of 10 rows

df_stanford = pd.read_csv('../dataset/stanford.csv')
df_stanford.sample(10)

Unnamed: 0,sentence_index,sentence
2630,2631,"A lovely film ... elegant , witty and beneath a prim exterior unabashedly romantic ... hugely enjoyable in its own right though not really faithful to its source 's complexity ."
7260,7261,Video games are more involving than this mess .
11336,11337,overburdened with complicated plotting and banal dialogue
5385,5386,"Another in a long line of ultra-violent war movies , this one is not quite what it could have been as a film , but the story and theme make up for it ."
6586,6587,"Frankly , it 's pretty stupid ."
4057,4058,The Santa Clause 2 proves itself a more streamlined and thought out encounter than the original could ever have hoped to be .
10722,10723,Do we really need another film that praises female self-sacrifice ?
4324,4325,Rarely have I seen a film so willing to champion the fallibility of the human heart .
7131,7132,"After seeing the film , I can tell you that there 's no other reason why anyone should bother remembering it ."
10111,10112,It 's mindless junk like this that makes you appreciate original romantic comedies like Punch-Drunk Love .


In [3]:
# stanford data overview
df_stanford.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11855 entries, 0 to 11854
Data columns (total 2 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   sentence_index  11855 non-null  int64 
 1   sentence        11855 non-null  object
dtypes: int64(1), object(1)
memory usage: 185.4+ KB


In [4]:
# Data preprocessing for stanford corpus
df_stanford.columns = df_stanford.columns.str.replace('sentence', 'raw_sentence')
df_stanford = df_stanford.assign(corpus_name='stanford')
df_stanford_keep = ['corpus_name', 'raw_sentence']
df_stanford = df_stanford[df_stanford_keep]

# Display a random sample of 10 rows from the processed data
df_stanford.sample(10)

Unnamed: 0,corpus_name,raw_sentence
697,stanford,"It risks seeming slow and pretentious , because it thinks the gamble is worth the promise ."
590,stanford,"Possession is Elizabeth Barrett Browning meets Nancy Drew , and it 's directed by ... Neil LaBute ."
2973,stanford,Narc is a no-bull throwback to 1970s action films .
4029,stanford,And the positive change in tone here seems to have recharged him .
215,stanford,"Solid , lump-in-the-throat family entertainment that derives its power by sticking to the facts ."
2142,stanford,The film sparkles with the the wisdom and humor of its subjects .
2901,stanford,"this film is not a love letter for the slain rappers , it 's a taunt - a call for justice for two crimes from which many of us have not yet recovered ."
11704,stanford,Almost peerlessly unsettling .
2379,stanford,"Transcends its agenda to deliver awe-inspiring , at times sublime , visuals and offer a fascinating glimpse into the subculture of extreme athletes whose derring-do puts the X into the games ."
9513,stanford,Blue Crush is as predictable as the tides .


# 02 Large Movie Review Dataset
You can find more details and the dataset at: https://ai.stanford.edu/~amaas/data/sentiment/

In [5]:
# Load "large_movie_review.csv" and display a random sample of 10 rows

df_large_movie_review = pd.read_csv('../dataset/large_movie_review.csv')
df_large_movie_review.sample(10)

Unnamed: 0,Filename,Content
20907,6317_1.txt,"I've no idea what dimwit from San Francisco came up with this stupid plot, but apparently they need to get off whatever drugs they are taking and put their analyst on danger money -- NOW.<br /><br />Yeah, this is a plausible story, if you regard the alien abduction sequence in ""Life of Brian"" as plausible.<br /><br />This film is little more than a leftist pipedream. Had the US and USSR give up nuclear weapons, the result would've been to eliminate the only real obstacle that kept the two from engaging in a war. Bad as Korea, Vietnam and other wars of the era were, they were ""proxy wars"" fought to keep the superpowers from a direct engagement.<br /><br />This film makes me think about how realistic it was when some group of high school kids would go on a hunger strike against nuclear proliferation. As if someone would say ""Mr. President, some kids at Drastic High are not eating!"" and Ronald Reagan would reply ""My God! I'd better revise my Defense policy!"" Right.<br /><br />Like this film? Wouldn't it be better if the Soviet Union would've collapsed because they could not support their massive arms build... wait, that happened!"
24530,9579_4.txt,"Here's a horror version of PRISCILLA: QUEEN OF THE DESERT (they wish!) starring Melinda/Mindy (RETURN OF THE LIVING DEAD 3) Clarke as Candy, a desert dweller who pulls off a bank heist with boyfriend Johnny (Jason Durr). He ends up in a South-of-the-border prison run by the sadistic Chief Screw (an overacting Robert Englund in a toupee). She and her beloved pet poodles end up in hiding at a gas station convent until they're transformed by a newly fallen meteor. The dogs turn into obnoxious drag queen ""bitches"" and Candy develops a VERY long, talking, killing forked tongue she can't control. Thugs looking for the stolen loot and other assorted numbskulls add extra complications.<br /><br />First off, Clarke is fantastic and makes what there is to make of this movie. You watch her and see someone very funny during the slapstick scenes, very convincing during the horror scenes and VERY sexy in various wigs and disguises, including an eye-popping, skin tight latex bodysuit...and wonder how come this actress isn't a huge star. It's too bad the rest of this cult attempt doesn't live up to her promise.<br /><br />Blame director/scripter Sciamma, who thinks the outlandish premise alone is enough to sustain laughs...but his vulgar gags, annoying supporting characters and stupid dialogue are no substitute for a real sense of humor. Another nail in the coffin; the film looks cheap, lots of garish colors and sets are strangely muted by muddy photography and the dusty desert locales. Luckily for Sciamma that Clarke is in his film, because she alone keeps you watching."
11961,9516_8.txt,"Tom Hanks like you've never seen him before. Hanks plays Michael Sullivan, ""The Angel of Death"". He is a hitman for his surrogate father John Rooney(Paul Newman)an elderly Irish mob boss. Sullivan's young son(Tyler Hoechlin)witnesses what his father does for a living and both are soon on the road for seven weeks robbing banks to avenge the murder of Sullivan's wife and other son. Enter Jude Law as a reporter/photographer willing to kill Sullivan himself for the chance to add to his collection of photos of dead mobsters. Filmed beautifully catching the drama of life in the 30's. Sometimes the pace bogs down, but then a burst of graphic violence sustains the story. Director Sam Mendes directs this powerful drama about loyalty, responsibility, betrayal and the bonding of a secretive man and his young son. Other notable cast members are: Dylan Baker, Stanley Tucci, Daniel Craig and Jennifer Jason Leigh. Hanks again proves to be excellent in a very memorable movie. Make room for some Oscars!"
6203,4333_10.txt,"saw this in preview- great movie- wonderful characterizations- witty and intelligent dialog- actors were fantastic- Peter Falk will be up for an Oscar- Paul Reiser was charming- photography was marvelous Reiser was at the theater when we saw the film, and he gave a vivid account about the making of the film- it had been a long dream of his to write a semi-autobiographical account of relationships between sons and fathers, and more specifically between him and his father- this was achieved in a dramatic and entertaining fashion- the supporting cast was well chosen and gave the film a feeling of family- i recommend this film to anyone who is longing to see intelligent drama and wonderful performances"
20688,611_4.txt,"Okay, 'enjoy' is a pretty relative term, but flexibility is in order when you're dealing with a filmmaker of James Glickenhaus' calibre.<br /><br />McBain is truly one of the most ridiculous, over the top action films I've ever seen, without the nasty edge of The Exterminator. Other reviews have commented on a suspension of disbelief regarding the film's heroic middle aged commandos, but how about making a film in the Philippines that is set in Colombia? All the extras are Filipino. In fact the only character who looks remotely Hispanic is good ol' Victor Argo as the much reviled 'El Presidente'! Oh yes, we also have Maria Conchita Alonso overemoting like crazy as a rebel leader. There are tons of explosions and bodies flying everywhere in this amusing paean to the glories of American imperialism."
4080,2422_9.txt,"Yeah,it's low budget. Yeah,it's one of Candy's earliest films, but it is maybe his funniest! John Candy was not so far removed from his SCTV days in ""Going Berserk"" and it shows. If you don't crack up when Candy tries to help a guy with his groceries while being hand-cuffed to an escaped cohort in the process of having sex with his girlfriend with only the apartment door separating them (huh?, see the movie!), or the way Euguene Levy (the creator of Kong Fu Yu) is talking to his mom on the phone, or just the countless number of facial expressions that only Candy could deliver you better check your pulse! If you like the John Candy of ""Only the Lonely"" this is may not be a movie I would advise you to see, but if you enjoy the SCTV days of John Candy, this movie is a must see!"
6547,4643_10.txt,"I am currently on vacation in Israel for summer, and so was able to see this incredible film. A bit of a warning before I begin writing: I speak fluent Hebrew, and so the Hebrew parts were no problem; however, about a quarter (a bit less) of the film is in Arabic, and I was unable to understand a bit of this subtitled bit. This did not detract from my understanding of the film, but did cause me to miss a few jokes which evoked some strong laughs in the theater.<br /><br />After a year of American Cinema which many hailed as one of the greatest years for homosexual cinema and relationships, it takes something truly special to stand head and shoulders above the rest; yet, ""The Bubble"" surpasses all others with its blend of excellent acting, witty dialogue, and relevant political climate.<br /><br />The film opens on a checkpoint on the Israeli-Palestinian border; For the first few moments, we are unsure about the type of movie we have walked in on. Yet, this is an important element of this film's strength. The political situation, and the extreme tension in the air is constantly in the background. Most importantly, Tel Aviv serves as a character of its own in this film. It is constantly referenced. Street names and restaurant names are constantly exchanged. The skyline and city development is critiqued quite harshly, and ultimately the city evolves along with the film The film focuses on the love between Noam (Ohad Knoller) and a Palestinian immigrant, Ashraf(Yousef 'Joe' Sweid), with the societies of Tel Aviv and Palestine serving as a constant foil. We always know that their relationship is forbidden, and this creates a sense of urgency rarely present in cinema. The love is incredibly strong, and stands as the centerpiece of the film. The secondary relationships and friendships are equally strong: flamboyant restaurant owner Yelli's ( Yousef 'Joe' Sweid) relationship with the ultra-butch and grating golani solider, Golan (Zohar Liba), is particularly a source of amusement. The love scenes which abound in this film are all exquisite, fine crafted works of art, and the cinematography is astounding: In the first love scene of the film, the camera pans down as a male character gives oral sex to Lulu (Daniela Virtzer), and dissolves into a shot of Noam and Ashraf. This shot any many others lead the viewer to realize that all of these relationships are expressions of the very same form of love.<br /><br />To give away more of the storyline would be a tragedy, but know that there is a lot of political tension and tragedy which touches onto the current world political climate, so I will instead focus on the witty dialogue. Even when watching this movie in my second language, I could not stop laughing throughout. Lines of particular amusement include the question of whether gay suicide bombers receive virgin women or men in heaven, and an analogy of Sampson from the bible as the worlds first suicide bomber. This dialogue shows a particular sense of purity and reality which is rarely seen in Cinema. The music used in the film is also particularly powerful. Music is only used in times when characters legitimately could or should be listening to it, and in one scene the music weakens when a character removes one earphone and stops when he removes the other. Little elements like this truly elevate the film.<br /><br />I could not give greater recommendation to a film; this is a superb work of cinema which is catharthic as well as extremely well crafted."
4891,3152_9.txt,"This show was absolutely great, and I always look forward to watching it.All the characters were funny and awesome in their own way, each and every episode provided non-stop laughter, and it was completely entertaining and different from a lot of other shows.Everybody was just absolutely insane and breathtakingly funny, that you couldn't help but love this show.There were a few dead weight episodes, but That '70s Show always managed to create some kind of likable atmosphere, to where it just really didn't matter.This was one of the best shows to ever be aired, and I will watch this show anytime I can, for it never gets old, never gets unfunny, and never gets uninteresting."
21026,6424_3.txt,"A broke would be screenwriter and his would be agent (Tom Wood and Arye Gross) are forced to live in a self storage facility run by an eccentric and intimidating manager (Ron Perlman) whom they come to believe is the serial murderer that is terrorizing the city, the ""Costume Killer"" (so named because, after injecting his victims with Windex, he dresses them in silly costumes). They convince him his life story would make a great film and gather together a group of misfit wannabe film makers (John Considine, Joe Pantoliano, Kristy Swanson) and discover that the art of movie making can be murder.<br /><br />There is more to this movie but it was unfortunately left on the editing room floor and it shows (rumor is the studio wanted a ""lighter"" dark comedy). Our loss (and the actors, who all do fine jobs and deserve better) as this has the makings of an exceptional black comedy but only rises to mediocre cute.<br /><br />If you're a Ron Perlman fan this is absolutely worth getting just for his performance. His comedic timing is excellent and he has the chance to do some really great impressions (he wasn't kidding when he said on the Hellboy movie commentary that he needed an intervention when he gets into Jerry Lewis mode). He's just simply fun to watch in this one. <br /><br />David Dukes also shines in a two-scener (but pivotal) role."
22396,7658_1.txt,"Alone in the Dark is Uwe Boll's kick in the nuts to Hollywood after House of the Dead's punch in the face.<br /><br />If anything it proves just how much of a master manipulator Boll is. After forcing Artisan out of business over the flop that was House of the Dead, one can only assume the normally credible Lion's Gate Films only released AITD under contractual obligation after acquiring Artisan's assets. Because AITD is an even bigger example of complete lack of coherent film-making ability, plot exposition and just plain stealing poorly from other movies because it was supposed to look cool instead of because it fitted within the movie's framework.<br /><br />But then that's the point, isn't it. Boll isn't trying to make a coherent film because he isn't trying to direct Alone in the Dark. He's just trying to manipulate Hollywood.<br /><br />Alone in the Dark, like House of the Dead, Dungeon Siege, Far Cry, Bloodrayne and the other 3 or 4 projects that are ""announced"" or in ""pre-production"".<br /><br />These aren't movies to be directed, but investment portfolios. Every single one of them rushed into production under the pretence that the tax law Boll and his investors are exploiting may be closed within the next 2 to 3 years. The more bomb projects he can release within that time-frame, the more money he and his investors can gain. Why bother making a good movie when a bad movie's making you a mint anyway? The result is movies like the awfulness of Alone in the Dark.<br /><br />Alone in the Dark, like all his other movies are just a cynical exploitation of Hollywood's current trend for lazy film-making.<br /><br />And to those who support Boll by calling him misunderstood or the next Ed Wood, congratulations, by making a cult figure out of the man, you're just making it easier for him to get investors but giving him notoriety.<br /><br />For more information, read here: http://www.cinemablend.com/feature.php?id=209 http://www.cinemablend.com/forum/showthread.php?s=&threadid=21699 As an aside, just don't ask me how he's getting his cast-lists together. Unless the actors are in on the investment-scam somehow, that mystery has still to be uncovered."


In [6]:
# large_movie_review data overview
df_large_movie_review.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25000 entries, 0 to 24999
Data columns (total 2 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   Filename  25000 non-null  object
 1   Content   25000 non-null  object
dtypes: object(2)
memory usage: 390.8+ KB


In [7]:
# Data preprocessing for large_movie_review corpus
df_large_movie_review.columns = df_large_movie_review.columns.str.replace('Content', 'raw_sentence')
df_large_movie_review = df_large_movie_review.assign(corpus_name='large_movie_review')
df_large_movie_review_keep = ['corpus_name', 'raw_sentence']
df_large_movie_review = df_large_movie_review[df_large_movie_review_keep]

# Display a random sample of 10 rows from the processed data
df_large_movie_review.sample(10)

Unnamed: 0,corpus_name,raw_sentence
3471,large_movie_review,"A stunning film of high quality.<br /><br />Apparently based on true events which, as told, has the clear ring of truth about it, this movie is highly emotional and deeply moving.<br /><br />An abused and neglected child often becomes wayward in adulthood, as one of life's failures, be it as a gangster, drug addict or burden on society.<br /><br />Antwone Fisher as a young adult in the navy, is troubled. He is on the brink of being a loser. He is counselled in therapy by a psychiatrist and it is that relationship which takes center stage in the play.<br /><br />In flash-backs and therapy the source and remedies to Antwones angst are revealed.<br /><br />Outstanding performances from the whole cast. The story is in effect a family tragedy with emotional and physical torment. All the actors give full blooded performances with conviction and realism.<br /><br />One message from the movie is the importance of raising children decently.<br /><br />The real Antwone deserves success. To have endured wickedness as a child but to rise above that, shows a magnificent character.<br /><br />And to all those out there who have endured such torment but to have survived and succeeded: you are all winners. 10 out of 10."
9756,large_movie_review,"I liked it... just that... i liked it, not like the animated series... i love it!!!. The fact that this make less appealing is that we all try to compare and not to appreciate, but this cartoon was awesome, but it really didn't like it that much. There's too much people talking about Bruce being so cold, but if this is around 5 years later, anybody in a crime-fighting gang would get this angry and darker attitude, so to me it isn't a flaw. Batgirl was awesome she really fit there, as there isn't more Dick Grayson as a robin, batman needed a good teammate, not like the new robin, he is just a child and you cant rely that much on a child. But heres what didn't work: The new artwork... it isn't horrible but... to me it does'nt work in a series like batman. This is a dark character, with a maniac killer like the joker, so you cant put this kind of artwork in this cartoon, The joker isn't a bad design but i still like the past joker (but to me the BEST joker ever was the one who appeared in batman beyond:return of the joker) , so this joker isn't near as good. The good thing about the joker is that it still mark Hamil voice. My favorite character: Harley Quinn (im in love for her) They put an awesome episode for her: Mad love (to me the best episode of this series). Here we finally know how she turned Harley Quinn, and how the joker twisted her mind, and it feel that atmosphere that you feel in the animated series, darker, no happy ending, brutal fight with the joker (but too short), this is how it was to be ALL the series. BUT in general i didn't like how she made Harley in this series... in almost every episode they put funny but in a ridiculous way, she get punched, she say nonsenses, she make flaws... c'mon she is funny in a way you can laugh with her, not from her... and here they put ridiculous (like i said the only episode where i don't think that its in mad love and beware of the creeper) So in general its a good series, it has it upsides and downs, the drawn could be better ( MY GOOD!!! KILL THAT CATWOMAN!!!!) nice sound effects, nice music, nice voices and nice episodes: my favorites, Mad love, Jokers millions, Old Wounds, Sins of the father, and Cold comfort. If you enjoyed Batman:TAS you can watch this but don't spec too much, in the other hand if you didn't watched TAS, watch this first and then watch TAS in that way you're really gonna love TAS :D"
6446,large_movie_review,"Daniel Day Lewis in My Left Foot gives us one of the best performances ever by an actor. He is brilliant as Christy Brown, a man who has cerebral palsy, who then learned to write and paint with his left foot. A well deserved Oscar for him and Brenda Fricker who plays his loving mother. Hugh O'Conner is terrific as the younger Christy Brown and Ray McAnally is great as the father. Worth watching for the outstanding performances."
15152,large_movie_review,"**WARNING: POSSIBLE SPOILER**<br /><br />If you can get by the extremely unpleasant subject matter, this film does offer a heaping helping of outrageously campy melodrama. Surprisingly enough, this movie has been copied and ripped-off several times over the years, although it's hard to fathom ANY filmmaker being inspired by this trashy drama. Neither one of the Hemingway women can act here (although Mariel HAS improved over the years), Anne Bancroft offers the only touch of class as a prosecuting attorney, and Chris Sarandon is by turns pathetic and unintentionally hilarious as the smirking, smarmy bad guy of the piece.<br /><br />Veteran director Lamont Johnson can't make a silk purse out of this sow's ear of a script, which is stuffed to bursting with howlingly bad dialogue and outlandish situations. For example, the final sequence, where Margaux grabs her shotgun and chases Sarandon down after his latest shocking act is meant to be exciting but elicits hearty chuckles instead. Add a notoriously shrill and spacy musical score by Michel Polnareff and you have a true guilty pleasure, even though you're likely to feel grubby and needing a hot shower after viewing it. Don't say you weren't warned."
19321,large_movie_review,"A movie about Vixen (Erica Gavin) who has a Mountie husband who she loves...but she loves sex too! In the course of the movie she gets multiple men in bed--including her husband AND brother! Also there's a (tame) lesbian sequence.<br /><br />This film put Russ Meyer on the map and was (I believe) the first critically acclaimed X rated film ever. It was a big hit when it came out. Unfortunately, it doesn't date well.<br /><br />It is well-directed and Erica Gavin is just great (whatever happened to her), and it was VERY colorful...but by today's standards it's extremely tame. I'm surprised it has an NC-17 rating now--there's no hardcore sex and it only has topless females and no male nudity at all. Also it's (sadly) pretty dull and the addition of politics at the end was confusing (and pretty silly). It is worth catching though to see what was considered very shocking in 1968. Purportedly I saw the cut version (which has an R rating) but I've heard only a few seconds here and there are missing. <br /><br />Meyer's next film ""Beyond the Valley of the Dolls"" is much better and dates VERY well. Catch that instead."
12971,large_movie_review,i should love this movie . the acting is very good and Barbara Stanwyck is great but the the movie has always seemed very trite to me . the movie makes working class people look low and cheap .the fact that the daughter is ashamed of her mother and that the daughter does not rise above it has always made me a bit uneasy . Barbara Stanwyck as the mother worships the daughter but the daughter forgoes a mothers love to find happiness with her well to do fathers family . i wonder how many others who have seen this film feel this way about it.again the acting was very very good and worth watching . i really don't like the story line . just a personal preference .thank you
18213,large_movie_review,"I was thinking that the main character, the astronaut with the bad case of the runs(in his case, his skin, hair, muscles, etc) could always get more movie work after he'd been reduced to a puddle. All he has to do is get a job as the Blob. The premise of this flick is pretty lame. An astronaut gets exposed to sunspot radiation(I think), and so begins to act like an ice cream cone on a hot day. Not only is this a puzzler, but apparently he has to kill humans and consume their flesh so that he can maintain some kind of cell integrity. Huh? Have you ever noticed that whenever any kind of radiation accident or experiment happens, the person instantly turns into a killing machine? Why is that?<br /><br />The astronaut lumbers off into the night from the 'secret facility'(which has no security whatsoever), shedding parts of himself as he goes. Apparently he retains just enough memory to make him head for the launch pad, maybe because he wanted to return to space. <br /><br />Thus begins the part of the movie that's pretty much filler, with a doctor wandering around with a Geiger counter, trying to find the melting man by the buzz he gives off. He kills a stupid Bill Gates look-alike fisherman, scares a little girl a la the Frankenstein monster movie, and finishes off a wacky older couple(punishing them karmically for stealing some lemons). Then there's a short scene where he whacks his former General, and a very long scene where he kills a young pothead and chases his girlfriend around. You'd think that after she cuts his arm off and he run away, the scene would shift. But no...we're treated to about ten minutes of the woman huddled into a corner panting and screaming in terror, even though the monster is gone. All I could think was..director's girlfriend, anyone?<br /><br />The end of the movie is even lamer than the rest of it. The melting man finishes turning into a pile of goo, and then...nothing. That's it. That's the end of the movie. Well, at least that meant that there was no room for a sequel."
20249,large_movie_review,"2005 gave us the very decent ""gore porn"" flick Hostel, and 2006 gave us Live Feed; a not so decent rip-off of Hostel. Live Feed follows pretty much the same formula as Eli Roth's earlier film, except this time the dumb kids are in Asia rather than central Europe. The plot focuses on these dumb kids, and one of them has annoyed one of the locals so they find themselves in trouble. The locals decide to lock them all in a theatre, and kill them. Despite the fact that I'd heard some less than favourable things about this film before seeing it, I still hoped that it might be at least half decent because director Ryan Nicholson previously made the very decent 45 minute rape and revenge film 'Torched', but this film falls down simply because most of it is either ridiculous or boring. The film is obviously trying to hark back to the good old days of Grindhouse cinema (which Hostel did, successfully), but it really doesn't come off. Surprisingly, considering Nicholson's previous work in special effects - not even the gore is impressive...although it is a lot better than the acting! There's not much else I can say about this film...it's bad and not in a good way. Avoid it!"
2356,large_movie_review,"Reading web sites on Bette Davis one can find instances where authors claim that there is nothing special about her acting. I even found a site which claimed that Bette Davis' success was probably due to her luck. But Ms Davis films of 1934 tell quite the opposite. The most evident example are two films that she did only few weeks apart: Fog over Frisco and On Human Bondage. Characters she played in these movies, though both being negative, are quite different. Arlene in the former is a beautiful, glamorous and frivolous heiress and much more likable character than Mildred in the latter, which is a pale, uneducated and impudent Cockney waitress. Needless to say that Ms Davis played both characters very authentic and with the same enthusiasm. But even that is not all. The point is that the former role, which would be wished by most actresses of the day, was the one she was forced to play. The latter role, which seemed to most actresses as undesirable, career destroying role, was the one she fought for ferociously for months. And it was the latter role that launched her among the greatest stars. So there is no question that Ms Davis knew from the start what she was doing.<br /><br />The film, which tells about a medical student Phillip Carey (Leslie Howard) which falls unhappily in love with Cockney waitress Mildred Rogers (Bette Davis), has a few week points, but many more strong ones. The story is simply too big to be told in mere 83 minutes. For example, it is quite unclear why refined student found any interest in an impudent waitress in the first place. Well, there is one scene in which we are exposed to Ms Davis captivating eyes, but this is when his emotions are already fully evolved. Nevertheless, the integrity of the story is preserved by superior acting from Howard and Davis as well as fantastic Steiner's music which tells tons of emotions even when we do not see characters' faces. In fact the film is amalgamated by Phillip's walking sequences showing him from the back supplemented with shuddering two-tone repetition. Every detail is well thought - Max Steiner wrote a beautiful leitmotif for each women in Phillip's life, which is consistently used through the film. And a beautiful scene in which we see Sally's face in front of calendar is one of the sweetest scenes I've ever seen exactly due to Francis Dee's breathtaking beauty (Ms Dee was by the way considered to be too beautiful to play leading role in Gone with a Wind) as well as Steiner's captivating music. Camera movements between the some scenes is also original and refreshing.<br /><br />But my strongest objection is that events are presented too two-dimensionally, which induce viewer that Mildred is an ultimate slut. The most disgusting characters ought to be men which lure her into relationship, despite well knowing that they will abandon her after taking use of her, but they, curiously, finished portrayed as likable characters. After all, Mildred always - in her own specific, but still a honest way - lets Phillip know that she despises him and had no interest in him. Which he just refuses to hear. It is Phillips masochistic nature connected to his club foot and infantile experiences that is the principal reason of his love problem. He is enslaved to his club foot as much as to Mildred and perhaps has to be free of both to start a normal life. Of course, selfish and impudent Mildred, after discovering voluntary Phillip's bondage to her, did its own share to make his life hell. Even taking into account that she exploded after realizing that the bondage has loosen, it is less than clear why would she burn Phillip's money (Maugham intended different in his novel). After all, she could as well steal it and drunk gallons of champagne.<br /><br />For modern standards the film is a bit outdated, but each subsequent time you watch it, you can reveal new interesting details due to superior acting, fascinating music and original editing, so it does deserve the highest possible mark."
12288,large_movie_review,"Albuquerque is a film that has all the elements of a class A western, except one: the story, that really belongs to a class B or C. That was acceptable at the time the film was made, when people were so thrilled to see a western in color, but nowadays it just looks very primitive. Nonetheless for people who enjoy old westerns, it is entertaining, the original color and sound are very well kept on the DVD that recently came out. Gabby Hayes is a good sidekick, Lon Chaney is mean as always, and Randolph Scott a bit more cheerful than usual. In a film named Albuquerque you would expect to see something that would remind you of the city, but the town that is shown here could be just anywhere."


# 03 Twitter US Airline Sentiment
You can find more details and the dataset at: https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment

In [8]:
# Load "twitter_us_airline.csv" and display a random sample of 10 rows

df_twitter_us_airline = pd.read_csv('../dataset/twitter_us_airline.csv')
df_twitter_us_airline.sample(10)

Unnamed: 0,tweet_id,airline_sentiment,airline_sentiment_confidence,negativereason,negativereason_confidence,airline,airline_sentiment_gold,name,negativereason_gold,retweet_count,text,tweet_coord,tweet_created,tweet_location,user_timezone
14382,5.69624e+17,positive,0.6779,,0.0,American,,mathardin,,0,"@AmericanAir I'm flying with your competitor today, starts with an U and ends with D. I will never make that mistake again. #americanforlife",,2/22/2015 14:24,Corpus Christi,
6110,5.6822e+17,neutral,1.0,,,Southwest,,FNCNerd,,0,"@SouthwestAir So I am flying Chicago-LAX-PHX just to go spotting at LAX and PHX airports, then I am flying back to Chicago :)",,2/18/2015 17:28,"Lake Buena Vista, Florida",Central Time (US & Canada)
14118,5.69665e+17,negative,1.0,Customer Service Issue,0.6579,American,,otisday,,0,"@AmericanAir @JimDayTV no, you couldn't care less. That's why their customer service is run by bots",,2/22/2015 17:06,Pekin,Eastern Time (US & Canada)
10894,5.68795e+17,negative,1.0,Bad Flight,1.0,US Airways,,carrieryan,,0,@USAirways headed to NYC from CLT. Funny to hear all phones ring at once and then the entire plane groan (has happened twice more).,,2/20/2015 7:33,"Charlotte, NC",Eastern Time (US & Canada)
11210,5.68424e+17,negative,1.0,Customer Service Issue,0.3663,US Airways,,Heartliss,,0,@USAirways that's it?!?!?,,2/19/2015 6:56,Here. There. Everywhere.,Quito
5530,5.68918e+17,negative,1.0,Customer Service Issue,1.0,Southwest,,davidgoodson71,,2,@SouthwestAir @JulGood1 she was traveling with me the one that got miscommunicated with,,2/20/2015 15:41,,
10398,5.69332e+17,negative,1.0,longlines,0.6753,US Airways,,AuroraBIZ,,0,@USAirways standing in line with 100 people all looking to do the same,,2/21/2015 19:05,"Boston, MA",Eastern Time (US & Canada)
12373,5.70215e+17,negative,1.0,Bad Flight,0.6717,American,,Aero0729,,0,@AmericanAir zoom in on the sauce and potatoes. This stuff is vile. And I mean vile. http://t.co/m2PHoavRxC,,2/24/2015 5:35,,
9233,5.70046e+17,negative,1.0,Flight Booking Problems,0.3505,US Airways,,JustPlain_Jake,,1,@USAirways did a pre schooler develop your app? The thing crashes every single time.,,2/23/2015 18:21,"ÜT: 38.8676126,-77.0831512",Quito
14200,5.69655e+17,negative,1.0,Bad Flight,0.3363,American,,point_princess,,0,@AmericanAir any idea what's up with flight AA3181?,,2/22/2015 16:28,"Ann Arbor, MI",


In [9]:
# twitter_us_airline data overview
df_twitter_us_airline.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14640 entries, 0 to 14639
Data columns (total 15 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   tweet_id                      14640 non-null  float64
 1   airline_sentiment             14640 non-null  object 
 2   airline_sentiment_confidence  14640 non-null  float64
 3   negativereason                9178 non-null   object 
 4   negativereason_confidence     10522 non-null  float64
 5   airline                       14640 non-null  object 
 6   airline_sentiment_gold        40 non-null     object 
 7   name                          14640 non-null  object 
 8   negativereason_gold           32 non-null     object 
 9   retweet_count                 14640 non-null  int64  
 10  text                          14640 non-null  object 
 11  tweet_coord                   1019 non-null   object 
 12  tweet_created                 14640 non-null  object 
 13  t

In [10]:
# Data preprocessing for twitter_us_airline corpus
df_twitter_us_airline.columns = df_twitter_us_airline.columns.str.replace('text', 'raw_sentence')
df_twitter_us_airline = df_twitter_us_airline.assign(corpus_name='twitter_us_airline')
df_twitter_us_airline_keep = ['corpus_name', 'raw_sentence']
df_twitter_us_airline = df_twitter_us_airline[df_twitter_us_airline_keep]

# Display a random sample of 10 rows from the processed data
df_twitter_us_airline.sample(10)

Unnamed: 0,corpus_name,raw_sentence
7708,twitter_us_airline,@JetBlue she did. she's still waiting on a callback....the lack of follow through is what's concerning. smh
799,twitter_us_airline,@united man I can't wait to book my ticket now! Thanks JP you're a life sabe
5973,twitter_us_airline,@SouthwestAir boarding passes now compatible with #iPhone #Passbook \nhttp://t.co/1ESmMnIZEk
6716,twitter_us_airline,@SouthwestAir Where should I fly by May 3rd? Plz advise.
5918,twitter_us_airline,@SouthwestAir do you have any flights from NAS-BWI ton March 23?
8759,twitter_us_airline,@JetBlue they couldn't do it sun then it was supposed to be mon after 6pm then today from 11 to 3 then from 6 to 8
11423,twitter_us_airline,@USAirways a $100 @Samsonite - totaled. Not happy. Not at all.
2684,twitter_us_airline,"@united Don't know her last name, but Karen at your call center is terrific. Friendly, helpful. Terrific representative. Kudos."
9006,twitter_us_airline,@USAirways - been standing at the gate for 45 min trying to go standby bc I will miss my connection. No help! Do NOT fly US AIRWAYS!
13247,twitter_us_airline,@AmericanAir flight 353


# 04 Sentiment140 dataset
You can find more details and the dataset at: https://www.kaggle.com/datasets/kazanova/sentiment140

In [11]:
# Load "sentiment140.csv" and display a random sample of 10 rows
# Note: I encountered a UnicodeDecodeError without specifying the encoding.
# After trying 'latin1' encoding, the file was successfully loaded.

df_sentiment140 = pd.read_csv('../dataset/sentiment140.csv', encoding='latin1') 
df_sentiment140.sample(10)

Unnamed: 0,0,1467810369,Mon Apr 06 22:19:45 PDT 2009,NO_QUERY,_TheSpecialOne_,"@switchfoot http://twitpic.com/2y1zl - Awww, that's a bummer. You shoulda got David Carr of Third Day to do it. ;D"
1113027,4,1972524370,Sat May 30 09:23:15 PDT 2009,NO_QUERY,hillaroo,3-cheers for earl grey tea
1357685,4,2048052578,Fri Jun 05 14:35:20 PDT 2009,NO_QUERY,daveredford,@adrianandsabine Thanks! You too!
1456702,4,2063520613,Sun Jun 07 02:59:56 PDT 2009,NO_QUERY,Geekwithalife,@mileycyrus looks like an amazing place to be
733778,0,2264395184,Sun Jun 21 03:44:13 PDT 2009,NO_QUERY,antonioval,@misspid I envy you - been raining non stop wherever I've been so far boooo
348951,0,2016971096,Wed Jun 03 07:39:17 PDT 2009,NO_QUERY,tenderheartjb,"@dapmd couldn't breathe or swallow, yikes, seriously allergic to soy and pineapple and five other foods"
929297,4,1759932365,Sun May 10 20:16:45 PDT 2009,NO_QUERY,uhohcaitie,@CarloHilton cooool if i see you ill come up and say hello
228461,0,1978265902,Sat May 30 22:52:14 PDT 2009,NO_QUERY,akadjtonyc,I feel bad for these seniors at GKHS - their admin handcuffed me and I went from packed dance floor to nothing
1219384,4,1989942186,Mon Jun 01 03:20:56 PDT 2009,NO_QUERY,brittcullen,@RobbySTEREOS BEST MOVIE OF THE YEAR!!!!
1075325,4,1967171382,Fri May 29 19:06:03 PDT 2009,NO_QUERY,hoping_for_sin,@kconWHOA Got it.
197099,0,1970934535,Sat May 30 05:42:36 PDT 2009,NO_QUERY,smurfberry,"oh, i am so conflicted @joincidence, you do english lang at manchester, right? pros/cons?"


In [12]:
# sentiment140 data overview
df_sentiment140.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1599999 entries, 0 to 1599998
Data columns (total 6 columns):
 #   Column                                                                                                               Non-Null Count    Dtype 
---  ------                                                                                                               --------------    ----- 
 0   0                                                                                                                    1599999 non-null  int64 
 1   1467810369                                                                                                           1599999 non-null  int64 
 2   Mon Apr 06 22:19:45 PDT 2009                                                                                         1599999 non-null  object
 3   NO_QUERY                                                                                                             1599999 non-null  object
 4   _

In [13]:
# Data preprocessing for sentiment140 corpus
df_sentiment140.columns = df_sentiment140.columns.str.replace('@switchfoot http://twitpic.com/2y1zl - Awww, that\'s a bummer.  You shoulda got David Carr of Third Day to do it. ;D', 'raw_sentence')
df_sentiment140 = df_sentiment140.assign(corpus_name='sentiment140')
df_sentiment140_keep = ['corpus_name', 'raw_sentence']
df_sentiment140 = df_sentiment140[df_sentiment140_keep]

# Display a random sample of 10 rows from the processed data
df_sentiment140.sample(10)

Unnamed: 0,corpus_name,raw_sentence
1469291,sentiment140,had an awesome weekend!! packing for wisconsin and germany
574937,sentiment140,"eating spaghetti, yum today its not that great weather.."
552483,sentiment140,@emjonaslover no i dont see it
291265,sentiment140,And hotmail won't let me send it either
30015,sentiment140,@RayRay_Sodmg qood morninq ! Today's school
1391426,sentiment140,Thanks to all who voted I won!!! Get back to you inawhile i need a nap.. www.iamstayingalive.blogspot.com
207797,sentiment140,I want pain to go away so I can concentrate on designing
560067,sentiment140,Swell is bigger if a little choppy -sun gone away though- Can't have everything I suppose
526296,sentiment140,thinkin about my sister. Haven't seen her in a while
517281,sentiment140,My industrial piercing has been giving me grief all day. It's all swollen and red plus it was throbbi... - http://mobypicture.com/?drhsv2


# Combine 4 Dataset

In [14]:
# Concatenate all DataFrames into a single DataFrame
combined_df = pd.concat([df_stanford, df_large_movie_review, df_twitter_us_airline, df_sentiment140], ignore_index=True)

# Save the combined DataFrame to a CSV file
combined_df.to_csv('../dataset/combined_data.csv', index=False)

In [15]:
# Load "combined_data.csv" and display a random sample of 10 rows

combined_df = pd.read_csv('../dataset/combined_data.csv')
combined_df.sample(10)

Unnamed: 0,corpus_name,raw_sentence
210037,sentiment140,@myria101 Like he's an inmate! Disgusting and it breaks my heart
1042079,sentiment140,@rdewijngaert Leuk! Je zit dus helemaal in de goede richting! what to do:use it to support your conclusion:'other experts also think...'
1062586,sentiment140,JUST GOT BACK FROM LIVING WATERS DRAMA THING borrrrring ours was sooooooo much better
267244,sentiment140,"to me :S but he coudnt be druk on a thursday. ih haate thi, im in love and wont forget aobut him. might go movies with him today if he"
586783,sentiment140,@mayabutterfly Yeah - I'm already sad 4 leaving NYC again
925315,sentiment140,@davidarchie i love zero gravity!!! kisses from chile!! hope you are having fun with mcfly!!
1521578,sentiment140,thank you steph i shall try!
960317,sentiment140,Lil Kim ; Download omq ; i think i like that sonqq noww lol .
1135485,sentiment140,everyone should follow @A1O they are the nicest guys i have seriously ever met. do it up.
747609,sentiment140,@underscored that stinks are you still coming today?


In [16]:
# combined_data data overview
combined_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1651494 entries, 0 to 1651493
Data columns (total 2 columns):
 #   Column        Non-Null Count    Dtype 
---  ------        --------------    ----- 
 0   corpus_name   1651494 non-null  object
 1   raw_sentence  1651494 non-null  object
dtypes: object(2)
memory usage: 25.2+ MB


In [17]:
# check the size of the dataframe
combined_df.shape

(1651494, 2)