# Entity Extraction with Generative Models

This notebook demonstrates how to use Cohere's generative models to extract the name of of a film from the title of an article. This demonstrated the broader usecase of sturctured generation based on providing multiple examples in the prompt. 



![Extracting Entities from text](https://github.com/cohere-ai/notebooks/raw/main/notebooks/images/keyword-extraction-gpt-models.png)


We'll use post titles from the r/Movies subreddit. And for each title, we'll extract which movie the post is about. If the model is unable to detect the name of a movie being mentioned, it will return "none".

## Setup
Let's start by installing the packages we need.

In [None]:
!pip install cohere requests tqdm

We'll then import these packages and declar ethe function that 

In [256]:
import cohere
import pandas as pd
import requests
import datetime
from tqdm import tqdm
pd.set_option('display.max_colwidth', None)

def get_post_titles(**kwargs):
    """ Gets data from the pushshift api. Read more: https://github.com/pushshift/api """
    base_url = f"https://api.pushshift.io/reddit/search/submission/"
    payload = kwargs
    request = requests.get(base_url, params=payload)
    return [a['title'] for a in request.json()['data']]


You'll need your API key for this next cell. [Sign up to Cohere](https://os.cohere.ai/) and get one if you haven't yet.

In [None]:
# Paste your API key here. Remember to not share publicly
api_key = ''

# Create and retrieve a Cohere API key from os.cohere.ai
co = cohere.Client(api_key)

## Preparing examples for the prompt

In our prompt, we'll present the model with examples for the type of output we're after. We basically get a set of subreddit article titles, and label them ourselves. The label here is the name of the movie mentioned in the title (and "none" if no movie is mentioned).


![Labeled dataset of text and extracted text](https://github.com/cohere-ai/notebooks/raw/main/notebooks/images/keyword-extraction-dataset.png)



In [215]:

movie_examples = [
("Deadpool 2", "Deadpool 2 | Official HD Deadpool's \"Wet on Wet\" Teaser | 2018"),
("none", "Jordan Peele Just Became the First Black Writer-Director With a $100M Movie Debut"),
("Joker", "Joker Officially Rated “R”"),
("Free Guy", "Ryan Reynolds’ 'Free Guy' Receives July 3, 2020 Release Date - About a bank teller stuck in his routine that discovers he’s an NPC character in brutal open world game."),
("none", "James Cameron congratulates Kevin Feige and Marvel!"),
("Guardians of the Galaxy", "The Cast of Guardians of the Galaxy release statement on James Gunn"),
]




## Creating the prompt


![Extraction prompt containing the examples and the input text](https://github.com/cohere-ai/notebooks/raw/main/notebooks/images/extraction-prompt-example.png)


In [331]:

class cohereExtractor():
    def __init__(self, examples, example_labels, labels, task_desciption, example_prompt):
        self.examples = examples
        self.example_labels = example_labels
        self.labels = labels
        self.task_desciption = task_desciption
        self.example_prompt = example_prompt

    def make_prompt(self, example):
        examples = self.examples + [example]
        labels = self.example_labels + [""]
        return (self.task_desciption +
                "\n---\n".join( [examples[i] + "\n" +
                                self.example_prompt + 
                                 labels[i] for i in range(len(examples))]))

    def extract(self, example):
      extraction = co.generate(
          model='large',
          prompt=self.make_prompt(example),
          max_tokens=10,
          temperature=0.1,
          stop_sequences=["\n"])
      return(extraction.generations[0].text[:-1])


cohereMovieExtractor = cohereExtractor([e[1] for e in movie_examples], 
                                       [e[0] for e in movie_examples], [],
                                       "", 
                                       "extract the movie title from the post:")

# Uncomment to inspect the full prompt:
# print(cohereMovieExtractor.make_prompt('<input text here>'))

## Getting the data
Let's now make the API call to get the top posts for 2021 from r/movies.

In [299]:
num_posts = 10

movies_list = get_post_titles(size=num_posts, 
      after=str(int(datetime.datetime(2021,1,1,0,0).timestamp())), 
      before=str(int(datetime.datetime(2022,1,1,0,0).timestamp())), 
      subreddit="movies", 
      sort_type="score", 
      sort="desc")

In [300]:
movies_list

['Hayao Miyazaki Got So Bored with Retirement He Started Directing Again ‘in Order to Live’',
 "First poster for Pixar's Luca",
 'New images from Space Jam: A New Legacy',
 'Official Poster for "Sonic the Hedgehog 2"',
 'Ng Man Tat, legendary HK actor and frequent collborator of Stephen Chow (Shaolin Soccer, God of Gambler) died at 70',
 'Zack Snyder’s Justice League has officially been Rated R for for violence and some language',
 'HBOMax and Disney+ NEED to improve their apps if they want to compete with Netflix.',
 'I want a sequel to Rat Race where John Cleese’s character dies and invites everyone from the first film to his funeral, BUT, he’s secretly set up a Rat Maze to trap them all in. A sort of post-mortem revenge on them for donating all his wealth to charity.',
 "'Trainspotting' at 25: How an Indie Film About Heroin Became a Feel-Good Classic",
 '‘Avatar: The Last Airbender’ Franchise To Expand With Launch Of Nickelodeon’s Avatar Studios, Animated Theatrical Film To Start Pr

## Running the model
And now we loop over the posts and process each one of them with our extractor.

In [302]:
results = []
for text in tqdm(movies_list):
    try:
        extracted_text = cohereMovieExtractor.extract(text)
        results.append(extracted_text)
    except Exception as e:
        print('ERROR: ', e)

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00,  1.35s/it]


Let's look at the results:

In [303]:
pd.DataFrame(data={'text': movies_list, 'extracted_text': results})

Unnamed: 0,text,extracted_text
0,Hayao Miyazaki Got So Bored with Retirement He Started Directing Again ‘in Order to Live’,none
1,First poster for Pixar's Luca,Pixar's Luca
2,New images from Space Jam: A New Legacy,Space Jam: A New Legacy
3,"Official Poster for ""Sonic the Hedgehog 2""",Sonic the Hedgehog 2
4,"Ng Man Tat, legendary HK actor and frequent collborator of Stephen Chow (Shaolin Soccer, God of Gambler) died at 70",none
5,Zack Snyder’s Justice League has officially been Rated R for for violence and some language,Justice League
6,HBOMax and Disney+ NEED to improve their apps if they want to compete with Netflix.,none
7,"I want a sequel to Rat Race where John Cleese’s character dies and invites everyone from the first film to his funeral, BUT, he’s secretly set up a Rat Maze to trap them all in. A sort of post-mortem revenge on them for donating all his wealth to charity.",none
8,'Trainspotting' at 25: How an Indie Film About Heroin Became a Feel-Good Classic,Trainspotting
9,"‘Avatar: The Last Airbender’ Franchise To Expand With Launch Of Nickelodeon’s Avatar Studios, Animated Theatrical Film To Start Production Later This Year",Avatar: The Last Airbender


Looking at these results, the model got 9/10 correctly. It didn't pick up on Shaolin Soccer and God of Gambler in example \#4. It also called the second example "Pixar's Luca" instead of "Luca". But maybe we'll let this one slide.

When experimenting with extrction prompts, we'll often find edge-cases along the way. What if a post has two movies mentioned, for example? The more we run into such examples, the more examples we can add to the prompt that address these cases.

## How well does this work?
We can better measure the performance of this extraction method using a larger labeled dataset. So let's load a test set of 100 examples:

In [314]:
test_df = pd.read_csv('movie_extraction_test_set_100.csv',index_col=0)
test_df

Unnamed: 0,text,label
0,Disney's streaming service loses some movies due to old licensing deals,none
1,"Hi, I’m Sam Raimi, producer of THE GRUDGE which hits theaters tonight. Ask Me Anything!",The Grudge
2,'Parasite' Named Best Picture by Australia's AACTA Awards,Parasite
3,Danny Trejo To Star In Vampire Spaghetti Western ‘Death Rider in the House of Vampires’,Death Rider in the House of Vampires
4,I really wish the 'realistic' CGI animal trend would end.,none
...,...,...
95,Hair Love | Oscar Winning Short Film (Full),Hair Love
96,First image of Jason Alexander in Christian film industry satire 'Faith Based',Faith Based
97,"'Borderlands' Movie in the Works From Eli Roth, Lionsgate",Borderlands
98,"Taika Waititi putting his Oscar ""away"" after winning best adapted screenplay for JOJO RABBIT",Jojo Rabbit


In [180]:
test_set =[("none", "Disney's streaming service loses some movies due to old licensing deals"),
("The Grudge", "Hi, I’m Sam Raimi, producer of THE GRUDGE which hits theaters tonight. Ask Me Anything!"),
("Parasite", "'Parasite' Named Best Picture by Australia's AACTA Awards"),
("Death Rider in the House of Vampires", "Danny Trejo To Star In Vampire Spaghetti Western ‘Death Rider in the House of Vampires’"),
("none", "I really wish the 'realistic' CGI animal trend would end."),
("none", "Dolemite is My Name might be my favourite film of 2019"),
("Goodfellas", "De Niro recreating a scene from Goodfellas to test Irishman deaging (3:30 in)"),
("none", "Appreciation Post: Adam Driver"),
("Marriage Story", "Laura Dern loves both of her awards movies — Marriage Story and Little Women — equally"),
("Ad Astra", "A Lot of the Sound Effects in Ad Astra Were Just Tommy Lee Jones's Voice"),
("End of Watch", "The on-screen chemistry between Jake Gyllenhaal and Michael Peña in End of Watch (2012) is one of the best pairings of performances I have ever seen. Such a fantastic film. Cannot believe it's been 8 years since its release, and only just coming onto my radar now."),
("Cats", "'Cats' Visual Woes Began Early On In Production"),
("1917", "Is there anyway way I could get a copy of 1917 for my dying father in law?"),
("Fun Home", "Jake Gyllenhaal to Produce &amp; Star in Movie Musical Adaptation of Fun Home"),
("none", "Actors who criticized their movies shortly after release"),
("none", "Netflix Movies Set for 2020 Release: Fincher, Spike Lee, Kaufman and more"),
("Ni No Kuni", "Ni No Kuni anime movie hits Netflix this month"),
("none", "Feel good and fun comedies to lighten the mood?"),
("none", "What is your favorite movie trailer of all time?"),
("Knives Out", "Movies Like Knives Out"),
("none", "Collider cancels Movie Talk, Jedi Council, Heroes, &amp; Collider Live"),
("none", "Disney Explodes Box Office Records with $11.1 Billion Worldwide for 2019"),
("The Big Year", "Please do yourself a favor and consider watching 'The Big Year' (2011, Jack Black, Owen Wilson, Steve Martin)"),
("The Flintstones", "the Flintstones movie (1994) might have some of best production design in film."),
("none", "Movies with the worst 'facts'"),
("Togo", "Have you seen Togo? If not.. you should. The real story of the Nome serum run that the Balto movie is based on."),
("Dracula Untold", "Dracula Untold is a guilty pleasure of mine, fun movie and chemistry between Luke Evans and Sarah Gadon was off the charts"),
("none", "Movies you thought bombed at the box office but didn't"),
("Parasite", "National Society of Film Critics names 'Parasite' best picture"),
("The Grudge", "Official Discussion: The Grudge (2020) [SPOILERS]"),
("Uncut Gems", "How Uncut Gems Won Over the Diamond District"),
("Back to the Future II", "Michael J. Fox and Christopher Lloyd posing for the Back to the Future II poster in 1989 that would later be illustrated by Drew Struzan"),
("Blood Simple", "* Coen Brothers Marathon MOVIE #1: 'Blood Simple' *"),
("none", "Which respected actor surprised you with their second career of slumming it in crappy cinema?"),
("none", "What happened to subtitles without closed captioning?"),
("none", "What's one type of movie you wish they made more of?"),
("Once Upon a Time in Hollywood", "Quentin Tarantino says you'll probably see that 4-hour 'Once Upon a Time in Hollywood' in a year or so."),
("King Kong", "When will Warner Bros release King Kong (1933) on 4K?"),
("Austin Powers", "Poster for new Austin Powers movie"),
("none", "2019 in film - with 'Movies' by Weyes Blood"),
("Gretel and Hansel", "Poster for Gretel and Hansel (2020) starring Sophia Lillis and Sam Leakey as the title characters."),
("2001: A Space Odyssey", "A poster I made for 2001: A Space Odyssey"),
("ROBOCOP", "'I'd buy that for a dollar.' - Possibly the greatest quote in movie history. (ROBOCOP)"),
("Brief Encounter", "Brief Encounter is a damn masterpiece and if you like Romance films and haven't seen it I highly recommend checking it out."),
("The Two Popes", "The Two Popes | VFX Breakdown"),
("Once Upon a Time in Hollywood", "In Once Upon a Time in Hollywood, I found Rick and Cliff's relationship to be one-sided and borderline resentful, not the wholesome bromance everyone makes it out to be"),
("The Grudge", "‘The Grudge’ Film Review: Classy Cast Gets Trapped Inside Played-Out Sequel"),
("The Grude", "The Poster for The Grude (2020) has actual hair dropping from the top. Definitely one of the best posters I've ever seen."),
("none", "What movie of the past decade had the best special effects?"),
("none", "Which Are Your Favorite Exposition Scenes In Film?"),
("Unknown Soldier", "Unknown Soldier - the most expensive Finnish movie ever made"),
("Wonder Woman 1984", "New Image For Wonder Woman 1984"),
("Hard Boiled", "What does Reddit think of the movie, Hard Boiled (1992)?"),
("Little Women", "Little Women"),
("Bohemian Rhapsody", "Bohemian Rhapsody was most-watched film at home in 2019"),
("Parasite", "Parasite script available to read"),
("The Lighthouse", "The Lighthouse Review: The best shitfaced acting I've seen."),
("Mad Max", "The Mad Max franchise is my all time favorite movie series. I finally watched Waterworld tonight. Oh man why didnt I see this sooner?"),
("none", "Reddit I am currently stuck in my house for the next few days due my wheelchair being out for repairs what are some movies I should check out to pass the time?"),
("Rocky", "More than mere schmaltz, Rocky tells an underdog story of mythic force"),
("none", "Movies that make the urban environment one of the main characters"),
("Tumbbad", "Tumbbad"),
("Shin godzilla", "What are your thoughts on the 2016 best picture award winning Shin godzilla?"),
("Run", "What is happening with Aneesh Chaganty's Run?"),
("Dune", "DUNE: The first image of the Ornithopter (?)"),
("none", "What Are Your Top 5 Favorite Animated Movies Of All Time?"),
("Interstellar", "Interstellar's video messages scene is one of the most emotional scenes I've ever seen"),
("Parasite", "Bong Joon Ho's Parasite was INCREDIBLE"),
("none", "10 SURPRISING TOP COMIC BOOK SONGS THAT WILL MAKE YOU HAPPY!"),
("Star Wars", "How A New Hope created Pixar Animation Studios"),
("Willow", "Why didn't Willow do better?"),
("none", "What Would Be A Good Film That Adam Sandler and Nicolas Cage Stars In?"),
("Ricochet", "Ricochet 1991"),
("none", "What's the best Nicolas Cage performance?"),
("none", "Looking for recommendations for two specific types of movies."),
("1917", "A scene from the movie 1917 was recreated from the stroyboards."),
("Ratatouille", "When does the movie 'Ratatouille' actually play?"),
("John Wick", "Has anyone taken a close listen to the sound track from John Wick? It’s seriously the key to the intensity of the action scenes. The use of non mainstream music helps to not distract from the action"),
("Dreams", "When Martin Scorsese played the role of Vincent van Gogh in the surreal Akira Kurosawa film ‘Dreams’"),
("Congo", "Can we all take a moment to appreciate 'Congo?' (1995)"),
("none", "I spent 2019 watching, keeping track and ranking 237 movies I had never seen before."),
("none", "A montage of movie characters falling down set to the song 'Can't Help Falling In Love' by Elvis Presley"),
("Wonder Woman", "New Wonder Woman image"),
("Jojo Rabbit", "Jojo Rabbit (2020)"),
("none", "No, Marvel Studios Won’t Debut Its First Trans Character ‘Very Soon’"),
("El Topo", "El Topo 4K restoration trailer"),
("none", "Who Are The Greatest Modern Directors?"),
("none", "Help me pick a movie for tonight!"),
("none", "Thoughts on the Irishman ..."),
("none", "Where to start with Bergman's filmography?"),
("Alice Doesnâ€™t Live Here Anymore", "Monterey Dreamin’: 45 Years of Alice Doesn’t Live Here Anymore"),
("Edge of Tomorrow", "Edge of Tomorrow is such a well-made sci-fi flick and its conceit is so effectively utilized (rather than being a gimmick) but I do have a question..."),
("The Ice Road", "First Image from Thriller 'The Ice Road' - Starring Liam Neeson &amp; Laurence Fishburne - After a remote diamond mine collapses in far northern Canada, an ice driver leads a rescue mission over a frozen ocean to save the lives of trapped miners despite thawing waters and a threat they never see coming."),
("Swallow", "Swallow - Official Trailer I 2020 I IFC Films."),
("Rocky Balboa", "Sylvester Stallone's performance in Rocky Balboa (2006) is so damn good"),
("Hair Love", "Hair Love | Oscar Winning Short Film (Full)"),
("Faith Based", "First image of Jason Alexander in Christian film industry satire 'Faith Based'"),
("Borderlands", "'Borderlands' Movie in the Works From Eli Roth, Lionsgate"),
("Jojo Rabbit", "Taika Waititi putting his Oscar “away” after winning best adapted screenplay for JOJO RABBIT"),
("Parasite", "Oscar-Winning 'Parasite' Lands One-Week IMAX Release Starting February 21 in 200+ Theaters"),
("Cus And Mike", "Sir Anthony Hopkins To Star As Mike Tyson’s Legendary Trainer Cus D’Amato In Boxing Movie ‘Cus And Mike’ directed and written by The Notebook filmmaker"),
("Birds of Prey", "‘Birds of Prey’ opens to #1 with terrible $33.3M, the lowest opening ever for a DCEU film"),
("none", "It's a new decade so we are voting to determine Reddit's Top 100 of the 2010s! Come in and cast your ballot for the films you think deserve to make the list."),
("none", "Christian Bale is one of the best actor of this generation"),
("Connected", "Phil Lord and Chris Miller's new animated film, 'Connected'"),
("none", "Paul Newman Movies: 20 Greatest Films Ranked Worst to Best"),
("Siberia", "First Full Trailer for Abel Ferrara's 'Siberia' Film Starring Willem Dafoe"),
("Transformers: War For Cybertron Trilogy", "Transformers: War For Cybertron Trilogy: Siege (by RoosterTeeth)"),
("Songs From The Second Floor", "Director Roy Andersson pulls out of Berlin Film Festival due to health issues, known for 'Songs From The Second Floor' and ' Pigeon Sat on a Branch Reflecting on Existence'"),
("A Farewell To Arms", "“A Farewell To Arms” idea"),
("none", "Which film is remade would actually be better?"),
("The Voice of Nemo", "The Voice of Nemo: Same Interview, 17 Years Apart"),
("none", "Your favourite actor who has never been nominated for an Oscar?"),
("Cats", "Oscars ‘Cats’ Joke Sparks Fury From Visual Effects Society"),
("none", "ICYMI: Bong Joon Ho is Developing an Action/Horror Film as One of His Next Projects"),
("Antlers", "New Poster for 'Antlers'"),
("The Good Nurse", "Netlfix Buys Jessica Chastain-Eddie Redmayne Thriller ‘The Good Nurse’ for Record $25M - True story of the pursuit and capture of Charlie Cullen, a nurse who is regarded as one of the most prolific serial killers in history."),
("Sonic the Hedgehog", "I've seen Sonic the Hedgehog (2020) AMA"),
("The Matrix 4", "'Sense8' Alum Brian J. Smith to Reteam with Lana Wachowski for 'The Matrix 4'"),
("none", "Federico Fellini - BFI video about how he pulled influence from old comic books"),
("Happy Gilmore", "Happy Gilmore's Cable Friendly Voiceover"),
("The Oscars", "The Oscars - All Best Actress Winners"),
("Ex Machina", "The Real Implications Of Ex Machina's Turing Test"),
("none", "All 30 Akira Kurosawa Movies Ranked From Worst To Best"),
("Chlorine", "A feature-length, student film made on a tiny budget: Chlorine."),
("Birds of Prey", "'Birds of Prey (and the Fantabulous Emancipation of One Harley Quinn)' Changes Title To 'Harley Quinn: Birds of Prey'"),
("Parasite", "The Illustrious Career of Park So-dam (Parasite)"),
("none", "Pedro Almodovar Plans New Short Film With Tilda Swinton"),
("none", "Recommend directors’ filmographies"),
("none", "10 Great Cyberpunk Movies You’ve Probably Never Seen"),
("Knives Out", "‘Knives Out’ 4K Blu-Ray Review: Sharp"),
("none", "What's A 'Cult Classic' Film That You Watched But Didn't Like?"),
("none", "Move over Criterion, Richard Stanley visits the SEVERIN CELLAR"),
("Jojo Rabbit", "Jojo Rabbit: “Jews look just like us” is the weakest argument against racism"),
("none", "[HELP] I don't remember the title of a movie i watched a while ago"),
("none", "What is the most disappointing or worst movie of 2019"),
("none", "Do you agree with this guy about the current state of movies/Hollywood?"),
("Casino Royale", "Today in 2006 the Casino Royale crew shot the opening scene, in which James Bond becomes a 00 agent."),
("The Batman", "Matt Reeve's The Batman camera test with Robert Pattinson"),
("Indiana Jones 5", "Steven Spielberg Won’t Direct ‘Indiana Jones 5'"),
("Get Out", "Is 'Get Out (2017) ' actually a woke movie?"),
("Die Hard", "The McClane Canon - what if the bad “Die Hard” sequels were swapped out with better Bruce Willis movies??"),
("none", "Is there a resource/database on the internet that can tell you what order a scene was filmed in a certain movie?"),
("none", "What are Some Films that are Worthy of Owning?"),
("none", "I can't find a movie I watched as a kid."),
("none", "Been bugging me for years..."),
("none", "A lot of reviews lately mention 'pushing an agenda' - do you agree or did audience expectations change during the last decade?"),
("The Gambler", "Film where a main character is making bets throughout the film and there’s one massive bet towards the end of the film."),
("Hamilton", "Disney to Release Film Version of 'Hamilton' Stage Performance With Original Broadway Cast"),
("Ragnarok", "Netflix: Ragnarok Series Thoughts"),
("Mission: Impossible", "'Mission: Impossible' Sequels Won't Send Tom Cruise to Space, But Will Make Christopher McQuarrie Puke in a Bucket"),
("none", "Looking for a movie! Please help!"),
("none", "The World's First Vertical Format Blockbuster is Coming, Whether We Want It Or Not"),
("My Spy", "My Spy looks interesting"),
("Beast", "Morena Baccarin To Star In Espen Sandberg’s ‘Beast’ For Rakuten H Collective"),
("Parasite", "'Parasite' Ending Explained by Bong Joon-ho"),
("And Then We Dance", "I'm Levan Akin, director of AND THEN WE DANCED. AMA!!!"),
("Family Jewels", "Goldie Hawn, Bette Midler, Diane Keaton Starring in ‘Family Jewels’"),
("none", "Which unknown or slightly unknown actress/actor will become highly known next"),
("PAULA", "PAULA- Spanish short film with English subtitles about a father who gets stuck in the elevator with the young man that forced himself on his daughter, shot in 2018."),
("none", "Hi every one can someone please tell me the name of this me thank you"),
("Cats", "VFX Society Slams Oscar Jokes About ‘Cats,’ Says CGI ‘Will Not Compensate for a Story Told Badly’"),
("none", "Digital copies should always come with physical copies of movies"),
("Honey, I Shrunk the Kids", "Rick Moranis Returning for ‘Honey, I Shrunk the Kids’ Sequel Starring Josh Gad"),
("The Raid", "The Raid Remake Has a New Title and Isn't a Raid Remake Anymore"),
("none", "The Independent Film &amp; Television Alliance Files Complaint with U.S. Government, Saying U.S. Should Punish China for Cheating on Film Trade Deal By Making It Harder for Indie Films to Secure China Theatrical Release"),
("Sonic the Hedgehog", "‘Sonic The Hedgehog’ China Release Date Postponed Due To Coronavirus"),
("none", "Movies that had great potential/nice story idea, but let down by the execution?"),
("none", "Every Harrison Ford Movie Performance, Ranked"),
("Aliens", "Re-watching Aliens and The Terminator with this new perspective is awesome!"),
("The Bike", "Eric Bana to Write, Direct and Star in Film Based on Racer Mike 'The Bike' Hailwood"),
("none", "James Mangold addresses the magic of editing and not just the very long scenes that are popular now"),
("The Iron Mask", "THE IRON MASK Official Trailer (2020)"),
("Parasite", "Oscars 2020: What Parasite says about South Korea"),
("Knives Out", "Rian Johnson Reveals ‘Knives Out’ Sequel Details on Oscars Red Carpet"),
("Doctor Strange 2", "Doctor Strange 2 Writers Didn't Get the Chance to Submit Draft Before Exit"),
("Lord of the Rings", "I love movies especially lord of the rings. drew this in sharpie and thought this sub would like it"),
("none", "Now playing: A smalltown movie theater lives"),
("none", "Workflow Breakdown of Every 2020 Oscars Best Picture and Editing Nominee"),
("none", "Join us tomorrow March 19 for an AMA with Anthony Daniels here in r/movies"),
("none", "What do you consider to be the most wasted opportunity in a movie?"),
("none", "Golden Globes Changes Film Eligibility Rules in Wake of Coronavirus Crisis - The organization has temporarily suspended the rule that a film has to be screened for in the greater Los Angeles area this year."),
("none", "NBCUniversal to release new movies online to rent because of coronavirus"),
("Cats", "‘Cats’ Sweeps Razzie Awards With Six Wins Including Worst Picture After Ceremony Was Cancelled Due To Coronavirus"),
("Scooby Doo 2: Monsters Unleashed", "How was Scooby Doo 2: Monsters Unleashed(2004) not succesful enough for a third film?"),
("none", "Coronavirus: Universal to make current theatrical movies available for home viewing on Friday"),
("Jurassic Park Dominion", "Poster surfaced for the new Jurassic Park Dominion (2021)"),
("Quigley Down Under", "Quigley Down Under is an overlooked masterpiece."),
("Bloodshot", "Sony to release Bloodshot on VOD early, starting March 24th"),
("Lupin the 3rd: The First", "‘Lupin the 3rd: The First’ Anime Film Based on Monkey Punch Franchise Getting US Release From GKIDS"),
("none", "Jonah Hill Says Nobody Saw His Best Performance: Amazon F*cked Up"),
("none", "This is the the most powerful and moving scene in movie history to me, what is your favorite scene from a movie?"),
("none", "Official Coronavirus/Covid-19 Movies Megathread - Release Date Changes and Conditions of Actors (ongoing)"),
("Star Wars", "All theatrical Star Wars trailers remastered to 4K"),
("none", "Tom Hanks and Rita Wilson released from the hospital following coronavirus diagnosis"),
("none", "Idris Elba tests positive for coronavirus"),
("Bloodshot", "Vin Diesel's ‘Bloodshot’ Releasing on Digital Months Early (March 24th)"),
("Portrait of a Lady on Fire", "Not my post, but I wanted to share here!!!: The Criterion Collection to release Portrait of a Lady on Fire in June!"),
("none", "What are some great movie series to watch?"),
("Trolls World Tour", "Universal to release movies online while they are in theaters, starting with 'Trolls World Tour'"),
("Blast From the Past", "Blast From the Past (1999)"),
("none", "Staying open: Admiral Twin Drive-In receives permission to continue showing movies"),
("Harry Potter", "[Harry Potter] Why Umbridge's Theme Song is so Fitting (Let me know your thoughts!)"),
("none", "Suggestion: /r/Movies Nights for Community Viewing while in Self-Isolation"),
("The World's End", "The World's End"),
("none", "A day in the life under Coronavirus quarantine."),
("none", "Now is the Perfect Time to Support your Local Drive-In Theatre!"),
("Heat", "Anybody want to talk about 'Heat'?"),
("Cats", "The people that made CATS should have turned the production process into a documentary."),
("none", "‘It’s a Mess’: Studios, Unions Grapple With Pay for Production Crew Members Amid Shutdowns"),
("none", "An updated list of my top 25 favorite movies"),
("Parasite", "Parasite — The Power of Symbols"),
("none", "Best movies about American history"),
("none", "$20 for a Theatrical Movie On-Demand? Americans Want to Pay More Like $6"),
("none", "Anyone think they're gonna make a movie about the COVID pandemic in a decade or so?"),
("The Way Back", "Ben Affleck Drama ‘The Way Back’ Heads Into Homes As Theaters Close. Available March 24th."),
("L.A. Confidential", "'L.A. Confidential' is amazing"),
("Trolls World Tour", "Universal will begin to release new films online while they were still in theaters. The upcoming 'Trolls World Tour' will be the first movie to open simultaneously online and in theaters for the company."),
("none", "Cinemark Shuts Down Movie Theaters Indefinitely Due to Coronavirus"),
("none", "The Razzies just posted their results"),
("none", "Tom Hanks and Rita Wilson are playing gin rummy and singing while quarantined with a case of 'the blahs'"),
("Blade Runner", "Connecting with Blade Runner (1982)"),
("none", "Movies where the studio intro's are integrated into or transition smoothly into the movie"),
("Crocodile Dundee", "Crocodile Dundee actor Mark Blum dies of coronavirus aged 69"),
("Pig", "Neon Wins Domestic Rights to Nicolas Cage Revenge Thriller ‘Pig’ - Writer-director Michael Sarnoski’s film sees Cage as a reclusive truffle hunter in Oregon whose prize hunting pig is kidnapped, forcing him to return to old stomping grounds in Portland and confront his past."),
("none", "What are some bad movies that are just plain bad?"),
("Stand and Deliver", "Stand and Deliver (1988) remains a great film"),
("none", "Educational Movies for the Stay at Home Student"),
("Uncharted", "Sony’s Uncharted Postpones Filming Due to Coronavirus"),
("none", "New Sub: Reddit Writes a Screenplay - a new collaborative project at /r/wras"),
("Final Destination", "In honor of Final Destination turning 20 this is great read on what makes the series so great despite being so absurd."),
("Magnolia", "Magnolia: 'How To Fake Like You Are Nice And Caring'"),
("none", "Simple website that displays the current Health status of Tom Hanks"),
("The Three Thousand Years of Longing", "George Miller's Three Thousand Years Of Longing Sets Up A Bizarre Tilda Swinton/Idris Elba Film"),
("none", "Coronavirus Could Result In $5 Billion Loss For Global Box Office"),
("none", "Casting Director Says Adèle Haenel Has A “Well-Deserved Dead Career” After That Awards Protest"),
("none", "Weinstein Sentenced to 23 Years"),
("none", "I'm going into self isolation soon. I love horror, fantasy, and animated films. Gimme some suggestions on what I can watch during my two to three week quarantine."),
("none", "Staying inside? Every movie worth watching on Canadian streaming services, for every kind of viewer"),
("Voyage of Sinbad", "Celebrate Ray Harryhausen's 100th Birthday With This Amazing 7th Voyage of Sinbad Merch"),
("none", "The Movie Gauntlet: The Homemade Quarantine March Madness"),
("none", "The list of the 100 Italian films to be saved (Italian: 100 film italiani da salvare) was created with the aim to report “100 films that have changed the collective memory of the country between 1942 and 1978”."),
("none", "Top 10 Funniest Movie Moments"),
("none", "Cannes Postponed"),
("Terribly Happy", "Terribly Happy (2008) is insanely overlooked. It is almost like a very dark version of Hot Fuzz (2007). Definitely worth a watch!"),
("none", "Anthony Hopkins plays piano for cat during ‘preventative’ coronavirus quarantine"),
("none", "Swedish Distributors Teaming With Streaming Service To Release New Films Digitally Faster &amp; Financially Assist Closed Cinemas"),
("Dune", "Dennis Villeneuve is by far one of the best directors today"),
("none", "Spawn Creator Confirms Jamie Foxx Hasn’t Quit Movie"),
("Caddyshack 2", "Apparently making Caddyshack 2 is a tragic fate like...... dying. Post image"),
("Margin Call", "Margin Call - Senior Partners Emergency Meeting (2011)"),
("none", "Hollywood star Tom Hanks and his wife test positive For coronavirus"),
("Eyes Wide Shut", "EYES WIDE SHUT was nearly a Steve Martin comedy. Would it have worked?"),
("Uncut Gems", "Adam Sandler deserved at least a nomination for Uncut Gems"),
("Green Goblin", "2002 Green Goblin: Norman possessed or simply a mental disorder?"),
("none", "A PSA from Arnold Schwarzenegger"),
("True History of the Kelly Gang", "New Poster for 'True History of the Kelly Gang' - Starring George MacKay, Nicholas Hoult, Russell Crowe, Thomasin McKenzie, and Charlie Hunnam - The story of Australian bushranger Ned Kelly and his gang as they flee from authorities during the 1870s. - Directed by Justin Kurzel ('Assassin's Creed')"),
("Indiana Jones", "BREAKING NEWS! There's a FOURTH INDIANA JONES movie"),
("The Graduate", "Great analysis of The Graduate"),
("Cinerama", "This is Cinerama (1952) - 'People sat back in spellbound wonder as the scenic program flowed across the screen. It was really as though most of them were seeing motion pictures for the first time.... Cinerama is frankly and exclusively 'sensational,' in the literal sense of that word.'"),
("none", "What’s the greatest dialogue involving, about, or with a devil-esque character you’ve ever heard?"),
("Taxi Driver", "Is the ending of Taxi Driver all in Travis’ head?"),
("Dune", "First Image of Timothée Chalamet in Dune"),
("Apollo 13", "If you begin 'Apollo 13' at 9:17pm ET tonight, Tom Hanks will utter the infamous, 'Houston, we have a problem' line *exactly* 50 years after Jim Lovell said it for real - 10:08pm, on April 13, 1970."),
("Dune", "Exclusive new look at Dune"),
("none", "My wife and I started watching movies listed on this Top 100 Movies scratch-off poster when we started staying at home because of the Coronavirus. We just hit the halfway mark with 50/100 watched! 50 more to go!"),
("Minari", "First Poster for A24's Drama 'Minari' - Starring Steven Yeun - A Korean family moves to Arkansas to start a farm in the 1980s."),
("A Goofy Movie", "25 Years Ago, 'A Goofy Movie' Became Disney’s Most Unlikely Sleeper Hit - Originally Planned as a Small Direct-to-Video Film with Little Fanfare &amp; Attention, It's Grown to Become One of the Most Popular Animated Cult Classics of that Decade"),
("The Silencing", "First Image from Crime-Thriller 'The Silencing' - Starring Nikolaj Coster-Waldau - A reformed hunter living in isolation becomes involved in a deadly game of cat and mouse when he and the local Sheriff set out to track a vicious killer who may have kidnapped his daughter years ago."),
("Splash", "Disney Edited Daryl Hannah’s Butt Out of ‘Splash’ with Horrific CGI"),
("none", "Star Cinema Grill owner suing insurance company after told 'pandemic insurance' doesn't cover COVID-19 crisis"),
("Drive-Ins", "The Return of Drive-Ins Could Be the Theater Industry’s Last Hope"),
("Kung Fu hustle", "Kung Fu hustle is the best attempt a live action anime by far.."),
("none", "Different Storyboards and their end results."),
("Parasite", "‘Parasite’ Has Monster Streaming Debut for Hulu and Sets All-Time Records in One Week, Becoming the Most Streamed Independent or Foreign Language Film Ever on the Platform"),
("Soul", "Pixar’s ‘Soul’ Release Delayed Until November"),
("none", "Brian Dennehy Dies Aged 81"),
("none", "Cinemark Lays Off 17,500 Workers, Furloughs 50% of Corporate Staff"),
("none", "I never realized how much I relied on movie theaters to bring me happiness"),
("E.T.", "‘E.T.’ cinematographer Allen Daviau has died at 77"),
("3:10 to Yuma", "'3:10 to Yuma' (2007) is a great movie. If you're not familiar with Westerns, you'll enjoy it anyways"),
("Porno", "Horror-Comedy ‘Porno’ Cancelling Its Theatrical Release and Going Straight-to-VOD - It follows a group of seemingly wholesome young movie theatre employees who are tempted and terrorized by a sex demon."),
("Not Another Teen Movie", "Not Another Teen Movie is a f**king masterpiece"),
("none", "Wynn Handman Dies of COVID-19 at 97 - Famed Acting Teacher Who Worked With Such Actors as Dustin Hoffman, Morgan Freeman, John Leguizamo, Alec Baldwin, Allison Janney, Michael Douglas, James Caan, Christopher Walken, Denzel Washington, Burt Reynolds, and Robert de Niro Early In Their Careers"),
("Parasite", "English dubs for parasite?"),
("Capone", "Tom Hardy's Fonzo Retitled 'Capone' and Set for VOD Release"),
("none", "Cinemark Reveals Plans To Gradually Reopen Starting July 1st With Older Films, Lower Ticket Prices, and Reduced Capacities"),
("Dune", "New Image of Zendaya as Chani from Denis Villeneuve’s ‘Dune’"),
("none", "Steven Soderbergh has been tasked with leading a new Directors Guild of America committee that seeks to assess when halted productions may restart and Hollywood can resume work"),
("none", "Burt Reynolds as James Bond Deep Fake"),
("Dune", "The Official Treatment Poster for Denis Villeneuve's 'Dune'"),
("Capone", "CAPONE Trailer (2020) Tom Hardy as Al Capone"),
("The Cat Came Back", "The Cat Came Back (1988) - Academy Award-nominated animated short. Beautifully remastered and totally bonkers."),
("Macbeth", "Joel Coen &amp; Frances McDormand Describe Their ‘Macbeth’ As Thriller, Say the Characters’ Ages Changes the Story of the A24 Film Co-Starring Denzel Washington"),
("The Peanut Butter Falcon", "Shia LaBeouf gave two great performances in 'The Peanut Butter Falcon' and 'Honey Boy'."),
("Doctor Strange 2", "Marvel's Doctor Strange 2 gets Spider-Man's Sam Raimi as new director"),
("Fight Club", "David Fincher thinks Fight Club would make a great TV show. “I think Chuck Palahniuk’s characters are so rich and dense and layered and faceted. Certainly 80 per cent of his other writings would make amazing mini-series, if nothing else”."),
("Honey Boy", "Honey Boy (2019) was the most devastatingly honest film I've seen in years."),
("Dune", "'Dune' confirmed to be split into two movies"),
("Lord of The Rings", "HAPPENING NOW: Cast members of Lord of The Rings (Sean Astin &amp; Dominic Monaghan) are having a reunion to raise money for covid-19"),
("Bruce Lee: His Greatest Hits", "Criterion Collection Unveils Bruce Lee: His Greatest Hits Tribute"),
("Doctor Sleep", "The Doctor Sleep director's cut is excellent"),
("Coffee Shop Names", "The great Danny Pudi in something new: Coffee Shop Names (2020, trailer)"),
("none", "Gore Verbinski Set To Direct Untitled Animated Film For Netflix"),
("none", "Why aren't many adventure movies made anymore?"),
("none", "Canadian director Denis Villeneuve declared 'filmmaker of the decade' by Hollywood Critics Association"),
("Wolverine", "Hugh Jackman Has Made Peace With MCU Rebooting Wolverine - “I knew it was the right time for me to leave the party—not just for me, but for the character. Somebody else will pick it up and run with it. It’s too good of a character not to.'"),
("none", "Roger Corman Launches Short Film Fest Competition During Quarantine"),
("Trolls World Tour", "The Only Theater Screening New Movies Is a Drive-In in Ocala, Florida — The owner of the Ocala Drive-In told Vice why people still need a place to see 'Trolls World Tour.'"),
("none", "Leonardo DiCaprio, Robert De Niro Offer Walk-On Role In Upcoming Film"),
("none", "Gene Wilder - How to React Naturally"),
("none", "Cannes 2020: New Films By Edgar Wright, Chloe Zhao, Wes Anderson, Paul Verhoeven Were Expected To Debut (&amp; Still Might)"),
("none", "Brad Pitt's Best Performance"),
("none", "Bob Iger Thought He Was Leaving on Top. Now, He’s Fighting for Disney’s Life."),
("License to Kill", "License to Kill (1989) Has to be the most Gruesome James Bond film"),
("Star Wars", "Daisy Ridley wonders where all of the 'love' for Star Wars went"),
("Tigertail", "Netflix drama ‘Tigertail’ quietly roars thanks to Tzi Ma - Joan Chen also stars in the story about three generations of a Taiwanese family"),
("none", "12 Classic Chinese Films are Now Free on YouTube with English Subtitles"),
("none", "Movie trends post-Coronavirus"),
("none", "The home experience will never surpass going to the actual theater (for me)"),
("Venom: Let There Be Carnage", "‘Venom: Let There Be Carnage’ Delayed Until June 25, 2021"),
("none", "Amid Cash Crunch, AMC Theatres Plans to Raise $500 Million In Private Offering"),
("Mortal Kombat Legends: Scorpion's Revenge", "'Mortal Kombat Legends: Scorpion's Revenge' came out today and is one of the more awesome things released this year. And it's a straight to video release!"),
("The Thing", "Puppets, Prosthetics, and Bubblegum: How They Did The Iconic Chest Chomp Scene in 'The Thing'"),
("none", "Johnny Depp Officially Joins Instagram, Thanks Fans For ‘Unwavering Support’"),
("Freaked", "Freaked - re:View"),
("none", "San Diego Comic-Con Postponed for the First Time in 51-Year History"),
("Donnie Darko", "Is Donnie Darko worth a rewatch?"),
("none", "Sean Bean’s character always dies, but what other things have become associated with one particular actor?"),
("American Psycho", "American Psycho at 20: a vicious satire that remains as sharp as ever"),
("none", "Movies often associated to the wrong director?"),
("Blade Runner 2049", "Roger Deakins Refused to Shoot ‘Blade Runner 2049’ the ‘Sloppy’ Way Hollywood Studios Expect"),
("Dune", "Dune (2020) poster by me"),
("none", "Workers at Hollywood Reporter and Billboard Vandalize Website After Getting Laid Off"),
("Transformers", "All of Shia LaBeouf's screaming in the Transformers movies"),
("The Invitation", "“The Invitation” is one of the best indie suspense films I have seen in a long time."),
("Wolverine", "Hugh Jackman Dodged a 'Cats'-Shaped Bullet; Rules Out Playing Wolverine Again for the 100th Time"),
("none", "You can really feel the hole Bill Paxton left"),
("Kick-Ass", "'No Studio Would Touch It:' The Big Gamble Behind 'Kick-Ass' - As the genre-bending comic book movie turns ten, the key players look back at its battle to hit the big screen: 'Literally every person who saw it or read the script said 'No.' '"),
("Network", "Network (1976) is a must watch now more than ever before"),
("Die Hard With A Vengeance", "Die Hard With A Vengeance is the best Die Hard"),
("none", "A guide of where to start and what to expect from Stanley Kubrick"),
("none", "The original Terminator showed us how to properly give expostion in a movie."),
("News of the World", "Tom Hank's 'News of the World' (December 2020), is First Movie Filmed at New, Native American Owned Film Studio in New Mexico"),
("none", "Criterion to release all Bruce Lee films"),
("Dune", "'Dune': Zendaya Shares a New Photo of Her Character in Denis Villeneuve's Sci-Fi Epic"),
("A Nightmare on Elm Street: Dream Warriors", "'A Nightmare on Elm Street: Dream Warriors' is the best ANOES sequel. What are your thoughts on it?"),
("none", "Cannes Film Festival Won’t Take Place in June"),
("none", "Fred Willard has passed away. Jamie Lee Curtis posts news on social media."),
("none", "I worked for Paramount Pictures in 2012. All employees were given this poster as a commemorative gift - help me figure out a few of these! (More info in comments)"),
("Game Night", "Game Night is a phenomenal movie"),
("Rescue Rangers", "Lonely Island's Akiva Schaffer To Direct Disney's 'Rescue Rangers' Film"),
("Empire Strikes Back", "Beautiful new poster commemorating the 40th anniversary of Empire Strikes Back by Matt Ferguson"),
("none", "Italy will reopen cinemas and theatres on 15 June"),
("Best in Show", "Fred Willard in 'Best in Show'. RIP."),
("Kingsman: The Secret Service", "Kingsman: The Secret Service is one of the most rewatchable movies I’ve ever seen."),
("Da 5 Bloods", "First Image from Spike Lee's War Film 'Da 5 Bloods' - Starring Chadwick Boseman, Jonathan Majors, Jean Reno, and Paul Walter Hauser - Four veterans battle the forces of man and nature when they return to Vietnam seeking the remains of their fallen leader and the gold fortune he helped them hide."),
("Yesterday", "Hollywood Accounting In Action: Even though Danny Boyle's 2019 hit 'Yesterday' grossed $154M internationally on a $26M budget, a recent document from Universal shows that the movie is being reported as an $88M loss, meaning the studio doesn't have to pay any bonuses/points on the back-end."),
("Forrest Gump", "What Forrest Gump Actually Said During His Silenced Rally Speech"),
("Hugo", "Reflections on ‘Hugo’: The Late-Scorsese Masterpiece That Finds Love in Art and Family"),
("none", "I’m Ivo Gerscovich, Chief Brand Officer for Sonic Studio at SEGA of America. AMA!"),
("none", "Owen Wilson"),
("Hercules", "Joe and Anthony Russo Say Their ‘Hercules’ Remake Won’t Be A “Literal Translation”, and 'Music Will Certainly Be A Part Of It'."),
("Tenet", "The trailer for Tenet will debut in Fortnite later on tonight. Yes, you read that correctly."),
("The Batman", "The Batman Will Be 'Darker' Than Previous Films, Andy Serkis Hints"),
("John Wick", "New John Wick poster announcing that it’ll be streaming free on YouTube starting this Friday"),
("Eurotrip", "Eurotrip (2004), who else likes this gem?"),
("DA 5 BLOODS", "The first poster for Spike Lee’s ‘DA 5 BLOODS’"),
("Despicable Me", "Despicable Me (the first) is actually such a heartwarming movie"),
("Irresistible", "Jon Stewart’s Political Comedy ‘Irresistible’ to Debut on VOD June 26th"),
("The Matrix", "Wachowskis Release Cut Of 'The Matrix' Where Neo Just Takes The Blue Pill"),
("Edge of Tomorrow", "Edge of Tomorrow - how was this not a huge hit?"),
("none", "Jonah Hill Swears the Most of Any Film Actor, Study Shows"),
("Black Bear", "Aubrey Plaza's acclaimed psychological-thriller 'Black Bear' secures US distribution - A female filmmaker at a creative impasse seeks solace from her tumultuous past at rural retreat, only to find that the woods summon her inner demons in intense and surprising ways."),
("none", "I Really Miss Going To The Theatre"),
("none", "Portrait poster I made 😊"),
("none", "What are your favourite 30mins from any film?"),
("Baby Driver", "‘Baby Driver’ Director Edgar Wright Launches New Production Company 'Complete Fiction' - Will Exclusively Team With Netflix to Develop Film &amp; TV Projects, Starting with a Supernatural TV Series &amp; a Documentary"),
("Cornetto Trilogy", "A celebration of the Cornetto trilogy"),
("none", "Favorite directors with 5 movies or fewer"),
("Moana", "Just finished Moana (2016). Why did I sleep on this Disney pic for so long?"),
("Indiana Jones and The Last Crusade", "Indiana Jones and The Last Crusade has the best movie ending of all time."),
("none", "David Lynch Emerges From Quarantine to Give Weather Report"),
("Dune", "New Image From Dune"),
("Dune", "New Image from Denis Villeneuve’s Dune (2020)"),
("Interstellar", "That video log scene from Interstellar always manages to make me cry."),
("Last Night in Soho", "Last Night in Soho official poster (dir. Edgar Wright)"),
("Singin' in the Rain", "Singin' in the Rain (1952)"),
("none", "Just got this excellent poster as a gift. There's more that I haven't seen then I would have thought, and a few that I can't remember very well, so going to use this as a reason to watch again!"),
("The Old Guard", "Official Poster for ‘The Old Guard’ Starring Charlize Theron"),
("Titanic", "Titanic Departure Real Video 1912"),
("none", "Hoping you guys might appreciate this. I’m a (very) amateur woodworker and a few years ago decided to make a new tv stand. Decided to include images from my favorite movies and franchises in the top"),
("Tetsuo: Iron Man", "A Guide to Shinya Tsukamoto, director of cult Japanese classic 'Tetsuo: Iron Man'"),
("none", "Hayao Miyazaki on his new project (English subs, 2019)"),
("Insomnia", "Insomnia is a Christopher Nolan gem that doesn’t get enough attention"),
("none", "ABC announces the return of 'The Wonderful World of Disney', to air movies at home this Summer (starting May 20th)"),
("Candyman", "Wow! Candyman (1992) is a fantastic film!"),
("none", "JustWatch just isn't accurate anymore: Missing more than 25% of new Netflix titles"),
("Grave of the Fireflies", "Grave of the Fireflies is the saddest movie I've ever watched, and I can't get over it"),
("American Son", "Russell Crowe to Star in 'American Son', the Hollywood Remake of 'A Prophet' written by Dennis Lehane (Mystic River &amp; Gone Baby Gone)"),
("Species II", "‘Species II’ Is a Sleazy, Gory, Nutty Creature Feature Masterpiece"),
("none", "Have any other directors made fundamental changes to their movies after they've been released like George Lucas did?"),
("High Noon", "High Noon(1952) is one of the best westerns ever made and is a must watch for all cinephiles."),
("Justice League", "The only 'good' scene in Justice League (2017)"),
("The Mummy", "This Day in Horror History: THE MUMMY Was Unearthed in 1999"),
("Dune", "Dune Movie Image Reveals Josh Brolin &amp; Timothee Chalamet Action Scene"),
("Tommaso", "Tommaso – Abel Ferrare, Willem Dafoe – Official U.S. Trailer - Arrives VOD June 5th"),
("Shrek the Third", "Shrek the Third is not nearly as much of a trainwreck as everyone says. Anyone with me?"),
("none", "I decided to watch every film from the wiki list of films with 0% critic ratings on Rotten Tomatoes"),
("Capone", "'Capone' Starring Tom Hardy - Review Thread"),
("Empire Strikes Back", "Redditers who saw Empire Strikes Back 40 years ago: what was the Vader reveal like?"),
("Greyhound", "Tom Hanks WWII Movie 'Greyhound' Cancels Theatrical Release Due to Coronavirus, Will Release on Apple TV Instead"),
("Capone", "Tom Hardy ‘Capone’ Movie Takes In $2.5M+ On VOD Repping A Record For Vertical Entertainment"),
("DUNE", "A new image for Denis Villeneuve’s ‘DUNE’ has been officially released."),
("none", "Berlin's Windowflicks film projections – great idea!"),
("none", "RIP Jerry Stiller, back then this guy was so hot! he could take a crap, wrap it in tinfoil, put a couple of fishhooks on it, and sell it to Queen Elizabeth as earrings."),
("American Son", "Russell Crowe to Star as Mobster in Thriller ‘American Son’"),
("none", "Fandango (1985) Kevin Costner's first starring role is a lesser known cult classic"),
("Legally Blonde 3", "‘Legally Blonde 3’ Taps Mindy Kaling, Dan Goor as Writers"),
("none", "Mubi has made its whole archive of arthouse films available"),
("none", "Christopher Nolan, Edgar Wright, and Rian Johnson Surprise NYU Tisch Cinema Studies Students with Epic Graduation Speeches"),
("Avatar 2", "James Cameron Says ‘Avatar 2’ Release on Schedule Despite Shutdown"),
("Tenet", "TENET - NEW TRAILER"),
("Rio Bravo", "Rio Bravo (1959) vs. El Dorado (1966)"),
("none", "Cannes Will Announce Official Selection in June, but There’ll Be No Physical Edition This Year"),
("none", "Amazon Sued for Saying You've 'Bought' Movies that It can Take Away from You"),
("The Lighthouse", "The Lighthouse | Official Spongebob Trailer | A24"),]

Let's run the extractor on these post titles (calling the API in parallel for quicker results):

In [326]:
from concurrent.futures import ThreadPoolExecutor

extracted = []
# Run the model to extract the entities
with ThreadPoolExecutor(max_workers=8) as executor:
    for i in executor.map(cohereMovieExtractor.extract, test_df['text']):
        extracted.append(str(i).strip())
# Save results
test_df['extracted_text'] = extracted

Let's look at some results:

In [306]:
test_df.head()

Unnamed: 0,text,label,extracted_text
0,Disney's streaming service loses some movies due to old licensing deals,none,none
1,"Hi, I’m Sam Raimi, producer of THE GRUDGE which hits theaters tonight. Ask Me Anything!",The Grudge,The Grudge
2,'Parasite' Named Best Picture by Australia's AACTA Awards,Parasite,Parasite
3,Danny Trejo To Star In Vampire Spaghetti Western ‘Death Rider in the House of Vampires’,Death Rider in the House of Vampires,Death Rider
4,I really wish the 'realistic' CGI animal trend would end.,none,none


Let's calculate the accuracy by comparing to the labeled examples

In [332]:
# Compare the label to the extracted text
test_df['correct'] = (test_df['label'].str.lower() == test_df['extracted_text'].str.lower()).astype(int)

# Print the accuracy
print(f'Classification accuracy {test_df["correct"].mean() *100}%')

Classification accuracy 90.0%


We can look at the ones it got wrong:

In [328]:
test_df[test_df['correct']==0]

Unnamed: 0,text,label,extracted_text,correct
3,Danny Trejo To Star In Vampire Spaghetti Western ‘Death Rider in the House of Vampires’,Death Rider in the House of Vampires,Death Rider,0
6,De Niro recreating a scene from Goodfellas to test Irishman deaging (3:30 in),Goodfellas,none,0
12,Is there anyway way I could get a copy of 1917 for my dying father in law?,1917,none,0
30,How Uncut Gems Won Over the Diamond District,Uncut Gems,none,0
39,2019 in film - with 'Movies' by Weyes Blood,none,Movies,0
57,The Mad Max franchise is my all time favorite movie series. I finally watched Waterworld tonight. Oh man why didnt I see this sooner?,Mad Max,Waterworld,0
63,What is happening with Aneesh Chaganty's Run?,Run,none,0
69,How A New Hope created Pixar Animation Studios,Star Wars,none,0
82,New Wonder Woman image,Wonder Woman,none,0
88,Thoughts on the Irishman ...,The Irishman,none,0


It indeed failed to pick up a few examples. Sometimes this uncovers edge cases and understandable mistakes.


We can look at the classification report for a more detailed look at what's included in the test set, and what the model got right and wrong:

In [329]:
from sklearn.metrics import classification_report
import warnings
warnings.filterwarnings('ignore')

print(classification_report(test_df['label'].str.lower(), test_df['extracted_text'].str.lower()))

                                      precision    recall  f1-score   support

                                1917       1.00      0.50      0.67         2
               2001: a space odyssey       1.00      1.00      1.00         1
                            ad astra       1.00      1.00      1.00         1
     alice doesn't live here anymore       1.00      1.00      1.00         1
                       austin powers       1.00      1.00      1.00         1
               back to the future ii       1.00      1.00      1.00         1
                        blood simple       1.00      1.00      1.00         1
                   bohemian rhapsody       1.00      1.00      1.00         1
                         borderlands       1.00      1.00      1.00         1
                     brief encounter       1.00      1.00      1.00         1
                                cats       1.00      1.00      1.00         1
                               congo       1.00      1.00      

This type of extraction is interesting because it doesn't just blindly look at the text. The model has picked up on movie data during its training process and that helps it understand the task from only a few examples.

You can think about extending this to other subreddits, to extract other kinds of entities and information. [Let us know in the forum](https://community.cohere.ai/) what you experiment with and what kinds of results you see!

Happy building!