# House of Gucci (Market Analysis)

### The goal of this project is to capture the optimal target audience for the movie 'House of Gucci':
- This notebook uses audience movie ratings for a list of 288 movies and demographic data corresponding to 210 cities in the United States.
- First, we find the top 10-similar movies to House of Gucci out of 288 movies based on similar movie tags in the content_metadata_tags dataset.
- Next, we find the top 5 geographic locations by market share for each of the 10 similar movies and obtain a list of average demographic data across each geographic location for each movie.
- Finally, we aggregate the top 10 movies' demographic data to determine the optimal set of demographics for the House of Gucci audience.

In [1]:
'''Imports Libraries and Data.'''

import pandas as pd
import numpy as np
from statistics import mode
import warnings

warnings.filterwarnings('ignore')

# Shows all columns in DataFrame
pd.set_option('max_columns', None)

# Imports data
demographics = pd.read_csv('Data/demographics.csv')
dma_market_share = pd.read_csv('Data/dma_market_share.csv')
house_of_gucci_tags = set(pd.read_csv('Data/house_of_gucci_tags.csv')['House of Gucci'].values)
content_metadata_tags = pd.read_csv('Data/content_metadata_tags.csv').drop(['US Release'], axis=1)

### Data Preprocessing

In [2]:
'''Converts Numeric Columns in String Form to Float.'''

for col in demographics.columns:
    if isinstance(demographics.at[0, col], str) and not col in ['DMA', 'DMA Name', 'Metric Type', 'DMA + Metric Type']:
        demographics[col] = demographics[col].str.replace(',', '')
        demographics[col] = demographics[col].str.replace('$', '')
        demographics[col] = demographics[col].astype(float)

In [3]:
'''Excludes Canada and Superfluous Values From dma_market_share.'''

dma_market_share.loc[dma_market_share['Film Share (All Weeks)'] == '-']
dma_market_share = dma_market_share.loc[~((dma_market_share['Film Share (All Weeks)'] == '-') | (dma_market_share['Index'] == '#VALUE!') | (dma_market_share['Index'] == '#DIV/0!'))]
dma_market_share = dma_market_share.loc[~(dma_market_share['DMA Code'] == 'NA: Canada')]

# Converts columns in dma_market_share to float
dma_market_share['Typical Market Share'] = dma_market_share['Typical Market Share'].str.replace('%', '').astype(float)
dma_market_share['Film Share (All Weeks)'] = dma_market_share['Film Share (All Weeks)'].str.replace('%', '').astype(float)
dma_market_share['Index'] = dma_market_share['Index'].str.replace('%', '').astype(float)

### Finding All Existing Tags Within 288 Movies

In [4]:
'''Gets Intersection of Assignment 2 and Assignment 3's Movies and Stores in DataFrame.'''

# Drops unnamed columns
demographics = demographics.drop([i for i in demographics.columns if i[:7] == 'Unnamed'], axis=1)

# Gets unique film lists
unique_films = list(dma_market_share.Film.unique())
assignment_2_movies = list(content_metadata_tags.Name.unique())

# Finds movie overlap in lists
intersection_movies = list(set(unique_films).intersection(set(assignment_2_movies)))
intersection_df = content_metadata_tags.loc[content_metadata_tags['Name'].isin(intersection_movies)]
intersection_df = intersection_df.reset_index(drop=True)

# Changes index to movie names, transposes, and removes superfluous values
intersection_df.index = list(intersection_df.Name.values)
intersection_df = intersection_df.drop(['Name'], axis=1).T
intersection_df = intersection_df[(intersection_df.index.str.contains('_KW')) &
                                  ~(intersection_df.index.str.contains('(character)')) & 
                                  ~(intersection_df.index.str.contains('(franchise)'))]

In [5]:
'''Creates DataFrame With Movies' Tags.'''

film_tags = pd.DataFrame()
for col in intersection_df.columns:
    tags_i = list(intersection_df.loc[intersection_df[col] == 1][col].keys())
    
    for i, t in enumerate(tags_i):
        film_tags.at[i, col] = t
        
film_tags.head()

Unnamed: 0,10 Cloverfield Lane,12 Strong,About Last Night,Abraham Lincoln: Vampire Hunter,Act of Valor,Action Point,Addicted,Admission,Adrift,Alien: Covenant,Allied,Almost Christmas,American Hustle,American Made,American Ultra,Annabelle,Annabelle: Creation,Annihilation,Ant-Man,Argo,Arrival,Assassin's Creed,Atomic Blonde,Avengers: Age Of Ultron,Avengers: Infinity War,Bad Moms,Bad Samaritan,Baggage Claim,Barbershop: The Next Cut,Battle of the Sexes,Beauty and the Beast,Begin Again,Beirut,Big Hero 6,Black Mass,Black Panther,Blade Runner 2049,Blair Witch,Blended,Blockers,Book Club,Breaking In,Brick Mansions,Bridget Jones's Baby,Broken City,Burnt,Captain America: Civil War,Captain Underpants: The First Epic Movie,Cars 3,Chappaquiddick,Creed,Crimson Peak,Daddy's Home,Daddy's Home 2,Deadpool,Deadpool 2,Death Wish,Den of Thieves,Despicable Me 3,Detroit,Dirty Grandpa,Do You Believe?,Doctor Strange,Dolphin Tale 2,Don't Breathe,Dope,Downsizing,Dracula Untold,Dunkirk,Early Man,Elysium,Entourage,Epic,Escape From Planet Earth,Every Day,Everybody Wants Some!!,"Everything, Everything",Fantastic Four,Fences,Ferdinand,Fifty Shades Darker,Fifty Shades Freed,Finding Dory,Fist Fight,Flatliners,Florence Foster Jenkins,Focus,Frankenweenie,Free Birds,Fruitvale Station,Furious 7,Fury,Game Night,Geostorm,Get On Up,Get Out,Ghost in the Shell,Gifted,Girls Trip,God's Not Dead,God's Not Dead 2,Goosebumps,Gravity,Gringo,Grown Ups 2,Grudge Match,Hacksaw Ridge,Happy Death Day,Hereditary,Hitman: Agent 47,Home,Home Again,Hope Springs,Hostiles,Hot Tub Time Machine 2,Hotel Artemis,I Can Only Imagine,I Feel Pretty,"I, Tonya",Ice Age: Collision Course,Identity Thief,If I Stay,Inferno,Inside Out,Insidious: The Last Key,Interstellar,It,It Comes At Night,Jack Reacher,Jack Reacher: Never Go Back,Jack Ryan: Shadow Recruit,Jackass Presents: Bad Grandpa,Jackie,Jason Bourne,Jersey Boys,Jigsaw,John Wick,Joy,Jumanji: Welcome to the Jungle,Jurassic World,Justice League,Keanu,Kidnap,Kong: Skull Island,Krampus,La La Land,Last Vegas,Leap!,Lee Daniels' The Butler,Let's Be Cops,Life,Lights Out,Logan,Logan Lucky,London Has Fallen,Lone Survivor,"Love, Simon",Lucy,Mad Max: Fury Road,Magic Mike,Magic Mike XXL,Maleficent,Marshall,Max,Maze Runner: The Death Cure,Me Before You,Mechanic: Resurrection,Megan Leavey,Midnight Sun,Million Dollar Arm,Miracles From Heaven,Moana,Molly's Game,Monster Trucks,Moonlight,My Big Fat Greek Wedding 2,Neighbors,Neighbors 2: Sorority Rising,Nine Lives,No Good Deed,Nocturnal Animals,Non-Stop,Ocean's 8,Oculus,Overboard,Pacific Rim,Paddington,Paddington 2,Paper Towns,Paranormal Activity: The Ghost Dimension,Paranormal Activity: The Marked Ones,ParaNorman,Parental Guidance,Peter Rabbit,Pete's Dragon,Pitch Perfect 3,Pixels,Planes,Planes: Fire and Rescue,Poltergeist,Project Almanac,Project X,Prometheus,Proud Mary,Rampage,Ready Player One,Red Sparrow,Red Tails,Resident Evil: Retribution,Resident Evil: The Final Chapter,Ride Along,Ride Along 2,Rings,Rise of the Guardians,Risen,Rogue One: A Star Wars Story,Rough Night,Run All Night,Rush,Safe Haven,San Andreas,Savages,Saving Mr. Banks,Self/Less,Selma,Sgt. Stubby: An American Hero,Sherlock Gnomes,Show Dogs,Sicario,Sing,Sinister,Sinister 2,Skyfall,Sleepless,Smurfs: The Lost Village,Snowden,Solo: A Star Wars Story,Southpaw,Spectre,Split,Spotlight,Spy,Star Wars: The Force Awakens,Star Wars: The Last Jedi,Steve Jobs,Storks,Suicide Squad,Sully,Taken 2,Taken 3,Tammy,Thank You For Your Service,The Wedding Ringer,This Is 40,This Is The End,Thor: Ragnarok,Tomb Raider,Tomorrowland,Transcendence,Transformers: Age of Extinction,Transformers: The Last Knight,Trolls,Tully,Turbo,Tyler Perry's Acrimony,Unbroken,Underworld: Blood Wars,Unforgettable,Unfriended,Unsane,Upgrade,War Room,Warcraft,What To Expect When You're Expecting,Whiskey Tango Foxtrot,Why Him?,Wild,Winchester,Wind River,Wish Upon,Wonder,Wonder Woman,World War Z,Wreck-It Ralph,X-Men: Apocalypse,Zootopia
0,2010s_KW,IMAX_KW,2010s_KW,1800s_KW,terrorism_KW,2010s_KW,2010s_KW,high school_KW,female protagonist_KW,IMAX_KW,war_KW,2010s_KW,CIA_KW,CIA_KW,CIA_KW,eerie_KW,eerie_KW,2010s_KW,IMAX_KW,escape_KW,female protagonist_KW,2010s_KW,female protagonist_KW,IMAX_KW,IMAX_KW,2010s_KW,2010s_KW,female protagonist_KW,2010s_KW,female protagonist_KW,IMAX_KW,2010s_KW,CIA_KW,book_KW,book_KW,book_KW,crisis_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,intense action_KW,2010s_KW,New York_KW,2010s_KW,IMAX_KW,2010s_KW,2010s_KW,car accident_KW,2010s_KW,eerie_KW,2010s_KW,2010s_KW,2010s_KW,book_KW,mysterious_KW,2010s_KW,2010s_KW,CIA_KW,2010s_KW,2010s_KW,2010s_KW,true story_KW,2010s_KW,2010s_KW,CIA_KW,battle_KW,IMAX_KW,survival_KW,CIA_KW,2010s_KW,teen_KW,escape_KW,2010s_KW,true story_KW,2010s_KW,2010s_KW,African-American_KW,escape_KW,2010s_KW,2010s_KW,IMAX_KW,2010s_KW,2010s_KW,CIA_KW,2010s_KW,IMAX_KW,buddy_KW,early 2000s_KW,2010s_KW,survival_KW,2010s_KW,IMAX_KW,CIA_KW,2010s_KW,female protagonist_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,female protagonist_KW,2010s_KW,buddy_KW,CIA_KW,battle_KW,2010s_KW,2010s_KW,2010s_KW,book_KW,2010s_KW,Black List_KW,survival_KW,buddy_KW,violent_KW,2010s_KW,2010s_KW,female protagonist_KW,funny_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,apocalypse_KW,eerie_KW,2010s_KW,intense action_KW,2010s_KW,IMAX_KW,buddy_KW,female protagonist_KW,2010s_KW,CIA_KW,2010s_KW,2010s_KW,female protagonist_KW,2010s_KW,intense action_KW,book_KW,2010s_KW,2010s_KW,IMAX_KW,2010s_KW,2010s_KW,New York_KW,female protagonist_KW,CIA_KW,2010s_KW,contained thriller_KW,2010s_KW,IMAX_KW,2010s_KW,mysterious_KW,survival_KW,2010s_KW,2010s_KW,escape_KW,2010s_KW,2010s_KW,female protagonist_KW,true story_KW,teen_KW,survival_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,teen_KW,book_KW,female protagonist_KW,female protagonist_KW,2010s_KW,psychological_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,IMAX_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,buddy_KW,2010s_KW,parenthood_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,Navy (U.S.)_KW,2010s_KW,2010s_KW,2010s_KW,teen_KW,IMAX_KW,2010s_KW,IMAX_KW,escape_KW,2010s_KW,military_KW,3-D_KW,survival_KW,2010s_KW,2010s_KW,2010s_KW,IMAX_KW,military_KW,IMAX_KW,2010s_KW,2010s_KW,car accident_KW,crime_KW,2010s_KW,kidnap_KW,true story_KW,2010s_KW,CIA_KW,hero_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,murder_KW,2010s_KW,escape_KW,2010s_KW,2010s_KW,2010s_KW,hero_KW,2010s_KW,IMAX_KW,2010s_KW,2001_KW,2010s_KW,female protagonist_KW,female protagonist_KW,book_KW,war_KW,IMAX_KW,IMAX_KW,2010s_KW,2010s_KW,2010s_KW,psychological_KW,2010s_KW,music_KW,apocalypse_KW,IMAX_KW,female protagonist_KW,2010s_KW,CIA_KW,2010s_KW,IMAX_KW,buddy_KW,2010s_KW,fish-out-of-water_KW,2010s_KW,survival_KW,2010s_KW,2010s_KW,2010s_KW,2010s_KW,CIA_KW,2010s_KW,survival_KW,book_KW,female protagonist_KW,2010s_KW,female protagonist_KW,female protagonist_KW,2010s_KW,female protagonist_KW,2010s_KW,female protagonist_KW,apocalypse_KW,hero_KW,apocalypse_KW,perseverance_KW
1,apocalypse_KW,2001_KW,friendship_KW,Black List_KW,armed forces (U.S.)_KW,1970s_KW,female protagonist_KW,mother/son_KW,isolated_KW,survival_KW,spy_KW,touching_KW,true story_KW,true story_KW,government_KW,1960s_KW,survival_KW,eerie_KW,book_KW,CIA_KW,mysterious_KW,intense action_KW,mysterious_KW,book_KW,battle_KW,women_KW,kidnap_KW,African-American_KW,CIA_KW,CIA_KW,music_KW,CIA_KW,undercover_KW,funny_KW,true story_KW,CIA_KW,IMAX_KW,mysterious_KW,South_KW,teen_KW,female protagonist_KW,contained thriller_KW,kidnap_KW,female protagonist_KW,Black List_KW,Black List_KW,book_KW,book_KW,touching_KW,crisis_KW,crisis_KW,IMAX_KW,funny_KW,buddy_KW,IMAX_KW,CIA_KW,intense action_KW,intense action_KW,crisis_KW,true story_KW,Black List_KW,inspirational_KW,IMAX_KW,war_KW,contained thriller_KW,survival_KW,money_KW,war_KW,hero_KW,overcoming adversity_KW,kidnap_KW,entertainment_KW,war_KW,hero_KW,mysterious_KW,coming-of-age_KW,female protagonist_KW,battle_KW,race relations_KW,book_KW,book_KW,book_KW,war_KW,high school_KW,eerie_KW,true story_KW,IMAX_KW,3-D_KW,funny_KW,true story_KW,IMAX_KW,hero_KW,mysterious_KW,hero_KW,true story_KW,isolated_KW,IMAX_KW,battle_KW,Louisiana_KW,courage_KW,war_KW,teen_KW,IMAX_KW,kidnap_KW,friendship_KW,retirement_KW,hero_KW,college/university_KW,female protagonist_KW,battle_KW,buddy_KW,single mother_KW,ship_KW,dark_KW,friendship_KW,Black List_KW,music_KW,female protagonist_KW,true story_KW,animal_KW,buddy_KW,car accident_KW,crisis_KW,psychological_KW,eerie_KW,IMAX_KW,mysterious_KW,isolated_KW,military_KW,intense action_KW,Afghanistan_KW,journey_KW,true story_KW,survival_KW,musician_KW,psychological_KW,intense action_KW,true story_KW,survival_KW,nature_KW,CIA_KW,buddy_KW,Louisiana_KW,mysterious_KW,ship_KW,CIA_KW,buddy_KW,touching_KW,true story_KW,buddy_KW,survival_KW,mysterious_KW,book_KW,crime_KW,CIA_KW,Afghanistan_KW,teen_KW,female protagonist_KW,IMAX_KW,performer_KW,South_KW,IMAX_KW,inspirational_KW,teenage_KW,teen_KW,book_KW,intense action_KW,female protagonist_KW,female protagonist_KW,early 2000s_KW,true story_KW,hero_KW,psychological_KW,teen_KW,teen_KW,teen_KW,college/university_KW,battle_KW,funny_KW,survival_KW,female protagonist_KW,contained thriller_KW,New York_KW,eerie_KW,fish-out-of-water_KW,survival_KW,book_KW,book_KW,mysterious_KW,New York_KW,high school_KW,touching_KW,parent_KW,book_KW,war_KW,music_KW,war_KW,friendship_KW,hero_KW,scary_KW,teen_KW,high school_KW,mysterious_KW,female protagonist_KW,survival_KW,IMAX_KW,female protagonist_KW,war_KW,sequel_KW,intense action_KW,buddy_KW,funny_KW,teen_KW,hero_KW,Antiquity_KW,hero_KW,female protagonist_KW,New York_KW,true story_KW,police_KW,survival_KW,drugs_KW,war_KW,Louisiana_KW,true story_KW,heroic_KW,funny_KW,kidnap_KW,CIA_KW,courage_KW,investigation_KW,mysterious_KW,IMAX_KW,intense action_KW,funny_KW,book_KW,heroic_KW,dark_KW,CIA_KW,teen_KW,early 2000s_KW,female protagonist_KW,IMAX_KW,IMAX_KW,true story_KW,funny_KW,book_KW,book_KW,kidnap_KW,escape_KW,female protagonist_KW,Afghanistan_KW,buddy_KW,motherhood_KW,survival_KW,book_KW,IMAX_KW,female protagonist_KW,2012_KW,IMAX_KW,intense action_KW,music_KW,female protagonist_KW,accident_KW,psychological_KW,book_KW,battle_KW,survival_KW,teen_KW,female protagonist_KW,violent_KW,war_KW,intense action_KW,pregnancy_KW,Afghanistan_KW,college/university_KW,true story_KW,true story_KW,mysterious_KW,teen_KW,war_KW,book_KW,intense action_KW,friendship_KW,battle_KW,race relations_KW
2,car accident_KW,Afghanistan_KW,funny_KW,3-D_KW,,1979_KW,sex_KW,motherhood_KW,stranded_KW,CIA_KW,1940s_KW,funny_KW,betrayal_KW,Black List_KW,drugs_KW,period_KW,period_KW,female protagonist_KW,hero_KW,Middle East_KW,military_KW,war_KW,intense action_KW,hero_KW,book_KW,motherhood_KW,kidnapping_KW,marriage_KW,African-American_KW,true story_KW,touching_KW,musician_KW,political_KW,comic_KW,dark_KW,hero_KW,CIA_KW,survival_KW,mother/son_KW,2012_KW,book_KW,survival_KW,kidnapping_KW,motherhood_KW,political_KW,money_KW,hero_KW,hero_KW,funny_KW,CIA_KW,inspirational_KW,1800s_KW,broad comedy_KW,broad comedy_KW,book_KW,hero_KW,violence_KW,heist_KW,funny_KW,African-American_KW,funny_KW,United States_KW,book_KW,touching_KW,survival_KW,drugs_KW,funny_KW,dark_KW,heroic_KW,funny_KW,money_KW,friendship_KW,father/daughter_KW,secrets_KW,teen_KW,period_KW,teen_KW,book_KW,period_KW,war_KW,sea_KW,sex_KW,touching_KW,school_KW,mysterious_KW,musician_KW,Louisiana_KW,dog_KW,animal_KW,African-American_KW,intense action_KW,heroic_KW,kidnap_KW,heroic_KW,musician_KW,mysterious_KW,book_KW,Black List_KW,African-American_KW,uplifting_KW,high school_KW,funny_KW,isolated_KW,violent_KW,ship_KW,father/son_KW,heroic_KW,student_KW,psychological_KW,intense action_KW,fish-out-of-water_KW,funny_KW,marriage_KW,Mexico_KW,funny_KW,crime_KW,singer_KW,psychological_KW,violence_KW,sequel_KW,crime_KW,teen_KW,IMAX_KW,school_KW,scary_KW,mysterious_KW,dark_KW,mysterious_KW,soldier_KW,military_KW,CIA_KW,bawdy_KW,Black List_KW,CIA_KW,mob_KW,survival_KW,New York_KW,New York_KW,teen_KW,suspenseful_KW,hero_KW,undercover_KW,kidnap_KW,intense action_KW,relationship_KW,musician_KW,singer_KW,school_KW,African-American_KW,law enforcement_KW,scary_KW,survival_KW,CIA_KW,heist_KW,intense action_KW,battle_KW,coming-of-age_KW,survival_KW,survival_KW,touching_KW,entertainment_KW,betrayal_KW,race relations_KW,Afghanistan_KW,dark_KW,touching_KW,kidnap_KW,hero_KW,teen_KW,true story_KW,inspirational_KW,war_KW,book_KW,high school_KW,African-American_KW,funny_KW,student_KW,college/university_KW,father/daughter_KW,courage_KW,wife_KW,terrorism_KW,heist_KW,death_KW,funny_KW,intense action_KW,war_KW,war_KW,teen_KW,death_KW,school_KW,funny_KW,family_KW,England_KW,uplifting_KW,singer_KW,funny_KW,ship_KW,heroic_KW,suspenseful_KW,2012_KW,school_KW,CIA_KW,intense action_KW,intense action_KW,teen_KW,book_KW,race relations_KW,evil_KW,suspenseful_KW,crime_KW,police_KW,death_KW,3-D_KW,death_KW,heroic_KW,Black List_KW,violence_KW,celebrity_KW,ship_KW,hero_KW,novel_KW,Black List_KW,mysterious_KW,African-American_KW,military_KW,3-D_KW,kidnapping_KW,violence_KW,music_KW,journalism_KW,mother/son_KW,kidnap_KW,kidnap_KW,fan_KW,CIA_KW,war_KW,overcoming adversity_KW,intense action_KW,kidnap_KW,true story_KW,nuclear_KW,hero_KW,battle_KW,1970s_KW,animal_KW,CIA_KW,early 2000s_KW,father/daughter_KW,CIA_KW,buddy_KW,book_KW,friendship_KW,fatherhood_KW,drugs_KW,CIA_KW,survival_KW,IMAX_KW,Black List_KW,survival_KW,3-D_KW,funny_KW,New York_KW,animal_KW,betrayal_KW,hero_KW,intense action_KW,violence_KW,revenge_KW,psychological_KW,murder_KW,inspirational_KW,war_KW,,book_KW,funny_KW,overcoming adversity_KW,1800s_KW,CIA_KW,teenage_KW,New York_KW,CIA_KW,war_KW,invasion_KW,book_KW,buddy_KW
3,contained thriller_KW,battle_KW,sex_KW,novel_KW,,funny_KW,marriage_KW,college/university_KW,survival_KW,intense action_KW,marriage_KW,reconciliation_KW,New York_KW,drug trafficking_KW,funny_KW,scary_KW,death_KW,mysterious_KW,heroic_KW,true story_KW,2012_KW,murder_KW,war_KW,heroic_KW,hero_KW,funny_KW,crime_KW,siblings_KW,political_KW,inspirational_KW,father/daughter_KW,New York_KW,1970s_KW,3-D_KW,violence_KW,intense action_KW,dark_KW,student_KW,single mother_KW,Black List_KW,friendship_KW,death_KW,undercover_KW,funny_KW,police_KW,touching_KW,intense action_KW,school_KW,Florida_KW,true story_KW,perseverance_KW,19th century_KW,fatherhood_KW,fatherhood_KW,CIA_KW,violent_KW,crime_KW,law enforcement_KW,siblings_KW,race relations_KW,spring break_KW,Christianity_KW,hero_KW,love_KW,teen_KW,high school_KW,social issues_KW,violent_KW,military_KW,underdog_KW,social issues_KW,funny_KW,death_KW,siblings_KW,teenage_KW,college/university_KW,coming-of-age_KW,hero_KW,prejudice_KW,courage_KW,sex_KW,discovery_KW,friendship_KW,funny_KW,drugs_KW,New York_KW,revenge_KW,stop motion animation_KW,travel_KW,race relations_KW,Middle East_KW,military_KW,kidnapping_KW,intense action_KW,Black List_KW,CIA_KW,intense action_KW,mother/son_KW,women_KW,college/university_KW,school_KW,television series_KW,survival_KW,drugs_KW,marriage_KW,sports_KW,military_KW,funny_KW,dark_KW,violence_KW,funny_KW,middle-aged_KW,relationship_KW,journey_KW,murder_KW,Los Angeles_KW,touching_KW,New York_KW,Black List_KW,talking animal_KW,funny_KW,CIA_KW,mysterious_KW,funny_KW,suspenseful_KW,survival_KW,period_KW,husband_KW,murder_KW,friendship_KW,terrorism_KW,child_KW,1960s_KW,intense action_KW,1960s_KW,violence_KW,violence_KW,perseverance_KW,book_KW,vacation_KW,heroic_KW,fish-out-of-water_KW,kidnapping_KW,soldier_KW,scary_KW,music_KW,friendship_KW,funny_KW,race relations_KW,mob_KW,suspenseful_KW,dark_KW,hero_KW,funny_KW,terrorism_KW,book_KW,high school_KW,CIA_KW,dark_KW,friendship_KW,music_KW,touching_KW,period_KW,military_KW,novel_KW,tragedy_KW,kidnapping_KW,military_KW,teenage_KW,war_KW,death_KW,South_KW,early 2000s_KW,friendship_KW,betrayal_KW,Greek_KW,funny_KW,broad comedy_KW,wife_KW,scary_KW,death_KW,New York_KW,funny_KW,ship_KW,accident_KW,war_KW,touching_KW,touching_KW,teenage_KW,scary_KW,murder_KW,suspenseful_KW,,animal_KW,friendship_KW,friendship_KW,invasion_KW,competition_KW,courage_KW,3-D_KW,Black List_KW,,death_KW,violent_KW,military_KW,teenage_KW,CIA_KW,1940s_KW,video game_KW,sequel_KW,law enforcement_KW,Florida_KW,scary_KW,child_KW,sea_KW,intense action_KW,crime_KW,2012_KW,1970s_KW,relationship_KW,intense action_KW,cartel_KW,1960s_KW,Black List_KW,human rights_KW,patriotic_KW,sequel_KW,New York_KW,violent_KW,performer_KW,,single mother_KW,kidnapping_KW,kidnapping_KW,animal_KW,true story_KW,space_KW,tragedy_KW,crime_KW,kidnapping_KW,Black List_KW,CIA_KW,war_KW,CIA_KW,father/daughter_KW,3-D_KW,hero_KW,hero_KW,wife_KW,intense action_KW,funny_KW,military_KW,funny_KW,sex_KW,celebrity_KW,hero_KW,hero_KW,mysterious_KW,experiment_KW,battle_KW,sequel_KW,journey_KW,women_KW,nature_KW,dark_KW,heroic_KW,betrayal_KW,violent_KW,ship_KW,survival_KW,experiment_KW,overcoming adversity_KW,violence_KW,,true story_KW,father/daughter_KW,uplifting_KW,guns_KW,violence_KW,dark_KW,overcoming adversity_KW,hero_KW,South_KW,ship_KW,hero_KW,law enforcement_KW
4,crisis_KW,book_KW,ship_KW,historical_KW,,slapstick_KW,therapy_KW,school_KW,two-hander_KW,gory_KW,suspenseful_KW,estranged_KW,Black List_KW,guns_KW,police_KW,suspenseful_KW,scary_KW,psychological_KW,heist_KW,kidnap_KW,government_KW,father/daughter_KW,graphic novel_KW,intense action_KW,intense action_KW,irreverent_KW,suspenseful_KW,sister_KW,friendship_KW,political_KW,love_KW,music_KW,period_KW,technology_KW,violent_KW,South_KW,3-D_KW,scary_KW,funny_KW,high school_KW,funny_KW,suspenseful_KW,police_KW,slapstick_KW,revenge_KW,fish-out-of-water_KW,government_KW,broad comedy_KW,competition_KW,Black List_KW,coming-of-age_KW,period_KW,parenthood_KW,parenthood_KW,hero_KW,funny_KW,wife_KW,police_KW,sequel_KW,racism_KW,Florida_KW,faith_KW,comic_KW,animal_KW,kidnap_KW,school_KW,discovery_KW,Black List_KW,true story_KW,technology_KW,space_KW,television series_KW,queen_KW,3-D_KW,fan_KW,friendship_KW,love_KW,heroic_KW,father/son_KW,touching_KW,discovery_KW,novel_KW,funny_KW,broad comedy_KW,college/university_KW,music_KW,sex_KW,short_KW,historical_KW,racism_KW,violent_KW,war_KW,crime_KW,war_KW,drugs_KW,New York_KW,government_KW,touching_KW,friendship_KW,student_KW,student_KW,father/daughter_KW,two-hander_KW,Mexico_KW,child_KW,competition_KW,true story_KW,irreverent_KW,death_KW,violent_KW,invasion_KW,mother/daughter_KW,Black List (2008)_KW,New Mexico_KW,ship_KW,future_KW,father/son_KW,funny_KW,crime_KW,king_KW,Florida_KW,musician_KW,book_KW,discovery_KW,sinister_KW,hero_KW,friendship_KW,nature_KW,novel_KW,ship_KW,kidnap_KW,parent_KW,touching_KW,Europe_KW,celebrity_KW,murder_KW,violent_KW,single mother_KW,funny_KW,3-D_KW,intense action_KW,funny_KW,mother/son_KW,1970s_KW,child_KW,performer_KW,fish-out-of-water_KW,France_KW,racism_KW,buddy cop_KW,ensemble_KW,scary_KW,intense action_KW,racing_KW,United States_KW,CIA_KW,touching_KW,intense action_KW,violent_KW,funny_KW,performer_KW,invasion_KW,prejudice_KW,true story_KW,novel series_KW,England_KW,revenge_KW,true story_KW,coming-of-age_KW,fish-out-of-water_KW,afterlife_KW,uplifting_KW,true story_KW,school_KW,violence_KW,celebration_KW,fraternity_KW,fraternity_KW,father/son_KW,suspenseful_KW,discovery_KW,law enforcement_KW,ocean_KW,relationship_KW,boat_KW,death_KW,fish-out-of-water_KW,funny_KW,book_KW,3-D_KW,scary_KW,3-D_KW,,child_KW,death_KW,funny_KW,1980s_KW,racing_KW,touching_KW,child_KW,high school_KW,,suspenseful_KW,crime_KW,animal_KW,intense action_KW,dark_KW,ensemble_KW,game_KW,evil_KW,undercover_KW,marriage_KW,ticking clock_KW,children_KW,epic_KW,war_KW,women_KW,violent_KW,sex_KW,secrets_KW,estranged_KW,car_KW,behind the scenes_KW,cancer_KW,inspirational_KW,true story_KW,disappearance_KW,buddy_KW,crime_KW,singer_KW,,murder_KW,government_KW,drugs_KW,3-D_KW,government_KW,origin story_KW,father/daughter_KW,Mexico_KW,suspenseful_KW,touching_KW,war_KW,betrayal_KW,hero_KW,1980s_KW,heartwarming_KW,Special Forces_KW,heroic_KW,revenge_KW,kidnap_KW,family_KW,true story_KW,slapstick_KW,ship_KW,friendship_KW,intense action_KW,intense action_KW,teen_KW,terrorist_KW,intense action_KW,save the world_KW,quest_KW,motherhood_KW,siblings_KW,revenge_KW,military_KW,love_KW,murder_KW,relationship_KW,overcoming adversity_KW,death_KW,marriage_KW,warrior_KW,,war_KW,competition_KW,death_KW,revenge_KW,violent_KW,Black List_KW,uplifting_KW,intense action_KW,Black List_KW,3-D_KW,intense action_KW,funny_KW


### Finds Overlap of 288 Movies' and House of Gucci's Tags

- Sorts Columns by Most-Similar Movies

In [6]:
'''Creates Dictionary with Each Film's Tag Overlap with House of Gucci.'''

film_ratings = {}
for col in film_tags.columns:
    film_i_movies = set(film_tags[col].values)
    overlap = house_of_gucci_tags.intersection(film_i_movies)
    
    film_ratings[col] = [len(overlap), list(overlap)]

In [7]:
'''Creates DataFrame with Overlapping Tags.'''

overlap_df = pd.DataFrame()
for film in film_ratings.keys():
    tags_list_len = str(film_ratings[film][0])
    movies = film_ratings[film][1]
    
    overlap_df.at['length', film] = tags_list_len
    
    for t in range(int(tags_list_len)):
        overlap_df.at[t+1, film] = movies[t]
        
# Sorts by length of movie overlap list
overlap_df = overlap_df.T
overlap_df['length'] = overlap_df['length'].astype(int)
overlap_df = overlap_df.sort_values(by=['length'], ascending=False)
overlap_df = overlap_df.T

In [8]:
overlap_df.head()

Unnamed: 0,"I, Tonya",Molly's Game,Creed,Black Mass,Taken 3,Hereditary,Star Wars: The Force Awakens,Jackie,Saving Mr. Banks,Moonlight,Tully,Gifted,Jack Ryan: Shadow Recruit,Rush,Run All Night,Red Sparrow,Steve Jobs,Unforgettable,Fruitvale Station,Joy,About Last Night,This Is 40,Blended,Daddy's Home,American Hustle,Wild,Nocturnal Animals,Almost Christmas,Addicted,Battle of the Sexes,Book Club,Self/Less,Florence Foster Jenkins,Ride Along,Death Wish,Bridget Jones's Baby,Non-Stop,Nine Lives,Beauty and the Beast,Neighbors 2: Sorority Rising,Flatliners,Southpaw,San Andreas,La La Land,"Everything, Everything",Adrift,Spy,Rogue One: A Star Wars Story,Skyfall,Oculus,Dirty Grandpa,Winchester,Tyler Perry's Acrimony,Assassin's Creed,Baggage Claim,Gravity,Bad Moms,Tammy,Fences,Wish Upon,Proud Mary,Taken 2,Last Vegas,Daddy's Home 2,Interstellar,Breaking In,Spotlight,Action Point,Blockers,If I Stay,Star Wars: The Last Jedi,Tomb Raider,John Wick,Tomorrowland,Get On Up,Midnight Sun,Chappaquiddick,My Big Fat Greek Wedding 2,Max,Sinister 2,Atomic Blonde,Argo,Me Before You,Marshall,Sicario,Grudge Match,Girls Trip,Get Out,Focus,Megan Leavey,Paranormal Activity: The Ghost Dimension,Wonder,Admission,Allied,Poltergeist,Whiskey Tango Foxtrot,No Good Deed,Begin Again,Neighbors,Beirut,Spectre,Magic Mike,Transcendence,Epic,Rough Night,Maleficent,Jigsaw,Safe Haven,Lights Out,Snowden,Million Dollar Arm,Paddington 2,Sleepless,Pacific Rim,Jersey Boys,Kidnap,Pete's Dragon,10 Cloverfield Lane,Game Night,War Room,Dracula Untold,Escape From Planet Earth,Dolphin Tale 2,Detroit,The Wedding Ringer,Thank You For Your Service,Unsane,Sully,Brick Mansions,Upgrade,I Can Only Imagine,Black Panther,Wind River,Barbershop: The Next Cut,Why Him?,Arrival,Parental Guidance,Lucy,Mad Max: Fury Road,American Made,Don't Breathe,Ride Along 2,American Ultra,Annabelle,Underworld: Blood Wars,Unbroken,Warcraft,Deadpool,Crimson Peak,Annihilation,Unfriended,"Love, Simon",Burnt,Broken City,I Feel Pretty,London Has Fallen,Transformers: Age of Extinction,Dunkirk,Identity Thief,Split,Inferno,Hot Tub Time Machine 2,Hope Springs,Home Again,Hitman: Agent 47,It Comes At Night,Happy Death Day,Hacksaw Ridge,Goosebumps,Jackass Presents: Bad Grandpa,Suicide Squad,Ghost in the Shell,Paranormal Activity: The Marked Ones,ParaNorman,Fifty Shades Darker,Everybody Wants Some!!,Lee Daniels' The Butler,Leap!,Geostorm,Krampus,Fantastic Four,Fifty Shades Freed,Finding Dory,Selma,Turbo,Planes: Fire and Rescue,Sgt. Stubby: An American Hero,Ready Player One,This Is The End,Wonder Woman,Prometheus,Kong: Skull Island,Paper Towns,Ferdinand,It,Grown Ups 2,Fury,Furious 7,Free Birds,Fist Fight,Every Day,Logan Lucky,Downsizing,Despicable Me 3,Captain America: Civil War,Blair Witch,Blade Runner 2049,Bad Samaritan,Jack Reacher: Never Go Back,Hotel Artemis,Mechanic: Resurrection,Miracles From Heaven,Moana,Paddington,Ocean's 8,Den of Thieves,Gringo,God's Not Dead 2,Storks,God's Not Dead,Magic Mike XXL,Project Almanac,Elysium,Trolls,Overboard,Maze Runner: The Death Cure,Hostiles,Planes,Pixels,What To Expect When You're Expecting,Avengers: Infinity War,Ant-Man,Annabelle: Creation,Alien: Covenant,Abraham Lincoln: Vampire Hunter,World War Z,Wreck-It Ralph,X-Men: Apocalypse,Cars 3,Zootopia,Sinister,Keanu,Jason Bourne,12 Strong,Sing,Let's Be Cops,Jurassic World,Life,Logan,Resident Evil: The Final Chapter,Jack Reacher,Lone Survivor,Risen,Insidious: The Last Key,Inside Out,Rings,Show Dogs,Big Hero 6,Pitch Perfect 3,Justice League,Peter Rabbit,Savages,Avengers: Age Of Ultron,Act of Valor,Rise of the Guardians,Captain Underpants: The First Epic Movie,Do You Believe?,Deadpool 2,Project X,Ice Age: Collision Course,Solo: A Star Wars Story,Smurfs: The Lost Village,Red Tails,Resident Evil: Retribution,Frankenweenie,Monster Trucks,Entourage,Jumanji: Welcome to the Jungle,Thor: Ragnarok,Early Man,Dope,Transformers: The Last Knight,Doctor Strange,Home,Sherlock Gnomes,Rampage
length,17,13,12,12,11,11,11,11,11,11,11,11,11,11,10,10,10,10,10,10,10,9,9,9,9,9,9,9,9,9,8,8,8,8,8,8,8,8,8,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,fall from grace_KW,New York_KW,parent_KW,biopic_KW,wrongly accused_KW,surprise ending_KW,family_KW,biopic_KW,family_KW,mother_KW,motherhood_KW,brother_KW,fiancé_KW,biopic_KW,New York_KW,suspenseful_KW,biopic_KW,murder_KW,family_KW,New York_KW,ex-girlfriend_KW,motherhood_KW,family_KW,parent_KW,New York_KW,parent_KW,family_KW,parent_KW,therapy_KW,biographical_KW,relationship_KW,wealthy_KW,New York_KW,brother_KW,family_KW,motherhood_KW,New York_KW,family_KW,romantic_KW,parent_KW,addiction_KW,shooting_KW,family_KW,relationship_KW,mother_KW,desperation_KW,revenge_KW,suspenseful_KW,femme fatale_KW,parent_KW,parent_KW,revenge_KW,therapy_KW,family_KW,family_KW,motherhood_KW,motherhood_KW,family_KW,family_KW,parent_KW,revenge_KW,family_KW,New York_KW,parent_KW,family_KW,parent_KW,scandal_KW,desperation_KW,parent_KW,parent_KW,mother_KW,obsession_KW,New York_KW,female_KW,abandonment_KW,romantic_KW,scandal_KW,family_KW,brother_KW,murder_KW,double cross_KW,historical_KW,wealthy_KW,biopic_KW,revenge_KW,ex-girlfriend_KW,women_KW,New York_KW,femme fatale_KW,love_KW,New York_KW,New York_KW,motherhood_KW,romantic_KW,family_KW,relationship_KW,parent_KW,New York_KW,family_KW,1980s_KW,deception_KW,sexy_KW,assassination_KW,mother_KW,women_KW,betrayal_KW,murder_KW,relationship_KW,siblings_KW,political_KW,business_KW,narcissism_KW,desperation_KW,brother_KW,fame_KW,desperation_KW,parent_KW,escape_KW,death_KW,wealthy_KW,son_KW,escape_KW,touching_KW,law enforcement_KW,fiancé_KW,alcohol_KW,suspenseful_KW,New York_KW,racy_KW,murder_KW,touching_KW,revenge_KW,murder_KW,political_KW,family_KW,surprise ending_KW,parenthood_KW,crime_KW,escape_KW,scandal_KW,crime_KW,brother_KW,racy_KW,pregnancy_KW,betrayal_KW,biopic_KW,weapon_KW,revenge_KW,tragedy_KW,suspenseful_KW,relationship_KW,touching_KW,therapy_KW,New York_KW,New York_KW,political_KW,father/daughter_KW,suspenseful_KW,couple_KW,desperation_KW,historical_KW,business_KW,marriage_KW,separation_KW,murder_KW,family_KW,murder_KW,biopic_KW,father/daughter_KW,parent_KW,law enforcement_KW,crime_KW,murder_KW,wrongly accused_KW,seduction_KW,relationship_KW,1970s_KW,touching_KW,racy_KW,family_KW,ex-girlfriend_KW,couple_KW,parent_KW,political_KW,brother_KW,fight_KW,touching_KW,escape_KW,celebrity_KW,female_KW,death_KW,suspenseful_KW,relationship_KW,escape_KW,dark_KW,marriage_KW,loyalty_KW,revenge_KW,couple_KW,conflict_KW,romantic_KW,brother_KW,money_KW,brother_KW,political_KW,brother_KW,traitor_KW,crime_KW,racy_KW,crime_KW,revenge_KW,death_KW,female_KW,family_KW,New York_KW,law enforcement_KW,business_KW,lawyer_KW,family_KW,conflict_KW,racy_KW,suspenseful_KW,money_KW,couple_KW,deception_KW,dark_KW,dark_KW,international_KW,1980s_KW,pregnancy_KW,New York_KW,1980s_KW,death_KW,suspenseful_KW,historical_KW,suspenseful_KW,romance_KW,1980s_KW,touching_KW,law enforcement_KW,murder_KW,break-up_KW,assassin_KW,true story_KW,ambition_KW,law enforcement_KW,suspenseful_KW,suspenseful_KW,dark_KW,suspenseful_KW,murder_KW,true story_KW,death_KW,suspenseful_KW,family_KW,death_KW,New York_KW,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,husband_KW,fall from grace_KW,family_KW,murder_KW,murder_KW,parent_KW,betrayal_KW,historical_KW,mother_KW,betrayal_KW,New York_KW,parent_KW,New York_KW,celebrity_KW,father/son_KW,revenge_KW,1980s_KW,husband_KW,murder_KW,biopic_KW,relationship_KW,family_KW,mother_KW,husband_KW,betrayal_KW,mother_KW,husband_KW,family_KW,addiction_KW,true story_KW,romantic_KW,suspenseful_KW,biopic_KW,family_KW,revenge_KW,pregnancy_KW,wrongly accused_KW,husband_KW,love_KW,revenge_KW,death_KW,tragedy_KW,death_KW,romantic_KW,romantic_KW,true story_KW,love_KW,rebellion_KW,family_KW,brother_KW,son_KW,inheritance_KW,revenge_KW,murder_KW,female_KW,mother_KW,parent_KW,mother_KW,father/son_KW,death_KW,killer_KW,revenge_KW,engagement_KW,husband_KW,single father_KW,mother_KW,true story_KW,estranged_KW,father/daughter_KW,romantic_KW,rebellion_KW,female_KW,revenge_KW,female protagonist_KW,fame_KW,love_KW,tragedy_KW,mother_KW,family_KW,mother_KW,assassination_KW,true story_KW,romantic_KW,legal_KW,violence_KW,father/son_KW,female_KW,family_KW,revenge_KW,true story_KW,family_KW,family_KW,relationship_KW,assassin_KW,mother_KW,true story_KW,mother_KW,relationship_KW,revenge_KW,political_KW,Italy_KW,racy_KW,death_KW,death_KW,female_KW,female_KW,ruthless_KW,crime_KW,brother_KW,legal_KW,family_KW,son_KW,son_KW,death_KW,celebrity_KW,son_KW,death_KW,female_KW,murder_KW,marriage_KW,father_KW,brother_KW,compassion_KW,historical_KW,engagement_KW,alcoholism_KW,therapy_KW,biographical_KW,ex-girlfriend_KW,death_KW,son_KW,father_KW,crime_KW,business_KW,father/daughter_KW,death_KW,parent_KW,female_KW,death_KW,true story_KW,millionaire_KW,brother-in-law_KW,couple_KW,couple_KW,love_KW,son_KW,fight_KW,terminally ill_KW,love triangle_KW,female_KW,revenge_KW,deception_KW,money_KW,political_KW,female_KW,murder_KW,daughter_KW,historical_KW,crime_KW,son_KW,dark_KW,murder_KW,relationship_KW,mother_KW,assassin_KW,husband_KW,suspenseful_KW,biographical_KW,daughter_KW,elderly_KW,son_KW,female_KW,mother_KW,suspenseful_KW,sex_KW,true story_KW,true story_KW,female_KW,political_KW,relationship_KW,brother_KW,seduction_KW,touching_KW,biopic_KW,siblings_KW,touching_KW,true story_KW,culture_KW,death_KW,female protagonist_KW,suspenseful_KW,1970s_KW,revenge_KW,touching_KW,1980s_KW,family_KW,violence_KW,international_KW,historical_KW,fight_KW,love_KW,crime_KW,romance_KW,siblings_KW,law enforcement_KW,siblings_KW,dark_KW,suspenseful_KW,wrongly accused_KW,assassin_KW,assassin_KW,true story_KW,female protagonist_KW,touching_KW,fashion_KW,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,mother_KW,biopic_KW,famous_KW,ruthless_KW,death_KW,family_KW,father/son_KW,biographical_KW,father/son_KW,love_KW,parent_KW,mother_KW,suspenseful_KW,true story_KW,violence_KW,desperation_KW,business_KW,mother_KW,mother_KW,parent_KW,romantic_KW,relationship_KW,single father_KW,envy_KW,true story_KW,death_KW,death_KW,death_KW,female_KW,female_KW,female_KW,mother_KW,relationship_KW,engagement_KW,violence_KW,mother_KW,killer_KW,father/son_KW,romance_KW,marriage_KW,father/daughter_KW,father/daughter_KW,father_KW,love_KW,love_KW,female_KW,weapon_KW,rebel_KW,tragedy_KW,relationship_KW,wedding_KW,true story_KW,betrayal_KW,killer_KW,wedding_KW,death_KW,family_KW,female_KW,affair_KW,female_KW,female_KW,father/daughter_KW,wedding_KW,marriage_KW,father/daughter_KW,desperation_KW,racy_KW,father/daughter_KW,daughter_KW,tragedy_KW,female_KW,female protagonist_KW,violence_KW,father/daughter_KW,biopic_KW,female_KW,death_KW,tradition_KW,death_KW,father/son_KW,female_KW,political_KW,love_KW,lawyer_KW,international_KW,son_KW,affair_KW,relationship_KW,sexy_KW,female_KW,uncle_KW,siblings_KW,mother_KW,romance_KW,father_KW,female_KW,desperation_KW,break-up_KW,marriage_KW,international_KW,international_KW,romance_KW,marriage_KW,father/daughter_KW,female protagonist_KW,romance_KW,killer_KW,suspenseful_KW,dark_KW,true story_KW,culture_KW,wrongly accused_KW,father_KW,fight_KW,historical_KW,mother_KW,family_KW,love_KW,crime_KW,daughter_KW,father/son_KW,family_KW,true story_KW,true story_KW,deception_KW,compassion_KW,female_KW,historical_KW,weapon_KW,revenge_KW,father_KW,father/son_KW,suspenseful_KW,family_KW,daughter_KW,female_KW,family_KW,female protagonist_KW,dark_KW,1980s_KW,suspenseful_KW,marriage_KW,assassin_KW,suspenseful_KW,romance_KW,true story_KW,violence_KW,health_KW,love_KW,female protagonist_KW,suspenseful_KW,romance_KW,touching_KW,revenge_KW,female protagonist_KW,death_KW,father_KW,true story_KW,raunchy_KW,suspenseful_KW,Italy_KW,raunchy_KW,family_KW,daughter_KW,violence_KW,father_KW,killer_KW,true story_KW,father_KW,raunchy_KW,dark_KW,female protagonist_KW,suspenseful_KW,touching_KW,sexy_KW,1980s_KW,1980s_KW,female protagonist_KW,brother_KW,tradition_KW,CEO_KW,sex_KW,family_KW,true story_KW,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,famous_KW,relationship_KW,father/son_KW,violence_KW,racy_KW,mother_KW,rebellion_KW,death_KW,business_KW,violence_KW,relationship_KW,family_KW,murder_KW,ambition_KW,father_KW,assassination_KW,biographical_KW,violence_KW,son_KW,mother_KW,love_KW,mother_KW,romance_KW,exes_KW,ambition_KW,true story_KW,female_KW,estranged_KW,female protagonist_KW,female protagonist_KW,romance_KW,death_KW,love_KW,law enforcement_KW,death_KW,romantic_KW,alcoholism_KW,father/daughter_KW,father/daughter_KW,raunchy_KW,daughter_KW,daughter_KW,estranged_KW,ambition_KW,female_KW,female protagonist_KW,female_KW,father/daughter_KW,death_KW,family_KW,marriage_KW,female_KW,obsession_KW,assassin_KW,romance_KW,female_KW,mother_KW,female protagonist_KW,conflict_KW,female protagonist_KW,female protagonist_KW,daughter_KW,romance_KW,ex-husband_KW,daughter_KW,death_KW,touching_KW,daughter_KW,parenthood_KW,death_KW,rebel_KW,father/daughter_KW,death_KW,daughter_KW,celebrity_KW,romance_KW,true story_KW,wedding_KW,true story_KW,estranged_KW,female protagonist_KW,1970s_KW,tragedy_KW,historical_KW,crime_KW,fight_KW,raunchy_KW,racy_KW,sex_KW,female protagonist_KW,death_KW,touching_KW,romance_KW,marriage_KW,daughter_KW,female protagonist_KW,daughter_KW,couple_KW,raunchy_KW,1970s_KW,crime_KW,touching_KW,death of a spouse_KW,daughter_KW,raunchy_KW,female protagonist_KW,violence_KW,family_KW,family_KW,suspenseful_KW,true story_KW,touching_KW,father/son_KW,siblings_KW,biopic_KW,suspenseful_KW,death of a parent_KW,female protagonist_KW,couple_KW,family_KW,dark_KW,siblings_KW,love_KW,suspenseful_KW,wedding_KW,true story_KW,female protagonist_KW,true story_KW,suspenseful_KW,death of a spouse_KW,father/son_KW,son_KW,violence_KW,relationship_KW,father_KW,female protagonist_KW,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


### Capturing House of Gucci's Demographics

In [9]:
dma_market_share.head()

Unnamed: 0,Rank,DMA Long Name,DMA,DMA Code,Typical Market Share,Film Share (All Weeks),Index,Film
0,1,Los Angeles (2),Los Angeles,803,8.19,7.42,-9.51,Jumanji: Welcome to the Jungle
1,2,New York (1),New York,501,6.41,4.96,-22.59,Jumanji: Welcome to the Jungle
2,3,Dallas-Ft. Worth (5),Dallas-Ft. Worth,623,2.77,2.79,0.86,Jumanji: Welcome to the Jungle
3,4,San Francisco-Oak-San Jose (6),San Francisco-Oak-San Jose,807,3.07,2.73,-11.09,Jumanji: Welcome to the Jungle
4,5,Chicago (3),Chicago,602,2.93,2.68,-8.54,Jumanji: Welcome to the Jungle


In [10]:
'''Gets Top 5 DMAs for Each of the Top 10-Similar Movies.'''

# Filters dataframe to top 10 movies
top_10_movies_list = ['I, Tonya', "Molly's Game", 'Creed', 'Black Mass', 'Taken 3', 'Hereditary',
                      'Star Wars: The Force Awakens', 'Jackie', 'Saving Mr. Banks', 'Moonlight']
top_10_movies = dma_market_share.loc[dma_market_share['Film'].isin(top_10_movies_list)]

# Gets top 5 DMAs
top_dma = {}
for movie in top_10_movies_list:
    temp_df = top_10_movies.loc[top_10_movies['Film'] == movie]
    top_dma[movie] = list(temp_df.sort_values(['Index'], ascending=False)['DMA Code'].values[:5])

In [11]:
'''Gathers Top 2 Values Per Demographic For Each Similar Movie.'''

top_10_movies_demos = pd.DataFrame()
for movie in top_10_movies_list:
    top_pops, top_lang, top_gend, top_age, top_married, top_education, top_income, top_family, top_occ, top_occ_cl = [], [], [], [], [], [], [], [], [], []

    for dma in top_dma[movie]:
        temp_df = demographics.loc[demographics['DMA Area Code'] == int(dma)]
        
        # Population by Single-Classification Race
        # Population Top 1
        population_cols = ['White Alone', 'Black or African American Alone', 'Amer. Indian and Alaska Native Alone', 'Asian Alone',
                           'Native Hawaiian and Other Pac. Isl. Alone', 'Some Other Race Alone', 'Two or More Races']
        population_1 = temp_df[population_cols].reset_index(drop=True)
        top_pops.append(population_1.idxmax(axis=1).values[0])
        # Population Top 2
        population_cols.remove(population_1.idxmax(axis=1).values[0])
        population_2 = temp_df[population_cols].reset_index(drop=True)
        top_pops.append(population_2.idxmax(axis=1).values[0])
        
        
        # Language
        # Language Top 1
        language_cols = ['Speak Only English at Home', 'Speak Asian/Pac. Isl. Lang. at Home',
                         'Speak IndoEuropean Language at Home', 'Speak Spanish at Home', 'Speak Other Language at Home']
        language_1 = temp_df[language_cols].reset_index(drop=True)
        top_lang.append(language_1.idxmax(axis=1).values[0])
        # Language Top 2
        language_cols.remove(language_1.idxmax(axis=1).values[0])
        language_2 = temp_df[language_cols].reset_index(drop=True)
        top_lang.append(language_2.idxmax(axis=1).values[0])
        
        
        # Gender
        gender_cols = ['Male', 'Female']
        gender_1 = temp_df[gender_cols].reset_index(drop=True)
        top_gend.append(gender_1.idxmax(axis=1).values[0])
        
        
        # Age
        # Age Top 1
        age_cols = ['Age 0 - 4.1', 'Age 5 - 9.1', 'Age 10 - 14.1', 'Age 15 - 17.1', 'Age 18 - 20.1', 'Age 21 - 24.1',
                    'Age 25 - 34.1', 'Age 35 - 44.1', 'Age 45 - 54.1', 'Age 55 - 64.1', 'Age 65 - 74.1', 'Age 75 - 84.1', 'Age 85 and over.1']
        age_1 = temp_df[age_cols].reset_index(drop=True)
        top_age.append(age_1.idxmax(axis=1).values[0])
        # Age Top 2
        age_cols.remove(age_1.idxmax(axis=1).values[0])
        age_2 = temp_df[age_cols].reset_index(drop=True)
        top_age.append(age_2.idxmax(axis=1).values[0])
        
        
        # Marital Status
        # Married Top 1
        married_cols = ['Total, Never Married', 'Married, Spouse present', 'Married, Spouse absent', 'Widowed', 'Divorced']
        married_1 = temp_df[married_cols].reset_index(drop=True)
        top_married.append(married_1.idxmax(axis=1).values[0])
        # Married Top 2
        married_cols.remove(married_1.idxmax(axis=1).values[0])
        married_2 = temp_df[married_cols].reset_index(drop=True)
        top_married.append(married_2.idxmax(axis=1).values[0])
        
        
        # Education
        # Education Top 1
        education_cols = ['Less than 9th grade', 'Some High School, no diploma', 'High School Graduate (or GED)', 'Some College, no degree',
                        'Associate Degree', "Bachelor's Degree", "Master's Degree", 'Professional School Degree', 'Doctorate Degree']
        education_1 = temp_df[education_cols].reset_index(drop=True)
        top_education.append(education_1.idxmax(axis=1).values[0])
        # Education Top 2
        education_cols.remove(education_1.idxmax(axis=1).values[0])
        education_2 = temp_df[education_cols].reset_index(drop=True)
        top_education.append(education_2.idxmax(axis=1).values[0])
        
        
        # Income
        # Income Top 1
        income_cols = ['Income < $15,000', 'Income $15,000 - $24,999', 'Income $25,000 - $34,999', 'Income $35,000 - $49,999', 'Income $50,000 - $74,999',
                       'Income $75,000 - $99,999', 'Income $100,000 - $124,999', 'Income $125,000 - $149,999', 'Income $150,000 - $199,999',
                       'Income $200,000 - $249,999', 'Income $250,000 - $499,999', 'Income $500,000+']
        income_1 = temp_df[income_cols].reset_index(drop=True)
        top_income.append(income_1.idxmax(axis=1).values[0])
        # Income Top 2
        income_cols.remove(income_1.idxmax(axis=1).values[0])
        income_2 = temp_df[income_cols].reset_index(drop=True)
        top_income.append(income_2.idxmax(axis=1).values[0])
        
        
        # Family Type
        # Family Top 1
        family_cols = ['Married-Couple Family, own children', 'Married-Couple Family, no own children', 'Male Householder, own children',
                       'Male Householder, no own children', 'Female Householder, own children', 'Female Householder, no own children']
        family_1 = temp_df[family_cols].reset_index(drop=True)
        top_family.append(family_1.idxmax(axis=1).values[0])
        # Family Top 2
        family_cols.remove(family_1.idxmax(axis=1).values[0])
        family_2 = temp_df[family_cols].reset_index(drop=True)
        top_family.append(family_2.idxmax(axis=1).values[0])
        
        
        # Occupation
        # Occupation Top 1
        occ_cols = ['Architect/Engineer', 'Arts/Entertainment/Sports', 'Building Grounds Maintenance', 'Business/Financial Operations', 'Community/Social Services',
                    'Computer/Mathematical', 'Construction/Extraction', 'Education/Training/Library', 'Farming/Fishing/Forestry', 'Food Prep/Serving',
                    'Health Practitioner/Technician', 'Healthcare Support', 'Maintenance Repair', 'Legal', 'Life/Physical/Social Science', 'Management',
                    'Office/Admin. Support', 'Production', 'Protective Services', 'Sales/Related', 'Personal Care/Service', 'Transportation/Moving']
        occ_1 = temp_df[occ_cols].reset_index(drop=True)
        top_occ.append(occ_1.idxmax(axis=1).values[0])
        # Occupation Top 2
        occ_cols.remove(occ_1.idxmax(axis=1).values[0])
        occ_2 = temp_df[occ_cols].reset_index(drop=True)
        top_occ.append(occ_2.idxmax(axis=1).values[0])
        # Occupation Top 3
        occ_cols.remove(occ_2.idxmax(axis=1).values[0])
        occ_3 = temp_df[occ_cols].reset_index(drop=True)
        top_occ.append(occ_3.idxmax(axis=1).values[0])
        # Occupation Top 4
        occ_cols.remove(occ_3.idxmax(axis=1).values[0])
        occ_4 = temp_df[occ_cols].reset_index(drop=True)
        top_occ.append(occ_4.idxmax(axis=1).values[0])
        # Occupation Top 5
        occ_cols.remove(occ_4.idxmax(axis=1).values[0])
        occ_5 = temp_df[occ_cols].reset_index(drop=True)
        top_occ.append(occ_5.idxmax(axis=1).values[0])
        
        
        # Occupation Class
        occ_cl_cols = ['Blue Collar', 'White Collar', 'Service and Farm']
        occ_cl_1 = temp_df[occ_cl_cols].reset_index(drop=True)
        top_occ_cl.append(occ_cl_1.idxmax(axis=1).values[0])

        
    '''Aggregates Top Demographics Across All 5 DMAs.'''
    # Population Aggregation
    pop_top_1 = mode(top_pops)
    pop_top_2 = mode([i for i in top_pops if i != pop_top_1])
    top_10_movies_demos.at[movie, 'Population'] = ', '.join([pop_top_1, pop_top_2])
    
    # Language Aggregation
    lang_top_1 = mode(top_lang)
    lang_top_2 = mode([i for i in top_lang if i != lang_top_1])
    top_10_movies_demos.at[movie, 'Language'] = ', '.join([lang_top_1, lang_top_2])
    
    # Gender Aggregation
    top_10_movies_demos.at[movie, 'Gender'] = mode(top_gend)
    
    # Age Aggregation
    age_top_1 = mode(top_age)
    age_top_2 = mode([i for i in top_age if i != age_top_1])
    top_10_movies_demos.at[movie, 'Age'] = ', '.join([age_top_1, age_top_2])
    
    # Married Aggregation
    married_top_1 = mode(top_married)
    married_top_2 = mode([i for i in top_married if i != married_top_1])
    top_10_movies_demos.at[movie, 'Marital Status'] = ', '.join([married_top_1, married_top_2])
    
    # Education Aggregation
    education_top_1 = mode(top_education)
    education_top_2 = mode([i for i in top_education if i != education_top_1])
    top_10_movies_demos.at[movie, 'Education'] = ', '.join([education_top_1, education_top_2])
    
    # Income Aggregation
    income_top_1 = mode(top_income)
    income_top_2 = mode([i for i in top_income if i != income_top_1])
    top_10_movies_demos.at[movie, 'Income'] = ', '.join([income_top_1, income_top_2])
    
    # Family Aggregation
    family_top_1 = mode(top_family)
    family_top_2 = mode([i for i in top_family if i != family_top_1])
    top_10_movies_demos.at[movie, 'Family Type'] = ', '.join([family_top_1, family_top_2])
    
    # Occupation Aggregation
    occ_top_1 = mode(top_occ)
    occ_top_2 = mode([i for i in top_occ if i != occ_top_1])
    occ_top_3 = mode([i for i in top_occ if i != occ_top_1 and i != occ_top_2])
    occ_top_4 = mode([i for i in top_occ if i != occ_top_1 and i != occ_top_2 and i != occ_top_3])
    occ_top_5 = mode([i for i in top_occ if i != occ_top_1 and i != occ_top_2 and i != occ_top_3 and i != occ_top_4])
    top_10_movies_demos.at[movie, 'Occupation'] = ', '.join([occ_top_1, occ_top_2, occ_top_3, occ_top_4, occ_top_5])
    
    # Occupation Class Aggregation
    top_10_movies_demos.at[movie, 'Occupation Class'] = mode(top_occ_cl)

In [12]:
top_10_movies_demos.head(3)

Unnamed: 0,Population,Language,Gender,Age,Marital Status,Education,Income,Family Type,Occupation,Occupation Class
"I, Tonya","White Alone, Amer. Indian and Alaska Native Alone","Speak Only English at Home, Speak IndoEuropean...",Male,"Age 55 - 64.1, Age 45 - 54.1","Married, Spouse present, Total, Never Married","High School Graduate (or GED), Some College, n...","Income $50,000 - $74,999, Income $35,000 - $49...","Married-Couple Family, no own children, Marrie...","Office/Admin. Support, Sales/Related, Manageme...",White Collar
Molly's Game,"White Alone, Black or African American Alone","Speak Only English at Home, Speak Spanish at Home",Female,"Age 25 - 34.1, Age 35 - 44.1","Married, Spouse present, Total, Never Married","High School Graduate (or GED), Some College, n...","Income $50,000 - $74,999, Income $35,000 - $49...","Married-Couple Family, no own children, Marrie...","Office/Admin. Support, Sales/Related, Manageme...",White Collar
Creed,"White Alone, Some Other Race Alone","Speak Only English at Home, Speak Spanish at Home",Female,"Age 25 - 34.1, Age 35 - 44.1","Married, Spouse present, Total, Never Married","High School Graduate (or GED), Some College, n...","Income $50,000 - $74,999, Income < $15,000","Married-Couple Family, no own children, Marrie...","Office/Admin. Support, Sales/Related, Manageme...",White Collar


In [13]:
'''Gets Top Values In Each Demographic Column and Assigns to House of Gucci.'''

house_of_gucci_dem = pd.DataFrame()
for col in top_10_movies_demos.columns:
    total_vals = []
    gets_total_vals = [[total_vals.append(j) for j in i.split(', ')] for i in top_10_movies_demos[col].values]
    
    if col in ['Gender', 'Occupation Class']:
        top_1 = mode(total_vals)
        house_of_gucci_dem.at['House of Gucci', col] = top_1
    elif col == 'Occupation':
        top_1 = mode(total_vals)
        top_2 = mode([i for i in total_vals if i != top_1])
        top_3 = mode([i for i in total_vals if i != top_1 and i != top_2])
        top_4 = mode([i for i in total_vals if i != top_1 and i != top_2 and i != top_3])
        top_5 = mode([i for i in total_vals if i != top_1 and i != top_2 and i != top_3 and i != top_4])
        house_of_gucci_dem.at['House of Gucci', col] = ', '.join([top_1, top_2, top_3, top_4, top_5])
    elif col == 'Population':
        top_1 = mode(total_vals)
        top_2 = mode([i for i in total_vals if i != top_1])
        top_3 = mode([i for i in total_vals if i != top_1 and i != top_2])
        house_of_gucci_dem.at['House of Gucci', col] = ', '.join([top_1, top_2, top_3])
    else:
        top_1 = mode(total_vals)
        top_2 = mode([i for i in total_vals if i != top_1])
        house_of_gucci_dem.at['House of Gucci', col] = ', '.join([top_1, top_2])

# Lists House of Gucci's top demographic data
house_of_gucci_dem

Unnamed: 0,Population,Language,Gender,Age,Marital Status,Education,Income,Family Type,Occupation,Occupation Class
House of Gucci,"White Alone, Some Other Race Alone, Black or A...","Speak Only English at Home, Speak Spanish at Home",Female,"Age 25 - 34.1, Age 35 - 44.1","Married, Spouse present","High School Graduate (or GED), Some College","Income $50,000 - $74,999, Income $35,000 - $49...","Married-Couple Family, no own children","Office/Admin. Support, Sales/Related, Manageme...",White Collar


In [14]:
'''Lists Top 5 DMAs for House of Gucci.'''

all_dmas = []
gets_all_dmas = [[all_dmas.append(j) for j in i] for i in top_dma.values()]

top_1_dma = mode(all_dmas)
top_2_dma = mode([i for i in all_dmas if i != top_1_dma])
top_3_dma = mode([i for i in all_dmas if i != top_1_dma and i != top_2_dma])
top_4_dma = mode([i for i in all_dmas if i != top_1_dma and i != top_2_dma and i != top_3_dma])
top_5_dma = mode([i for i in all_dmas if i != top_1_dma and i != top_2_dma and i != top_3_dma and i != top_4_dma])

for i, dma in enumerate([top_1_dma, top_2_dma, top_3_dma, top_4_dma, top_5_dma]):
    dma_name = dma_market_share.loc[dma_market_share['DMA Code'] == str(dma)]['DMA'].values[0]
    print(f"DMA {i+1}: {dma_name} ({dma})")

DMA 1: Palm Springs (804)
DMA 2: Helena (766)
DMA 3: West Palm Beach-Ft. Pierce (625)
DMA 4: Charleston, SC (519)
DMA 5: Meridian (711)
