# BoxOfficeMojo Pipeline

In [1]:
import pandas as pd
import boxOfficeMojoMethods as mojoMethods 
import datetime

In [2]:
help(mojoMethods)

Help on module boxOfficeMojoMethods:

NAME
    boxOfficeMojoMethods - Jonathan L Chu, for Metis Data Science, 7 April 2020

DESCRIPTION
    Methods for scraping and downloading movie data from
    www.boxofficemojo.com. Actual pipline can be found in the
    accompanying boxOfficeMojoPipeline.py file.

FUNCTIONS
    get_dataframe_from_year(year, num_releases=-1)
        Method to retrieve movie data from one year of
        www.boxofficemojo.com/year/
        
        Number of releases defaults to all [0:-1]
        
        Note that the execution time is equal to 
        2 * len(releases in the year) + 1 seconds
        
        Returns pandas dataframe
    
    get_movie_info_from_title(url)
        Parse the following data from a boxofficemojo.com Title url: 
        ['Movie_Title','Domestic_Distributor','Domestic_Total_Gross',
        'Runtime','Rating','Release_Date','Budget', 'Cast1','Cast2','Cast3','Cast4']
        
        Input: boxofficemojo.com url like:
        'https://

In [3]:
df = pd.DataFrame()
year = 2019

In [4]:
df = mojoMethods.get_dataframe_from_year(year) #,num_releases=5)

200   https://www.boxofficemojo.com/year/2019
200   https://www.boxofficemojo.com/release/rl3059975681/?ref_=bo_yld_table_1
200   https://www.boxofficemojo.com/title/tt4154796/credits/?ref=bo_tt_tab
dataframe shape:  (1, 15)
200   https://www.boxofficemojo.com/release/rl3321923073/?ref_=bo_yld_table_2
200   https://www.boxofficemojo.com/title/tt6105098/credits/?ref=bo_tt_tab
dataframe shape:  (2, 15)
200   https://www.boxofficemojo.com/release/rl3798500865/?ref_=bo_yld_table_3
200   https://www.boxofficemojo.com/title/tt1979376/credits/?ref=bo_tt_tab
dataframe shape:  (3, 15)
200   https://www.boxofficemojo.com/release/rl2424210945/?ref_=bo_yld_table_4
200   https://www.boxofficemojo.com/title/tt4520988/credits/?ref=bo_tt_tab
dataframe shape:  (4, 15)
200   https://www.boxofficemojo.com/release/rl3009644033/?ref_=bo_yld_table_5
200   https://www.boxofficemojo.com/title/tt4154664/credits/?ref=bo_tt_tab
dataframe shape:  (5, 15)
200   https://www.boxofficemojo.com/release/rl3305145857/?r

200   https://www.boxofficemojo.com/title/tt6324278/credits/?ref=bo_tt_tab
dataframe shape:  (46, 15)
200   https://www.boxofficemojo.com/release/rl3993339393/?ref_=bo_yld_table_47
200   https://www.boxofficemojo.com/title/tt7547410/credits/?ref=bo_tt_tab
dataframe shape:  (47, 15)
200   https://www.boxofficemojo.com/release/rl2852423169/?ref_=bo_yld_table_48
200   https://www.boxofficemojo.com/title/tt5886046/credits/?ref=bo_tt_tab
dataframe shape:  (48, 15)
200   https://www.boxofficemojo.com/release/rl2651096577/?ref_=bo_yld_table_49
200   https://www.boxofficemojo.com/title/tt3224458/credits/?ref=bo_tt_tab
dataframe shape:  (49, 15)
200   https://www.boxofficemojo.com/release/rl2634515969/?ref_=bo_yld_table_50
200   https://www.boxofficemojo.com/title/tt6924650/credits/?ref=bo_tt_tab
dataframe shape:  (50, 15)
200   https://www.boxofficemojo.com/release/rl2139063809/?ref_=bo_yld_table_51
200   https://www.boxofficemojo.com/title/tt4701182/credits/?ref=bo_tt_tab
dataframe shape:  (5

200   https://www.boxofficemojo.com/release/rl470189569/?ref_=bo_yld_table_92
200   https://www.boxofficemojo.com/title/tt7549996/credits/?ref=bo_tt_tab
dataframe shape:  (92, 15)
200   https://www.boxofficemojo.com/release/rl2742978049/?ref_=bo_yld_table_93
200   https://www.boxofficemojo.com/title/tt5848272/credits/?ref=bo_tt_tab
dataframe shape:  (93, 15)
200   https://www.boxofficemojo.com/release/rl1577420289/?ref_=bo_yld_table_94
200   https://www.boxofficemojo.com/title/tt6513120/credits/?ref=bo_tt_tab
dataframe shape:  (94, 15)
200   https://www.boxofficemojo.com/release/rl1359316481/?ref_=bo_yld_table_95
200   https://www.boxofficemojo.com/title/tt4669788/credits/?ref=bo_tt_tab
dataframe shape:  (95, 15)
200   https://www.boxofficemojo.com/release/rl3036972545/?ref_=bo_yld_table_96
200   https://www.boxofficemojo.com/title/tt8385474/credits/?ref=bo_tt_tab
dataframe shape:  (96, 15)
200   https://www.boxofficemojo.com/release/rl2986771969/?ref_=bo_yld_table_97
200   https://www

200   https://www.boxofficemojo.com/release/rl788956673/?ref_=bo_yld_table_137
200   https://www.boxofficemojo.com/title/tt9019352/credits/?ref=bo_tt_tab
dataframe shape:  (137, 15)
200   https://www.boxofficemojo.com/release/rl2567276033/?ref_=bo_yld_table_138
200   https://www.boxofficemojo.com/title/tt2365580/credits/?ref=bo_tt_tab
dataframe shape:  (138, 15)
200   https://www.boxofficemojo.com/release/rl1611236865/?ref_=bo_yld_table_139
200   https://www.boxofficemojo.com/title/tt8760684/credits/?ref=bo_tt_tab
dataframe shape:  (139, 15)
200   https://www.boxofficemojo.com/release/rl2282063361/?ref_=bo_yld_table_140
200   https://www.boxofficemojo.com/title/tt6476140/credits/?ref=bo_tt_tab
dataframe shape:  (140, 15)
200   https://www.boxofficemojo.com/release/rl3574171137/?ref_=bo_yld_table_141
200   https://www.boxofficemojo.com/title/tt7456310/credits/?ref=bo_tt_tab
dataframe shape:  (141, 15)
200   https://www.boxofficemojo.com/release/rl3641017857/?ref_=bo_yld_table_142
200   

200   https://www.boxofficemojo.com/release/rl3406333441/?ref_=bo_yld_table_182
200   https://www.boxofficemojo.com/title/tt6521876/credits/?ref=bo_tt_tab
dataframe shape:  (182, 15)
200   https://www.boxofficemojo.com/release/rl1779009025/?ref_=bo_yld_table_183
200   https://www.boxofficemojo.com/title/tt8879946/credits/?ref=bo_tt_tab
dataframe shape:  (183, 15)
200   https://www.boxofficemojo.com/release/rl2466678273/?ref_=bo_yld_table_184
200   https://www.boxofficemojo.com/title/tt7212726/credits/?ref=bo_tt_tab
dataframe shape:  (184, 15)
200   https://www.boxofficemojo.com/release/rl654870017/?ref_=bo_yld_table_185
200   https://www.boxofficemojo.com/title/tt7721800/credits/?ref=bo_tt_tab
dataframe shape:  (185, 15)
200   https://www.boxofficemojo.com/release/rl168330753/?ref_=bo_yld_table_186
200   https://www.boxofficemojo.com/title/tt8151874/credits/?ref=bo_tt_tab
dataframe shape:  (186, 15)
200   https://www.boxofficemojo.com/release/rl1024165377/?ref_=bo_yld_table_187
200   h

200   https://www.boxofficemojo.com/release/rl2382923265/?ref_=bo_yld_table_227
200   https://www.boxofficemojo.com/title/tt10442108/credits/?ref=bo_tt_tab
dataframe shape:  (227, 15)
200   https://www.boxofficemojo.com/release/rl1426556417/?ref_=bo_yld_table_228
200   https://www.boxofficemojo.com/title/tt7137380/credits/?ref=bo_tt_tab
dataframe shape:  (228, 15)
200   https://www.boxofficemojo.com/release/rl3825567233/?ref_=bo_yld_table_229
200   https://www.boxofficemojo.com/title/tt3750872/credits/?ref=bo_tt_tab
dataframe shape:  (229, 15)
200   https://www.boxofficemojo.com/release/rl3792274945/?ref_=bo_yld_table_230
200   https://www.boxofficemojo.com/title/tt7374952/credits/?ref=bo_tt_tab
dataframe shape:  (230, 15)
200   https://www.boxofficemojo.com/release/rl2886436353/?ref_=bo_yld_table_231
200   https://www.boxofficemojo.com/title/tt8258074/credits/?ref=bo_tt_tab
dataframe shape:  (231, 15)
200   https://www.boxofficemojo.com/release/rl2969994753/?ref_=bo_yld_table_232
200 

200   https://www.boxofficemojo.com/release/rl4161504769/?ref_=bo_yld_table_272
200   https://www.boxofficemojo.com/title/tt0078748/credits/?ref=bo_tt_tab
dataframe shape:  (272, 15)
200   https://www.boxofficemojo.com/release/rl1258718721/?ref_=bo_yld_table_273
200   https://www.boxofficemojo.com/title/tt8359848/credits/?ref=bo_tt_tab
dataframe shape:  (273, 15)
200   https://www.boxofficemojo.com/release/rl1292338689/?ref_=bo_yld_table_274
200   https://www.boxofficemojo.com/title/tt6675244/credits/?ref=bo_tt_tab
dataframe shape:  (274, 15)
200   https://www.boxofficemojo.com/release/rl4137452033/?ref_=bo_yld_table_275
200   https://www.boxofficemojo.com/title/tt9430698/credits/?ref=bo_tt_tab
dataframe shape:  (275, 15)
200   https://www.boxofficemojo.com/release/rl201885185/?ref_=bo_yld_table_276
200   https://www.boxofficemojo.com/title/tt10260672/credits/?ref=bo_tt_tab
dataframe shape:  (276, 15)
200   https://www.boxofficemojo.com/release/rl3339290113/?ref_=bo_yld_table_277
200  

200   https://www.boxofficemojo.com/release/rl3708388865/?ref_=bo_yld_table_317
200   https://www.boxofficemojo.com/title/tt5791098/credits/?ref=bo_tt_tab
dataframe shape:  (317, 15)
200   https://www.boxofficemojo.com/release/rl3355477505/?ref_=bo_yld_table_318
200   https://www.boxofficemojo.com/title/tt4218572/credits/?ref=bo_tt_tab
dataframe shape:  (318, 15)
200   https://www.boxofficemojo.com/release/rl2366146049/?ref_=bo_yld_table_319
200   https://www.boxofficemojo.com/title/tt0113824/credits/?ref=bo_tt_tab
dataframe shape:  (319, 15)
200   https://www.boxofficemojo.com/release/rl1560839681/?ref_=bo_yld_table_320
200   https://www.boxofficemojo.com/title/tt0087544/credits/?ref=bo_tt_tab
dataframe shape:  (320, 15)
200   https://www.boxofficemojo.com/release/rl1157858817/?ref_=bo_yld_table_321
200   https://www.boxofficemojo.com/title/tt3104988/credits/?ref=bo_tt_tab
dataframe shape:  (321, 15)
200   https://www.boxofficemojo.com/release/rl1913357825/?ref_=bo_yld_table_322
200  

200   https://www.boxofficemojo.com/release/rl1611171329/?ref_=bo_yld_table_362
200   https://www.boxofficemojo.com/title/tt7778680/credits/?ref=bo_tt_tab
dataframe shape:  (362, 15)
200   https://www.boxofficemojo.com/release/rl4043933185/?ref_=bo_yld_table_363
200   https://www.boxofficemojo.com/title/tt6705860/credits/?ref=bo_tt_tab
dataframe shape:  (363, 15)
200   https://www.boxofficemojo.com/release/rl520848897/?ref_=bo_yld_table_364
200   https://www.boxofficemojo.com/title/tt0291350/credits/?ref=bo_tt_tab
dataframe shape:  (364, 15)
200   https://www.boxofficemojo.com/release/rl621381121/?ref_=bo_yld_table_365
200   https://www.boxofficemojo.com/title/tt5323662/credits/?ref=bo_tt_tab
dataframe shape:  (365, 15)
200   https://www.boxofficemojo.com/release/rl4144596481/?ref_=bo_yld_table_366
200   https://www.boxofficemojo.com/title/tt10050136/credits/?ref=bo_tt_tab
dataframe shape:  (366, 15)
200   https://www.boxofficemojo.com/release/rl3590948353/?ref_=bo_yld_table_367
200   

200   https://www.boxofficemojo.com/release/rl1728677377/?ref_=bo_yld_table_407
200   https://www.boxofficemojo.com/title/tt4669296/credits/?ref=bo_tt_tab
dataframe shape:  (407, 15)
200   https://www.boxofficemojo.com/release/rl2192212481/?ref_=bo_yld_table_408
200   https://www.boxofficemojo.com/title/tt11448176/credits/?ref=bo_tt_tab
dataframe shape:  (408, 15)
200   https://www.boxofficemojo.com/release/rl3030811137/?ref_=bo_yld_table_409
200   https://www.boxofficemojo.com/title/tt0098635/credits/?ref=bo_tt_tab
dataframe shape:  (409, 15)
200   https://www.boxofficemojo.com/release/rl4060644865/?ref_=bo_yld_table_410
200   https://www.boxofficemojo.com/title/tt8228884/credits/?ref=bo_tt_tab
dataframe shape:  (410, 15)
200   https://www.boxofficemojo.com/release/rl4211508737/?ref_=bo_yld_table_411
200   https://www.boxofficemojo.com/title/tt6183104/credits/?ref=bo_tt_tab
dataframe shape:  (411, 15)
200   https://www.boxofficemojo.com/release/rl1258915329/?ref_=bo_yld_table_412
200 

200   https://www.boxofficemojo.com/release/rl2013824513/?ref_=bo_yld_table_453
200   https://www.boxofficemojo.com/title/tt8999864/credits/?ref=bo_tt_tab
dataframe shape:  (451, 15)
200   https://www.boxofficemojo.com/release/rl1477019137/?ref_=bo_yld_table_454
200   https://www.boxofficemojo.com/title/tt7117594/credits/?ref=bo_tt_tab
dataframe shape:  (452, 15)
200   https://www.boxofficemojo.com/release/rl2903082497/?ref_=bo_yld_table_455
200   https://www.boxofficemojo.com/title/tt4481066/credits/?ref=bo_tt_tab
dataframe shape:  (453, 15)
200   https://www.boxofficemojo.com/release/rl1805615617/?ref_=bo_yld_table_456
200   https://www.boxofficemojo.com/title/tt0934553/credits/?ref=bo_tt_tab
dataframe shape:  (454, 15)
200   https://www.boxofficemojo.com/release/rl2594079233/?ref_=bo_yld_table_457
200   https://www.boxofficemojo.com/title/tt11197154/credits/?ref=bo_tt_tab
dataframe shape:  (455, 15)
200   https://www.boxofficemojo.com/release/rl2147583489/?ref_=bo_yld_table_458
200 

200   https://www.boxofficemojo.com/title/tt11100434/credits/?ref=bo_tt_tab
dataframe shape:  (495, 15)
200   https://www.boxofficemojo.com/release/rl2433189377/?ref_=bo_yld_table_499
200   https://www.boxofficemojo.com/title/tt4645358/credits/?ref=bo_tt_tab
dataframe shape:  (496, 15)
200   https://www.boxofficemojo.com/release/rl1795589633/?ref_=bo_yld_table_500
200   https://www.boxofficemojo.com/title/tt8286894/credits/?ref=bo_tt_tab
dataframe shape:  (497, 15)
200   https://www.boxofficemojo.com/release/rl1124828673/?ref_=bo_yld_table_501
200   https://www.boxofficemojo.com/title/tt10980000/credits/?ref=bo_tt_tab
dataframe shape:  (498, 15)
200   https://www.boxofficemojo.com/release/rl2282259969/?ref_=bo_yld_table_502
200   https://www.boxofficemojo.com/title/tt8510206/credits/?ref=bo_tt_tab
dataframe shape:  (499, 15)
200   https://www.boxofficemojo.com/release/rl772310529/?ref_=bo_yld_table_503
200   https://www.boxofficemojo.com/title/tt8328740/credits/?ref=bo_tt_tab
dataframe

200   https://www.boxofficemojo.com/title/tt1657517/credits/?ref=bo_tt_tab
dataframe shape:  (540, 15)
200   https://www.boxofficemojo.com/release/rl134841857/?ref_=bo_yld_table_544
200   https://www.boxofficemojo.com/title/tt5749596/credits/?ref=bo_tt_tab
dataframe shape:  (541, 15)
200   https://www.boxofficemojo.com/release/rl1208714753/?ref_=bo_yld_table_545
200   https://www.boxofficemojo.com/title/tt10954574/credits/?ref=bo_tt_tab
dataframe shape:  (542, 15)
200   https://www.boxofficemojo.com/release/rl4127753729/?ref_=bo_yld_table_546
200   https://www.boxofficemojo.com/title/tt8956390/credits/?ref=bo_tt_tab
dataframe shape:  (543, 15)
200   https://www.boxofficemojo.com/release/rl503875073/?ref_=bo_yld_table_547
200   https://www.boxofficemojo.com/title/tt10234494/credits/?ref=bo_tt_tab
dataframe shape:  (544, 15)
200   https://www.boxofficemojo.com/release/rl2382988801/?ref_=bo_yld_table_548
200   https://www.boxofficemojo.com/title/tt1389098/credits/?ref=bo_tt_tab
dataframe 

200   https://www.boxofficemojo.com/title/tt9018916/credits/?ref=bo_tt_tab
dataframe shape:  (585, 15)
200   https://www.boxofficemojo.com/release/rl1778812417/?ref_=bo_yld_table_589
200   https://www.boxofficemojo.com/title/tt5942864/credits/?ref=bo_tt_tab
dataframe shape:  (586, 15)
200   https://www.boxofficemojo.com/release/rl0558593/?ref_=bo_yld_table_590
200   https://www.boxofficemojo.com/title/tt0103776/credits/?ref=bo_tt_tab
dataframe shape:  (587, 15)
200   https://www.boxofficemojo.com/release/rl3959981569/?ref_=bo_yld_table_591
200   https://www.boxofficemojo.com/title/tt3823098/credits/?ref=bo_tt_tab
dataframe shape:  (588, 15)
200   https://www.boxofficemojo.com/release/rl1426621953/?ref_=bo_yld_table_592
200   https://www.boxofficemojo.com/title/tt6522668/credits/?ref=bo_tt_tab
dataframe shape:  (589, 15)
200   https://www.boxofficemojo.com/release/rl2343207425/?ref_=bo_yld_table_593
200   https://www.boxofficemojo.com/title/tt8130904/credits/?ref=bo_tt_tab
dataframe sha

200   https://www.boxofficemojo.com/title/tt4587656/credits/?ref=bo_tt_tab
dataframe shape:  (630, 15)
200   https://www.boxofficemojo.com/release/rl1762362881/?ref_=bo_yld_table_634
200   https://www.boxofficemojo.com/title/tt10551420/credits/?ref=bo_tt_tab
dataframe shape:  (631, 15)
200   https://www.boxofficemojo.com/release/rl4027287041/?ref_=bo_yld_table_635
200   https://www.boxofficemojo.com/title/tt10572668/credits/?ref=bo_tt_tab
dataframe shape:  (632, 15)
200   https://www.boxofficemojo.com/release/rl2785576449/?ref_=bo_yld_table_636
200   https://www.boxofficemojo.com/title/tt5859882/credits/?ref=bo_tt_tab
dataframe shape:  (633, 15)
200   https://www.boxofficemojo.com/release/rl3356001793/?ref_=bo_yld_table_637
200   https://www.boxofficemojo.com/title/tt9617456/credits/?ref=bo_tt_tab
dataframe shape:  (634, 15)
200   https://www.boxofficemojo.com/release/rl856196609/?ref_=bo_yld_table_638
200   https://www.boxofficemojo.com/title/tt8923500/credits/?ref=bo_tt_tab
dataframe

200   https://www.boxofficemojo.com/title/tt6838918/credits/?ref=bo_tt_tab
dataframe shape:  (675, 15)
200   https://www.boxofficemojo.com/release/rl2644673025/?ref_=bo_yld_table_679
200   https://www.boxofficemojo.com/title/tt8863066/credits/?ref=bo_tt_tab
dataframe shape:  (676, 15)
200   https://www.boxofficemojo.com/release/rl1846052353/?ref_=bo_yld_table_680
200   https://www.boxofficemojo.com/title/tt0118688/credits/?ref=bo_tt_tab
dataframe shape:  (677, 15)
200   https://www.boxofficemojo.com/release/rl2064090625/?ref_=bo_yld_table_681
200   https://www.boxofficemojo.com/title/tt6263618/credits/?ref=bo_tt_tab
dataframe shape:  (678, 15)
200   https://www.boxofficemojo.com/release/rl1644791297/?ref_=bo_yld_table_682
200   https://www.boxofficemojo.com/title/tt8725958/credits/?ref=bo_tt_tab
dataframe shape:  (679, 15)
200   https://www.boxofficemojo.com/release/rl3976824321/?ref_=bo_yld_table_683
200   https://www.boxofficemojo.com/title/tt8426594/credits/?ref=bo_tt_tab
dataframe 

200   https://www.boxofficemojo.com/title/tt8586088/credits/?ref=bo_tt_tab
dataframe shape:  (720, 15)
200   https://www.boxofficemojo.com/release/rl3641214465/?ref_=bo_yld_table_724
200   https://www.boxofficemojo.com/title/tt0050634/credits/?ref=bo_tt_tab
dataframe shape:  (721, 15)
200   https://www.boxofficemojo.com/release/rl1694926337/?ref_=bo_yld_table_725
200   https://www.boxofficemojo.com/title/tt4003440/credits/?ref=bo_tt_tab
dataframe shape:  (722, 15)
200   https://www.boxofficemojo.com/release/rl4245390849/?ref_=bo_yld_table_726
200   https://www.boxofficemojo.com/title/tt10703826/credits/?ref=bo_tt_tab
dataframe shape:  (723, 15)
200   https://www.boxofficemojo.com/release/rl2768799233/?ref_=bo_yld_table_727
200   https://www.boxofficemojo.com/title/tt3009772/credits/?ref=bo_tt_tab
dataframe shape:  (724, 15)
200   https://www.boxofficemojo.com/release/rl336299521/?ref_=bo_yld_table_728
200   https://www.boxofficemojo.com/title/tt10844816/credits/?ref=bo_tt_tab
dataframe

200   https://www.boxofficemojo.com/title/tt5796176/credits/?ref=bo_tt_tab
dataframe shape:  (765, 15)
200   https://www.boxofficemojo.com/release/rl285967873/?ref_=bo_yld_table_769
200   https://www.boxofficemojo.com/title/tt6071092/credits/?ref=bo_tt_tab
dataframe shape:  (766, 15)
200   https://www.boxofficemojo.com/release/rl520652289/?ref_=bo_yld_table_770
200   https://www.boxofficemojo.com/title/tt0082964/credits/?ref=bo_tt_tab
dataframe shape:  (767, 15)
200   https://www.boxofficemojo.com/release/rl3708323329/?ref_=bo_yld_table_771
200   https://www.boxofficemojo.com/title/tt8159562/credits/?ref=bo_tt_tab
dataframe shape:  (768, 15)
200   https://www.boxofficemojo.com/release/rl3087500801/?ref_=bo_yld_table_772
200   https://www.boxofficemojo.com/title/tt6977442/credits/?ref=bo_tt_tab
dataframe shape:  (769, 15)
200   https://www.boxofficemojo.com/release/rl1812366849/?ref_=bo_yld_table_773
200   https://www.boxofficemojo.com/title/tt5668850/credits/?ref=bo_tt_tab
dataframe sh

200   https://www.boxofficemojo.com/release/rl419857921/?ref_=bo_yld_table_814
200   https://www.boxofficemojo.com/title/tt5891122/credits/?ref=bo_tt_tab
dataframe shape:  (810, 15)
200   https://www.boxofficemojo.com/release/rl939951617/?ref_=bo_yld_table_815
200   https://www.boxofficemojo.com/title/tt5501104/credits/?ref=bo_tt_tab
dataframe shape:  (811, 15)
200   https://www.boxofficemojo.com/release/rl2198570497/?ref_=bo_yld_table_816
200   https://www.boxofficemojo.com/title/tt6619250/credits/?ref=bo_tt_tab
dataframe shape:  (812, 15)
200   https://www.boxofficemojo.com/release/rl2533918209/?ref_=bo_yld_table_817
200   https://www.boxofficemojo.com/title/tt8247470/credits/?ref=bo_tt_tab
dataframe shape:  (813, 15)
200   https://www.boxofficemojo.com/release/rl2030798337/?ref_=bo_yld_table_818
200   https://www.boxofficemojo.com/title/tt8287690/credits/?ref=bo_tt_tab
dataframe shape:  (814, 15)
200   https://www.boxofficemojo.com/release/rl3600581121/?ref_=bo_yld_table_819
200   h

200   https://www.boxofficemojo.com/release/rl2752218625/?ref_=bo_yld_table_859
200   https://www.boxofficemojo.com/title/tt6305892/credits/?ref=bo_tt_tab
dataframe shape:  (855, 15)
200   https://www.boxofficemojo.com/release/rl470517249/?ref_=bo_yld_table_860
200   https://www.boxofficemojo.com/title/tt9047474/credits/?ref=bo_tt_tab
dataframe shape:  (856, 15)
200   https://www.boxofficemojo.com/release/rl470320641/?ref_=bo_yld_table_861
200   https://www.boxofficemojo.com/title/tt6871338/credits/?ref=bo_tt_tab
dataframe shape:  (857, 15)
200   https://www.boxofficemojo.com/release/rl2426438145/?ref_=bo_yld_table_862
200   https://www.boxofficemojo.com/title/tt4558200/credits/?ref=bo_tt_tab
dataframe shape:  (858, 15)
200   https://www.boxofficemojo.com/release/rl2316011009/?ref_=bo_yld_table_863
200   https://www.boxofficemojo.com/title/tt6983908/credits/?ref=bo_tt_tab
dataframe shape:  (859, 15)
200   https://www.boxofficemojo.com/release/rl3288958465/?ref_=bo_yld_table_864
200   h

200   https://www.boxofficemojo.com/release/rl1241941505/?ref_=bo_yld_table_904
200   https://www.boxofficemojo.com/title/tt3076510/credits/?ref=bo_tt_tab
dataframe shape:  (900, 15)
200   https://www.boxofficemojo.com/release/rl3506603521/?ref_=bo_yld_table_905
200   https://www.boxofficemojo.com/title/tt5815492/credits/?ref=bo_tt_tab
dataframe shape:  (901, 15)
200   https://www.boxofficemojo.com/release/rl17401345/?ref_=bo_yld_table_906
200   https://www.boxofficemojo.com/title/tt6314766/credits/?ref=bo_tt_tab
dataframe shape:  (902, 15)
200   https://www.boxofficemojo.com/release/rl268928513/?ref_=bo_yld_table_907
200   https://www.boxofficemojo.com/title/tt8184202/credits/?ref=bo_tt_tab
dataframe shape:  (903, 15)


In [5]:
df.head()

Unnamed: 0,index,Movie_Title,Domestic_Distributor,Domestic_Total_Gross,Runtime,Rating,Release_Date,Budget,Cast1,Cast2,Cast3,Cast4,Director,Writer,Producer,Cinematographer
0,0,Avengers: Endgame,Walt Disney Studios Motion Pictures,858373000,181,PG-13,2019-04-24,356000000,Robert Downey Jr.,Chris Evans,Mark Ruffalo,Chris Hemsworth,Anthony Russo,Christopher Markus,Kevin Feige,Trent Opaloch
1,0,The Lion King,Walt Disney Studios Motion Pictures,543638043,118,PG,2019-07-12,260000000,Donald Glover,Beyoncé,Seth Rogen,Chiwetel Ejiofor,Jon Favreau,Jeff Nathanson,Jon Favreau,Caleb Deschanel
2,0,Toy Story 4,Walt Disney Studios Motion Pictures,434038008,100,G,2019-06-20,200000000,Tom Hanks,Tim Allen,Annie Potts,Tony Hale,Josh Cooley,John Lasseter,Mark Nielsen,Jean-Claude Kalache
3,0,Frozen II,Walt Disney Studios Motion Pictures,477373578,103,PG,2019-11-20,150000000,Kristen Bell,Idina Menzel,Josh Gad,Jonathan Groff,Chris Buck,Jennifer Lee,Peter Del Vecho,
4,0,Captain Marvel,Walt Disney Studios Motion Pictures,426829839,123,PG-13,2019-03-06,160000000,Brie Larson,Samuel L. Jackson,Ben Mendelsohn,Jude Law,Anna Boden,Anna Boden,Kevin Feige,Ben Davis


In [6]:
df.to_pickle(path=('./data/mojo_'+str(year)+'_movies.pkl'))

In [7]:
df.shape

(903, 16)