### Web Scraping Project on Rotten Tomatoes
- 140 ESSENTIAL ACTION MOVIES TO WATCH NOW BY ROTTEN TOMATOES
    - https://editorial.rottentomatoes.com/guide/140-essential-action-movies-to-watch-now
    

    
    
<img src = 'img.png'>

### To do    
   - Extract all 140 Movies details and present in a structured form.
    
       - Make a get request with requests function from the requests library
       - Choose a parser and create a beautiful soup object
       - Extract movie title
       - Extract movie year
       - Extract movie score
       - Extract critics consensus
       - Extract director names
       - Extract cast information
       - Extract synopsis
       - Extract Adjusted score
       - Represent the data in a structured form
       - Exporting data to csv, excel and json

### setup

- pip install requests
     -  - a Python library for making http requests
- pip install pip install beautifulsoup4
     -   - a Python library for pulling data out of HTML and XML files
- pip install pandas
     -   - a Python library that offers various data structures and operations for manipulating numerical data and time series.

### Import relevant libraries

In [1]:
import requests
from bs4 import BeautifulSoup

In [2]:
#Define the url of the site
base_url = "https://editorial.rottentomatoes.com/guide/140-essential-action-movies-to-watch-now/"

### Making a get request

In [3]:
#Sending a request to the webpage
response = requests.get(base_url)

In [242]:
#check the response code
response.status_code

200

In [5]:
#Get the HTML from the webpage
html = response.content

### Choosing a parser

In [6]:
# convert the HTML to a BeatifulSoup object
soup = BeautifulSoup(html, 'html.parser')

In [7]:
# Exporting the HTML to a file
with open('Rotten_tomatoes_html_parsere.html', 'wb') as file:
    file.write(soup.prettify('utf-8'))

### Obtaining the element containing all data

In [8]:
soup

<!DOCTYPE html>

<html class="hitim" lang="en-US">
<head prefix="og: http://ogp.me/ns# flixstertomatoes: http://ogp.me/ns/apps/flixstertomatoes#">
<meta content="text/html; charset=utf-8" http-equiv="content-type"/>
<meta content="From John Wick and Die Hard to Mad Max and Atomic Blonde, these best action movies ever will thrill you and get the adrenaline pumping!" property="og.description"/>
<meta content="From John Wick and Die Hard to Mad Max and Atomic Blonde, these best action movies ever will thrill you and get the adrenaline pumping!" name="description"/>
<meta content="140 Essential Action Movies To Watch Now" property="og:title"/>
<meta content="article" property="og:type"/>
<meta content="https://prd-rteditorial.s3.us-west-2.amazonaws.com/wp-content/uploads/2019/06/01073141/600Crank.jpg" property="og:image"/>
<meta content="https://editorial.rottentomatoes.com/guide/140-essential-action-movies-to-watch-now/" property="og:url"/>
<meta content="175594" name="editorialID"/>
<met

In [9]:
# Find all div tags on the webpage containing the information we want to scrape
div_set = soup.find_all('div', class_ ="col-sm-18 col-full-xs countdown-item-content")

In [10]:
# Extracting all 'h2' tags
headings = [div.find('h2') for div in div_set]
headings

[<h2><a href="https://www.rottentomatoes.com/m/1018009-running_scared">Running Scared</a> <span class="subtle start-year">(1986)</span> <span class="icon tiny rotten" title="Rotten"></span> <span class="tMeterScore">59%</span></h2>,
 <h2><a href="https://www.rottentomatoes.com/m/equilibrium">Equilibrium</a> <span class="subtle start-year">(2002)</span> <span class="icon tiny rotten" title="Rotten"></span> <span class="tMeterScore">41%</span></h2>,
 <h2><a href="https://www.rottentomatoes.com/m/hero">Hero</a> <span class="subtle start-year">(2002)</span> <span class="icon tiny certified" title="Certified Fresh"></span> <span class="tMeterScore">94%</span></h2>,
 <h2><a href="https://www.rottentomatoes.com/m/1017666-road_house">Road House</a> <span class="subtle start-year">(1989)</span> <span class="icon tiny rotten" title="Rotten"></span> <span class="tMeterScore">37%</span></h2>,
 <h2><a href="https://www.rottentomatoes.com/m/unstoppable-2010">Unstoppable</a> <span class="subtle start

### Extracting movie title

In [11]:
# Filtering only the link containing the title and Extracting the title string
movie_titles = [heading.find('a').string for heading in headings]
movie_titles

### Extracting movie year

In [12]:
# Filtering only the spans containing the year and Extracting the year string
years = [heading.find('span', class_ = "subtle start-year").string for heading in headings]
## Removing the '()'
year = [year.strip('()') for year in years]
#convert to int
year = [int(y) for y in year]
year

[1986,
 2002,
 2002,
 1989,
 2010,
 1971,
 2017,
 1986,
 1990,
 2004,
 2005,
 2017,
 1992,
 1971,
 1986,
 1997,
 2012,
 1999,
 2005,
 1998,
 2014,
 2016,
 1997,
 1988,
 1998,
 1995,
 1995,
 1987,
 1985,
 2007,
 2006,
 2010,
 2011,
 1989,
 1992,
 1996,
 1968,
 2008,
 1978,
 1998,
 1988,
 1993,
 2012,
 2007,
 1979,
 1997,
 2010,
 1991,
 1996,
 2014,
 2008,
 2006,
 1994,
 1993,
 2015,
 1985,
 2001,
 2014,
 1997,
 1986,
 2017,
 1995,
 2004,
 1984,
 2003,
 2004,
 1993,
 1981,
 2000,
 2004,
 2010,
 1992,
 1989,
 2004,
 1986,
 2008,
 2018,
 2017,
 1964,
 1976,
 2017,
 1972,
 2014,
 2003,
 1971,
 2015,
 1990,
 1992,
 1971,
 2014,
 2003,
 1993,
 2018,
 2010,
 1995,
 2002,
 2019,
 2012,
 2002,
 2008,
 1997,
 1985,
 2008,
 2011,
 2011,
 1987,
 1996,
 1987,
 2017,
 2006,
 2017,
 1994,
 1989,
 2014,
 1973,
 1985,
 1982,
 2015,
 1984,
 2000,
 2003,
 1994,
 1994,
 1994,
 2014,
 2000,
 1987,
 2007,
 1990,
 1981,
 1995,
 2011,
 2018,
 1981,
 1986,
 1992,
 1999,
 1991,
 1988,
 2015]

### Extracting movie score

In [13]:
# Filtering only the spans containing the score and Extracting the score string
scores = [heading.find('span', class_ = "tMeterScore").string for heading in headings]
# Removing the '%' sign
score = [score.strip('%') for score in scores]
#converting the score to integer
score = [int(s) for s in score]
score

[59,
 41,
 94,
 37,
 87,
 86,
 85,
 70,
 69,
 46,
 53,
 93,
 91,
 97,
 57,
 55,
 67,
 61,
 60,
 61,
 60,
 90,
 78,
 46,
 57,
 42,
 59,
 66,
 70,
 67,
 61,
 72,
 93,
 72,
 79,
 67,
 98,
 71,
 94,
 69,
 85,
 67,
 91,
 91,
 87,
 66,
 91,
 70,
 68,
 92,
 59,
 61,
 69,
 62,
 51,
 93,
 73,
 75,
 71,
 75,
 79,
 80,
 80,
 83,
 85,
 86,
 91,
 85,
 88,
 93,
 95,
 88,
 88,
 90,
 93,
 94,
 91,
 94,
 99,
 95,
 93,
 83,
 90,
 81,
 97,
 82,
 89,
 96,
 89,
 91,
 85,
 96,
 96,
 87,
 76,
 90,
 94,
 79,
 84,
 86,
 92,
 85,
 94,
 93,
 77,
 80,
 68,
 91,
 89,
 94,
 92,
 100,
 98,
 82,
 95,
 69,
 87,
 94,
 100,
 78,
 85,
 73,
 94,
 84,
 86,
 97,
 80,
 92,
 82,
 94,
 88,
 87,
 97,
 95,
 97,
 94,
 88,
 93,
 94,
 97]

### Extracting critics consensus

In [56]:
# Filtering only the div containing the critic-consensus and Extracting the string
critics_cons = [div.find('div', class_ = 'info critics-consensus') for div in div_set]
cons_text = [cons.text for cons in critics_cons]
#con_text = [con.text[common_length:] if con.text.startswith(common_phrase) else con for con in critics_con]
cons_text

['Critics Consensus: Running Scared struggles to strike a consistent balance between violent action and humor, but the chemistry between its well-matched leads keeps things entertaining.',
 'Critics Consensus: Equilibrium is a reheated mishmash of other sci-fi movies.',
 'Critics Consensus: With death-defying action sequences and epic historic sweep, Hero offers everything a martial arts fan could ask for.',
 "Critics Consensus: Whether Road House is simply bad or so bad it's good depends largely on the audience's fondness for Swayze -- and tolerance for violently cheesy action.",
 "Critics Consensus: As fast, loud, and relentless as the train at the center of the story, Unstoppable is perfect popcorn entertainment -- and director Tony Scott's best movie in years.",
 'Critics Consensus: This is the man that would risk his neck for his brother, man. Can you dig it?',
 'Critics Consensus: The Villainess offers enough pure kinetic thrills to satisfy genre enthusiasts -- and carve out a bl

#### text processing 
- to remove Critics Consensus:

In [49]:
#save the string to a variable
common_phrase = 'Critics Consensus: '

#lenght of the variable
common_length = len(common_phrase)

In [147]:
consensus_text = [cons[common_length:] if cons.startswith(common_phrase) else cons for cons in cons_text ]
consensus_text

['Running Scared struggles to strike a consistent balance between violent action and humor, but the chemistry between its well-matched leads keeps things entertaining.',
 'Equilibrium is a reheated mishmash of other sci-fi movies.',
 'With death-defying action sequences and epic historic sweep, Hero offers everything a martial arts fan could ask for.',
 "Whether Road House is simply bad or so bad it's good depends largely on the audience's fondness for Swayze -- and tolerance for violently cheesy action.",
 "As fast, loud, and relentless as the train at the center of the story, Unstoppable is perfect popcorn entertainment -- and director Tony Scott's best movie in years.",
 'This is the man that would risk his neck for his brother, man. Can you dig it?',
 'The Villainess offers enough pure kinetic thrills to satisfy genre enthusiasts -- and carve out a bloody niche for itself in modern Korean action cinema.',
 "People hate Highlander because it's cheesy, bombastic, and absurd. And peop

### Extracting directors

In [64]:
directors = [div.find('div', class_ = 'info director') for div in div_set]

In [68]:
#prepare for none output
director = [None if director.find('a') is None else director.find('a').string for director in directors]

['Peter Hyams',
 'Kurt Wimmer',
 'Zhang Yimou',
 'Rowdy Herrington',
 'Tony Scott',
 'Gordon Parks',
 'Jeong Byeong-gil',
 'Russell Mulcahy',
 'Renny Harlin',
 'Jon Turteltaub',
 'Prachya Pinkaew',
 'Coralie Fargeat',
 'Robert Rodriguez',
 'King Hu',
 'Tony Scott',
 'Simon West',
 'Simon West',
 'Stephen Sommers',
 'Doug Liman',
 'Brett Ratner',
 'Antoine Fuqua',
 'Anthony Russo',
 'Wolfgang Petersen',
 'Newt Arnold',
 'Stephen Norrington',
 'Michael Bay',
 'John McTiernan',
 'Paul Michael Glaser',
 'Andrew Davis',
 'Michael Davis',
 'Mark Neveldine',
 'Robert Rodriguez',
 'Nicolas Winding Refn',
 'Tim Burton',
 'Andrew Davis',
 'Roland Emmerich',
 'Peter Yates',
 'Timur Bekmambetov',
 'Richard Donner',
 'John Frankenheimer',
 'John Carpenter',
 'Renny Harlin',
 'Joss Whedon',
 'Edgar Wright',
 'Walter Hill',
 'Paul Verhoeven',
 'José Padilha',
 'Kathryn Bigelow',
 'Renny Harlin',
 'Adam Wingard',
 'Pierre Morel',
 'Zack Snyder',
 'James Cameron',
 'Marco Brambilla',
 'Ilya Naishuller'

### Extracting cast Info

In [87]:
casts = [div.find('div', class_ = "info cast") for div in div_set]

In [88]:
casts[0]

<div class="info cast">
<span class="descriptor">Starring:</span> <a class="" href="//www.rottentomatoes.com/celebrity/gregory_hines">Gregory Hines</a>, <a class="" href="//www.rottentomatoes.com/celebrity/billy_crystal">Billy Crystal</a>, <a class="" href="//www.rottentomatoes.com/celebrity/jimmy_smits">Jimmy Smits</a>, <a class="" href="//www.rottentomatoes.com/celebrity/steven_bauer">Steven Bauer</a></div>

In [100]:
cast_links = casts[0].find_all('a')
cast_links

[<a class="" href="//www.rottentomatoes.com/celebrity/gregory_hines">Gregory Hines</a>,
 <a class="" href="//www.rottentomatoes.com/celebrity/billy_crystal">Billy Crystal</a>,
 <a class="" href="//www.rottentomatoes.com/celebrity/jimmy_smits">Jimmy Smits</a>,
 <a class="" href="//www.rottentomatoes.com/celebrity/steven_bauer">Steven Bauer</a>]

In [106]:
#using for loop to store each movie casts in a string
result = []
for cast in casts: 
    cast_links = cast.find_all('a')
    cast_names = [link.string for link in cast_links]
    cast = ', '.join(cast_names)
    result.append(cast)
print(result)

['Gregory Hines, Billy Crystal, Jimmy Smits, Steven Bauer', 'Christian Bale, Emily Watson, Taye Diggs, Angus Macfadyen', 'Jet Li, Tony Leung Chiu Wai, Maggie Cheung Man-yuk, Donnie Yen', 'Patrick Swayze, Kelly Lynch, Sam Elliott, Ben Gazzara', 'Denzel Washington, Chris Pine, Rosario Dawson, Kevin Dunn', 'Richard Roundtree, Moses Gunn, Christopher St. John, Charles Cioffi', 'Kim Ok-bin, Shin Ha-kyun, Sung Joon, Kim Seo-hyung', 'Christopher Lambert, Sean Connery, Roxanne Hart, Clancy Brown', 'Bruce Willis, Bonnie Bedelia, William Atherton, Reginald VelJohnson', 'Nicolas Cage, Diane Kruger, Justin Bartha, Sean Bean', 'Tony Jaa, Johnny Nguyen, Nathan Jones, Petchtai Wongkamlao', 'Matilda Lutz, Kevin Janssens, Vincent Colombe, Guillaume Bouchède', 'Carlos Gallardo, Consuelo Gómez, Reinol Martinez, Peter Marquardt', 'Feng Hsu, Chun Shih, Pai Ying, Roy Chiao', 'Tom Cruise, Kelly McGillis, Anthony Edwards, Val Kilmer', 'Nicolas Cage, John Cusack, John Malkovich, Steve Buscemi', 'Sylvester Stal

In [145]:
#using nested list comprehension to each movie casts in a string
cast = [', '.join(link.string for link in cast.find_all('a')) for cast in casts]
cast

['Gregory Hines, Billy Crystal, Jimmy Smits, Steven Bauer',
 'Christian Bale, Emily Watson, Taye Diggs, Angus Macfadyen',
 'Jet Li, Tony Leung Chiu Wai, Maggie Cheung Man-yuk, Donnie Yen',
 'Patrick Swayze, Kelly Lynch, Sam Elliott, Ben Gazzara',
 'Denzel Washington, Chris Pine, Rosario Dawson, Kevin Dunn',
 'Richard Roundtree, Moses Gunn, Christopher St. John, Charles Cioffi',
 'Kim Ok-bin, Shin Ha-kyun, Sung Joon, Kim Seo-hyung',
 'Christopher Lambert, Sean Connery, Roxanne Hart, Clancy Brown',
 'Bruce Willis, Bonnie Bedelia, William Atherton, Reginald VelJohnson',
 'Nicolas Cage, Diane Kruger, Justin Bartha, Sean Bean',
 'Tony Jaa, Johnny Nguyen, Nathan Jones, Petchtai Wongkamlao',
 'Matilda Lutz, Kevin Janssens, Vincent Colombe, Guillaume Bouchède',
 'Carlos Gallardo, Consuelo Gómez, Reinol Martinez, Peter Marquardt',
 'Feng Hsu, Chun Shih, Pai Ying, Roy Chiao',
 'Tom Cruise, Kelly McGillis, Anthony Edwards, Val Kilmer',
 'Nicolas Cage, John Cusack, John Malkovich, Steve Buscemi',


### Extracting synopsis

In [134]:
# The synopsis is located inside a 'div' tag with the class 'info synopsis'
synopsis = [div.find('div', class_ = "info synopsis")for div in div_set]

In [135]:
#inspecting the element
synopsis[0]

<div class="info synopsis"><span class="descriptor">Synopsis:</span> Ray and Danny (Gregory Hines, Billy Crystal) are two Chicago police detectives hot on the trail of drug kingpin Julio...<a class="" data-pageheader="" href="https://www.rottentomatoes.com/m/1018009-running_scared" target="_top"> [More]</a></div>

In [138]:
# The text is the second child
synopsis[0].contents[1]

'Ray and Danny (Gregory Hines, Billy Crystal) are two Chicago police detectives hot on the trail of drug kingpin Julio...'

In [141]:
# Extracting the text
synopsis_text = [syn.contents[1].strip() for syn in synopsis]
synopsis_text

['Ray and Danny (Gregory Hines, Billy Crystal) are two Chicago police detectives hot on the trail of drug kingpin Julio...',
 'In a futuristic world, a regime has eliminated war by suppressing emotions: books, art and music are strictly forbidden and...',
 'In this visually arresting martial arts epic set in ancient China, an unnamed fighter (Jet Li) is being honored for...',
 'The Double Deuce is the meanest, loudest and rowdiest bar south of the Mason-Dixon Line, and Dalton (Patrick Swayze) has...',
 'When a massive, unmanned locomotive roars out of control, the threat is more ominous than just a derailment. The train...',
 'John Shaft (Richard Roundtree) is the ultimate in suave black detectives. He first finds himself up against Bumpy (Moses Gunn),...',
 'Honed from childhood to be an elite assassin, Sook-hee embarks on a rampage of violence and revenge to finally earn...',
 'When the mystical Russell Nash (Christopher Lambert) kills a man in a sword fight in a New York City parkin

### Extracting adjusted score


#### Looks like the adjusted score was not set to display on the web but it is in the html 
   - After some google search; the adjusted score takes into account the number of reviews, the year of release, and the average Tomatometer scores of other films released contemporaneously. It is primarily used when comparing or ranking films across several decades.
    - Maybe they don't use the metric anymore, i guess they are okay with the percentage score.

In [121]:
# The adjusted scores can be found in a div with class 'info countdown-adjusted-score'
adj_scores = [div.find('div', class_ = "info countdown-adjusted-score") for div in div_set]

In [122]:
#inspect the element
adj_scores[0]

<div class="info countdown-adjusted-score"><span class="descriptor">Adjusted Score: </span>59009% <span class="glyphicon glyphicon-question-sign" data-html="true" data-original-title="The Adjusted Score comes from a weighted formula (Bayesian) that we use that accounts for variation in the number of reviews per movie." data-placement="top" data-toggle="tooltip" rel="tooltip" title=""></span></div>

In [124]:
# By inspection we see that the string we are looking for is the second child of the 'div' tag
adj_scores[0].contents[1] #notice the white space

'59009% '

In [129]:
# Extracting the string (without '%' sign and extra space)
scores_clean = [score.contents[1].strip('% ') for score in adj_scores]
scores_clean

['59009',
 '42592',
 '102656',
 '40028',
 '93763',
 '88924',
 '90990',
 '73021',
 '72285',
 '52231',
 '56400',
 '99773',
 '96212',
 '98390',
 '63042',
 '57647',
 '72405',
 '65058',
 '68278',
 '63500',
 '67736',
 '118224',
 '80154',
 '46394',
 '62506',
 '45929',
 '63859',
 '67919',
 '70820',
 '72965',
 '64693',
 '79670',
 '103418',
 '78119',
 '80820',
 '71601',
 '101574',
 '79849',
 '101450',
 '71256',
 '88264',
 '71211',
 '106750',
 '100055',
 '90697',
 '69788',
 '92093',
 '73927',
 '68478',
 '95540',
 '65402',
 '70698',
 '71887',
 '61799',
 '60174',
 '92134',
 '76855',
 '85083',
 '73788',
 '79847',
 '106407',
 '80313',
 '84671',
 '86661',
 '89259',
 '95898',
 '93402',
 '90206',
 '88056',
 '104599',
 '100004',
 '67863',
 '94118',
 '98122',
 '92518',
 '105243',
 '91144',
 '126304',
 '104128',
 '98801',
 '129165',
 '83268',
 '102769',
 '87288',
 '105199',
 '92724',
 '90865',
 '98775',
 '93216',
 '104775',
 '93373',
 '102583',
 '129316',
 '101217',
 '81554',
 '97986',
 '128404',
 '86263',

In [144]:
## Converting the strings to number
final_adj_scores = [int(s) for s in scores_clean]
final_adj_scores

[59009,
 42592,
 102656,
 40028,
 93763,
 88924,
 90990,
 73021,
 72285,
 52231,
 56400,
 99773,
 96212,
 98390,
 63042,
 57647,
 72405,
 65058,
 68278,
 63500,
 67736,
 118224,
 80154,
 46394,
 62506,
 45929,
 63859,
 67919,
 70820,
 72965,
 64693,
 79670,
 103418,
 78119,
 80820,
 71601,
 101574,
 79849,
 101450,
 71256,
 88264,
 71211,
 106750,
 100055,
 90697,
 69788,
 92093,
 73927,
 68478,
 95540,
 65402,
 70698,
 71887,
 61799,
 60174,
 92134,
 76855,
 85083,
 73788,
 79847,
 106407,
 80313,
 84671,
 86661,
 89259,
 95898,
 93402,
 90206,
 88056,
 104599,
 100004,
 67863,
 94118,
 98122,
 92518,
 105243,
 91144,
 126304,
 104128,
 98801,
 129165,
 83268,
 102769,
 87288,
 105199,
 92724,
 90865,
 98775,
 93216,
 104775,
 93373,
 102583,
 129316,
 101217,
 81554,
 97986,
 128404,
 86263,
 89617,
 85330,
 96126,
 87186,
 108306,
 103466,
 85280,
 83900,
 71892,
 95770,
 109334,
 105858,
 122974,
 74779,
 99840,
 88391,
 98829,
 70186,
 87115,
 106921,
 104608,
 84781,
 92953,
 768

### Representing the data in a structured form

- using pandas dataframe

In [142]:
import pandas as pd

In [148]:
movies_info = pd.DataFrame()

movies_info['Movie_Title'] = movie_titles
movies_info['Year'] = year
movies_info['Score'] =  score
movies_info['Adjusted_Score'] = final_adj_scores
movies_info['Director'] = director
movies_info['Synopsis'] = synopsis_text
movies_info['Cast'] = cast
movies_info['Critics_Consensus'] = consensus_text

In [149]:
#view top 5 rows
movies_info.head()

Unnamed: 0,Movie_Title,Year,Score,Adjusted_Score,Director,Synopsis,Cast,Critics_Consensus
0,Running Scared,1986,59,59009,Peter Hyams,"Ray and Danny (Gregory Hines, Billy Crystal) a...","Gregory Hines, Billy Crystal, Jimmy Smits, Ste...",Running Scared struggles to strike a consisten...
1,Equilibrium,2002,41,42592,Kurt Wimmer,"In a futuristic world, a regime has eliminated...","Christian Bale, Emily Watson, Taye Diggs, Angu...",Equilibrium is a reheated mishmash of other sc...
2,Hero,2002,94,102656,Zhang Yimou,In this visually arresting martial arts epic s...,"Jet Li, Tony Leung Chiu Wai, Maggie Cheung Man...",With death-defying action sequences and epic h...
3,Road House,1989,37,40028,Rowdy Herrington,"The Double Deuce is the meanest, loudest and r...","Patrick Swayze, Kelly Lynch, Sam Elliott, Ben ...",Whether Road House is simply bad or so bad it'...
4,Unstoppable,2010,87,93763,Tony Scott,"When a massive, unmanned locomotive roars out ...","Denzel Washington, Chris Pine, Rosario Dawson,...","As fast, loud, and relentless as the train at ..."


In [154]:
#set column width option
pd.set_option('display.max_colwidth', None)

In [155]:
#view top 5 rows
movies_info.head()

Unnamed: 0,Movie_Title,Year,Score,Adjusted_Score,Director,Synopsis,Cast,Critics_Consensus
0,Running Scared,1986,59,59009,Peter Hyams,"Ray and Danny (Gregory Hines, Billy Crystal) are two Chicago police detectives hot on the trail of drug kingpin Julio...","Gregory Hines, Billy Crystal, Jimmy Smits, Steven Bauer","Running Scared struggles to strike a consistent balance between violent action and humor, but the chemistry between its well-matched leads keeps things entertaining."
1,Equilibrium,2002,41,42592,Kurt Wimmer,"In a futuristic world, a regime has eliminated war by suppressing emotions: books, art and music are strictly forbidden and...","Christian Bale, Emily Watson, Taye Diggs, Angus Macfadyen",Equilibrium is a reheated mishmash of other sci-fi movies.
2,Hero,2002,94,102656,Zhang Yimou,"In this visually arresting martial arts epic set in ancient China, an unnamed fighter (Jet Li) is being honored for...","Jet Li, Tony Leung Chiu Wai, Maggie Cheung Man-yuk, Donnie Yen","With death-defying action sequences and epic historic sweep, Hero offers everything a martial arts fan could ask for."
3,Road House,1989,37,40028,Rowdy Herrington,"The Double Deuce is the meanest, loudest and rowdiest bar south of the Mason-Dixon Line, and Dalton (Patrick Swayze) has...","Patrick Swayze, Kelly Lynch, Sam Elliott, Ben Gazzara",Whether Road House is simply bad or so bad it's good depends largely on the audience's fondness for Swayze -- and tolerance for violently cheesy action.
4,Unstoppable,2010,87,93763,Tony Scott,"When a massive, unmanned locomotive roars out of control, the threat is more ominous than just a derailment. The train...","Denzel Washington, Chris Pine, Rosario Dawson, Kevin Dunn","As fast, loud, and relentless as the train at the center of the story, Unstoppable is perfect popcorn entertainment -- and director Tony Scott's best movie in years."


In [151]:
#shape of dataframe
movies_info.shape

(140, 8)

### Exporting data to csv excel json

In [None]:
#export to csv
movies_info.to_csv('movies_info.csv', index = False, header = True)

#export to excel
movies_info.to_excel('movies_info.xlsx', index = False, header = True)

#export to json
movies_info.to_json('movies_info.json')