# **IMDB Python web scraper**

It retrieves the top-rated movies data from the IMDB website and saves it to a CSV file. 
The scraper uses the requests module to make an HTTP GET request to the top-rated movies page on the IMDB website. 
It then uses BeautifulSoup to parse the HTML content of the website and extract the movie title, year, 
and rating for each movie. The data is stored in a list of lists, with each inner list representing the data 
for a single movie.

The scraper then uses the csv module to write the scraped data to a CSV file called top-rated-movies.csv. 
The data is written to the file in comma-separated value format, with each row representing the data for a 
single movie and the first row containing the column names.

## Code and Description

### Importing 
Importing 'requests' packege as it allows us to make HTTP requests to the IMDB website, BeautifulSoup is a library that helps us parse the HTML content of the website, and csv is a module that allows us to read and write data in CSV (comma-separated value) format.

In [4]:
# Importing required packages
import requests
from bs4 import BeautifulSoup
import pandas as pd

### Requesting

These below lines of code make a request to the IMDB website and retrieve the HTML content of the top-rated movies page. 

In [5]:
# store requesting site in 'url' variable
url = 'https://www.imdb.com/chart/top'

# request the website and retrieve its content
response = requests.get(url)

# parse the HTML content of the website and store the result in a variable called soup
soup = BeautifulSoup(response.content, 'html.parser')


### Extracting

In [20]:
movie_table = soup.find('table', {'class': 'chart full-width'})
movies = []
for row in movie_table.find_all('tr')[1:]:
    title_column = row.find('td', {'class': 'titleColumn'})
    title = title_column.find('a').text
    year = title_column.find('span', {'class': 'secondaryInfo'}).text.strip('()')
    rating = row.find('td', {'class': 'ratingColumn imdbRating'}).text.strip()
    movies.append([title, year, rating])

In [None]:
df = pd.DataFrame(movies, columns=['Title', 'Year', 'Rating'])
print(df.head(21))


with open('top-rated-movies.csv', mode='w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(['Title', 'Year', 'Rating'])
    for movie in movies:
        writer.writerow(movie)