Skip to content

Mykola-Yatsyk/imdb-spark-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

imdb-spark-project

Big Data Diploma Project with using PyPark with support by

Description

Run project

To run a shell script, you need to open your terminal and navigate to the directory and you can execute the script by typing $./run.sh and pressing enter.

Solution tasks
  • Task 1. Get all titles of series/movies etc. that are available in Ukrainian.
  • Task 2. Get the list of people’s names, who were born in the 19th century.
  • Task 3. Get titles of all movies that last more than 2 hours.
  • Task 4. Get names of people, corresponding movies/series and characters they played in those films.
  • Task 5. Get information about how many adult movies/series etc. there are per region. Get the top 100 of them from the region with the biggest count to the region with the smallest one.
  • Task 6. Get information about how many episodes in each TV Series. Get the top 50 of them starting from the TV Series with the biggest quantity of episodes.
  • Task 7. Get 10 titles of the most popular movies/series etc. by each decade.
  • Task 8. Get 10 titles of the most popular movies/series etc. by each genre.

Releases

No releases published

Packages