IMDB_watchlist_stats

Data analysis and visualization project based on my personal watchlist from IMDB during all the time

In this project, my IMDB watchlist was analyzed, consisting of all the films that I watched in my life (approximately)

New version of this analysis

These dashboards have the following types of summary visualization that interact with each other through the filter:

the area chart of the number of pictures watched by their release date
the donut chart the number of pictures by their release decade and percentage distribution
the donut chart the number of pictures by their runtime and percentage distribution
the donut chart the number of pictures by their genres and percentage distribution
the table grouping by project types
the tables / lists of pictures displayed during filtering (LIST, LIST OF MOVIES)
the table grouping by project directors
the histogram of the dependence of the films rating on the year, titles filtered by director

According to these criteria, visualization was compiled and presented on 2 dashboards

DASHBOARDS

https://public.tableau.com/app/profile/sergei5857/viz/watchlist_new_1/MAINDASH

https://public.tableau.com/app/profile/sergei5857/viz/watchlist_new_2/DIRECTORDASH

This project is being implemented as a training project in order to master the basic principles of data analysis and visualization

Tools used: Oracle SQL, PL/SQL developer, Tableau and Excel

- Views of dashboards:

Map of watched

In addition to this analysis I decided to make the dashboard with a map of all the movies and TV shows we watched, countries of origin map

It's not as accurate as we would like, since many countries can participate in the production of the film. But still it can reflect the real affiliation of the film to the country

Since downloading a watchlist from IMDB doen't take into account the country of origin I had to resort to parsing movie pages by their URL using Python and Jupiter Notebook for that

DASHBOARD

https://public.tableau.com/app/profile/sergei5857/viz/watchlist_new_map/COUNTRIESDASH

Tools used: Oracle SQL, PL/SQL developer, Tableau, Jupiter Notebook, Python, pandas, BeautifulSoup and Excel

- View of dashboard:

Python script for setting the entire procedure to autorun on a schedule

To automate the uploading and getting the final dataset for the dashboard, I created a python script

It does these following steps:

load the watchlist into csv file from IMDB
find a difference between the new list and current list (only new items)
parse countries for each new item
merge new rows with countries into the main table in the database
select unified SQL query for the dashboard
save the results into excel file that connected to Tableau

this application also keeps logs of this operation and writes them to the .log file

####### Launch of this python app is scheduled every 2 weeks by TaskTillDawn, it also shows logs after that

Tools used: Oracle database, Oracle SQL, Tableau, Jupiter Notebook, Python (libs: pandas, BeautifulSoup, cx_Oracle, logging), TaskTillDawn and Excel

- Python script

https://github.com/Wreiler/IMDB_watchlist_stats/blob/main/parse_and_insert_oracle.py

Example of logs:

-----[INFO]: New unloading from IMDB - 07.01.2023-----
[07.01.2023 19:00:18 | INFO]: Unloading .csv from IMDB is completed
[07.01.2023 19:00:20 | INFO]: All counties are parsed.
[07.01.2023 19:00:20 | INFO]: Number of inserted rows = 2
[07.01.2023 19:00:20 | INFO]: Insert is completed!
[07.01.2023 19:00:21 | INFO]: Number of selected rows = 8629
[07.01.2023 19:00:21 | INFO]: Select is completed!
[07.01.2023 19:00:24 | INFO]: DataFrame is written to Tableau Excel File successfully. All is done!

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
old_dash		old_dash
.gitattributes		.gitattributes
README.md		README.md
Tableau unified table.xlsx		Tableau unified table.xlsx
WATCHLIST.csv		WATCHLIST.csv
creation of types.sql		creation of types.sql
creation of watchlist table.sql		creation of watchlist table.sql
improved_global_numbers.sql		improved_global_numbers.sql
map_of_movies.png		map_of_movies.png
movie_stats_pkg.pck		movie_stats_pkg.pck
parse_and_insert_oracle.py		parse_and_insert_oracle.py
parse_country_for_watchlist.py		parse_country_for_watchlist.py
unified_query.sql		unified_query.sql
watchlist_new_work1.png		watchlist_new_work1.png
watchlist_new_work2.png		watchlist_new_work2.png
watchlist_with_countries.xlsx		watchlist_with_countries.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDB_watchlist_stats

New version of this analysis

According to these criteria, visualization was compiled and presented on 2 dashboards

DASHBOARDS

Tools used: Oracle SQL, PL/SQL developer, Tableau and Excel

- Views of dashboards:

Map of watched

DASHBOARD

Tools used: Oracle SQL, PL/SQL developer, Tableau, Jupiter Notebook, Python, pandas, BeautifulSoup and Excel

- View of dashboard:

Python script for setting the entire procedure to autorun on a schedule

Tools used: Oracle database, Oracle SQL, Tableau, Jupiter Notebook, Python (libs: pandas, BeautifulSoup, cx_Oracle, logging), TaskTillDawn and Excel

- Python script

Example of logs:

About

Releases

Packages

Languages

Wreiler/IMDB_watchlist_stats

Folders and files

Latest commit

History

Repository files navigation

IMDB_watchlist_stats

New version of this analysis

According to these criteria, visualization was compiled and presented on 2 dashboards

DASHBOARDS

Tools used: Oracle SQL, PL/SQL developer, Tableau and Excel

- Views of dashboards:

Map of watched

DASHBOARD

Tools used: Oracle SQL, PL/SQL developer, Tableau, Jupiter Notebook, Python, pandas, BeautifulSoup and Excel

- View of dashboard:

Python script for setting the entire procedure to autorun on a schedule

Tools used: Oracle database, Oracle SQL, Tableau, Jupiter Notebook, Python (libs: pandas, BeautifulSoup, cx_Oracle, logging), TaskTillDawn and Excel

- Python script

Example of logs:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages