Flask App with Jupyter Notebook HTML version

ETL Project - The Best 50 Superhero Movies

ETL is a type of data integration that refers to the three steps (extract, transform, load) used to blend data from multiple sources. It's often used to build a data warehouse. During this process, data is taken (extracted) from a source system, converted (transformed) into a format that can be analyzed, and stored (loaded) into a data warehouse or other system. Extract, load, transform (ELT) is an alternate but related approach designed to push processing down to the database for improved performance.

Project Proposal Combine the Best 50 Superhero Movies dataset with The Superhero dataset for the analysis and data visualization. 3 questions relating to the data: For the 3 questions, I decided to focus exclusively on American superheroes and movies.

Is there a gender bias in superhero movie industry?
What are the most important variables in predicting movie's profitability.
For production companies that were acquired, how much on average has each superhero earned?

Extract Data: Data sources: • Superheroes API • Web Scraping • Movies API • Python package - IMDbPY

Data formats: • CSV • XLSX • pgAdmin • HTML

Transform Data: • cleaning • joining • filtering • aggregating (group by, sort, average, count) • deleting columns • data visualization

Load the final dataset. Flask Application: render images for each movie, superhero and a short movie summary from the database.

Deployed my App to Heroku : https://super-heroes.herokuapp.com/

Jupyter Notebook HTML version with whole ETL process available under link ETL-Project(Jupyter Notebook) located in the App Navigation Bar.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Screenshots		Screenshots
static		static
templates		templates
ETL-Project.ipynb		ETL-Project.ipynb
Procfile		Procfile
README.md		README.md
app.json		app.json
app.py		app.py
movie_actor_hero copy.csv		movie_actor_hero copy.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flask App with Jupyter Notebook HTML version

About

Releases

Packages

Languages

LinaSachuk/SuperHeroes-ETL

Folders and files

Latest commit

History

Repository files navigation

Flask App with Jupyter Notebook HTML version

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages