Skip to content
This repository has been archived by the owner on Jan 15, 2022. It is now read-only.
/ movie-analysis Public archive

Movie Analysis Mini-Project for CZ1115 - Introduction to Data Science and Artificial Intelligence

Notifications You must be signed in to change notification settings

nicklimmm/movie-analysis

Repository files navigation

Welcome to movie-analysis repository

About

This is a Mini-Project for CZ1115 (Introduction to Data Science and Artificial Intelligence) which focuses on movies from The Movie Database APIs. For detailed walkthrough, please view the source code in order from:

  1. Data Extraction
  2. Data Visualization
  3. Data Resampling and Splitting
  4. Logistic Regression
  5. Neural Network

Contributors

  • @nicklimmm - Neural Networks, Data Resampling, Data Extraction
  • @TCaken - Logistic Regression
  • @coolcoolwhat - Data Visualization, Data Extraction

Problem Definition

  • Are we able to predict if a movie is good (rating above 7.2) based on its attributes?
  • Which model would be the best to predict it?

Models Used

  1. Logistic Regression
  2. Neural Networks

Conclusion

  • Popularity and budget have low linear correlation value with ratings (watch out for bandwagons 🤣)
  • Popularity of the casts and crews have higher linear correlation value with ratings
  • Resampling imbalanced data improved model performance especially on the minority class
  • Logistic Regression did not perform well with non-linearly correlated variables
  • Neural Networks along with SMOTEENN resampling method consistently did well in predicting good movies after 100 training attempts (around 72% accuracy, 70% recall)
  • Yes, it is possible to predict if a movie is good with acceptable amount of accuracy and recall

What did we learn from this project?

  • Handling imbalanced datasets using resampling methods and imblearn package
  • Neural Networks, Keras and Tensorflow
  • Logistic Regression from sklearn
  • API Usage
  • Other packages such as tqdm, json, requests
  • Collaborating using GitHub
  • Concepts about Precision, Recall, and F1 Score

References

About

Movie Analysis Mini-Project for CZ1115 - Introduction to Data Science and Artificial Intelligence

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published