Skip to content

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Notifications You must be signed in to change notification settings

AnveshaM/Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Dataset - A public dataset “Flowers” (3600 images, 5 classes) is used for the analysis.

About the project - A comprehensive in-depth analysis of the effect of parallelisation on the performance of various Cluster configurations(GCP's Dataproc) in terms of CPU/Memory utilisation, Disk I/O operations and Network bandwidth . The project also experiments with different VM configurations and distribution strategies to analyse the more efficient combination for training the ML model in the Google Cloud's AI platform.

Repository Contents - BigData_ML_Model_on_Google_Cloud_Platform.ipynb - Google Colaboratory file containing the code.

About

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages