Skip to content

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Notifications You must be signed in to change notification settings

AnveshaM/Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Dataset - A public dataset “Flowers” (3600 images, 5 classes) is used for the analysis.

About the project - A comprehensive in-depth analysis of the effect of parallelisation on the performance of various Cluster configurations(GCP's Dataproc) in terms of CPU/Memory utilisation, Disk I/O operations and Network bandwidth . The project also experiments with different VM configurations and distribution strategies to analyse the more efficient combination for training the ML model in the Google Cloud's AI platform.

Repository Contents - BigData_ML_Model_on_Google_Cloud_Platform.ipynb - Google Colaboratory file containing the code.

About

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published