Skip to content

This is my capstone project (final project) of Nanodegree Machine Learning Engineer Udacity course.

Notifications You must be signed in to change notification settings

messerzen/Malaria_capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNN model for cell images classification

This is the capstone project of "Nanodegree Machine Learning Engineer" course from Udacity. The model classification task is to detect if a cell is infected with malaria parasite or is uninfected based on image attributes. In this project three CNN models are compared and their performance are compared:

  • Model_1: CNN model created from scratch
  • Model_2: Model_1 update with more layers and adjusted parameters.
  • Pre-trained: model using VGG16 pre-trained model.

The best performance was achieved by model_2.

Getting started

This project was developed using Google Colaboratory because it allows you to use a GPU for free. You are encourage to use google colaboratory in order to reduce time consuming to preprocess data, build and training the model.

Instructions

There are two options to upload the '*.tar.xz" dataset files in Google Colaboratory Notebook:

  • Upload button:

  • Using Pydrive module - In this case you have to perform the following steps:
  1. install pydrive module in google colaboratory notebook using:

! pip install pydrive

  1. Upload all tar.xz files in a google drive account.

  2. Authenticate and create the PyDrive client.

auth.authenticate_user()

gauth = GoogleAuth()

gauth.credentials = GoogleCredentials.get_application_default()

drive = GoogleDrive(gauth)

  1. Import necessary modules

import os

from pydrive.auth import GoogleAuth

from pydrive.drive import GoogleDrive

from google.colab import auth

from oauth2client.client import GoogleCredentials

When prompted, click on the link to get authentication to allow Google to access your Drive account. You should see a screen with “Google Cloud SDK wants to access your Google Account” at the top. After you allow permission, copy the given verification code and paste it in the box in Colab. Info source

  1. Download the files from google drive

To get the file ID use the sharable link file, e.g.: https://drive.google.com/open?id=14DiZ1ZbsfbZaDFqwDDJWKFcb1M0wXvIW

Copy and paste the file ID in 'id' key

download = drive.CreateFile({'id': '14DiZ1ZbsfbZaDFqwDDJWKFcb1M0wXvIW'})'

download.GetContentFile('test.tar') # Download the file and save with the specific name 'test.tar

You should create an CreateFile object for each validation, training and test file.

  1. Upload & extract .tar files.

upload = drive.CreateFile({'title': 'test.tar'}) # Title is the same name chosen in GetContentFile object upload.SetContentFile('test.tar') upload.Upload()

! mkdir -p content/cell_images # Create a directory to extract the dataset

Feel free to change the directory name

  1. Extract content

!tar -xvf test.tar -C content/cell_images

Useful links:

Colab upload tutorial

Article 1

Forum discussion

Author

  • Paulo Henrique Zen Messershmidt - LinkedIn

email: phzm.engmec@gmail.com

Built with

Acknowledgments

About

This is my capstone project (final project) of Nanodegree Machine Learning Engineer Udacity course.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published