Skip to content

pranealmerchant/pmerchant

Repository files navigation

The creation of this data repository is to be a space where all the relevant compnents of the capstone project for CIND 820 at Toronto Metropolitan University can be collected for reference.

The capstone project for this semester is to determine what characteristics from the Wisconsin (Diagnostic) Breast Cancer dataset predict whether a patient is diagnosed with benign or malignant breast cancer.

The steps involved in the data analysis of this project so far include the following:

a) Understanding and Cleaning the Data: This involves understanding the characteristics of this data set (how many benign and malignant cases there, for example) as well as recognizing and removing any aspects of the dataset which are not necessary for further data analysis. Additional steps will involve transforming any categorical data to make data analysis more easier in the later stages of the project.

b) Establishing correlations: Using code to determine the level of correlation that each characterstic has with diagnosis. Understanding this will allow one to begin to answer some questions outlined in the first stages of the project such as "What characteristics are most closely linked to a diagnosis of malignant or breast cancer among patients from this dataset?"

c) Comparing Machine Learning Models: The last phase of this project will be to compare the fit of each machine learning model to this data set. The goal of this stage is to determine which machine learning model can best help determine the characteristics from the dataset which predict benign of malignant diagnoses among patients. Comparison will involve metrics such as accuracy, recall and precision to help determine the best-fitting model. As this is still considered the initial stages of data analysis, any conclusions drawn at this point in time may change once further analysis is done by the end of the project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published