Skip to content

In this assignment, we will tackle a regression problem. We will be working on a dataset consolidated from census data in the USA. The goal is to accurately predict cancer mortality based on information related to US counties.The dataset contains 33 different features (demography, medical information).

Notifications You must be signed in to change notification settings

buithehai1994/Regression-Model-on-Cancer-US-County-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Regression Model on Cancer Mortality Project README

Cancer Mortality

Overview:

This project aims to predict cancer mortality rates in US counties using machine learning techniques. The dataset used for this project contains various demographic and medical features of US counties, consolidated from census data. The project is divided into four parts, each focusing on different aspects of regression modeling.

Repository Structure:

  • data/: Contains the dataset files (cancer_us_county-training.csv and cancer_us_county-testing.csv).
  • notebooks/: Contains Jupyter notebooks for each part of the assignment.
    • Part_A_Univariate_Linear_Regression.ipynb
    • Part_B_Multivariate_Linear_Regression.ipynb
    • Part_C_Experiment_On_Multivariate_Linear_Regression_With_Feature_Engineering.ipynb
  • EXPERIMENT REPORT: Contains experiment reports in Word format.
    • EXPERIMENT REPORT - Part A
    • EXPERIMENT REPORT - Part B
    • EXPERIMENT REPORT - Part C
  • FINAL REPORT - Part D: Contains final report in Word format for the project.
  • README.md: Overview of the project (this file).

Instructions:

  1. Ensure you have Python and Jupyter Notebook installed on your system.
  2. Clone this repository to your local machine

For more information, please refer to the web app presentation of the project using the following link:

Regression-Model-on-Cancer-US-County

Acknowledgments

This project is based on the dataset consolidated from census data in the USA and its documentation provided for educational purposes. (Dataset is from the Master of Data Science and Innovation course of University of Technology Sydney, and it is the asset of TD School)

About

In this assignment, we will tackle a regression problem. We will be working on a dataset consolidated from census data in the USA. The goal is to accurately predict cancer mortality based on information related to US counties.The dataset contains 33 different features (demography, medical information).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published