Navigation Menu

Skip to content

chongjason914/seaborn-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Seaborn Tutorial - Who Pays More for Health Insurance?

Introduction

Seaborn is a data visualisation library that is built on top of the Python programming language. This repository contains a tutorial notebook which demonstrates some of the most common visualisation techniques in the seaborn library as well as a medical cost dataset from Kaggle. I have applied various data visualisations to explore the distribution and relationship between the variables in the dataset.

Data description

The Kaggle medical cost dataset contains information about 1,338 insurance beneficiaries living in the United States and the corresponding amount they pay for their health insurance premium.

I have chosen this dataset for the following reasons:

  • Good mix of categorical and numerical variables
  • Not too many features (columns)
  • Intuitive and straightforward relationship between the predictor and target variables

Below are the description of each column in the dataset:

  • age: Age of primary beneficiary
  • sex: Insurance contractor gender
  • bmi: Body mass index
  • children: Number of children covered by health insurance / Number of dependents
  • smoker: Smoking
  • region: The beneficiary's residential area in the US
  • charges: Individual medical costs billed by health insurance

Medium article

Link to full write-up on Towards Data Science here.

Follow me

Releases

No releases published

Packages

No packages published