Skip to content

FilipeGood/Data-Visualization-and-Analysis

Repository files navigation

Data Visualization Courses

This repo contains jupyter notebooks with the code of two Data Visualization Courses.

Coursera Applied Plotting

url => https://www.coursera.org/learn/python-plotting A course about matplotlib and the various types of charts that are available in the matplotlib framework

Kaggle - Data Visualization

url => https://www.kaggle.com/learn/data-visualization A course that overviews the most common charts using the Seaborn package

EDA - IBM coursera course

url => https://www.coursera.org/learn/data-analysis-with-python - week 3 Simple EDA techniques that give good insights about the dataset

Exploratory Data Analysis

  Steps:
     1. Identification of variables, data types and shape of the dataset
        - Numerical => Discrete or Continuous
        - Categorical => Ordinal or Nominal
     2. Analyzins basic metrics => Statistical Summary
     3. Non-Graphical Univariate Analysis
        - Get the count and list of unique values
        - Filtering based on coditions + grouping
     4. Graphical Univariate Analysis
        - Analyzing individual feature patterns using visualization
        - Histograms (numeric features), box plots (categorical features), count plots
     5. Bivariate Analysis
        - Identifying relationships between features
        - Scatter plots (regplot - seaborn), boxplots (x being a categorical feature (labels) and y being numeric (for example the target)), heatmaps
        - sns.countplot() - compare the distrution of two categorical columns => sex vs have/no disease
     6. Analyzing outliers and missing values
     7. Correlation Analysis
        - Regplots (seaborn)
        - Correlation matrix
        - Check redundant variables

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published