Skip to content

Supporting material for the Open Risk Academy course "Exploratory Data Analysis using Pandas, Seaborn and Statsmodels"

License

Notifications You must be signed in to change notification settings

Open-Risk-Academy/Academy-Course-DAT31048

Repository files navigation

Academy-Course-DAT31048

Exploratory Data Analysis using Pandas, Seaborn and Statsmodels

This course is a CrashProgram (short course) introducing exploratory data analysis using credit risk data as the use case

EDA Image

Course objectives

  • We learn the concept and techniques of Exploratory Data Analysis
  • Touch upon the issue of bias and how to mitigate it
  • Learn about more advanced formats such as HDF
  • Learn basic exploratory data analysis using pandas
  • Create standard graphs using seaborn
  • Calculate Contingency tables, WoE and Information Value using pandas, scipy and statsmodels

The course is live at the Open Risk Academy, this repository hosts the python scripts used in the course. The scripts can be used standalone but documentation is minimal

Brief Description

  • Step 1: Importing data using pandas
  • Step 2: Blindfoldind data and saving in HDF format to preserve metadata
  • Step 3: Univariate statistics for numerical and categorical variables
  • Step 4: Histograms and Barplots using Seaborn
  • Step 5: Identifying outliers visually and numerically
  • Step 6: Scatterplots, correlations and correlations heatmaps
  • Step 7: Contingency tables and mosaic plots
  • Step 8: Assessing association using Chi-Square tests and Information Value

Where To Get Help:

If you get stuck on any issue with the course or the Academy:

  • If the issue is related to the course topics / material, check in the first instance the Course Forum (Chat)
  • Join the course discussion in the Open Risk Commons
  • If the issue is related the operation of the Open Risk Academy check first the Academy FAQ
  • If the issue persists contact us at info at openrisk dot eu

Academy Course Catalog

About

Supporting material for the Open Risk Academy course "Exploratory Data Analysis using Pandas, Seaborn and Statsmodels"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages