Skip to content

EricSchles/datascience_book

Repository files navigation

Programming Probabilistically: An introduction to the world of data science

By

Eric Schles

Hello and welcome to my book! You'll find the following sections:

  1. Descriptive Statistics and Hypothesis testing
  2. Applied Statistical Tests - A/B testing
  3. Regression Introduction
  4. Classification Introduction
  5. Information Theory, Entropy and Tree Models
  6. Neural Network Models
  7. Introduction to Time Series Analysis

Each section covers about 4 to 5 chapters worth of material broken out into:

  • Basics
  • Mathematical Intuition
  • Implementation
  • Typical API
  • Advanced Use Cases

In addition to the main chapters, I've added a number of 'engineering' focused chapters that are somewhat supplemental:

Sections to come:

  • Reinforcement Learning
  • Engineering for Data Science
  • Text Processing
  • Image Processing
  • Support Vector Machines
  • Genetic Algorithms
  • neural network optimizers
  • Recommender Systems
  • A/B testing and other related workflows
  • SQL best practice
  • Timeseries Forecasting and Analysis
  • Geospatial Analysis
  • Geospatial and Timeseries forecasting
  • Video Processing
  • Building Data Dashboards
  • Working With Search
  • Building An OCR System
  • Advanced Python Usage
  • Active Learning
  • Recurrent Neural Networks
  • Convolutional Neural Networks
  • Capsule Networks
  • Adversarial Machine Learning
  • Open World - in distribution out of distribution
  • Bayesian Machine Learning
  • Graph Based Neural Networks
  • Monitoring
  • Working with Spark
  • Working with Streaming Data
  • Ensembling - scikit learn ensembling strategies
  • Random Forests
  • Additive models:
    • Gradient boosted trees
    • splines
    • General Additive Models
    • adaboost
  • explainability metrics
    • litany of examples
    • showing when and how they can fail
  • Metrics
  • Hyper parameter tunning
  • Randomness in your models
  • Counterfactual examples
  • testing in machine learning applications

To Dos

  • fix Decision Tree Implementation
  • add SVM chapter
  • add dimensionality reduction chapter
  • add clustering chapter
  • add RNN chapter
  • add conv net chapter
  • discuss attention
  • create engineering productionization chapter
  • hypothesis test as a ticket within engineering scrum context
  • reproducibility of results

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published