Skip to content

Latest commit

 

History

History
207 lines (132 loc) · 5.08 KB

data-science.md

File metadata and controls

207 lines (132 loc) · 5.08 KB

data science

list of useful tools

references to further similar lists

basics

exploratory data analysis (EDA)

anomaly detection

visualization

scaling pandas

experiment design

experiment tracking

MLops overview: https://github.com/visenger/awesome-mlops

bayesian analyses

deep learning

neighbor search

subgroup discovery

NLP

rules

reprodrucibility

population drift monitoring

exploration of models

fairness

explainable AI

version control

streaming data science

libraries

data

audio

streams

feature store

  • hopsworks
  • Behaviors compose better than states
    • perhaps a simple python library with some transformations & DBT & Dagster will do the job very well and https://metriql.com/ for metrics
  • https://feast.dev/

forecasting (time series)

generic timeseries operations

website tracking

google analytics

CDM/CDP customer data management & customer data platform

basically SQL-based sync & incremental updates of data to many systems and reports without sending a lot of CSV files around

deployment

cloud instance handling

use cases

marketing

text processing

ocr