Starred repositories
An open-source ML pipeline development platform
Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
An open-source, low-code machine learning library in Python
Best Practices on Recommendation Systems
A game theoretic approach to explain the output of any machine learning model.
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Algorithms for explaining machine learning models
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphic…
Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristi…
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
The Hitchhiker's Guide to Data Science for Social Good
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
An R package for causal inference in time series
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …
PyTorch extensions for high performance and large scale training.
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Natural Gradient Boosting for Probabilistic Prediction
A model-agnostic visual debugging tool for machine learning
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
STUMPY is a powerful and scalable Python library for modern time series analysis
Deep universal probabilistic programming with Python and PyTorch
The machine learning toolkit for time series analysis in Python
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
A library of sklearn compatible categorical variable encoders
Visualize and compare datasets, target values and associations, with one line of code.
TFX is an end-to-end platform for deploying production ML pipelines