This curated list contains 910 awesome open-source projects with a total of 3.9M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!
Contents
- Machine Learning Frameworks 59 projects
- Data Visualization 54 projects
- Text Data & NLP 101 projects
- Image Data 64 projects
- Graph Data 36 projects
- Audio Data 29 projects
- Geospatial Data 22 projects
- Financial Data 25 projects
- Time Series Data 29 projects
- Medical Data 19 projects
- Tabular Data 5 projects
- Optical Character Recognition 12 projects
- Data Containers & Structures 1 projects
- Data Loading & Extraction 1 projects
- Web Scraping & Crawling 1 projects
- Data Pipelines & Streaming 1 projects
- Distributed Machine Learning 36 projects
- Hyperparameter Optimization & AutoML 52 projects
- Reinforcement Learning 23 projects
- Recommender Systems 17 projects
- Privacy Machine Learning 7 projects
- Workflow & Experiment Tracking 39 projects
- Model Serialization & Deployment 20 projects
- Model Interpretability 54 projects
- Vector Similarity Search (ANN) 13 projects
- Probabilistics & Statistics 23 projects
- Adversarial Robustness 9 projects
- GPU & Accelerator Utilities 20 projects
- Tensorflow Utilities 16 projects
- Jax Utilities 3 projects
- Sklearn Utilities 19 projects
- Pytorch Utilities 32 projects
- Database Clients 1 projects
- Others 66 projects
Explanation
🥇 🥈 🥉 Combined project-quality score⭐️ Star count from GitHub🐣 New project (less than 6 months old)💤 Inactive project (6 months no activity)💀 Dead project (12 months no activity)📈 📉 Project is trending up or down➕ Project was recently added❗️ Warning (e.g. missing/risky license)👨💻 Contributors count from GitHub🔀 Fork count from GitHub📋 Issue count from GitHub⏱️ Last update timestamp on package manager📥 Download count from package manager📦 Number of dependent projectsTensorflow related project
Sklearn related project
PyTorch related project
MxNet related project
Apache Spark related project
Jupyter related project
PaddlePaddle related project
Pandas related project
Jax related project
Machine Learning Frameworks
General-purpose machine learning and deep learning frameworks.
Tensorflow (🥇 55 · ⭐ 180K) - An Open Source Machine Learning Framework for Everyone. Apache-2

-
GitHub (
👨💻 4.4K ·🔀 71K ·📦 280K ·📋 37K - 5% open ·⏱️ 01.06.2023):git clone https://github.com/tensorflow/tensorflow
-
PyPi (
📥 15M / month):pip install tensorflow
-
Conda (
📥 4.2M ·⏱️ 27.03.2023):conda install -c conda-forge tensorflow
-
Docker Hub (
📥 73M ·⭐ 2.2K ·⏱️ 01.06.2023):docker pull tensorflow/tensorflow
scikit-learn (🥇 52 · ⭐ 54K) - scikit-learn: machine learning in Python. BSD-3

StatsModels (🥇 45 · ⭐ 8.5K) - Statsmodels: statistical modeling and econometrics in Python. BSD-3
XGBoost (🥇 43 · ⭐ 24K) - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or.. Apache-2
LightGBM (🥇 43 · ⭐ 15K) - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT,.. MIT
pytorch-lightning (🥈 42 · ⭐ 24K) - Deep learning framework to train, deploy, and ship AI.. Apache-2

PaddlePaddle (🥈 42 · ⭐ 20K) - PArallel Distributed Deep LEarning: Machine Learning.. Apache-2

Jina (🥈 39 · ⭐ 18K) - Build multimodal AI services via cloud native technologies. Apache-2
-
GitHub (
👨💻 170 ·🔀 2.1K ·📦 600 ·📋 1.9K - 1% open ·⏱️ 31.05.2023):git clone https://github.com/jina-ai/jina
-
PyPi (
📥 400K / month):pip install jina
-
Conda (
📥 46K ·⏱️ 16.08.2022):conda install -c conda-forge jina-core
-
Docker Hub (
📥 1.2M ·⭐ 8 ·⏱️ 29.05.2023):docker pull jinaai/jina
Thinc (🥈 36 · ⭐ 2.7K) - A refreshing functional take on deep learning, compatible with your favorite.. MIT
Vowpal Wabbit (🥈 35 · ⭐ 8.2K) - Vowpal Wabbit is a machine learning system which pushes the.. BSD-3
tensorflow-upstream (🥉 32 · ⭐ 650) - TensorFlow ROCm port. Apache-2

tensorpack (🥉 31 · ⭐ 6.3K) - A Neural Net Training Interface on TensorFlow, with focus.. Apache-2

Neural Network Libraries (🥉 30 · ⭐ 2.6K) - Neural Network Libraries. Apache-2
Neural Tangents (🥉 27 · ⭐ 2K) - Fast and Easy Infinite Neural Networks in Python. Apache-2
xLearn (🥉 25 · ⭐ 3K · 💤 ) - High performance, easy-to-use, and scalable machine learning (ML).. Apache-2
ThunderSVM (🥉 20 · ⭐ 1.5K) - ThunderSVM: A Fast SVM Library on GPUs and CPUs. Apache-2
chefboost (🥉 18 · ⭐ 400) - A Lightweight Decision Tree Framework supporting regular algorithms:.. MIT
ThunderGBM (🥉 16 · ⭐ 660 · 💤 ) - ThunderGBM: Fast GBDTs and Random Forests on GPUs. Apache-2
Show 16 hidden projects...
- dlib (
🥈 41 ·⭐ 12K) - A toolkit for making real world machine learning and data analysis..❗️BSL-1.0
- MindsDB (
🥈 34 ·⭐ 16K) - MindsDB is a Server for Artificial Intelligence Logic. Enabling..❗️GPL-3.0
- Theano (
🥈 34 ·⭐ 9.7K) - Theano was a Python library that allows you to define, optimize,..❗Unlicensed
- Turi Create (
🥈 33 ·⭐ 11K ·💀 ) - Turi Create simplifies the development of custom machine..BSD-3
- ivy (
🥉 31 ·⭐ 11K) - The Unified Machine Learning Framework.❗Unlicensed
- TFlearn (
🥉 30 ·⭐ 9.6K ·💀 ) - Deep learning library featuring a higher-level API for..❗Unlicensed
- NuPIC (
🥉 28 ·⭐ 6.3K ·💀 ) - Numenta Platform for Intelligent Computing is an implementation..❗️AGPL-3.0
- Lasagne (
🥉 28 ·⭐ 3.8K ·💀 ) - Lightweight library to build and train neural networks in Theano.MIT
- CNTK (
🥉 26 ·⭐ 17K ·💤 ) - Microsoft Cognitive Toolkit (CNTK), an open source deep-learning..❗Unlicensed
- SHOGUN (
🥉 26 ·⭐ 2.9K ·💀 ) - Unified and efficient Machine Learning.BSD-3
- mace (
🥉 23 ·⭐ 4.8K ·💀 ) - MACE is a deep learning inference framework optimized for mobile..Apache-2
- neon (
🥉 22 ·⭐ 3.9K ·💀 ) - Intel Nervana reference deep learning framework committed to best..Apache-2
- Torchbearer (
🥉 21 ·⭐ 630 ·💀 ) - torchbearer: A model fitting library for PyTorch.MIT
- Objax (
🥉 20 ·⭐ 740) -Apache-2
- elegy (
🥉 18 ·⭐ 450 ·💀 ) - A High Level API for Deep Learning in JAX.MIT
- StarSpace (
🥉 16 ·⭐ 3.9K ·💀 ) - Learning embeddings for classification, retrieval and ranking.MIT
Data Visualization
General-purpose and task-specific data visualization libraries.
Matplotlib (🥇 48 · ⭐ 17K) - matplotlib: plotting with Python. ❗Unlicensed
Plotly (🥈 36 · ⭐ 14K · 📉 ) - The interactive graphing library for Python This project now includes.. MIT
VisPy (🥈 35 · ⭐ 3.1K) - High-performance interactive 2D/3D data visualization library. BSD-3

-
GitHub (
👨💻 190 ·🔀 610 ·📦 1.2K ·📋 1.4K - 22% open ·⏱️ 29.05.2023):git clone https://github.com/vispy/vispy
-
PyPi (
📥 68K / month ·📦 130 ·⏱️ 14.11.2022):pip install vispy
-
Conda (
📥 390K ·⏱️ 13.05.2023):conda install -c conda-forge vispy
-
npm (
📥 6 / month ·📦 1 ·⏱️ 15.03.2020):npm install vispy
datashader (🥈 33 · ⭐ 3K) - Quickly and accurately render even the largest data. BSD-3
Perspective (🥈 30 · ⭐ 6.3K) - A data visualization and analytics component, especially.. Apache-2

-
GitHub (
👨💻 88 ·🔀 690 ·📦 9 ·📋 640 - 14% open ·⏱️ 31.05.2023):git clone https://github.com/finos/perspective
-
PyPi (
📥 5.3K / month ·📦 11 ·⏱️ 20.01.2023):pip install perspective-python
-
Conda (
📥 350K ·⏱️ 31.05.2023):conda install -c conda-forge perspective
-
npm (
📥 1.7K / month):npm install @finos/perspective-jupyterlab
D-Tale (🥈 30 · ⭐ 4.1K) - Visualizer for pandas data structures. ❗️LGPL-2.1


bqplot (🥈 30 · ⭐ 3.4K) - Plotting library for IPython/Jupyter notebooks. Apache-2

-
GitHub (
👨💻 62 ·🔀 470 ·📦 43 ·📋 610 - 40% open ·⏱️ 11.04.2023):git clone https://github.com/bqplot/bqplot
-
PyPi (
📥 140K / month ·📦 100 ·⏱️ 02.09.2022):pip install bqplot
-
Conda (
📥 1.2M ·⏱️ 12.04.2023):conda install -c conda-forge bqplot
-
npm (
📥 3.9K / month ·📦 14 ·⏱️ 11.04.2023):npm install bqplot
hvPlot (🥈 30 · ⭐ 760) - A high-level plotting API for pandas, dask, xarray, and networkx built on.. BSD-3
Facets Overview (🥉 29 · ⭐ 7.1K) - Visualizations for machine learning datasets. Apache-2

mpld3 (🥉 29 · ⭐ 2.3K) - D3 Renderings of Matplotlib Graphics. BSD-3
-
GitHub (
👨💻 51 ·🔀 350 ·📦 4.7K ·📋 360 - 59% open ·⏱️ 10.12.2022):git clone https://github.com/mpld3/mpld3
-
PyPi (
📥 240K / month ·📦 410 ·⏱️ 10.12.2022):pip install mpld3
-
Conda (
📥 180K ·⏱️ 10.12.2022):conda install -c conda-forge mpld3
-
npm (
📥 880 / month ·📦 8 ·⏱️ 10.12.2022):npm install mpld3
pythreejs (🥉 28 · ⭐ 870) - A Jupyter - Three.js bridge. BSD-3

-
GitHub (
👨💻 30 ·🔀 180 ·📦 26 ·📋 230 - 25% open ·⏱️ 20.02.2023):git clone https://github.com/jupyter-widgets/pythreejs
-
PyPi (
📥 86K / month ·📦 56 ·⏱️ 20.02.2023):pip install pythreejs
-
Conda (
📥 480K ·⏱️ 16.03.2023):conda install -c conda-forge pythreejs
-
npm (
📥 4.2K / month ·📦 11 ·⏱️ 20.02.2023):npm install jupyter-threejs
data-validation (🥉 28 · ⭐ 720) - Library for exploring and validating machine learning.. Apache-2


pandas-profiling (🥉 26 · ⭐ 11K · 📈 ) - Deprecated pandas-profiling package, use ydata-.. MIT


Chartify (🥉 26 · ⭐ 3.3K) - Python library that makes it easy for data scientists to create.. Apache-2
Plotly-Resampler (🥉 26 · ⭐ 730) - Visualize large time series data with plotly.py. MIT
Multicore-TSNE (🥉 25 · ⭐ 1.8K) - Parallel t-SNE implementation with Python and Torch.. BSD-3

AutoViz (🥉 25 · ⭐ 1.3K) - Automatically Visualize any dataset, any size with a single line of.. Apache-2
Sweetviz (🥉 24 · ⭐ 2.4K · 💤 ) - Visualize and compare datasets, target values and associations, with.. MIT
Pandas-Bokeh (🥉 24 · ⭐ 840) - Bokeh Plotting Backend for Pandas and GeoPandas. MIT

python-ternary (🥉 22 · ⭐ 640) - Ternary plotting library for python with matplotlib. MIT
Show 14 hidden projects...
- cartopy (
🥈 32 ·⭐ 1.2K) - Cartopy - a cartographic python library with matplotlib support.❗️LGPL-3.0
- Cufflinks (
🥉 29 ·⭐ 2.8K ·💀 ) - Productivity Tools for Plotly + Pandas.MIT
- HyperTools (
🥉 25 ·⭐ 1.8K ·💀 ) - A Python toolbox for gaining geometric insights into high-..MIT
- PandasGUI (
🥉 24 ·⭐ 2.9K ·💀 ) - A GUI for Pandas DataFrames.❗️MIT-0
- PDPbox (
🥉 24 ·⭐ 750 ·💀 ) - python partial dependence plot toolbox.MIT
- pivottablejs (
🥉 22 ·⭐ 550 ·💀 ) - Dragndrop Pivot Tables and Charts for Jupyter/IPython..MIT
- joypy (
🥉 21 ·⭐ 480 ·💀 ) - Joyplots in Python with matplotlib & pandas.MIT
- vegafusion (
🥉 21 ·⭐ 240) - Serverside scaling for Vega and Altair visualizations.BSD-3
- ivis (
🥉 19 ·⭐ 300) - Dimensionality reduction in very large datasets using Siamese..Apache-2
- animatplot (
🥉 18 ·⭐ 400 ·💀 ) - A python package for animating plots build on matplotlib.MIT
- data-describe (
🥉 17 ·⭐ 290 ·💀 ) - datadescribe: Pythonic EDA Accelerator for Data Science.Apache-2
- pdvega (
🥉 16 ·⭐ 340 ·💀 ) - Interactive plotting for Pandas using Vega-Lite.MIT
- nx-altair (
🥉 16 ·⭐ 210 ·💀 ) - Draw interactive NetworkX graphs with Altair.MIT
- nptsne (
🥉 9 ·⭐ 30 ·💀 ) - nptsne is a numpy compatible python binary package that offers a..Apache-2
Text Data & NLP
Libraries for processing, cleaning, manipulating, and analyzing text data as well as libraries for NLP tasks such as language detection, fuzzy matching, classification, seq2seq learning, conversational AI, keyword extraction, and translation.
transformers (🥇 49 · ⭐ 100K) - Transformers: State-of-the-art Machine Learning for.. Apache-2


nltk (🥇 41 · ⭐ 12K) - Suite of libraries and programs for symbolic and statistical natural.. Apache-2
gensim (🥇 40 · ⭐ 14K) - Topic Modelling for Humans. ❗️LGPL-2.1
flair (🥇 40 · ⭐ 13K) - A very simple framework for state-of-the-art Natural Language Processing.. MIT

sentencepiece (🥇 37 · ⭐ 7.5K) - Unsupervised text tokenizer for Neural Network-based text.. Apache-2
haystack (🥇 36 · ⭐ 9K) - Haystack is an open source NLP framework to interact with your data.. Apache-2
sentence-transformers (🥇 35 · ⭐ 11K) - Multilingual Sentence & Image Embeddings with BERT. Apache-2

TensorFlow Text (🥇 35 · ⭐ 1.1K) - Making text a first-class citizen in TensorFlow. Apache-2

TextBlob (🥈 34 · ⭐ 8.6K) - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech.. MIT
Tokenizers (🥈 34 · ⭐ 7.1K) - Fast State-of-the-Art Tokenizers optimized for Research and.. Apache-2
jellyfish (🥈 32 · ⭐ 1.9K) - a python library for doing approximate and phonetic matching of strings. MIT
spacy-transformers (🥈 31 · ⭐ 1.3K) - Use pretrained transformers like BERT, XLNet and GPT-2.. MIT
spacy
snowballstemmer (🥈 31 · ⭐ 660) - Snowball compiler and stemming algorithms. BSD-3
DeepPavlov (🥈 30 · ⭐ 6.2K) - An open source library for deep learning end-to-end dialog.. Apache-2

SciSpacy (🥈 30 · ⭐ 1.4K) - A full spaCy pipeline and models for scientific/biomedical documents. Apache-2
english-words (🥈 28 · ⭐ 9K · 💤 ) - A text file containing 479k English words for all your.. Unlicense
fastNLP (🥈 28 · ⭐ 2.9K) - fastNLP: A Modularized and Extensible NLP Framework. Currently still.. Apache-2
Ciphey (🥈 27 · ⭐ 13K) - Automatically decrypt encryptions without knowing the key or cipher, decode.. MIT
-
GitHub (
👨💻 47 ·🔀 790 ·📋 300 - 15% open ·⏱️ 05.12.2022):git clone https://github.com/Ciphey/Ciphey
-
PyPi (
📥 40K / month):pip install ciphey
-
Docker Hub (
📥 19K ·⭐ 14 ·⏱️ 10.03.2023):docker pull remnux/ciphey
TextDistance (🥈 27 · ⭐ 3.1K · 💤 ) - Compute distance between sequences. 30+ algorithms, pure.. MIT
scattertext (🥈 27 · ⭐ 2.1K) - Beautiful visualizations of how language differs among document.. Apache-2
qdrant (🥉 26 · ⭐ 11K) - Qdrant - Vector Database for the next generation of AI applications... Apache-2
-
GitHub (
👨💻 50 ·🔀 520 ·📥 120 ·📋 530 - 17% open ·⏱️ 31.05.2023):git clone https://github.com/qdrant/qdrant
OpenPrompt (🥉 26 · ⭐ 3.3K) - An Open-Source Framework for Prompt-Learning. Apache-2
PyTextRank (🥉 26 · ⭐ 2K · 💤 ) - Python implementation of TextRank algorithms (textgraphs) for.. MIT
YouTokenToMe (🥉 23 · ⭐ 880) - Unsupervised text tokenizer focused on computational efficiency. MIT
lightseq (🥉 22 · ⭐ 2.8K) - LightSeq: A High Performance Library for Sequence Processing and.. Apache-2
promptsource (🥉 22 · ⭐ 1.8K) - Toolkit for creating, sharing and using natural language.. Apache-2
NLP Architect (🥉 21 · ⭐ 2.9K · 💤 ) - A model library for exploring state-of-the-art deep.. Apache-2
Texthero (🥉 21 · ⭐ 2.7K · 💤 ) - Text preprocessing, representation and visualization from zero to.. MIT
small-text (🥉 21 · ⭐ 460) - Active Learning for Text Classification in Python. MIT


happy-transformer (🥉 20 · ⭐ 420) - A package built on top of Hugging Faces transformers.. Apache-2
huggingface
textaugment (🥉 18 · ⭐ 320) - TextAugment: Text Augmentation Library. MIT
TextBox (🥉 17 · ⭐ 970) - TextBox 2.0 is a text generation library with pre-trained language models. MIT
OpenNRE (🥉 16 · ⭐ 4K) - An Open-Source Package for Neural Relation Extraction (NRE). MIT
-
GitHub (
👨💻 12 ·🔀 1K ·📋 360 - 2% open ·⏱️ 03.01.2023):git clone https://github.com/thunlp/OpenNRE
Show 38 hidden projects...
- ChatterBot (
🥇 35 ·⭐ 13K ·💀 ) - ChatterBot is a machine learning, conversational dialog engine..BSD-3
- fuzzywuzzy (
🥈 33 ·⭐ 8.9K ·💀 ) - Fuzzy String Matching in Python.❗️GPL-2.0
- stanza (
🥈 31 ·⭐ 6.7K) - Official Stanford NLP Python Library for Many Human Languages.❗Unlicensed
- langid (
🥈 28 ·⭐ 2.1K ·💀 ) - Stand-alone language identification system.BSD-3
- vaderSentiment (
🥈 27 ·⭐ 4K ·💀 ) - VADER Sentiment Analysis. VADER (Valence Aware Dictionary and..MIT
- polyglot (
🥈 27 ·⭐ 2.2K ·💀 ) - Multilingual text (NLP) processing toolkit.❗️GPL-3.0
- flashtext (
🥉 26 ·⭐ 5.4K ·💀 ) - Extract Keywords from sentence or Replace keywords in sentences.MIT
- neuralcoref (
🥉 26 ·⭐ 2.7K ·💀 ) - Fast Coreference Resolution in spaCy with Neural Networks.MIT
- underthesea (
🥉 26 ·⭐ 1.2K) - Underthesea - Vietnamese NLP Toolkit.❗️GPL-3.0
- pytorch-nlp (
🥉 25 ·⭐ 2.2K ·💀 ) - Basic Utilities for PyTorch Natural Language Processing..BSD-3
- textgenrnn (
🥉 24 ·⭐ 4.9K ·💀 ) - Easily train your own text-generating neural network of any..MIT
- Snips NLU (
🥉 24 ·⭐ 3.8K ·💀 ) - Snips Python library to extract meaning from text.Apache-2
- MatchZoo (
🥉 24 ·⭐ 3.8K ·💀 ) - Facilitating the design, comparison and sharing of deep..Apache-2
- Kashgari (
🥉 23 ·⭐ 2.4K ·💀 ) - Kashgari is a production-level NLP Transfer learning..Apache-2
- pySBD (
🥉 23 ·⭐ 620 ·💀 ) - pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence..MIT
- gpt-2-simple (
🥉 22 ·⭐ 3.3K ·💀 ) - Python package to easily retrain OpenAIs GPT-2 text-..MIT
- Texar (
🥉 22 ·⭐ 2.4K ·💀 ) - Toolkit for Machine Learning, Natural Language Processing, and..Apache-2
- stop-words (
🥉 22 ·⭐ 150 ·💀 ) - Get list of common stop words in various languages in Python.BSD-3
- DELTA (
🥉 21 ·⭐ 1.5K ·💀 ) - DELTA is a deep learning based natural language and speech..Apache-2
- anaGo (
🥉 21 ·⭐ 1.5K ·💀 ) - Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition,..MIT
- PyText (
🥉 20 ·⭐ 6.4K ·💤 ) - A natural language modeling framework based on PyTorch.❗Unlicensed
- pyfasttext (
🥉 20 ·⭐ 230 ·💀 ) - Yet another Python binding for fastText.❗️GPL-3.0
- fastT5 (
🥉 19 ·⭐ 470 ·💀 ) - boost inference speed of T5 models by 5x & reduce the model size..Apache-2
- numerizer (
🥉 19 ·⭐ 200) - A Python module to convert natural language numerics into ints and..MIT
- DeepMatcher (
🥉 18 ·⭐ 490 ·💀 ) - Python package for performing Entity and Text Matching using..BSD-3
- NeuroNER (
🥉 17 ·⭐ 1.7K ·💀 ) - Named-entity recognition using neural networks. Easy-to-use and..MIT
- Camphr (
🥉 17 ·⭐ 340 ·💀 ) - Camphr - NLP libary for creating pipeline components.Apache-2
spacy
- skift (
🥉 17 ·⭐ 230 ·💤 ) - scikit-learn wrappers for Python fastText.MIT
- nboost (
🥉 16 ·⭐ 660 ·💀 ) - NBoost is a scalable, search-api-boosting platform for deploying..Apache-2
- whoosh (
🥉 16 ·⭐ 410 ·💀 ) - Pure-Python full-text search library.❗Unlicensed
- textpipe (
🥉 16 ·⭐ 300 ·💀 ) - Textpipe: clean and extract metadata from text.MIT
- Headliner (
🥉 15 ·⭐ 230 ·💀 ) - Easy training and deployment of seq2seq models.MIT
- BLINK (
🥉 14 ·⭐ 1K ·💀 ) - Entity Linker solution.MIT
- spacy-dbpedia-spotlight (
🥉 14 ·⭐ 88) - A spaCy wrapper for DBpedia Spotlight.MIT
spacy
- TransferNLP (
🥉 13 ·⭐ 290 ·💀 ) - NLP library designed for reproducible experimentation..MIT
- ONNX-T5 (
🥉 13 ·⭐ 230 ·💀 ) - Summarization, translation, sentiment-analysis, text-generation..Apache-2
- NeuralQA (
🥉 13 ·⭐ 220 ·💀 ) - NeuralQA: A Usable Library for Question Answering on Large Datasets..MIT
- textvec (
🥉 11 ·⭐ 190 ·💤 ) - Text vectorization tool to outperform TFIDF for classification..MIT
Image Data
Libraries for image & video processing, manipulation, and augmentation as well as libraries for computer vision tasks such as facial recognition, object detection, and classification.
MMDetection (🥇 36 · ⭐ 24K) - OpenMMLab Detection Toolbox and Benchmark. Apache-2

torchvision (🥇 36 · ⭐ 14K) - Datasets, Transforms and Models specific to Computer Vision. BSD-3

InsightFace (🥈 35 · ⭐ 15K) - State-of-the-art 2D and 3D Face Analysis Project. MIT

PyTorch Image Models (🥈 34 · ⭐ 25K) - PyTorch image models, scripts, pretrained weights --.. Apache-2

detectron2 (🥈 34 · ⭐ 25K) - Detectron2 is a platform for object detection, segmentation.. Apache-2

Albumentations (🥈 34 · ⭐ 12K) - Fast image augmentation library and an easy-to-use wrapper.. MIT

PaddleDetection (🥈 32 · ⭐ 11K) - Object Detection toolkit based on PaddlePaddle. It.. Apache-2

Face Recognition (🥈 30 · ⭐ 48K · 💤 ) - The worlds simplest facial recognition api for Python.. MIT

vit-pytorch (🥈 29 · ⭐ 14K) - Implementation of Vision Transformer, a simple way to achieve.. MIT

facenet-pytorch (🥈 29 · ⭐ 3.5K) - Pretrained Pytorch face detection (MTCNN) and facial.. MIT

sahi (🥈 29 · ⭐ 2.7K) - Framework agnostic sliced/tiled inference + interactive ui + error analysis.. MIT
opencv-python (🥉 28 · ⭐ 3.5K · 📉 ) - Automated CI toolchain to produce precompiled opencv-python,.. MIT
imageai (🥉 27 · ⭐ 7.8K · 📉 ) - A python library built to empower developers to build applications.. MIT
Face Alignment (🥉 26 · ⭐ 6.3K) - 2D and 3D Face alignment library build using pytorch. BSD-3

layout-parser (🥉 26 · ⭐ 3.7K · 💤 ) - A Unified Toolkit for Deep Learning Based Document Image.. Apache-2
vidgear (🥉 26 · ⭐ 2.8K) - A High-performance cross-platform Video Processing Python framework.. Apache-2
Image Deduplicator (🥉 25 · ⭐ 4.5K) - Finding duplicate images made easy!. Apache-2

segmentation_models (🥉 25 · ⭐ 4.3K · 💤 ) - Segmentation models with pretrained backbones. Keras.. MIT

Norfair (🥉 25 · ⭐ 1.8K) - Lightweight Python library for adding real-time multi-object tracking.. BSD-3
pytorchvideo (🥉 23 · ⭐ 2.9K) - A deep learning library for video understanding research. Apache-2

Classy Vision (🥉 23 · ⭐ 1.6K) - An end-to-end PyTorch framework for image and video.. MIT

tensorflow-graphics (🥉 22 · ⭐ 2.7K) - TensorFlow Graphics: Differentiable Graphics Layers.. Apache-2

icevision (🥉 22 · ⭐ 820) - An Agnostic Computer Vision Framework - Pluggable to any Training.. Apache-2
image-match (🥉 20 · ⭐ 2.9K) - Quickly search over billions of images. Apache-2
DE⫶TR (🥉 19 · ⭐ 11K) - End-to-End Object Detection with Transformers. Apache-2

-
GitHub (
👨💻 26 ·🔀 2K ·📋 490 - 42% open ·⏱️ 07.02.2023):git clone https://github.com/facebookresearch/detr
PySlowFast (🥉 19 · ⭐ 5.7K) - PySlowFast: video understanding codebase from FAIR for.. Apache-2

scenic (🥉 19 · ⭐ 2.2K) - Scenic: A Jax Library for Computer Vision Research and Beyond. Apache-2

-
GitHub (
👨💻 63 ·🔀 310 ·📋 150 - 52% open ·⏱️ 01.06.2023):git clone https://github.com/google-research/scenic
Show 19 hidden projects...
- scikit-image (
🥇 42 ·⭐ 5.4K) - Image processing in Python.❗Unlicensed
- imgaug (
🥈 35 ·⭐ 14K ·💀 ) - Image augmentation for machine learning experiments.MIT
- glfw (
🥈 35 ·⭐ 11K) - A multi-platform library for OpenGL, OpenGL ES, Vulkan, window and input.❗️Zlib
- PyTorch3D (
🥈 30 ·⭐ 7.3K) - PyTorch3D is FAIRs library of reusable components for..❗Unlicensed
- imutils (
🥈 29 ·⭐ 4.3K ·💀 ) - A series of convenience functions to make basic image processing..MIT
- chainercv (
🥉 27 ·⭐ 1.5K ·💀 ) - ChainerCV: a Library for Deep Learning in Computer Vision.MIT
- mtcnn (
🥉 26 ·⭐ 2K ·💀 ) - MTCNN face detection implementation for TensorFlow, as a PIP package.MIT
- Pillow-SIMD (
🥉 26 ·⭐ 2K ·💤 ) - The friendly PIL fork.❗️PIL
- mahotas (
🥉 25 ·⭐ 800) - Computer Vision in Python.❗Unlicensed
- CellProfiler (
🥉 25 ·⭐ 770) - An open-source application for biological image analysis.❗Unlicensed
- deep-daze (
🥉 22 ·⭐ 4.4K ·💀 ) - Simple command line tool for text to image generation using..MIT
- Image Super-Resolution (
🥉 22 ·⭐ 4.2K ·💀 ) - Super-scale your images and run experiments with..Apache-2
- Luminoth (
🥉 22 ·⭐ 2.4K ·💀 ) - Deep Learning toolkit for Computer Vision.BSD-3
- nude.py (
🥉 21 ·⭐ 910 ·💀 ) - Nudity detection with Python.MIT
- detecto (
🥉 20 ·⭐ 590 ·💀 ) - Build fully-functioning computer vision models with PyTorch.MIT
- Caer (
🥉 18 ·⭐ 690 ·💀 ) - A lightweight Computer Vision library. Scale your models, not boilerplate.MIT
- solt (
🥉 18 ·⭐ 260 ·💤 ) - Streaming over lightweight data transformations.MIT
- Torch Points 3D (
🥉 14 ·⭐ 140 ·💀 ) - Pytorch framework for doing deep learning on point..BSD-3
- HugsVision (
🥉 13 ·⭐ 180) - HugsVision is a easy to use huggingface wrapper for state-of-the-..MIT
huggingface
Graph Data
Libraries for graph processing, clustering, embedding, and machine learning tasks.
dgl (🥇 37 · ⭐ 12K) - Python package built to ease deep learning on graph, on top of existing DL.. Apache-2
PyTorch Geometric (🥇 35 · ⭐ 18K · 📉 ) - Graph Neural Network Library for PyTorch. MIT

ogb (🥈 31 · ⭐ 1.7K) - Benchmark datasets, data loaders, and evaluators for graph machine learning. MIT
pygraphistry (🥈 28 · ⭐ 1.9K) - PyGraphistry is a Python library to quickly load, shape,.. BSD-3

Paddle Graph Learning (🥈 28 · ⭐ 1.5K) - Paddle Graph Learning (PGL) is an efficient and.. Apache-2

AmpliGraph (🥈 26 · ⭐ 1.9K) - Python library for Representation Learning on Knowledge.. Apache-2

PyKEEN (🥈 26 · ⭐ 1.2K) - A Python library for learning and evaluating knowledge graph embeddings. MIT
pytorch_geometric_temporal (🥈 24 · ⭐ 2.1K) - PyTorch Geometric Temporal: Spatiotemporal Signal.. MIT

torch-cluster (🥉 21 · ⭐ 650) - PyTorch Extension Library of Optimized Graph Cluster.. MIT

graph-nets (🥉 18 · ⭐ 5.3K) - Build Graph Nets in Tensorflow. Apache-2

GraphEmbedding (🥉 16 · ⭐ 3.3K · 💤 ) - Implementation and experiments of graph embedding.. MIT

-
GitHub (
👨💻 9 ·🔀 940 ·📦 27 ·📋 67 - 62% open ·⏱️ 21.06.2022):git clone https://github.com/shenweichen/GraphEmbedding
kglib (🥉 16 · ⭐ 540 · 💤 ) - TypeDB-ML is the Machine Learning integrations library for TypeDB. Apache-2
OpenKE (🥉 15 · ⭐ 3.5K · 💤 ) - An Open-Source Package for Knowledge Embedding (KE). MIT
-
GitHub (
👨💻 11 ·🔀 950 ·📋 370 - 4% open ·⏱️ 03.11.2022):git clone https://github.com/thunlp/OpenKE
OpenNE (🥉 14 · ⭐ 1.6K · 💤 ) - An Open-Source Package for Network Embedding (NE). MIT

-
GitHub (
👨💻 11 ·🔀 480 ·📋 100 - 4% open ·⏱️ 02.11.2022):git clone https://github.com/thunlp/OpenNE
Show 16 hidden projects...
- networkx (
🥇 42 ·⭐ 13K) - Network Analysis in Python.❗Unlicensed
- igraph (
🥇 33 ·⭐ 1.1K) - Python interface for igraph.❗️GPL-2.0
- StellarGraph (
🥈 28 ·⭐ 2.7K ·💀 ) - StellarGraph - Machine Learning on Graphs.Apache-2
- pygal (
🥈 27 ·⭐ 2.5K) - PYthon svg GrAph plotting Library.❗️LGPL-3.0
- Karate Club (
🥈 23 ·⭐ 1.9K) - Karate Club: An API Oriented Open-source Python Framework for..❗️GPL-3.0
- DIG (
🥈 23 ·⭐ 1.5K) - A library for graph deep learning research.❗️GPL-3.0
- PyTorch-BigGraph (
🥉 21 ·⭐ 3.2K) - Generate embeddings from large-scale graph-structured..❗Unlicensed
- DeepWalk (
🥉 21 ·⭐ 2.6K ·💀 ) - DeepWalk - Deep Learning for Graphs.❗️GPL-3.0
- pyRDF2Vec (
🥉 21 ·⭐ 200) - Python Implementation and Extension of RDF2Vec.MIT
- Sematch (
🥉 17 ·⭐ 410 ·💀 ) - semantic similarity framework for knowledge graph.Apache-2
- DeepGraph (
🥉 16 ·⭐ 270 ·💀 ) - Analyze Data with Pandas-based Networks. Documentation:.BSD-3
- Euler (
🥉 15 ·⭐ 2.8K ·💀 ) - A distributed graph deep learning framework.Apache-2
- GraphGym (
🥉 15 ·⭐ 1.4K) - Platform for designing and evaluating Graph Neural Networks..❗Unlicensed
- GraphSAGE (
🥉 14 ·⭐ 3.1K ·💀 ) - Representation learning on large graphs using stochastic..MIT
- GraphVite (
🥉 14 ·⭐ 1.1K ·💀 ) - GraphVite: A General and High-performance Graph Embedding..Apache-2
- ptgnn (
🥉 14 ·⭐ 370 ·💀 ) - A PyTorch Graph Neural Network Library.MIT
Audio Data
Libraries for audio analysis, manipulation, transformation, and extraction, as well as speech recognition and music generation tasks.
speechbrain (🥇 36 · ⭐ 6K) - A PyTorch-based Speech Toolkit. Apache-2

torchaudio (🥇 35 · ⭐ 2.1K) - Data manipulation and transformation for audio signal.. BSD-2

SpeechRecognition (🥈 34 · ⭐ 7.2K) - Speech recognition module for Python, supporting several.. BSD-3
python-soundfile (🥈 29 · ⭐ 550) - SoundFile is an audio library based on libsndfile, CFFI, and.. BSD-3
pyAudioAnalysis (🥈 28 · ⭐ 5.3K · 💤 ) - Python Audio Analysis Library: Feature Extraction,.. Apache-2
audiomentations (🥈 28 · ⭐ 1.4K) - A Python library for audio data augmentation. Inspired by.. MIT
tinytag (🥈 28 · ⭐ 600) - Read audio and music meta data and duration of MP3, OGG, OPUS, MP4, M4A,.. MIT
Show 10 hidden projects...
- DeepSpeech (
🥈 34 ·⭐ 22K ·💀 ) - DeepSpeech is an open source embedded (offline, on-..MPL-2.0
- aubio (
🥈 28 ·⭐ 3K ·💀 ) - a library for audio and music analysis.❗️GPL-3.0
- Essentia (
🥈 28 ·⭐ 2.4K) - C++ library for audio and music analysis, description and..❗️AGPL-3.0
- Madmom (
🥉 25 ·⭐ 1.1K ·💀 ) - Python audio and music signal processing library.BSD-3
- python_speech_features (
🥉 24 ·⭐ 2.2K ·💀 ) - This library provides common speech features for ASR..MIT
- TTS (
🥉 22 ·⭐ 7.5K ·💀 ) - Deep learning for Text to Speech (Discussion forum:..MPL-2.0
- TimeSide (
🥉 22 ·⭐ 350) - scalable audio processing framework and server written in Python.❗️AGPL-3.0
- Dejavu (
🥉 21 ·⭐ 6.1K ·💀 ) - Audio fingerprinting and recognition in Python.MIT
- Muda (
🥉 17 ·⭐ 220 ·💀 ) - A library for augmenting annotated audio data.ISC
- textlesslib (
🥉 9 ·⭐ 410 ·💀 ) - Library for Textless Spoken Language Processing.MIT
Geospatial Data
Libraries to load, process, analyze, and write geographic data as well as libraries for spatial analysis, map visualization, and geocoding.
pydeck (🥇 42 · ⭐ 11K) - WebGL2 powered visualization framework. MIT

-
GitHub (
👨💻 230 ·🔀 2K ·📦 6K ·📋 2.7K - 8% open ·⏱️ 31.05.2023):git clone https://github.com/visgl/deck.gl
-
PyPi (
📥 1.4M / month ·📦 42 ·⏱️ 04.11.2022):pip install pydeck
-
Conda (
📥 380K ·⏱️ 04.11.2022):conda install -c conda-forge pydeck
-
npm (
📥 410K / month ·📦 450 ·⏱️ 31.05.2023):npm install deck.gl
ArcGIS API (🥈 32 · ⭐ 1.6K) - Documentation and samples for ArcGIS API for Python. Apache-2
-
GitHub (
👨💻 88 ·🔀 1K ·📥 8.3K ·📋 600 - 9% open ·⏱️ 01.06.2023):git clone https://github.com/Esri/arcgis-python-api
-
PyPi (
📥 91K / month ·📦 31 ·⏱️ 27.01.2023):pip install arcgis
-
Docker Hub (
📥 10K ·⭐ 40 ·⏱️ 17.06.2022):docker pull esridocker/arcgis-api-python-notebook
ipyleaflet (🥉 31 · ⭐ 1.4K) - A Jupyter - Leaflet.js bridge. MIT

-
GitHub (
👨💻 82 ·🔀 350 ·📦 4.5K ·📋 570 - 40% open ·⏱️ 10.02.2023):git clone https://github.com/jupyter-widgets/ipyleaflet
-
PyPi (
📥 150K / month ·📦 150 ·⏱️ 19.10.2022):pip install ipyleaflet
-
Conda (
📥 1M ·⏱️ 19.10.2022):conda install -c conda-forge ipyleaflet
-
npm (
📥 40K / month ·📦 5 ·⏱️ 19.10.2022):npm install jupyter-leaflet
pymap3d (🥉 24 · ⭐ 310) - pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef.. BSD-2
Show 8 hidden projects...
- Geocoder (
🥈 32 ·⭐ 1.5K ·💀 ) - Python Geocoder.MIT
- Satpy (
🥉 31 ·⭐ 920) - Python package for earth-observing satellite data processing.❗️GPL-3.0
- Sentinelsat (
🥉 29 ·⭐ 880) - Search and download Copernicus Sentinel satellite images.❗️GPL-3.0
- EarthPy (
🥉 26 ·⭐ 440 ·💀 ) - A package built to support working with spatial data using open..BSD-3
- prettymaps (
🥉 24 ·⭐ 9.8K) - A small set of Python functions to draw pretty maps from..❗️AGPL-3.0
- gmaps (
🥉 23 ·⭐ 760 ·💀 ) - Google maps for Jupyter notebooks.BSD-3
- Mapbox GL (
🥉 23 ·⭐ 640 ·💀 ) - Use Mapbox GL JS to visualize data in a Python Jupyter notebook.MIT
- geoplotlib (
🥉 21 ·⭐ 990 ·💀 ) - python toolbox for visualizing geographical data and making maps.MIT
Financial Data
Libraries for algorithmic stock/crypto trading, risk analytics, backtesting, technical analysis, and other tasks on financial data.