Skip to content

Welcome to Curated AI Resources! 🚀

Afi edited this page Sep 22, 2022 · 4 revisions

Here you can access a curated list of helpful resources:

Curated list of datasets

Datasets are imperative to train AI models and access to quality data is always demanding. Below you can find the public datasets and search engines. There are several private datasets as well but they come with a cost.

Curated list of prototyping tools

ML services and open source codes help to speed up ML project planning, data pipeline and model development for a quick release of an AI feature. The following is a list of recommended useful resources:

  • Huggingface: Build, train and deploy state-of-the-art models powered by the reference open source in machine learning. Examples of the common models:

    • Natural Language Processing: Transformers, Masked word completion with BERT, Name Entity Recognition with Electra, Text generation with GPT-2, GPT-J, Q&A with DistilBERT and RoBERTa, Summarization with BART, and Translation with T5.
    • Computer Vision: Image classification with ViT, Object Detection with DETR, Semantic Segmentation with SegFormer, Panoptic Segmentation with DETR.
    • Audio: Automatic Speech Recognition with Wav2Vec2, Keyword Spotting with Wav2Vec2.
    • Multimodal tasks: Visual Question Answering with ViLT.
    • Others: Knock Knock: Library to get a notification when your training is complete or when it crashes during the process with two additional lines of code. \
  • Google Vertex AI: Build, deploy, and scale ML models faster, with pre-trained and custom tooling within a unified artificial intelligence platform. Google provides many AI products to speed up prototyping and production such as AutoML, and Dialogflow. Here is the complete list.

  • ZenML: Extensible, open-source MLOps framework to create production-ready machine learning pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows.

  • Weights & Biases: A great ‍MLOps platform to build models faster with experiment tracking, dataset versioning, and model management.

  • AWS Sagemaker: Build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows.

  • IBM Watson Studio: Build and scale trusted AI on any cloud. Automate the AI lifecycle for MLOps.

  • Neptune.ai: Log, organize, compare, register, and share all your ML model metadata in a single place. Automate and standardize as your modelling team grows.

  • Papers with Code: Categorized list of state-of-the-art machine learning research along with open source code (if available / published by authors on GitHub).