Skip to content
Switch branches/tags


Failed to load latest commit information.
Latest commit message
Commit time


Use Cases

3D Recognition

Semantic Segmentation

Audio Recognition

Speech to Text

Data Augmentation



Gesture Recognition

Using wearable sensors (phones, watches etc.)


Code repositories

Hyperparameter Tuning

Image Recognition

Face Recognition

Food Recognition

Image Captioning


Person Detection

Semantic Segmentation


Programming and ML

Predict defects

Predict performance

Searching code

Writing code



Crossword question answerers

Database queries

Named entity resolution

Also known as deduplication and record linkage (but not entity recognition which is picking up the names and classifying them in running text)

Reverse dictionaries

Other name is concept finders Return the name of a concept given a definition or description:

Sequence to sequence

Semantic analysis



Text classification

Text to Image

Text to Speech

Personality recognition

  • Mining Facebook Data for Predictive Personality Modeling (Dejan Markovikj,Sonja Gievska, Michal Kosinski, David Stillwell)
  • Personality Traits Recognition on Social Network — Facebook (Firoj Alam, Evgeny A. Stepanov, Giuseppe Riccardi)
  • The Relationship Between Dimensions of Love, Personality, and Relationship Length (Gorkan Ahmetoglu, Viren Swami, Tomas Chamorro-Premuzic)



Transfer Learning


Video recognition

Pose recognition

Object detection

Here are video-specific methods. See also Semantic Segmentation.

Scene Segmentation

Detects when one video (shot/scene/chapter) ends and another begins

Video Captioning

Video Classification


Multiple Modalities

Open problems

  • Recycled goods (not solved, no dataset)
  • Safety symbols on cardboard boxes (not solved, no dataset)


Amazon SageMaker

  • Distributed Training: You can’t choose the number of workers and parameter servers independently
  • Job Startup Latency: Up to 5 minutes single node
  • Hyper Parameters Tuning: In-Preview, and only supports the built-in algorithms
  • Batch Prediction: Not supported
  • GPU readiness: Bring your own docker image with CUDA installed
  • Auto-scale Online Serving: You need to specify the number of nodes
  • Training Job Monitoring: No monitoring

Apple ARCore

Apple Core ML

iOS framework from Apple to integrate machine learning models into your app.

Apple Create ML

Apple framework used with familiar tools like Swift and macOS playgrounds to create and train custom machine learning models on your Mac.

Apple Natural Language Framework

Firebase ML Kit

Google AutoML


  • let users train their own custom machine learning algorithms from scratch, without having to write a single line of code
  • uses Transfer Learning (the more data and customers, the better results)
  • is fully integrated with other Google Cloud services (Google Cloud Storage to store data, use Cloud ML or Vision API to customize the model etc.)


  • limited to image recognition (2018-Q1)
  • doesn't allow to download a trained model

Google Datalab

  • Powerful interactive tool created to explore, analyze, transform and visualize data and build machine learning models on Google Cloud Platform. It runs on Google Compute Engine and connects to multiple cloud services easily so you can focus on your data science tasks.
  • Built on Jupyter (formerly IPython), which boasts a thriving ecosystem of modules and a robust knowledge base.
  • Enables analysis of your data on Google BigQuery, Cloud Machine Learning Engine, Google Compute Engine, and Google Cloud Storage using Python, SQL, and JavaScript (for BigQuery user-defined functions).

Google Dataprep

Intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis. Cloud Dataprep is serverless and works at any scale. Easy data preparation with clicks and no code.

Google ML Engine

  • Samples & Tutorials
  • Samples for usage
  • Distributed Training: Specify number of nodes, types, (workers/PS), associated accelerators, and sizes
  • Job Startup Latency: 90 seconds for single node
  • Hyper Parameters Tuning: Grid Search, Random Search, and Bayesian Optimisation
  • Batch Prediction: You can submit a batch prediction job for high throughputs
  • GPU readiness: Out-of-the box, either via scale-tier, or config file
  • Auto-scale Online Serving: Scaled up to your specified maximum number of nodes, down to 0 nodes if no requests for 5 minutes
  • Training Job Monitoring: Full monitoring to the cluster nodes (CPU, Memory, etc.)
  • Automation of ML: AutoML - Vision, NLP, Speech, etc.
  • Specialised Hardware: Tensor Processing Units (TPUs)
  • SQL-supported ML: BQML

Google Natural language

  • entiry recognition: extract information about people, places, events, and much more mentioned in text documents, news articles, or blog posts
  • sentiment analysis: understand the overall sentiment expressed in a block of text
  • multilingual support
  • syntax analysis: extract tokens and sentences, identify parts of speech (PoS) and create dependency parse trees for each sentence

Google Deep Learning Virtual Machine

  • VMs with CPU and GPU

Google Mobile Vision

  • Detect Faces (finds facial landmarks such as the eyes, nose, and mouth; doesn't identifies a person)
  • Scan barcodes
  • Recognize Text

Google Speech API

  • speech recognition
  • word hints: Can provide context hints for improved accuracy. Especially useful for device and app use cases.
  • noise robustness: No need for signal processing or noise cancellation before calling API; can handle noisy audio from a variety of environments
  • realtime results: can stream text results, returning partial recognition results as they become available. Can also be run on buffered or archived audio files.
  • over 80 languages
  • can also filter inappropriate content in text results

Google Translation API

  • Supports more than 100 languages and thousands of language pairs
  • automatic language detection
  • continuous updates: Translation API is learning from logs analysis and human translation examples. Existing language pairs improve and new language pairs come online at no additional cost

Google Video Intelligence

  • Label Detection - Detect entities within the video, such as "dog", "flower" or "car"
  • Shot Change Detection - Detect scene changes within the video
  • Explicit Content Detection - Detect adult content within a video
  • Video Transcription - Automatically transcribes video content in English

Google Vision API

  • Object recognition: detect broad sets of categories within an image, ranging from modes of transportation to animals
  • Facial sentiment and logos: Analyze facial features to detect emotions: joy, sorrow, anger; detect logos
  • Extract text: detect and extract text within an image, with support of many languages and automatic language identification
  • Detect inapropriate content: fetect different types of inappropriate content from adult to violent content

Experiments Frameworks

Tools to help you configure, organize, log and reproduce experiments

Jupyter Notebook


Lobe is an easy-to-use visual tool (no coding required) that lets you build custom deep learning models, quickly train them, and ship them directly in your app without writing any code.

Microsoft Azure Bot Service

Microsoft Azure Machine Learning

Microsoft Cognitive Services

Microsoft Cognitive Toolkit


Syn Bot Oscova


  • Data visualization tool created by Tableau Software.
  • Connects to files, relational and Big Data sources, allows transforming data into dashboards that look amazing and are also interactive.


Turi Create

Apple python framework that simplifies the development of custom machine learning models. You don't have to be a machine learning expert to add recommendations, object detection, image classification, image similarity or activity classification to your app.

  • Export models to Core ML for use in iOS, macOS, watchOS, and tvOS apps.
  • A Guide to Turi Create from wwdc2018


Google AIY

  • Vision Kit - Do-it-yourself intelligent camera. Experiment with image recognition using neural networks on Raspberry Pi.
  • Voice Kit - Do-it-yourself intelligent speaker. Experiment with voice recognition and the Google Assistant on Raspberry Pi.




Decision Trees


  • can model nonlinearities
  • are highly interpretable
  • do not require extensive feature preprocessing
  • do not require enormous data sets


  • tend to overfit
    • fixed by building a decision forest with boosting
  • unstable/undeterministic (generate different results while trained on the same data)
    • fixed by using bootstrap aggregation/bagging (a boosted forest)
  • do mapping directly from the raw input to the label
    • better use neural nets that can learn intermediate representations


  • tree depth
  • maximum number of leaf nodes


Embedding models

Evolutionary Algorithms

Metrics of dataset quality

  • Statistical metrics
    • descriptive statistics: dimensionality, unique subject counts, systematic replicates counts, pdfs, cdfs (probability and cumulative distribution fx's)
    • cohort design
    • power analysis
    • sensitivity analysis
    • multiple testing correction analysis
    • dynamic range sensitivity
  • Numerical analysis metrics
    • number of clusters
    • PCA dimensions
    • MDS space dimensions/distances/curves/surfaces
    • variance between buckets/bags/trees/branches
    • informative/discriminative indices (i.e. how much does the top 10 features differ from one another and the group)
    • feature engineering differnetiators

Neural Networks

Approaches when our model doesn’t work:

  • Fetch more data
  • Add more layers to Neural Network
  • Try some new approach in Neural Network
  • Train longer (increase the number of iterations)
  • Change batch size
  • Try Regularisation
  • Check Bias Variance trade-off to avoid under and overfitting
  • Use more GPUs for faster computation

Back-propagation problems:

  • it requires labeled training data; while almost all data is unlabeled
  • the learning time does not scale well, which means it is very slow in networks with multiple hidden layers
  • it can get stuck in poor local optima, so for deep nets they are far from optimal.

Capsule Networks

Convolutional Neural Networks

Deep Residual Networks

Distributed Neural Networks

Feed-Forward Neural Networks

  • Perceptrons

Gated Recurrent Neural Networks

Generative Adversarial Networks

Long-Short Term Memory Networks

Recurrent Neural Networks

Symmetrically Connected Networks

Reinforcement Learning


Deep learning

  • Deep Learning: A Critical Appraisal by Gary Marcus, 2018
    • Deep learning thus far is data hungry
    • Deep learning thus far is shallow and has limited capacity for transfer
    • Deep learning thus far has no natural way to deal with hierarchical structure
    • Deep learning thus far has struggled with open-ended inference
    • Deep learning thus far is not sufficiently transparent
    • Deep learning thus far has not been well integrated with prior knowledge
    • Deep learning thus far cannot inherently distinguish causation from correlation
    • Deep learning presumes a largely stable world, in ways that may be problematic
    • Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted
    • Deep learning thus far is difficult to engineer with
  • Software 2.0 by Andrej Karpathy, 2017

Interview preparation


Google oriented courses






  • ScanNet - RGB-D video dataset annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations
  • SceneNet - Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth




Research Groups


The Browser of a Data Scientist

  • The Browser of a Data Scientist


A statistician drowned crossing a river that was only three feet deep on average