Home

Welcome to the model-evaluation-workbench wiki!

Short Name

Performance Evaluation of Machine Learning Models

Short Description

This code pattern compares Watson cognitive service models and shows various performance metrics of these models to help users to choose the model that fits their requirement.

Offering Type

Artificial Intelligence

Introduction

Machine learning models are numerous and are being created for achieving a specific task. This code pattern shows you a way to compare Watson Cognitive services models so as to decide which model performs better for a particular set of data. It provides user a platform to configure models, provide input data, execute and prepare performance evaluation statistics.

Author

By Srikanth Manne, Muralidhar Chavan, Naveen Sharma, Mohammad Arshad

Code

https://github.com/IBM/model-evaluation-workbench

Demo

NA

Video

https://www.youtube.com/watch?v=VkgqjqBnvWo

Overview

This code pattern details about Watson cognitive service models performance evaluation and comparison. Watson Model Evaluation Workbench application helps user with a platform to configure, execute and test cognitive models, prepare performance evaluation metrics and calculate performance statistics such as confusion matrix and ROC curve. Different models perform differently for a given set of data. This code pattern help retrieve performance parameters of models and evaluates models so it helps users to decide which model to consider for their requirements.

This code pattern uses Java Liberty runtime, Watson Natural Language Classifier & IBM Cloud Object Storage.

Flow

WKSCreateProject

User launches the application.
Cloud authenticates the request and redirects it to the application.
Parses input data provided for evaluating the models.
Invokes adapter which calls cognitive services like Natural Language Classifier, Natural Language Understanding, etc.
Parses the cognitive model services configuration.
Connects to cognitive services.
Gets response from cognitive services.
Compares the expected result with actual result and does performance evaluation.
Performance results are sent back to client devices
Performance analysis is shown on UI

Included components

Java Liberty Runtime - Develop, deploy, and scale Java web apps with ease. IBM WebSphere Liberty Profile is a highly composable, ultra-fast, ultra-light profile of IBM WebSphere Application Server designed for the cloud.
Natural Language Classifier - The Natural Language Classifier service applies cognitive computing techniques to return the best matching classes for a sentence or phrase.
IBM Cloud Object Storage: An IBM Cloud service that provides an unstructured/structured cloud data store to build and deliver cost effective apps and services with high reliability and fast speed to market.

Featured technologies

Liberty for Java: Develop, deploy, and scale Java web apps with ease. IBM WebSphere Liberty Profile is a highly composable, ultra-fast, ultra-light profile of IBM WebSphere Application Server designed for the cloud.
Aritificial Intilligence: Intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans.

Blog

In machine learning world numerous models are being created for achieving a specific task. With so many models available as an open source to consume, how can one decide which model to use? Which model is performing better? What are the various performance parameters for different models?
Now, the question is which model is going to be the right fit for your requirement.

This code pattern shows you a way to compare Watson Cognitive services models so as to decide which model performs better for a particular set of data. It provides user a platform to configure models, provide input data, execute and prepare performance evaluation statistics such as confusion matrix and ROC curve. Workbench provides Recommendations, ROC curve and summary statistics for all the configured model on single dashboard screen to enable user for selecting the best performing model.

The code pattern demonstrates a way to compare Watson cognitive service models.

High level steps involved in comparing the models is

Create models or use available models that are to be compared.
Develop an application which can consume these models.
Upload the Test data with the ground truth for their supported Watson cognitive service.
Application will compare these models that are consumed based on the real data.
Application comes up with recommendations, ROC curve and summary statistics for all the configured models on single dashboard screen to enable user for selecting the best performing model.
These statistics can be used for fine tuning model's parameters and selecting the best-performing model.

This code pattern can be extended to support other types of machine learning models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly