Skip to content
forked from apinf/ml-rest

REST API (and possible UI) for Machine Learning workflows

License

Notifications You must be signed in to change notification settings

eMediaCode/ml-rest

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Need

Businesses and governments have a lot of data, and want to learn about structures and patterns in the data. This might include being able to make predictions extending from the data.

Use case examples

Time-series analysis

Companies, research organizations, governments, etc. often collect data/observations containing timestamps. It is useful in many cases to find trends or patterns in data over time, including the possibility to forecast future trends. These types of analyses fall under the umbrella of time series analysis.

Specific examples include:

Anomoly detection

We can search for values that stand out of the normal range, or variance, in medium or large data sets. These 'abnormal' data may point to problems or unique conditions, which need attention. Anomoly detection algorithms can help decision makers quickly find unusual segments of data.

Specific examples include:

  • fraud detection
  • server monitoring and alerting

Related resources

Situation

There are myriad tools to help people design Machine Learning workflows. However, there does not appear to be a visual programming environment with machine learning primitives.

Goal

Build a general purpose machine learning programming environment that is accessible by a REST API and web user interface.

Roadmap

  • Sketch out REST API using design-first API tool
    • seek feedback on API design from Orange3 developers, APInf team, ML community
  • Create wireframe, and possibly mockups, of User Interface
  • Research/choose framework(s) and libraries to commence development
    • REST framework
    • UI framework (if applicable)
    • Visualization framework (if applicable)
  • Scaffold initial REST API
  • Prototype initial User Interface using UI framework

Design

The design will most likely consist of a REST API and User Interface, developed as separate components.

API

A REST API would make it easy to use Machine Learning algorithms, since users would not have to install or maintain the ML software.

The API might be structured to mirror the Orange3 User Interface. Specifically, the Orange3 UI has the following structure:

  • Data - widgets for data handling (import, export, random data generation, processing etc)
  • Visualize - widgets to represent data in various forms (scatter plot, tree, histogram, etc)
  • Model - widgets to analyze data and pick predictive algorithm(s)
  • Evaluate - widgets to test the strength of chosen predictive algorithm(s)
  • Unsupervised - widgets for selecting unsupervised learning models (probably can be combined under the Model section of the API)

UI

The user interface for machine learning algorithms will make it easy for people with little programming experience to build machine learning services. The UI should include interface for interacting with data, sequencing ML tasks, and accessing output. It might also include basic visualizations to give users insight into data (histogram, etc)

UI Mockup

UI contains features such as:

  • Upload .csv file with historical data
  • Set the number of traits (hallmarks, set of data was based on predecting)
  • Area of output result

Existing tools

It is worth building on top of existing tools, to make our work more focused. This section outlines relevant tools for building the idea as easily as possible.

REST framework(s)

  • Connexion Swagger/OpenAPI First framework for Python on top of Flask with automatic endpoint validation & OAuth2 support
  • Eve is the Simple Way to REST that is based on Flask
  • Lepo – Contract-first REST APIs in Django

Machine Learning Framework(s)

  • scikit-learn: a popular and consistent API for many machine learning algorithms, written in Python

Machine Learning User Interface(s)

Orange3

Orange3: machine learning user interface with drag and drop modelling, visualization, data management and more.

  • based on scikit-learn
  • open issue for REST API design: biolab/orange3#1419
  • may need a web-based UI widget library

While Orange3 has a user interface, it is based on the Qt framework. This design decision means Orange3 is primarily relegated to Desktop usage. It may be desirable to build a web native user interface, so that no end-user download is necessary (aside from a web browser) to use the software .

Machine Learning REST Interface(s)

General UI widgets

To build out the overall user interface, we can select an existing JS UI framework, such as:

Graph/data flow widgets

Following the conventions in the Orange3 user interface, ML sequences can be modeled as data flows. To facilitate this type of modelling/interation, we can build on an existing JavaScript UI framework such as the following:

Flow-based programming environments

There are some programming environments that support a flow-based visual workflow. The following examples are open-source, and run in aweb browser:

  • NodeRed is a browser-based editor that makes it easy to wire together flows using the wide range of nodes in the palette that can be deployed to its runtime in a single-click.
  • WireCloud is an end-user centred web application mashup platform aimed at allowing end users without programming skills to easily create web applications and dashboards/cockpits

Visualization

For similar reasons as the user interface, the visualization framework should be based on web standards.

A discussion was opened in the Orange3 repository related to open-source, web-based data visualization frameworks.

Proposals for the data visualization framework include:

  • Altair - Altair is a declarative statistical visualization library for Python, based on the powerful Vega-Lite visualization grammar.

  • Bokeh - Bokeh is a Python interactive visualization library that targets modern web browsers for presentation.

  • Matplotlib D3 (mpld3) - The mpld3 project brings together Matplotlib, the popular Python-based graphing library, and D3js, the popular JavaScript library for creating interactive data visualizations for the web.

Resources

About

REST API (and possible UI) for Machine Learning workflows

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 90.9%
  • HTML 9.1%