Skip to content

LIT is a visual tool for understanding ML models with text, image, and tabular data. It interprets behaviour, identifies strengths/weaknesses, and enhances transparency.

Notifications You must be signed in to change notification settings

NashTech-Labs/LIT-Illuminating-Model-Insights

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLRun and LIT Integration

Introduction

We have leveraged Mlrun,an open source platform to orchestrate a pipeline on Sentiment analysis of Flipkart review dataset.This pipeline utilizes machine learning to analyze user reviews and ratings, predicting whether a review is positive or negative. The process involves data ingestion, model training, and deployment as a serverless function. Through this pipeline LIT-NLP will fetch the model from the artifact registry of mlrun and will make visualise the model.

What is MLRun?

MLRun is an open-source MLOps platform designed to streamline the development, deployment, and management of machine learning models. It offers a unified environment for data scientists and engineers to collaborate, automate, and scale their machine learning workflows.

What is LIT?

Language interpretability tool is an open source data visualization platform made by Google for visualzaing NLP models and offers a wide variety of metrics to test our model. We've been integerating LIT to our pipeline flow to show the plots of our pretrained model

Getting Started

Prerequisites

  • First we have to set up 2 environments one for mlrun(compatible with only python:3.9) and second for LIT(for more elaborate display of features we used python:3.10).This is done so that our end-end flow of pipeline isn't hindered.
  • Install MLRun and LIT following their official documentation.

Installation

  1. Clone this repository to your local machine.
    git clone https://github.com/amanknoldus/mlrun_template/Mlrun-Pipeline
  2. Navigate to the project directory.
  3. Install the required dependencies using pip install -r requirements.txt.
       pip install -r requirements.txt

Usage and Results from each component of pipeline

Data Preparation and Visualization

Before running the pipeline, ensure your dataset is in a format compatible with LIT. This may involve preprocessing or converting your data into a suitable format.For this the Mlrun pipeline is triggered which encodes our dataset to numerical format and displays visualization of data.Data preprocessing gives us visualization graphs of our dataset including .describe function, correlation matrix.

Screenshot from 2024-04-05 14-16-25

Screenshot from 2024-04-05 14-16-57

Screenshot from 2024-04-05 14-19-16

Model training

Our transformer model is being trained on our dataset and the logs are displayed on mlrun ui along with the metrics.

Screenshot from 2024-04-05 14-29-41

Screenshot from 2024-04-05 14-29-56

Serving Model

Screenshot from 2024-04-05 14-34-45 Our pretrained transformer is successfully deployed to nuclio and therefore our end-end pipeline is completed.

Running the Pipeline

To execute the MLRun and LIT integration pipeline, run the following command:

python3 main.py

Change environment to python 3:10

python3 lit_app.py

LIT-NLP interface

Screenshot from 2024-04-05 14-45-33

Our model is now being visualised effectively on LIT platform and other metrics are also being displayed.

This script will initiate the data ingestion, model training, and serving process.

About

LIT is a visual tool for understanding ML models with text, image, and tabular data. It interprets behaviour, identifies strengths/weaknesses, and enhances transparency.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published