We have leveraged Mlrun,an open source platform to orchestrate a pipeline on Sentiment analysis of Flipkart review dataset.This pipeline utilizes machine learning to analyze user reviews and ratings, predicting whether a review is positive or negative. The process involves data ingestion, model training, and deployment as a serverless function. Through this pipeline LIT-NLP will fetch the model from the artifact registry of mlrun and will make visualise the model.
MLRun is an open-source MLOps platform designed to streamline the development, deployment, and management of machine learning models. It offers a unified environment for data scientists and engineers to collaborate, automate, and scale their machine learning workflows.
Language interpretability tool is an open source data visualization platform made by Google for visualzaing NLP models and offers a wide variety of metrics to test our model. We've been integerating LIT to our pipeline flow to show the plots of our pretrained model
- First we have to set up 2 environments one for mlrun(compatible with only python:3.9) and second for LIT(for more elaborate display of features we used python:3.10).This is done so that our end-end flow of pipeline isn't hindered.
- Install MLRun and LIT following their official documentation.
- Clone this repository to your local machine.
git clone https://github.com/amanknoldus/mlrun_template/Mlrun-Pipeline
- Navigate to the project directory.
- Install the required dependencies using
pip install -r requirements.txt
.pip install -r requirements.txt
Before running the pipeline, ensure your dataset is in a format compatible with LIT. This may involve preprocessing or converting your data into a suitable format.For this the Mlrun pipeline is triggered which encodes our dataset to numerical format and displays visualization of data.Data preprocessing gives us visualization graphs of our dataset including .describe function, correlation matrix.
Our transformer model is being trained on our dataset and the logs are displayed on mlrun ui along with the metrics.
Our pretrained transformer is successfully deployed to nuclio and therefore our end-end pipeline is completed.
To execute the MLRun and LIT integration pipeline, run the following command:
python3 main.py
Change environment to python 3:10
python3 lit_app.py
Our model is now being visualised effectively on LIT platform and other metrics are also being displayed.
This script will initiate the data ingestion, model training, and serving process.