Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of cuDF and HoloViews for GPU acceleration #485

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

AdityaR-Bits
Copy link

Overview

This implementation of LUX adds the option of utilizing NVIDIA GPUs, with RAPIDS cuDF and HoloViews as the plotting engine. It is capable of a speed up 3-10X compared to the original LUX, and avoids browser memory issues when dealing with datasets in the millions+ rows (measured on the NVIDIA RTX A3000 Laptop GPU).

HoloViews

HoloViews does not require the creation of a JSON file which for larger datasets is both memory and time expensive. It is able to show magnitudes of more data points on its curve without being time consuming, also removing the constraint of having to fall back to heatmaps rather than scatter plots, when the number of rows is too high. In this implementation we have not relied on the LUX widget for displaying the charts, for simplicity in viewing.

To Run

To run the cuDF + HoloViews implementation, simply do the following

#If global_backend is defined as "holoviews" then cuDF and HoloViews will be used, otherwise no need to define backend.set_back
from global_backend import backend
backend.set_back = "holoviews"
import lux
import pandas as pd
if backend.set_back == "holoviews":
    import cudf

To plot the HoloViews curves, run df.maintain_recs() rather than df in a different cell.

Example Output

A brief output is shown below
image

Next Steps

This implementation is a proof of concept demonstrating the acceleration that RAPIDS can bring to LUX. It also shows the benefits of adding HoloViews as an additional option for plotting. @exactlyallan, @AjayThorve and I would like to discuss if and how an integration like this might proceed, @dorisjlee?

@exactlyallan
Copy link

BTW this addresses #478 and shows really good performance potential in either adding HoloViews option OR cuDF option.

@dorisjlee
Copy link
Member

Thanks for the contribution @AdityaR-Bits! The cuDF and Holoviews backends is certainly a valuable addition to Lux.
I will be reviewing the PR shortly. In the meanwhile, could you look into addressing the failing tests? Note that the commit messages need to be formatted based on our commit message guidelines for the linter test to pass.

@exactlyallan
Copy link

@dorisjlee, @AdityaR-Bits's project has shown that there is value in modularizing lux into components that separate the data structure, interestingness calculations, and visualization frameworks. Are there any plans (or resources) to do so?

@dorisjlee
Copy link
Member

dorisjlee commented Oct 30, 2022

Hi @exactlyallan, Definitely agree that it would be a good refactoring change. We have experimented with separating out the LuxDataFrame data structure from the visualization and recommendation modules in a separate branch, but the work has not yet been merged into the main branch yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants