MNMG Logistic Regression (dask-glm wrapper) #3512

daxiongshu · 2021-02-18T05:11:34Z

In this PR, I'll wrap dask-glm models so that it accepts dask_cudf Dataframes and behaves like other cuml.dask models. dask-glm provides three estimators: LogisticRegression, LinearRegression and PoissonRegression. MNMG LogisticRegression is requested by @beckernick . @JohnZed for visibility. Thank you all.

sync with upstream

Sync with upstream

sync with upstream

codecov-io · 2021-03-20T00:08:18Z

Codecov Report

Merging #3512 (2fa2d54) into branch-0.19 (f5d86b9) will increase coverage by 7.96%.
The diff coverage is 0.00%.

@@               Coverage Diff               @@
##           branch-0.19    #3512      +/-   ##
===============================================
+ Coverage        72.91%   80.87%   +7.96%     
===============================================
  Files              214      229      +15     
  Lines            16856    17777     +921     
===============================================
+ Hits             12290    14378    +2088     
+ Misses            4566     3399    -1167

Flag	Coverage Δ
dask	`45.20% <0.00%> (?)`
non-dask	`73.16% <0.00%> (+0.25%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...thon/cuml/dask/linear_model/logistic_regression.py	`0.00% <0.00%> (ø)`
python/cuml/common/numba_utils.py	`0.00% <0.00%> (ø)`
python/cuml/neighbors/__init__.py	`100.00% <0.00%> (ø)`
python/cuml/model_selection/__init__.py	`100.00% <0.00%> (ø)`
python/cuml/internals/global_settings.py	`100.00% <0.00%> (ø)`
python/cuml/neighbors/nearest_neighbors_mg.pyx	`98.51% <0.00%> (ø)`
python/cuml/decomposition/base_mg.pyx	`100.00% <0.00%> (ø)`
python/cuml/linear_model/base_mg.pyx	`100.00% <0.00%> (ø)`
python/cuml/cluster/dbscan_mg.pyx	`100.00% <0.00%> (ø)`
python/cuml/neighbors/kneighbors_classifier_mg.pyx	`100.00% <0.00%> (ø)`
... and 78 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f5d86b9...2fa2d54. Read the comment docs.

sync with upstream

CHANGELOG.md

daxiongshu · 2021-03-23T12:35:30Z

While I'm fixing the CI errors, I'd like to raise some issues with this PR. It lacks several functionalities of other cuml.dask.linear_models because it doesn't have an internal model.

No get_combined_model() and _set_internal_model
No mutli-gpu training and single-gpu inference.
No single-gpu training and multi-gpu inference.
No pickling of the model.

The functionalities 2) and 4) can be added quite easily but they will look different from other cuml.dask.linear_models.

Currently it subclassed the cuml.dask.common.base.BaseEstimator quite trivially. The only things it utilized from BaseEstimator is checking if client is valid.

This implementation also has some different hyperparameters from other cuml.dask.linear_models:

No hyperparameter normalized. dask-glm always normalizes inputs internally.
No hyperparameter delayed

So I have two questions:

Are these missing functionalities required by the customers? @beckernick
Is it necessary to subclass cuml.dask.common.base.BaseEstimator? Due to the differences, @JohnZed suggests that we put it into another namespace say cuml wrappers.

Thank you all. Please let me know.

beckernick · 2021-03-23T14:10:46Z

Does this implementation implicitly support a GPU based LogisticRegression.decision_function via Dask dispatching?

Due to the differences, @JohnZed suggests that we put it into another namespace say cuml wrappers.

This sounds like a great idea!

sync with upstream

Update yaml

…' into fea_dask_glm_wrapper

test three solvers

dantegd · 2022-04-26T16:37:39Z

@gpucibot merge

@beckernick

In this PR, I'll wrap `dask-glm` models so that it accepts `dask_cudf Dataframes` and behaves like other `cuml.dask` models. `dask-glm` provides three estimators: `LogisticRegression`, `LinearRegression` and `PoissonRegression`. MNMG `LogisticRegression` is requested by @beckernick . @JohnZed for visibility. Thank you all. Authors: - Jiwei Liu (https://github.com/daxiongshu) - Nick Becker (https://github.com/beckernick) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Ray Douglass (https://github.com/raydouglass) - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#3512

daxiongshu added 6 commits July 26, 2020 12:49

Merge pull request #15 from rapidsai/branch-0.15

cf87af4

sync with upstream

Merge pull request #18 from rapidsai/branch-0.17

e3b7848

sync with upstream

Merge pull request #19 from rapidsai/branch-0.17

e6d8ec3

sync with upstream

Merge pull request #20 from rapidsai/branch-0.18

8b1b7c3

Sync with upstream

Merge pull request #22 from rapidsai/branch-0.19

7a51c5a

sync with upstream

Start logistic regression

565eeef

daxiongshu requested a review from a team as a code owner February 18, 2021 05:11

github-actions bot added the Cython / Python Cython or Python issue label Feb 18, 2021

JohnZed changed the title ~~[WIP] MNMG Logistic Regression~~ [WIP] MNMG Logistic Regression (dask-glm wrapper) Feb 25, 2021

daxiongshu added 5 commits February 27, 2021 07:12

Merge pull request #23 from rapidsai/branch-0.19

d7a2f60

sync with upstream

Merge pull request #24 from rapidsai/branch-0.19

0f1e44c

sync with upstream

Merge pull request #25 from rapidsai/branch-0.19

feb6ec0

sync with upstream

basic function works

fe28399

install latest dask glm

2fa2d54

daxiongshu requested a review from a team as a code owner March 19, 2021 20:18

github-actions bot added the conda conda issue label Mar 19, 2021

daxiongshu added 2 commits March 21, 2021 06:29

add test

2674d02

CHANGELOG

05b68a1

daxiongshu added non-breaking Non-breaking change feature request New feature or request labels Mar 21, 2021

daxiongshu added 3 commits March 21, 2021 06:51

Merge pull request #26 from rapidsai/branch-0.19

419db3c

sync with upstream

update version

57ff8e7

install latest dask-glm in ci build.sh

dc583fb

github-actions bot added the gpuCI gpuCI issue label Mar 21, 2021

ajschmidt8 requested changes Mar 22, 2021

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

daxiongshu added 2 commits March 22, 2021 22:16

install dask-glm dev in ci & remove CHANGELOG entry

3add8e4

install dask-glm with --no-dep

c791234

dantegd mentioned this pull request Dec 6, 2021

[FEA] dask-glm PR tracker #4427

Closed

daxiongshu added 2 commits December 16, 2021 09:35

Merge pull request #32 from rapidsai/branch-22.02

6349067

sync with upstream

Merge branch 'rapidsai:branch-22.02' into branch-22.02

6c408e5

daxiongshu added 5 commits February 14, 2022 09:20

Merge branch 'rapidsai:branch-22.04' into branch-22.04

3f4b89d

Merge branch 'rapidsai:branch-22.04' into branch-22.04

e305ae3

Merge branch 'rapidsai:branch-22.04' into branch-22.04

d3ee54a

Merge branch 'rapidsai:branch-22.04' into branch-22.04

39311f9

Merge branch 'rapidsai:branch-22.04' into branch-22.04

591ad28

daxiongshu changed the base branch from branch-21.10 to branch-22.06 April 8, 2022 00:10

daxiongshu added 12 commits April 7, 2022 20:22

Merge branch 'rapidsai:branch-22.06' into branch-22.06

c6ae54d

try to resolve conflict

c2fafe5

update cuda11.5.yml

26b5440

replace 22.04 with 22.06 in yamls

3d18ec8

Merge pull request #39 from daxiongshu/update_yaml

c11f569

Update yaml

t push origin fea_dask_glm_wrapperMerge branch 'rapidsai-branch-22.06…

d3ac712

…' into fea_dask_glm_wrapper

fix copy year

f1e341b

pip install sparse

cbf66ea

test three solvers

dfb397e

Merge pull request #40 from daxiongshu/add_test

81de611

test three solvers

Merge branch 'branch-22.06' into fea_dask_glm_wrapper

549082d

Merge branch 'rapidsai:branch-22.06' into fea_dask_glm_wrapper

d3e2925

dantegd approved these changes Apr 25, 2022

View reviewed changes

rapids-bot bot merged commit 6ca7254 into rapidsai:branch-22.06 Apr 26, 2022

beckernick mentioned this pull request Jun 29, 2022

Experimental sparse/MNMG logistic regression #1480

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNMG Logistic Regression (dask-glm wrapper) #3512

MNMG Logistic Regression (dask-glm wrapper) #3512

daxiongshu commented Feb 18, 2021

codecov-io commented Mar 20, 2021

daxiongshu commented Mar 23, 2021

beckernick commented Mar 23, 2021

dantegd commented Apr 26, 2022

MNMG Logistic Regression (dask-glm wrapper) #3512

MNMG Logistic Regression (dask-glm wrapper) #3512

Conversation

daxiongshu commented Feb 18, 2021

codecov-io commented Mar 20, 2021

Codecov Report

daxiongshu commented Mar 23, 2021

beckernick commented Mar 23, 2021

dantegd commented Apr 26, 2022