Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory issue when call uplift_auc_score #202

Closed
Benliu20221208 opened this issue Dec 8, 2022 · 3 comments
Closed

Out of memory issue when call uplift_auc_score #202

Benliu20221208 opened this issue Dec 8, 2022 · 3 comments

Comments

@Benliu20221208
Copy link

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. when call uplift_auc_score with a 4215 length array. donot know why it need 558 G memory for calculating perfect_uplift_curve

Expected behavior

Environment

  • scikit-uplift version (e.g., 0.1.2):
  • scikit-learn version (e.g., 0.22.2):
  • Python version (e.g., 3.7):
  • OS (e.g., Linux):
  • Any other relevant information:

Additional context

2022-12-08T11:00:19.929+11:00 uplift is:
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:19.929+11:00
  2022-12-08T11:00:20.930+11:00
  2022-12-08T11:00:20.930+11:00
  2022-12-08T11:00:20.930+11:00
  2022-12-08T11:00:20.930+11:00
  2022-12-08T11:00:20.930+11:00
  2022-12-08T11:00:20.930+11:00Copy[[2 1 2 ... 1 1 1] [2 1 2 ... 1 1 1] [0 3 0 ... 3 3 3] ... [2 1 2 ... 1 1 1] [2 1 2 ... 1 1 1] [2 1 2 ... 1 1 1]]
  2022-12-08T11:00:20.930+11:00Copydesc_score_indices...
  2022-12-08T11:00:20.930+11:00Copy[[ 1 3 4 ... 4200 4201 4202] [ 1 3 4 ... 4200 4201 4202] [ 1 3 4 ... 4200 4201 4202] ... [ 0 2 7 ... 4212 4213 4214] [ 1 3 4 ... 4200 4201 4202] [ 1 3 4 ... 4200 4201 4202]]
2022-12-08T11:00:20.930+11:00CopyException during training: Unable to allocate 558. GiB for an array with shape (4215, 4215, 4215) and data type int64 Exception during training: Unable to allocate 558. GiB for an array with shape (4215, 4215, 4215) and data type int64
@maks-sh
Copy link
Owner

maks-sh commented Dec 8, 2022

Hello! 🖐️

Thanks for the feedback, could you tell me which version of scikit-upllift you are using?

You can find out the version using the following command:

import sklift

print(sklift.__version__)

@Benliu20221208
Copy link
Author

Hello, thanks for response. here are some libs version and python 3.7.5
numpy vesrion: 1.19.5
pandas version: 1.1.5
scikit uplift vesrion: 0.5.1

I did some debug with the source code, looks the perfect uplift will return a matrix instead of an 1d array. is it possible some data can make below code return a n*n matrix?

   y_true, treatment = np.array(y_true), np.array(treatment)

    cr_num = np.sum((y_true == 1) & (treatment == 0))  # Control Responders
    tn_num = np.sum((y_true == 0) & (treatment == 1))  # Treated Non-Responders

    # express an ideal uplift curve through y_true and treatment
    summand = y_true if cr_num > tn_num else treatment
    
    perfect_uplift = 2 * (y_true == treatment) + summand

@Benliu20221208
Copy link
Author

the issue was solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants