Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some formatting #87

Closed
arita37 opened this issue Oct 27, 2021 · 7 comments
Closed

some formatting #87

arita37 opened this issue Oct 27, 2021 · 7 comments

Comments

@arita37
Copy link

arita37 commented Oct 27, 2021

Hi,
Thasnks for this.

Just would like to confirm the format of the input

X : CSR format x(i,k) = val Can valx be a float ? does it need to be binary or [0,1] value ?

Y: CSR format, y(i,k) = valy . does it need to be binary ( 0 or 1) ?

Thx

@OctoberChang
Copy link
Contributor

  • X is the feature matrix, can be either dense NPY format, or sparse CSR format. Yes, the X(i,j)=val can take floating point value.
  • Y is the instance-to-label matrix, should be sparse CSR format. Yes, its value should be binary, either 0 and 1.

@arita37
Copy link
Author

arita37 commented Oct 29, 2021 via email

@OctoberChang
Copy link
Contributor

For sparse TFIDF features, we typically do row-wise l2-normalization for the feature matrix X.

@weiliw-amz
Copy link
Contributor

Closing this issue due to no further response for 7 days.
If you feel necessary, please re-open this issue for more discussions.

@arita37
Copy link
Author

arita37 commented Nov 10, 2021 via email

@weiliw-amz weiliw-amz reopened this Nov 12, 2021
@rofuyu
Copy link
Contributor

rofuyu commented Dec 7, 2021

The scalability of PECOS as of now depends on the memory capacity and the number of CPUs. If you have a machine which can hold the necessary datasets and models, it should be fine. In this paper https://arxiv.org/pdf/2106.12657.pdf, PECOS can be applied to handle datasets of even larger dimension.

@OctoberChang
Copy link
Contributor

closing this issue as there are no additional follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants