New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some formatting #87
Comments
|
Thanks.
Is it preferable to normalize
X(i,j) in the [0,1] space ?
(ie normalization is done manually before)
Thanks
… On Oct 30, 2021, at 2:46, Wei-Cheng Chang ***@***.***> wrote:
X is the feature matrix, can be either dense NPY format, or sparse CSR format. Yes, the X(i,j)=val can take floating point value.
Y is the instance-to-label matrix, should be sparse CSR format. Yes, its value should be binary, either 0 and 1.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
For sparse TFIDF features, we typically do row-wise l2-normalization for the feature matrix |
Closing this issue due to no further response for 7 days. |
Thanks for answer.
Just to confirm, if pecos cam handle this case:
X : dense (no sparse) of dimension 5000
Y: sparse of dimension 20 millions
Thanks
… On Nov 10, 2021, at 10:01, Wei Li ***@***.***> wrote:
Closed #87.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
The scalability of PECOS as of now depends on the memory capacity and the number of CPUs. If you have a machine which can hold the necessary datasets and models, it should be fine. In this paper https://arxiv.org/pdf/2106.12657.pdf, PECOS can be applied to handle datasets of even larger dimension. |
closing this issue as there are no additional follow-up. |
Hi,
Thasnks for this.
Just would like to confirm the format of the input
X : CSR format x(i,k) = val Can valx be a float ? does it need to be binary or [0,1] value ?
Y: CSR format, y(i,k) = valy . does it need to be binary ( 0 or 1) ?
Thx
The text was updated successfully, but these errors were encountered: