Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance for converting transactions to BitMatrix #1

Closed
tonyabracadabra opened this issue May 1, 2022 · 5 comments
Closed

Performance for converting transactions to BitMatrix #1

tonyabracadabra opened this issue May 1, 2022 · 5 comments

Comments

@tonyabracadabra
Copy link

tonyabracadabra commented May 1, 2022

I am not familiar with BitMatrix that is used in this library, if I want to pass a list of transactions from python with PyO3, would there be too much overhead converting transactions in form of Vec<Vec<i64>> to the BitMatrix since I have to set those values one by one.

@tonyabracadabra tonyabracadabra changed the title Is it possible to make it a Python library with PyO3? Example for using BitMatrix May 1, 2022
@tonyabracadabra tonyabracadabra changed the title Example for using BitMatrix Performance for converting transactions to BitMatrix May 1, 2022
@gahag
Copy link
Owner

gahag commented May 1, 2022

Considering the overhead of itemset mining itself, I would guess that the conversion overhead would be negligible. But I wonder why you are using i64. Is it a requirement for the Python FFI, or are you leveraging other values than 0 and 1?

@tonyabracadabra
Copy link
Author

Considering the overhead of itemset mining itself, I would guess that the conversion overhead would be negligible. But I wonder why you are using i64. Is it a requirement for the Python FFI, or are you leveraging other values than 0 and 1?

Ah the input I intend to pass from Python was like
[[0, 10, 23], [22, 23, 55, 289], [2, 55, 999, 777, 4], [0, 22]] where each item in the list is a transaction with number being the feature id

@gahag
Copy link
Owner

gahag commented May 2, 2022

Okay, so for each feature id in your transaction, you will have to set bitmatrix[transaciton_id][feature_id] to true. Please, let me know if this works for you.

@tonyabracadabra
Copy link
Author

Okay, so for each feature id in your transaction, you will have to set bitmatrix[transaciton_id][feature_id] to true. Please, let me know if this works for you.

Thanks! May I ask why are you implementing the test dataset instead of using BitMatrix? Could you update a version using BitMatrix for testing as I assume this will be recommended in production?

@gahag
Copy link
Owner

gahag commented Jun 19, 2022

The test dataset is for test purposes, not a documentation for client users. Even tho, there is a comment there already mentioning one should use a bitmatrix in a production scenario.

@gahag gahag closed this as completed Jun 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants