Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict/inference in real time #14

Open
Subbui opened this issue Oct 28, 2020 · 4 comments
Open

Predict/inference in real time #14

Subbui opened this issue Oct 28, 2020 · 4 comments
Labels
Optimization The code is working fine but some parts can be optimized regarding speed, memory etc

Comments

@Subbui
Copy link

Subbui commented Oct 28, 2020

To make predictions based on the fit method, is pickling the best approach or is there a better way to do it? Given the size of the file that is being generated with the probabilities with the fit method, making prediction/inference is taking time with pickle. Thanks

@Subbui Subbui changed the title Avoid printing probabilities to console Predict/inference in real time Oct 28, 2020
@erdogant
Copy link
Owner

Dear Subbui,
Can you give an example?

ps I fixed to verbosity messages.

@Subbui
Copy link
Author

Subbui commented Oct 29, 2020

Hi Erdogant, Thanks for prompt response. I've built a network on around 100K records with 15 variables. The file generated from the fit method(which contains all the probability numbers) was ~2GB and I wanted to use this file for inference in real time. I'm doing it with the help of pickle library right now. Every time I need to make the inference, I load the file and run the inference method to get the results. But given the size and layout of the file, it's taking few minutes to get the inference results. I wanted to understand if there is a better/faster way to do it.

Also I'm little concerned about the computational power that is needed and I'm working on reducing the cardinality of the variables and also trying to eliminate few variables(which is quite hard as all variables are handpicked and important). Please let me know your thoughts or suggest methods that I could try to deal with this.

Thanks,
Subbu

@erdogant
Copy link
Owner

Thats indeed a huge file to store and load! But Im surprised about the size if it contains only the cpds and fitted model. Do you maybe also store the data in the pickle?

@Subbui
Copy link
Author

Subbui commented Oct 29, 2020

No @erdogant . Pickle file just contains the fit method results. I've converted all the numerical variables into bins and few of the categorical variables have very high cardinality, say more than 100 and I think that's the reason for the huge output file size.

@erdogant erdogant added the Optimization The code is working fine but some parts can be optimized regarding speed, memory etc label Jan 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optimization The code is working fine but some parts can be optimized regarding speed, memory etc
Projects
None yet
Development

No branches or pull requests

2 participants