Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big multi label classifier on db #59

Open
francescoagati opened this issue Mar 1, 2018 · 5 comments
Open

Big multi label classifier on db #59

francescoagati opened this issue Mar 1, 2018 · 5 comments

Comments

@francescoagati
Copy link

Hi,
Can multi label classifiers with big data writed to a SQL DB and streamed in classify phase? Or the classifier must work always in memory? Form big classifier is best split the data in many classificators, serialize it and calo sequentually or in parallel?

@erelsgl
Copy link
Owner

erelsgl commented Mar 2, 2018

This is an interesting question.
I believe it can be done but requires some additions and adaptations.

@francescoagati
Copy link
Author

and using many serializers and every serializer is trained with a split of data?

@Berkmann18
Copy link
Contributor

Since the classifier can be serialized using serialization, I believe it's possible (assuming that the SQL DB in question does support JSON data).

Would you mind expanding on the big classifier bit?
If it's one classifier then you're probably better off using one serializer (unless the performance costs of splitting it into several JSON data is better).

@francescoagati
Copy link
Author

yes but the problem is the ram. you must always load all the serialized json in ram

@Berkmann18
Copy link
Contributor

Good point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants