We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Line 530 in construct_bucket_vb_wc function in utils.py is too slow with huge datasets. It even freezes if dataset is larger than 300k objects.
I propose to change line
forw_corpus = [pad_char_feature] + list(reduce(lambda x, y: x + [pad_char_feature] + y, forw_features)) + [pad_char_feature]
to
forw_corpus = [pad_char_feature] for forw_feature in forw_features: forw_corpus.extend(forw_feature + [pad_char_feature])
Which works considerably faster with no freezes.
The text was updated successfully, but these errors were encountered:
Update utils.py
4f35e0a
#71
thanks and fixed in 4f35e0a PS: a more up-to-date lib is available at https://github.com/LiyuanLucasLiu/Vanilla_NER
Sorry, something went wrong.
No branches or pull requests
Line 530 in construct_bucket_vb_wc function in utils.py is too slow with huge datasets. It even freezes if dataset is larger than 300k objects.
I propose to change line
forw_corpus = [pad_char_feature] + list(reduce(lambda x, y: x + [pad_char_feature] + y, forw_features)) + [pad_char_feature]
to
Which works considerably faster with no freezes.
The text was updated successfully, but these errors were encountered: