-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize load csv #678
Optimize load csv #678
Conversation
…ication problem with n classes.
…itm for classification models. To be precise, this is is a Variance Reduces classification reinforcement learning rule.
…xtract a retina-like representation of the input image.
…ilable afterwards.
Properly use Enum type.
Typo fix in knn_main.cpp
Instead of including: methods/neighbor_search/ns_traversal_info.hpp Include the definition in: core/tree/traversal_info.hpp
Properly use Enum type, in rann and range_search.
Remove duplicated code for traversal info.
Deprecated arma function replaced by new arma constant
…ples. So we'll have to wait until mlpack 2.0.1 to remove it... :(
add cli executable for data_split
Add B_aux according to issue mlpack#642
Marcus thinks this will fix the Windows build... let's find out.
Fix mapping issue
More tests for DatasetInfo
2 : fix bug, cannot parse transpose file with correct result
fddfc18
to
1f562a1
Compare
Why there are so many commits on this branch suddenly? |
@stereomatchingkiss hmm... did you amend it with the merge commit? |
Someone force pushed to the repo and I am in the process of fixing it. That will fix this problem. |
No, I only merge pull request(#650) one time.
Thanks for the fix. Edit : Already integrate into load_impl.hpp, pass all of the test cases |
@rcurtin Would it make things easier to deal with if I delete this branch and open a new one? |
Hi, I use boost::spirit to implement the csv parser, it is more memory efficient and faster.
parse file with 1 million lines, 39796KByte
spirit version :
transpose : 2151 msec
non transpose : 4073 msec
old version :
transpose : 9616 msec
non transpose : 10131 msec
non transpose version is slower, I guess it is because arma::Mat is column wise.
Upload for code reviews, haven't integrate it into the load function and run the test cases yet.
ps : Single thread only, do not know multi-thread can make performance become better or worse, DataSetInfo is not a lock free data structure. If we want to utilize the power of multi-thread, I think we could read a bunch of string into the vector, create thread pool and DataSetInfo vectors, merge the DataSetInfo at last.