Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented maps in normalize_labels_impl.hpp #1780

Merged
merged 21 commits into from Apr 27, 2019

Conversation

Projects
None yet
3 participants
@jeffin143
Copy link
Contributor

commented Mar 16, 2019

In file src/mlpack/core/data/normalize_labels_impl.hpp, To find whether the labelsIn[i] was present there in mapping vector or not, we were using linear search in mapping vector but instead if we use C++ standard Maps then it would take log(n) time to find the element.
So if there are n elements in lableIn and m labels the time complexity will be nm for linear search but if we use map it would be nlog(m);

jeffin143 added some commits Mar 15, 2019

@mlpack-bot

This comment has been minimized.

Copy link

commented Mar 16, 2019

Thanks for opening your first pull request in this repository! Someone will review it when they have a chance. In the mean time, please be sure that you've handled the following things, to make the review process quicker and easier:

  • All code should follow the style guide
  • Documentation added for any new functionality
  • Tests added for any new functionality
  • Tests that are added follow the testing guide
  • Headers and license information added to the top of any new code files
  • HISTORY.md updated if the changes are big or user-facing
  • All CI checks should be passing

Thank you again for your contributions! 👍

@zoq

This comment has been minimized.

Copy link
Member

commented Mar 17, 2019

@mlpack-jenkins test this please

jeffin143 added some commits Mar 20, 2019

@jeffin143

This comment has been minimized.

Copy link
Contributor Author

commented Mar 23, 2019

@mlpack-jenkins test this please

@jeffin143 jeffin143 force-pushed the jeffin143:master branch from 7a5fbc2 to 33c9075 Mar 23, 2019

@rcurtin
Copy link
Member

left a comment

Hey @jeffin143, you're definitely right that the linear search over labels is not great. I'd generally expect there to be very few labels though, so technically the search should not take very long. However, accelerating this is definitely a nice thing.

If we use std::unordered_map<> (which is actually a hash table), actually the process will take constant O(1) time instead of O(log m), so if you want to change to use that, I think it can be faster. std::map isn't a hash table, actually underneath it uses red-black trees. (I can't remember if the C++ standard forces std::map to be an Rb-tree or if it's just the case that most implementations do that.)

About the boost::serialization bit, if we use unordered_map, I think it is already backported. 👍

Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated
Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated
Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated
Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated

@rcurtin rcurtin removed the s: unanswered label Mar 24, 2019

jeffin143 added some commits Mar 25, 2019

Merge pull request #5 from mlpack/master
Merging the latest merge
@jeffin143

This comment has been minimized.

Copy link
Contributor Author

commented Mar 27, 2019

Hey @jeffin143, you're definitely right that the linear search over labels is not great. I'd generally expect there to be very few labels though, so technically the search should not take very long. However, accelerating this is definitely a nice thing.

If we use std::unordered_map<> (which is actually a hash table), actually the process will take constant O(1) time instead of O(log m), so if you want to change to use that, I think it can be faster. std::map isn't a hash table, actually underneath it uses red-black trees. (I can't remember if the C++ standard forces std::map to be an Rb-tree or if it's just the case that most implementations do that.)

About the boost::serialization bit, if we use unordered_map, I think it is already backported. +1

@rcurtin , added unordered_map instead of map and made all the required changes

@jeffin143

This comment has been minimized.

Copy link
Contributor Author

commented Apr 21, 2019

@mlpack-jenkins test this please

Merge pull request #15 from mlpack/master
Syncing with master
@rcurtin
Copy link
Member

left a comment

Hey @jeffin143, this is a nice improvement. Thanks for doing it. I have a handful of comments (all really simple) and if you can handle them then I think we can merge this. 👍

Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated
Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated
Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated
found = true;
break;
}
labels[i] = labelMap[labelsIn[i]] - 1;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Apr 26, 2019

Member

I don't follow why the - 1 and + 1 throughout; you could adapt the code to just do labels[i] = labelMap[labelsIn[i]] and this would be cleaner. I'd like it if you could do that. 👍

This comment has been minimized.

Copy link
@jeffin143

jeffin143 Apr 26, 2019

Author Contributor

Ahaa, I completely forgot about it to change.

I guess if you take a look at my previous version of code it was if (labelMap[labelsIn[i]] != 0) instead of (labelMap.count(labelsIn[i]) > 0) and thus i was forced to not insert 0, becasue i had to check the presence of an element .

Show resolved Hide resolved src/mlpack/core/data/normalize_labels_impl.hpp Outdated
@rcurtin
Copy link
Member

left a comment

Quick fixes, thanks! 👍 I'll approve once I see that the builds are all passing. Thanks again!

@rcurtin
Copy link
Member

left a comment

AppVeyor still isn't done yet but I'd be astonished if it failed, since Travis built it successfully. Thanks again for the contribution @jeffin143. 👍

@mlpack-bot

mlpack-bot bot approved these changes Apr 27, 2019

Copy link

left a comment

Second approval provided automatically after 24 hours. 👍

@rcurtin rcurtin merged commit a009898 into mlpack:master Apr 27, 2019

6 checks passed

LaTeX Documentation Checks Build finished.
Details
Memory Checks Build finished.
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rcurtin

This comment has been minimized.

Copy link
Member

commented Apr 27, 2019

Thanks again! 👍

@jeffin143

This comment has been minimized.

Copy link
Contributor Author

commented Apr 27, 2019

Thanks again! +1

👍 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.