Skip to content
This repository was archived by the owner on Aug 9, 2024. It is now read-only.

Conversation

halfak
Copy link
Member

@halfak halfak commented Sep 2, 2016

  • Implements FeatureVector
  • Implements meta-Datasources for generating bag-of-words vectors
    • hashing (fast, portable hashing with murmur3)
    • gramming (ngrams and skipgrams)
    • selectors (TFiDF and raw key-based)
  • Refactors utilities to use JSON and base85 encoded pickle blobs for cache. See https://phabricator.wikimedia.org/T132580 for painful deliberations about this strategy

@codecov-io
Copy link

codecov-io commented Sep 2, 2016

Current coverage is 86.07% (diff: 53.57%)

Merging #287 into master will decrease coverage by 0.01%

@@             master       #287   diff @@
==========================================
  Files           207        217    +10   
  Lines          6584       6878   +294   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           5668       5920   +252   
- Misses          916        958    +42   
  Partials          0          0          

Powered by Codecov. Last update d39ae45...0d7f2c8

@@ -0,0 +1,283 @@
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why renaming this utility?

Copy link
Member Author

@halfak halfak Sep 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does something different and is no longer specific to "features"

@halfak
Copy link
Member Author

halfak commented Sep 2, 2016

Here's my gramming/hashing speed experiment https://gist.github.com/halfak/565d1c2153da57c5c6600cb175f20236

@Ladsgroup Ladsgroup merged commit 0f63427 into master Sep 2, 2016
@Ladsgroup Ladsgroup deleted the feature_vector_real branch April 29, 2017 08:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants