Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make normalization a transformation #69

Closed
piskvorky opened this issue Nov 29, 2011 · 3 comments

Comments

Projects
None yet
3 participants
@piskvorky
Copy link
Member

commented Nov 29, 2011

Right now the similarity computations assume cosine similarity and transform all vectors to unit length implicitly, inside the *Similarity classes.

Instead, make normalization an explicit transformation and leave it up to the user to choose which (or whether) normalization to use. Example vector = norm_l2[lsi_model[norm_l2[tfidf_model[bow_vector]]]].

The most common cases (L2 norm, L1 norm, identity) should be pre-defined, probably in gensim.matutils.

Also connected is issue #64 (allow custom similarity metrics).

@dsquareindia

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2016

@piskvorky can I take this up? Should I add another parameter to the Similarity constructor to include a norm parameter with default as norm_l2? As you suggested, I'll implement l2, l1 and identity in matutils.

@dsquareindia

This comment has been minimized.

Copy link
Contributor

commented Jun 1, 2016

can this be closed?

@tmylk tmylk closed this Jun 1, 2016

@piskvorky

This comment has been minimized.

Copy link
Member Author

commented Jun 1, 2016

Thanks a lot @dsquareindia !

Successfully closing a ticket from 2011, that's really rare :D Great job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.