Skip to content
/ textsim Public

Text Similarity measures, classified in string, token, knowledge, corpus, combined distances.

Notifications You must be signed in to change notification settings

sorice/textsim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python-textsim – Text Similarity on Python

Textsim is a python library for measure similarity between texts. This library integrates different kind of similarity distances, among them knowleadge distance, lexical/string distances, corpus distances and syntactic distances. Many of this measures are implemented in different libraries like NLTK, Gensim, Scipy and others.

Quick Example

Features

ORGANIZZAZIONE DIRECTORY

|----> phasis -- Directory moduli dell'applicativo |----> data | |----> az001 -- Database archivi | ----> dbAz -- Database aziende |----> images -- Directory immagini dell'applicativo |----> spool -- Directory di spool dell'applicativo |----> tmp -- Directory file temporanei dell'applicativo |----> logs -- Directory logs dell'applicativo |----> AUTHORS -- dati dell'autore |----> LICENSE -- file della licenza dell'applicativo Phasis |----> COPYING -- file della licenza GPL v2 |----> README -- questo file |----> default.cfg -- file di configurazione | ----> phasis.py -- file per eseguire l'applicativo

Support

Installation

Requirements

Citing TextSim

AUTORE

Massimo Gerardi Web : http://www.gnumangu.com E-Mail: gnumagnu@gmail.com

Copyrigth

Acknowledgment

Thanks to Gensim, Phasis and Polyglot documentation

About

Text Similarity measures, classified in string, token, knowledge, corpus, combined distances.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published