Skip to content
Switch branches/tags

Patient Case Similarity

patient-case-similarity is a Natural Language Processing (NLP) project used to calculate the similarity between two patients using WMD (Word Movers Distance).

The project was done as a part of Smart India Hackathon 2019 at National Institute of Technology Warangal, India. Project depicted was presented at the national level of SIH 2019.

How to use?

The user needs to provide the permission in linux Terminal using chmod +x

After the permission is granted, user has to run the

If you're having trouble running the script, you can type the commands manually in Linux terminal present in file.

Work Flow

The sample data (given to us) is present ./'Sample Dataset'/NER_XML/ The file used in shell script is copied from here to the root directory.

xslt is used for transforming and extracting annotated data into .csv from .xml

The data from .xml file is extracted and stored in ./data/ directory. The extracted data is preprocessed using the and finally the WMD Score is calculated using


NLP Project developed for Grand Finale (National Level, Software Edition) of Smart India Hackathon 2019 for ezDI.





No releases published


No packages published