SemanticApp

SemanticApp is a web application that computes the semantic similarity between two text documents. Semantic similarity measures how much two texts discuss the same topics, considering the meaning behind the words rather than just their presence. The application utilizes state-of-the-art natural language processing techniques and models to provide accurate results.

Semantic Similarity

Semantic similarity refers to the degree of likeness between two pieces of text in terms of their meaning. It goes beyond simple word matching and considers the context and understanding of the content. In the context of SemanticApp, the application calculates the semantic similarity between two uploaded text documents.

Purpose

The purpose of SemanticApp is to provide users with a tool for assessing how closely related two pieces of text are in terms of content and meaning. This can be useful in various applications such as document comparison, plagiarism detection, and content recommendation.

Technologies Used

SemanticApp utilizes the following technologies:

Cosine Similarity: A measure of similarity between two non-zero vectors.
DistilBERT: A pretrained transformer model for natural language understanding.
Streamlit: A Python library for creating interactive web applications.
Python: The programming language used for building the application.
NumPy: A library for numerical operations in Python.
Vector Embeddings: Representations of text in a high-dimensional space used for semantic analysis.

How It Works

Upload Documents: Users upload two text documents (in PDF or DOCX format) through the web interface.
Document Processing: The application reads the content of the documents using specialized functions for DOCX and PDF formats.
Semantic Embeddings: The text content is converted into vector embeddings using the DistilBERT model. These embeddings capture the semantic meaning of the text.
Cosine Similarity: Cosine similarity is calculated between the normalized embeddings of the two documents. This yields a semantic similarity score.
User Feedback: The application displays the content of the documents and the computed similarity score. The score is color-coded based on its magnitude, providing an intuitive understanding of the relationship between the texts.

Author

SemanticApp is developed and maintained by Ajbar Alae in February 2024. Feel free to reach out for questions, feedback, or contributions at alae1ajbar@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
test documents		test documents
.gitignore		.gitignore
README.md		README.md
app.py		app.py
cmd.exe		cmd.exe
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SemanticApp

Table of Contents

Semantic Similarity

Purpose

Technologies Used

How It Works

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

ajappdev/SemanticSimilarity

Folders and files

Latest commit

History

Repository files navigation

SemanticApp

Table of Contents

Semantic Similarity

Purpose

Technologies Used

How It Works

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages