# Text Similarity Checker

Build a simple tool to compare two pieces of text and see how similar they are using Python.
We'll use a technique called TF-IDF to convert texts into numerical vectors, and then compute the cosine similarity between them.


## Step 1: Import Necessary Libraries

We need to import `TfidfVectorizer` for converting texts into vectors, and `cosine_similarity` for measuring how close two vectors are.


In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity


## Step 2: Define the Function to Check Similarity

We'll create a function `check_similarity` that takes two texts, converts them into TF-IDF vectors, and calculates their cosine similarity.


In [None]:
def check_similarity(text1, text2):
    # Create TF-IDF vectors for both texts
    vectorizer = TfidfVectorizer()
    vectors = vectorizer.fit_transform([text1, text2])
    # Compute cosine similarity between the two vectors
    similarity_score = cosine_similarity(vectors[0], vectors[1])[0][0]
    return similarity_score


## Step 3: Define Function for Interpretation

Based on the similarity score, we can give a simple interpretation of how similar the texts are.


In [None]:
def interpret_score(score):
    if score >= 0.75:
        return "Highly similar content!"
    elif score >= 0.5:
        return "Moderately similar."
    else:
        return "Not very similar."


## Step 4: Sample Text and User Input

Let's define a sample paragraph and ask the user to input their own text to compare.


In [None]:
# Sample paragraph
sample_text = "Machine learning is transforming technology."

# Get user input
user_text = input("Enter your text: ")


## Step 5: Calculate Similarity and Show Results

Now, let's use our functions to find the similarity score and display the interpretation.


In [None]:
# Calculate similarity
result = check_similarity(sample_text, user_text)

# Display results
print(f"Similarity Score: {result:.2f}")
print(interpret_score(result))


## Summary

- Use TF-IDF to convert texts into numerical vectors.
- Measure similarity with cosine similarity.
- Interpret the similarity score to understand how related the texts are.

This simple tool helps in quickly comparing texts for similarity, useful in many applications like plagiarism detection, document clustering, and more!
