#### Prototype implementation 

A prototype for matching the candidates skills and work style explained in a paragraph to a job description and ranks multiple jobs based on the cosine similarity

Implementing cosine similarity using dummy text instead of actual parsed data

In [None]:
# dummy text for testing

candidate_text = """
Experienced Python developer with strong background in backend systems and cloud deployment.
Prefers remote async work. Enjoys working on open-source projects and solving real-world problems.
"""

job_text = """
Looking for a Python backend engineer familiar with cloud technologies (AWS, Docker).
Experience with async programming and working in a distributed team is a plus.
"""

Using a pretrained embedding model to convert text into vectors

In [None]:
from sentence_transformers import SentenceTransformer 

# Load lightweight pretrained model
model = SentenceTransformer('all-MiniLM-L6-v2')  # Fast and good enough

# Generate embeddings
candidate_vector = model.encode(candidate_text)
job_vector = model.encode(job_text)

#### Computing Cosine Similarity using the numpy library
This step is exactly the same as finding the dot product between two vectors, the larger the normalized dot product (cosine of the angle between the two vectors) the better the overlap. This is the main algorithm used to do the matchmaking.

In [7]:
from numpy import dot
from numpy.linalg import norm

def cosine_similarity(vec1, vec2):
    return dot(vec1, vec2) / (norm(vec1) * norm(vec2))

similarity_score = cosine_similarity(candidate_vector, job_vector)
print(f"Similarity Score: {similarity_score:.4f}")


Similarity Score: 0.8371


We see a compatibility score which translates to 83.71% match. Which is something similar to what we should expect from the given dummy text.

#### Ranking  multiple jobs

The next section of code is used to rank multiple jobs based on the cosine similarity score attained.
It takes in ```canditate_text``` and matches the with multiple job descriptions given in ```job_texts``` array.

In [19]:
job_texts = [
    "Looking for frontend engineer with React and CSS skills.",
    "Python backend developer with experience in cloud systems.",
    "Data analyst with SQL and Excel knowledge."
]

job_vectors = model.encode(job_texts)

# Rank jobs by similarity
scores = [(i, cosine_similarity(candidate_vector, job_vec)) for i, job_vec in enumerate(job_vectors)]
ranked = sorted(scores, key=lambda x: x[1], reverse=True)

# Print top matches
for idx, score in ranked:
    print(f"Job No. {idx} with cosine similarity {score:.4f}. i.e., Match of {score * 100:.2f}% \n Job description: {job_texts[idx]} \n")

Job No. 1 with cosine similarity 0.8230. i.e., Match of 82.30% 
 Job description: Python backend developer with experience in cloud systems. 

Job No. 0 with cosine similarity 0.4192. i.e., Match of 41.92% 
 Job description: Looking for frontend engineer with React and CSS skills. 

Job No. 2 with cosine similarity 0.2711. i.e., Match of 27.11% 
 Job description: Data analyst with SQL and Excel knowledge. 

