# Sentence embeddings and cosine similarity

In [1]:
! pip install -U sentence-transformers


Collecting sentence-transformers
  Downloading sentence_transformers-4.0.2-py3-none-any.whl.metadata (13 kB)
Collecting transformers<5.0.0,>=4.41.0 (from sentence-transformers)
  Downloading transformers-4.50.3-py3-none-any.whl.metadata (39 kB)
Collecting torch>=1.11.0 (from sentence-transformers)
  Using cached torch-2.6.0-cp312-cp312-win_amd64.whl.metadata (28 kB)
Collecting scikit-learn (from sentence-transformers)
  Using cached scikit_learn-1.6.1-cp312-cp312-win_amd64.whl.metadata (15 kB)
Collecting scipy (from sentence-transformers)
  Using cached scipy-1.15.2-cp312-cp312-win_amd64.whl.metadata (60 kB)
Collecting Pillow (from sentence-transformers)
  Using cached pillow-11.1.0-cp312-cp312-win_amd64.whl.metadata (9.3 kB)
Collecting networkx (from torch>=1.11.0->sentence-transformers)
  Using cached networkx-3.4.2-py3-none-any.whl.metadata (6.3 kB)
Collecting setuptools (from torch>=1.11.0->sentence-transformers)
  Downloading setuptools-78.1.0-py3-none-any.whl.metadata (6.6 kB)
Co


[notice] A new release of pip is available: 24.2 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [20]:
from sentence_transformers import SentenceTransformer
# max sequence length of 256
model = SentenceTransformer("all-MiniLM-L6-v2")

In [None]:
# model = SentenceTransformer("all-mpnet-base-v2")

# model = SentenceTransformer("bert-base-nli-mean-tokens")

# model = SentenceTransformer("BAAI/bge-m3")

# model  = SentenceTransformer("all-distilroberta-v1")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


In [23]:

# The sentences to encode
sentence1 = ''' Developed end-to-end APIs on the server side. I also built full-fledged features including live chatting through
 sockets using SocketIO, grpc communication between microservices, and ios notifications
 • Tech Stack: Flask-Python, MongoDB, MySQL, Kong, Redis, AWS, Docker, SocketIO, RESTful APIs, Jira, Git,
 Ubuntu-linux '''
sentence2 = ''' 
We are seeking skilled Backend Developers with a minimum of 1 year of development experience to join us as freelancers and contribute to impactful projects.

Key Responsibilities:

Write clean, efficient code for data processing and transformation.
Debug and resolve technical issues.
Evaluate and review code to ensure quality and compliance


Required Qualifications:

1+ year of Backend development experience.
Proficiency in server-side languages (e.g., Python, Java, Node.js).
Experience with database management (SQL, NoSQL).
Understanding of RESTful API design.


'''
sentence3 = '''
Bitbuffs Technologies Pvt. Ltd is looking for Front end developer Designing and implementing user-friendly and visually appealing web interfaces using HTML, CSS, and JavaScript.Building responsive designs that work on a variety of devices and screen sizes.Implementing interactive features such as forms, animations, and modals using JavaScript.Collaborating with back-end developers to integrate with APIs and server-side functionality.Debugging and fixing cross-browser compatibility issues.Optimizing website performance for speed and scalability.Writing clean, well-documented, and maintainable code.Participating in code reviews and contributing to team processes for continuous improvement.Should have excellent problem-solving, communication, and collaboration skills.
'''

sentence4  = '''

 SDEIntern
 CoRider- social ride-sharing app
 May 2023–Sep2023
 Remote
 • Developed end-to-end APIs on the server side. I also built full-fledged features including live chatting through
 sockets using SocketIO, grpc communication between microservices, and ios notifications
 • Tech Stack: Flask-Python, MongoDB, MySQL, Kong, Redis, AWS, Docker, SocketIO, RESTful APIs, Jira, Git,
 Ubuntu-linux

'''
# 2. Calculate embeddings by calling model.encode()
embedding1 = model.encode("Flask-Python, MongoDB, MySQL, Kong, Redis, AWS, Docker, SocketIO, RESTful APIs, Jira, Git,Ubuntu-linux")
embedding2  =model.encode("Proficiency in server-side languages (e.g., Python, Java, Node.js).Experience with database management (SQL, NoSQL).Understanding of RESTful API design.")

print(embedding1.shape)
print(embedding2.shape)


(384,)
(384,)


In [24]:
from sentence_transformers.util import cos_sim

cosine_sim = cos_sim(embedding1, embedding2)
print(cosine_sim)
#observed dismilar content with values of similarity less than <2.5/3


tensor([[0.5155]])


# Using cross encoders

In [38]:
from sentence_transformers import CrossEncoder

# 1. Load a pre-trained CrossEncoder model
model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")


In [41]:
# 2. Predict scores for a pair of sentences
scores = model.predict([
    (sentence4, sentence2),
    (sentence4, sentence3),
])
scores

array([-10.379882,  -9.711462], dtype=float32)

# Using mistral AI model to predict similarity score


In [None]:
# ! pip install mistralai

In [1]:
my_experience = '''
    I have 2 year of experience in software engineering, machine learning, and AI research, with a strong focus on deep learning, computer vision, and backend development. My expertise spans across multiple frameworks and technologies, including TensorFlow, PyTorch, Python, Flask, MongoDB, MySQL, Docker, AWS, and ReactJS.

Research Experience
As a Student Researcher at Sardar Vallabhbhai Patel Institute of Technology (Oct 2023 - Jul 2024), I worked on Speech Emotion Recognition (SER), leveraging deep learning and signal processing techniques to enhance emotion classification from speech. I explored two key approaches:

Spectrogram-based deep learning model – Used spectrograms of audio data as input to a pre-trained CNN model, extracting deep features for classification.
Acoustic feature-based model – Extracted features such as MFCC, ZCR, and RMS from raw audio and trained a 1D CNN + LSTM model, achieving state-of-the-art results.
This research involved extensive experimentation with TensorFlow, PyTorch, and Librosa, applying advanced feature engineering and deep learning architectures.
Industry Experience
Previously, as an SDE Intern at CoRider (May 2023 - Sep 2023), I contributed to the backend development of a social ride-sharing app, building robust and scalable APIs. My key contributions included:

Developing end-to-end RESTful APIs using Flask and Python.
Implementing real-time chat functionality using SocketIO.
Setting up gRPC-based communication between microservices.
Integrating AWS services, managing MongoDB and MySQL databases, and working with Kong API Gateway.
Deploying services using Docker and handling distributed systems.
Notable Projects
Speech Emotion Recognition – Fine-tuned 1D CNN, 1D CNN + LSTM, and 1D CNN + BiLSTM models for emotion classification from speech data. Employed ensemble learning techniques to further enhance performance.
Exercise Correctly – Built a web app that analyzes exercise poses using PoseNet to extract key points and compare them with ideal poses using cosine similarity. Scraped yoga pose datasets using Scrapy. The frontend was developed using ReactJS and TailwindCSS, while the backend used Flask-RESTful.
Visual Question Answering (VQA) – Fine-tuned the multimodal LLM Moondream on a VQA dataset and deployed it using Hugging Face libraries and Gradio for real-time inference.
Technical Expertise
My skill set includes deep learning, computer vision, natural language processing, backend development, and cloud computing. I am proficient in machine learning techniques, including classification, linear and logistic regression, decision trees, gradient descent, recommender systems, and supervised/unsupervised learning. Additionally, I have hands-on experience in database management, API development, containerization, and distributed computing.

With a strong foundation in AI research and software development, I am passionate about solving real-world problems through cutting-edge machine learning models, scalable architectures, and innovative AI applications.
'''

moderate_match = '''
Zeta Global seeks a visionary backend developer to join our Data Cloud team and spearhead its evolution with Generative AI. By integrating this revolutionary tech, you'll craft next-generation data products, unlocking powerful solutions for specific client challenges. This role offers the chance to master the marketing tech landscape, work on cutting-edge AI, and become a pioneer in its application for marketing.


About the role: 

The expectations from this role are two-fold. 

Backend Developer who can perform data analysis and create outputs for Gen AI related tasks while supporting standard data analysis tasks. 
Gen AI expert who can effectively understand and translate business requirements and provide a Gen AI powered output independently.

The person should be able to: 

Analyse data as needed from the data tables to generate data summaries and insights that are used for Gen AI and non-Gen AI work. 
Collaborate effectively with cross-functional teams (engineering, product, business) to ensure alignment and understanding. 
Create and use AI assistants to solve business problems. 
Comfortable providing and advocating recommendations for a better user experience. 
Support product development teams, enabling them to create and manage APIs that interact with the Gen AI backend data and create a next gen experience for our clients. 
Visualize and create data flow diagrams and materials required for effective coordination with devops teams. 
Manage the deployment of related APIs in Kubernetes or other relevant spaces 
Provide technical guidance to the UI development and data analyst teams on Gen AI best practices. 
Coordinate with business teams to ensure the outputs are aligned with expectations. 
Continuously integrating new developments in the Gen AI space into our solutions and provide product and non-product implementation ideas to fully leverage potential of Gen AI.

The person should have:

Proven experience as a data analyst, with a strong track record of delivering impactful insights and recommendations. 
Strong working knowledge of OpenAI, Gemini or other Gen AI platforms, and prior experience in creating and optimizing Gen AI models. 
Familiarity with API and application deployment, data pipelines and workflow automation. 
High-agency mindset with strong critical thinking skills. 
Strong business acumen to proactively identify what is right for the business. 
Excellent communication and collaboration skills. 

Technical Skills: 

Python
SQL 
AWS Services (Lambda, EKS) 
Apache Airflow 
CICD (Serverless Framework) 
Git 
Jira / Trello

It will be great to have:

Good understanding of marketing/advertising product industry. 
At least 1 Gen AI project in production. 
Strong programming skills in Python or similar languages. 
Prior experience in working as Devops engineer or have worked closely with Devops. 
Strong background in data management.

'''

perfect_match = '''
About the Role
We are looking for a highly skilled AI/ML Engineer with strong expertise in deep learning, machine learning, and backend development. The ideal candidate should have experience in building and fine-tuning AI models, developing scalable backend systems, and deploying applications on cloud platforms. This role involves working on cutting-edge projects in computer vision, NLP, and multimodal AI, while also contributing to backend architecture, API development, and system optimization.

Key Responsibilities
AI & Machine Learning
Develop, fine-tune, and deploy deep learning models using TensorFlow, PyTorch, and Hugging Face libraries.
Work on Speech Emotion Recognition (SER), Visual Question Answering (VQA), and pose estimation applications.
Implement and optimize 1D CNN, LSTM, BiLSTM, and ensemble learning models for various AI tasks.
Utilize Librosa, pandas, NumPy, and Matplotlib for data preprocessing, feature extraction, and visualization.
Conduct hyperparameter tuning, model evaluation, and performance optimization.
Backend Development & System Architecture
Design and develop scalable RESTful APIs using Flask and Flask-RESTful.
Implement real-time features such as WebSockets and SocketIO for chat applications.
Develop gRPC-based microservices for efficient inter-service communication.
Work with MongoDB, MySQL, and Redis for database management and optimization.
Deploy applications using Docker and manage cloud infrastructure on AWS.
Full-Stack & Web Development
Build AI-powered web applications using ReactJS, TailwindCSS, and Gradio.
Implement data scraping solutions using Scrapy for gathering training datasets.
Ensure responsive UI/UX and smooth user interactions for AI-driven applications.
Skills & Qualifications
Must-Have:
Strong experience with TensorFlow, PyTorch, Python, and Scikit-learn.
Proficiency in Flask, REST APIs, MongoDB, MySQL, Redis, and Docker.
Knowledge of machine learning concepts such as classification, regression, decision trees, and gradient descent.
Experience in LLMs, multimodal models, and AI research.
Hands-on experience with Ubuntu/Linux, Git, and Jira for development and collaboration.
Familiarity with cloud services (AWS), containerization, and microservices architecture.
Nice-to-Have:
Experience with Gradio, Kong API Gateway, and PoseNet for real-time applications.
Understanding of retrieval-augmented generation (RAG) and recommender systems.
Strong interest in AI for real-world problem-solving and automation.
Years of experience required: 2

'''

frontend_job = '''

Responsible for front-end requirement analysis
Design develop application and website User Interfaces
Responsible for the integration of web applications and components with the HTML markup
Responsible for the development of web pages, multimedia, GUIs
Working with PSDs
Utilizes wireframes and graphic pre-designs where appropriate
Effectively develops in a clean, well structured, easily maintainable format.
Responsible for meeting expectations and deliverables on time and in high quality.
Bug Fixing / Issues reporting Documentation.
Analyzing opportunities for improvement and its implementation
Knowledge of UI best practices
Practical experience in development of HTML, JavaScript, CSS, Jquery
Solid understanding of navigation and GUI for maximizing usability
Seamless integration of front-end to back-end functionality
Should have working knowledge of using a latest development tools and techniques
Front-end and some back-end development skills
Must have good problem solving and analysis skills
Team-player with strong communication collaboration skills
LOCK Xcellence-IT, and have the KEY to Innovative, Timely, Reliable and Perfect Services.
eXcel with Xcellence-IT!!!
Get on board with us, to get Quality IT Consulting, Solutions Services!!
When Xcellence-IT works... PROFIT follows!!!
PROFIT just happens with Xcellence-IT!!
Candidate should be any graduate/post graduate in Computer Science or related degree program
Excellent communication presentation skills
Atleast 1 years of relevant industry experience is required
 '''

uiux_job = '''
We are seeking talented Graphic Designers specializing in branding to join our team. The ideal candidate will possess a deep understanding of visual communication, user-centered design principles, and the ability to create compelling designs that amplify our technology solutions' impact. In this role, you will drive the creation of captivating branding, promotional materials, and user-friendly application interfaces that resonate with our audience.


Visual Branding: Craft and maintain a cohesive visual brand identity, designing logos, typography, color palettes, and brand assets that resonate with our tech-focused audience.
Promotional Materials: Design captivating marketing collateral, including banners, infographics, and presentations, that effectively communicate our technology's value proposition.
User-Centered Design: Conduct user research and usability testing to inform design decisions, ensuring our technology solutions align with user needs and preferences.
Responsive Design: Create responsive designs that adapt seamlessly to various devices and screen sizes, enhancing user accessibility and engagement.
Visual Storytelling: Craft custom graphics, illustrations, and visual narratives that convey complex technological concepts in an engaging and easily understandable manner.
Innovative Creativity: Bring innovative design ideas to the table, pushing the boundaries of design conventions to create unique and memorable experiences.
Cross-Functional Collaboration: Partner with marketing teams to design visually impactful campaigns and collaborate with developers to ensure design implementation aligns with intended user experiences.
Visual Trends: Stay updated on design trends and emerging technologies to ensure our designs remain current and resonant within the tech industry.
Application Interface Design: Collaborate with development teams to create user interfaces for web and mobile software applications, focusing on intuitive interactions and streamlined user experiences.
Wire framing & Prototyping: Develop wireframes and interactive prototypes to visualize and iterate on application interfaces, ensuring optimal usability.

Requirements

Graphic Design Expertise: Demonstrated proficiency in graphic design with a portfolio showcasing branding, promotional materials.
Design Tools: Proficiency in design tools such as Adobe Creative Suite (Illustrator, Photoshop), Coreldraw, Figma
Degree: A degree in Graphic Design, Visual Communication, UI/UX Design, or a related field is preferred.
Innovative Thinking: Creative mindset that pushes the boundaries of design norms to create visually stunning and impactful designs.
User-Centered Mindset: Ability to empathize with users, translating their needs into intuitive and appleaing design solutions.
Collaboration: Effective communication and collaboration skills to work closely with multidisciplinary teams.
Responsive Design: Knowledge of responsive design principles to ensure seamless experiences across devices.
Advantage: Web and Mobile application interface design and wire framing & prototyping skills. Proficiency in tools like Figma, Sketch, or similar.



 '''

aiml_engineer_intern =''' 

Refonte Learning is seeking enthusiastic individuals for our prestigious Data Science Study and Internship Program. This intensive initiative offers a unique opportunity to collaborate closely with our seasoned data science team on diverse and impactful projects.

Refonte Learning is seeking enthusiastic individuals who are looking to learn Data Science from beginning to advanced level while also working on live projects globally. RIGTIP (Refonte Infini Global Training & Internship Program) is designed as a manner that offers you to work in a flexible work environment but also offers working with people in a global team from Oceania, Asia, Europe, & American continents.

Job Description:
We are excited to offer an internship opportunity for individuals passionate about the convergence of AI, Data Science, DevOps, and Cloud technologies. As an AI, Data Science, DevOps, and Cloud Intern, you will have the unique opportunity to gain hands-on experience in these dynamic and interconnected fields, working on innovative projects and collaborating with experienced professionals.
Responsabilities:
Collaborate with cross-functional teams to develop, deploy, and maintain AI-driven solutions.
Assist in collecting, cleaning, and analyzing data from various sources to derive actionable insights.
Contribute to the design and implementation of scalable data pipelines for model training and deployment.
Support the integration of machine learning models into production systems using DevOps best practices.
Participate in designing and optimizing CI/CD pipelines for continuous integration, testing, and deployment of AI applications on cloud platforms.
Assist in infrastructure provisioning, configuration, and monitoring using cloud services such as AWS, Azure, or Google Cloud.
Work closely with data scientists and engineers to understand business requirements and translate them into technical solutions.
Research and experiment with emerging technologies and tools in AI, Data Science, DevOps, and Cloud domains.
Collaborate on documentation efforts to ensure knowledge sharing and best practices across the team.
Projects You Will Work On:
Multi Cloud AI Infrastructure Configuration, Automation and Deployment
Full Stack AI DevOps & Development
Generative AI model, Large Language Models and Foundations models to transform input to output. NB: Input can be text, images, audios or videos; Output can also be text, images, audios or videos
Finance Fraud Detection: Develop advanced fraud detection algorithms leveraging financial data analysis.
Recommender System: Contribute to personalized recommendation systems, enhancing user experiences across platforms.
Sentiment Analysis: Explore sentiment analysis to extract insights from textual data, shaping user sentiment understanding.
Chatbots: Engage in intelligent chatbot development, revolutionizing customer interactions and support.
Image/Audio Video Classification: Push boundaries with multimedia technology by working on image and audio video classification projects.
Text Analysis: Uncover hidden patterns in textual data through sophisticated text analysis techniques.
Roles & Responsibilities:
Collaborate with our esteemed data science experts to collect, clean, and analyze extensive datasets, honing skills in data preprocessing and visualization.
Contribute to the development of predictive models and algorithms, employing cutting-edge machine learning techniques to solve real-world challenges.
Work closely with team members to design, implement, and evaluate experiments, fostering a collaborative and innovative environment.
Stay updated with the latest industry trends and best practices in data science, applying newfound knowledge to enhance project outcomes.
Qualifications:
Currently pursuing any degree showcasing a strong commitment to continuous learning and professional growth.
Exceptional written and verbal communication skills, vital for effective collaboration and articulation of complex ideas.
Demonstrated ability to work both independently and as part of a cohesive team, highlighting adaptability and strong teamwork capabilities.

'''

In [2]:
from mistralai import Mistral
# from google.colab import userdata
from dotenv import dotenv_values

my_secrets = dotenv_values(".env")


# api_key = userdata.get('MISTRAL_API_KEY')
api_key = my_secrets["MISTRAL_API_KEY"]
model = "mistral-large-latest"

client = Mistral(api_key=api_key)

chat_response = client.chat.complete(
    model = model,
    messages = [
        {
            "role": "user",
            # "content": f"You are a recruiter/ hiring manager looking for the best fit for a job against a candidate resume. The non-negotiables to find the best match are same skills, years of experience required and context of experience that is relevant to job expectations and responsiblities.Calculate the similarity score (0 lowest-1highest) between 2 groups of sentences, one is candidate experience and other is job description. The similarity score should be calculated based on skills matched in the sentence both lexically and semantically. If the must have or mandatory skills or statements are satisfied or are matched then give a high score. Similarly if contect of experience aligns with expectations and responsibilities give a good score. Make sure the matches with high score satisfy the expected years of experience in job listing.If the candidate has no experience or has done work in completely different domain and no skills coincide then give a low score. Give a moderate score if some skills match.Only give the score as output to the query. Avoid being biased over formal education required, focus on skills and experience.Sentence 1: {my_experience} and Sentence 2: {aiml_engineer_intern}"
            "content": f'''
                [INST] [SYS]
                You are an experienced technical recruiter specializing in matching candidates to job descriptions with precision and objectivity. Your task is to analyze the match between a candidate's experience and a job posting.
                [/SYS]

                Calculate a similarity score (0.0-1.0, with two decimal places) between the candidate experience and job description below.

                SCORING CRITERIA:
                1. SKILLS MATCH (50%): Evaluate both lexical and semantic matches between technical skills mentioned
                2. CONTEXT RELEVANCE (30%): Assess how well the candidate's work context aligns with the job responsibilities
                3. EXPERIENCE DURATION (20%): Verify if the candidate meets the required years of experience

                IMPORTANT GUIDELINES:
                - Mandatory/must-have skills must be present for scores above 0.7
                - Ignore formal education requirements; focus exclusively on skills and experience
                - If candidate's domain is completely different with no overlapping skills, score below 0.3
                - For partial skill matches, assign scores between 0.3-0.7 based on relevance
                - Output ONLY the numerical score (e.g., "0.75") with no explanation or reasoning

                CANDIDATE EXPERIENCE:
                {my_experience}

                JOB DESCRIPTION:
                {perfect_match}
                [/INST]
            
            
            '''
        },],

)

print(chat_response.choices[0].message.content)

0.90


# Better approach for skill matching


In [25]:
from dotenv import load_dotenv
load_dotenv()

True

In [None]:
# create skill matching based on the above and non-negotiable skill matching