## Developing a recommendation model using cosine similarity  

### Initially, I will create a data frame which will contain jobs from different sectors. The code of this process is in the other file in my repository and you can take a look at the scraping process. So let's read those csv files that I created and concatenate them into a data frame in order to develop the recommandiation model


In [362]:
import pandas as pd
import glob

In [363]:
## First step concatenating the different jobs positions that I parsed from Indeed

path = r"C:\Users\nikos\Desktop\web_scraping"
all_files = glob.glob(path + "/*.csv")

li = []
for filename in all_files:
    df = pd.read_csv(filename, index_col=None, header=0)
    li.append(df)

df = pd.concat(li, axis=0, ignore_index=True)

In [364]:
## Let's print the data frame
df.head()

Unnamed: 0,Job Title,Job Url,Company,Location,Summary,Posting Date,Desc
0,COVID-19 Vaccination Programme Admin Support,http://www.indeed.com/rc/clk?jk=b174d86342664758&fccid=c7e13048dcc4f2c8&vjs=3,Bank Partners,London,Providing admin support on the vaccination activity by supporting the immuniser with vaccination delivery and vaccination records.,15 days ago,"Location: Barts Health NHS TrustThink you could deliver the Covid-19 vaccine? Now is your chance to be part of history for the residents of north east London.We have a variety of newly-created paid roles available across both clinical and administrative roles with flexible shift patterns. We need people from all walks of life to help us as we reach this epic milestone in our fight against the virus.As a COVID-19 Vaccination (Admin) Support you will be responsible for providing administrative support within a team of staff working in a mass vaccination site. You will also be responsible for:Accessing and maintaining accurate patient records, adhering to confidentiality as per the site’s policy.Providing admin support on the vaccination activity by supporting the immuniser with vaccination delivery and vaccination records.Recording vaccination consent and marking completion.Ensuring infection and waste control at the vaccination station."
1,Vaccine Admin Support Covid Delivery Programme,http://www.indeed.com/rc/clk?jk=5abe6dee9da1e139&fccid=c076058f5dfd0a7a&vjs=3,Epson and St Helier University Hospitals,South West London,"However, non-EEA candidates may not be appointed to a post if a suitably qualified, experienced and skilled EU/EEA candidate is available to take up the post as…",1 day ago,"Applications from job seekers who require Tier 2 sponsorship to work in the UK are welcome and will be considered alongside all other applications. However, non-EEA candidates may not be appointed to a post if a suitably qualified, experienced and skilled EU/EEA candidate is available to take up the post as the employing body is unlikely, in these circumstances, to satisfy the resident labour market test. UK Visas and Immigration (UKVI) requires employers to complete this test to show that no suitably qualified EEA or EU worker can fill the post. For further information please visit the UKVI website. From 6 April 2017, Tier 2 skilled worker applicants, applying for entry clearance into the UK, must present a criminal record certificate from each country they have resided continuously or cumulatively for 12 months or more in the past 10 years. Adult dependants (over 18 years old) will also be subject to this requirement. Guidance can be found here Criminal Records Checks for Oversea..."
2,Admin Personal Assistant,http://www.indeed.com/rc/clk?jk=aac567d7742e3752&fccid=aa26a8d42d4036bf&vjs=3,Furness Primary School,London,"3 days per week, term time only + 10 days.\nAbility to build relationships with a range of stakeholders and anticipate other’s needs.",30+ days ago,"PART TIME SCHOOL ADMINISTRATOR/PA TO HEADTEACHER3 days per week, term time only + 10 days. Hours: 8 am – 4pm, half an hour lunch.Salary: £21,748 per annum pro rata + LW £1,978 per annum pro rataStart Date: January 2021The Role: In this role you will be providing a comprehensive administrative support to Furness Primary School, including providing support to the Senior Leadership Team. You will collaborate with other members of the school’s admin and finance team to uphold the vision and ethos of the school at all times and provide excellent customer service to a range of stakeholders.KEY RESPONSIBILITIESProvide general clerical and administrative support for the school.Taking minutes of meetings as and when required.Maintain computerised records and management information systems, providing accurate lists or data as required by colleagues or compliance bodies.Ownership of the school’s admin email address, ensuring all correspondence received is actioned/ answered/forwarded as appro..."
3,Administrative Assistant,http://www.indeed.com/rc/clk?jk=bc15fcd81349ac74&fccid=95523754d68d9059&vjs=3,Hakna,London,There may also be a need for them to be an initial point of contact within the team and forward enquiries in line with local procedures.,5 days ago,"Job Description for Administrative Assistant:Administrative assistants play a key role in the provision of a high quality and responsive business support service to teams and frontline staff. They will need to work in a flexible and supportive manner with a range of staff in the team, depending on team requirements. Depending on the team within which the post holder works, they may be required to undertake additional specific responsibilities to assist in the effective delivery of services. There may also be a need for them to be an initial point of contact within the team and forward enquiries in line with local procedures.Role Purpose : To provide an administrative support service to operational and management teams within Adult Social Care that enables the smooth running of day-to-day activities. To work collaboratively with team members in delivering the directorate’s aim to provide personalised services for Adults in the community"
4,Administration Assistant – Facilities,http://www.indeed.com/rc/clk?jk=2545de5d84e6fa26&fccid=c869af706f9f123f&vjs=3,Clarke Willmott,London,Ensuring attention to detail and accuracy when working with multiple documents.\nProviding hospitality for client meetings.,Just posted,"You will be joining two other full-time team members in our central London office and reporting to our Office Manager. Whilst this role sits within Facilities Management, you will be supporting our fee earning teams with complex and process driven administrative tasks. Alongside this, the role will also encompass business support responsibilities and occasional Front of House reception work.On a day to day basis you will be assisting our busy team and your duties will include but not be limited to:Sorting incoming and outgoing post (arranging couriers when required)Carrying out general administration duties including scanning, photocopying, printing, CD copying and document bindingEnsuring attention to detail and accuracy when working with multiple documentsEscorting contractors when requiredMaintaining the office stationery stocks and ordering when necessaryAssisting with meeting and greeting clients on our Front of HouseAnswering and directing calls in a professional mannerBookin..."


In [365]:
df.shape

(182, 7)

In [366]:
## Checking for duplicates in the link's column
a = df["Job Url"].unique()
len(a)

171

In [367]:
## Comparing to the lenght of the initial data frame we have some duplicates so let's drop them
new_df = df.drop_duplicates(subset=['Job Url'])
new_df.shape

(171, 7)

In [368]:
## Now I will read CV (admin position that I found online)
import docx2txt

# read the word file
cv = docx2txt.process("monster-cv-template-admin-assistant.docx")
cv

'Uschi Barker\n\nAddress: Flat 0, Any Road, Any Town, Postcode\nEmail: name@hotmail.com | Telephone: +44 000 000 000 \n\nPERSONAL STATEMENT \n\nEfficient, organised Administrator with over 15 years’ experience and a record of working to very high standards. Proven literacy and numeracy skills – proficient user of MS Office, with a typing speed of 75 WPM. Holds an extensive list of certificates and a positive attitude to professional development. Excels in collaboration; possesses a proven record of inspiring others in different environments. Manages diaries, meetings and events effectively, and is able to resolve difficult customer and admin situations. Seeking a challenging PA or Executive Assistant role in a large company in order to develop and gain further skills\n\n\n\nEMPLOYMENT HISTORY\n\n06/2012 – Present \n\nCompany\n\nCity, Country\n\nSales Workflow Assistant  \n\nDealt with key account needs for major commercial contacts within the business\n\nInduction champion and ‘go-to’ 

In [371]:
## Making some text cleaning before inserting it to the Job's advertisement data frame
text = cv.replace("\n", "")
text

'Uschi BarkerAddress: Flat 0, Any Road, Any Town, PostcodeEmail: name@hotmail.com | Telephone: +44 000 000 000 PERSONAL STATEMENT Efficient, organised Administrator with over 15 years’ experience and a record of working to very high standards. Proven literacy and numeracy skills – proficient user of MS Office, with a typing speed of 75 WPM. Holds an extensive list of certificates and a positive attitude to professional development. Excels in collaboration; possesses a proven record of inspiring others in different environments. Manages diaries, meetings and events effectively, and is able to resolve difficult customer and admin situations. Seeking a challenging PA or Executive Assistant role in a large company in order to develop and gain further skillsEMPLOYMENT HISTORY06/2012 – Present CompanyCity, CountrySales Workflow Assistant  Dealt with key account needs for major commercial contacts within the businessInduction champion and ‘go-to’ person within the departmentTrained other memb

### In this step I will add to the data frame my CV. My goal is to create a script that will be able to send job's advertisement similar to my cv, so I need to invent a url for my cv which certainly doesn't exist, but I need it as a feature that could distinguish me from the data frame's ads.


In [372]:
new_row = {'Job Title':'Admin', 'Job Url':"http://www.indeed.com/rc/clk?jk=3c9df461c8afddce&fccid=160efb82f2462f14&vjs=1002", 'Desc': text}
new_df = new_df.append(new_row, ignore_index=True)

In [374]:
## Let's check if our CV is in the data frame
new_df[new_df["Job Url"] == "http://www.indeed.com/rc/clk?jk=3c9df461c8afddce&fccid=160efb82f2462f14&vjs=1002"]

Unnamed: 0,Job Title,Job Url,Company,Location,Summary,Posting Date,Desc
171,Admin,http://www.indeed.com/rc/clk?jk=3c9df461c8afddce&fccid=160efb82f2462f14&vjs=1002,,,,,"Uschi BarkerAddress: Flat 0, Any Road, Any Town, PostcodeEmail: name@hotmail.com | Telephone: +44 000 000 000 PERSONAL STATEMENT Efficient, organised Administrator with over 15 years’ experience and a record of working to very high standards. Proven literacy and numeracy skills – proficient user of MS Office, with a typing speed of 75 WPM. Holds an extensive list of certificates and a positive attitude to professional development. Excels in collaboration; possesses a proven record of inspiring others in different environments. Manages diaries, meetings and events effectively, and is able to resolve difficult customer and admin situations. Seeking a challenging PA or Executive Assistant role in a large company in order to develop and gain further skillsEMPLOYMENT HISTORY06/2012 – Present CompanyCity, CountrySales Workflow Assistant Dealt with key account needs for major commercial contacts within the businessInduction champion and ‘go-to’ person within the departmentTrained other m..."


In [375]:
## text preprocessing 

import nltk 

nltk.download('punkt') 

nltk.download('averaged_perceptron_tagger') 

nltk.download('wordnet') 

    
from nltk.stem import WordNetLemmatizer 

lemmatizer = WordNetLemmatizer() 

  

from nltk.corpus import stopwords 

nltk.download('stopwords') 

stop_words = set(stopwords.words('english')) 

  

VERB_CODES = {'VB', 'VBD', 'VBG', 'VBN', 'VBP', 'VBZ'}

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\nikos\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\nikos\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\nikos\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\nikos\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [376]:
## Developing a text manipulation function
def preprocess_sentences(text): 

    text = text.lower() 
    temp_sent =[] 
    words = nltk.word_tokenize(text) 
    tags = nltk.pos_tag(words) 
    
    for i, word in enumerate(words): 
        if tags[i][1] in VERB_CODES:   
            lemmatized = lemmatizer.lemmatize(word, 'v') 
        else: 
            lemmatized = lemmatizer.lemmatize(word) 
        if lemmatized not in stop_words and lemmatized.isalpha(): 
            temp_sent.append(lemmatized) 

    finalsent = ' '.join(temp_sent) 
    finalsent = finalsent.replace("n't", " not") 
    finalsent = finalsent.replace("'m", " am") 
    finalsent = finalsent.replace("'s", " is") 
    finalsent = finalsent.replace("'re", " are") 
    finalsent = finalsent.replace("'ll", " will") 
    finalsent = finalsent.replace("'ve", " have") 
    finalsent = finalsent.replace("'d", " would") 

    return finalsent 

  
## Creating a new column that we applied the text manipulation function 
new_df["Dexc proc"] = new_df["Desc"].apply(preprocess_sentences) 
new_df.head()


Unnamed: 0,Job Title,Job Url,Company,Location,Summary,Posting Date,Desc,Dexc proc
0,COVID-19 Vaccination Programme Admin Support,http://www.indeed.com/rc/clk?jk=b174d86342664758&fccid=c7e13048dcc4f2c8&vjs=3,Bank Partners,London,Providing admin support on the vaccination activity by supporting the immuniser with vaccination delivery and vaccination records.,15 days ago,"Location: Barts Health NHS TrustThink you could deliver the Covid-19 vaccine? Now is your chance to be part of history for the residents of north east London.We have a variety of newly-created paid roles available across both clinical and administrative roles with flexible shift patterns. We need people from all walks of life to help us as we reach this epic milestone in our fight against the virus.As a COVID-19 Vaccination (Admin) Support you will be responsible for providing administrative support within a team of staff working in a mass vaccination site. You will also be responsible for:Accessing and maintaining accurate patient records, adhering to confidentiality as per the site’s policy.Providing admin support on the vaccination activity by supporting the immuniser with vaccination delivery and vaccination records.Recording vaccination consent and marking completion.Ensuring infection and waste control at the vaccination station.",location bart health nh trustthink could deliver vaccine chance part history resident north east variety paid role available across clinical administrative role flexible shift pattern need people walk life help u reach epic milestone fight vaccination admin support responsible provide administrative support within team staff work mass vaccination site also responsible accessing maintain accurate patient record adhere confidentiality per site admin support vaccination activity support immuniser vaccination delivery vaccination vaccination consent mark infection waste control vaccination station
1,Vaccine Admin Support Covid Delivery Programme,http://www.indeed.com/rc/clk?jk=5abe6dee9da1e139&fccid=c076058f5dfd0a7a&vjs=3,Epson and St Helier University Hospitals,South West London,"However, non-EEA candidates may not be appointed to a post if a suitably qualified, experienced and skilled EU/EEA candidate is available to take up the post as…",1 day ago,"Applications from job seekers who require Tier 2 sponsorship to work in the UK are welcome and will be considered alongside all other applications. However, non-EEA candidates may not be appointed to a post if a suitably qualified, experienced and skilled EU/EEA candidate is available to take up the post as the employing body is unlikely, in these circumstances, to satisfy the resident labour market test. UK Visas and Immigration (UKVI) requires employers to complete this test to show that no suitably qualified EEA or EU worker can fill the post. For further information please visit the UKVI website. From 6 April 2017, Tier 2 skilled worker applicants, applying for entry clearance into the UK, must present a criminal record certificate from each country they have resided continuously or cumulatively for 12 months or more in the past 10 years. Adult dependants (over 18 years old) will also be subject to this requirement. Guidance can be found here Criminal Records Checks for Oversea...",application job seeker require tier sponsorship work uk welcome consider alongside application however candidate may appoint post suitably qualify experienced skilled candidate available take post employ body unlikely circumstance satisfy resident labour market test uk visa immigration ukvi require employer complete test show suitably qualify eea eu worker fill post information please visit ukvi website april tier skilled worker applicant apply entry clearance uk must present criminal record certificate country reside continuously cumulatively month past year adult dependant year old also subject requirement guidance find criminal record check overseas post subject rehabilitation offender act exception order necessary submission disclosure make disclosure barring service
2,Admin Personal Assistant,http://www.indeed.com/rc/clk?jk=aac567d7742e3752&fccid=aa26a8d42d4036bf&vjs=3,Furness Primary School,London,"3 days per week, term time only + 10 days.\nAbility to build relationships with a range of stakeholders and anticipate other’s needs.",30+ days ago,"PART TIME SCHOOL ADMINISTRATOR/PA TO HEADTEACHER3 days per week, term time only + 10 days. Hours: 8 am – 4pm, half an hour lunch.Salary: £21,748 per annum pro rata + LW £1,978 per annum pro rataStart Date: January 2021The Role: In this role you will be providing a comprehensive administrative support to Furness Primary School, including providing support to the Senior Leadership Team. You will collaborate with other members of the school’s admin and finance team to uphold the vision and ethos of the school at all times and provide excellent customer service to a range of stakeholders.KEY RESPONSIBILITIESProvide general clerical and administrative support for the school.Taking minutes of meetings as and when required.Maintain computerised records and management information systems, providing accurate lists or data as required by colleagues or compliance bodies.Ownership of the school’s admin email address, ensuring all correspondence received is actioned/ answered/forwarded as appro...",part time school day per week term time day hour half hour per annum pro rata lw per annum pro ratastart date january role role provide comprehensive administrative support furness primary school include provide support senior leadership team collaborate member school admin finance team uphold vision ethos school time provide excellent customer service range responsibilitiesprovide general clerical administrative support minute meeting computerise record management information system provide accurate list data require colleague compliance school admin email address ensure correspondence receive school trip efficiently take responsibility booking coordination trip school diary annual weekly planning organise oversee school record staff absence prepare return work interview sheet readiness external tandem school business manager ensure school health safety db check new school single central record update compliant main school office note job description exhaustive list add specificat...
3,Administrative Assistant,http://www.indeed.com/rc/clk?jk=bc15fcd81349ac74&fccid=95523754d68d9059&vjs=3,Hakna,London,There may also be a need for them to be an initial point of contact within the team and forward enquiries in line with local procedures.,5 days ago,"Job Description for Administrative Assistant:Administrative assistants play a key role in the provision of a high quality and responsive business support service to teams and frontline staff. They will need to work in a flexible and supportive manner with a range of staff in the team, depending on team requirements. Depending on the team within which the post holder works, they may be required to undertake additional specific responsibilities to assist in the effective delivery of services. There may also be a need for them to be an initial point of contact within the team and forward enquiries in line with local procedures.Role Purpose : To provide an administrative support service to operational and management teams within Adult Social Care that enables the smooth running of day-to-day activities. To work collaboratively with team members in delivering the directorate’s aim to provide personalised services for Adults in the community",job description administrative assistant administrative assistant play key role provision high quality responsive business support service team frontline staff need work flexible supportive manner range staff team depend team requirement depend team within post holder work may require undertake additional specific responsibility assist effective delivery service may also need initial point contact within team forward enquiry line local purpose provide administrative support service operational management team within adult social care enable smooth running activity work collaboratively team member deliver directorate aim provide personalised service adult community
4,Administration Assistant – Facilities,http://www.indeed.com/rc/clk?jk=2545de5d84e6fa26&fccid=c869af706f9f123f&vjs=3,Clarke Willmott,London,Ensuring attention to detail and accuracy when working with multiple documents.\nProviding hospitality for client meetings.,Just posted,"You will be joining two other full-time team members in our central London office and reporting to our Office Manager. Whilst this role sits within Facilities Management, you will be supporting our fee earning teams with complex and process driven administrative tasks. Alongside this, the role will also encompass business support responsibilities and occasional Front of House reception work.On a day to day basis you will be assisting our busy team and your duties will include but not be limited to:Sorting incoming and outgoing post (arranging couriers when required)Carrying out general administration duties including scanning, photocopying, printing, CD copying and document bindingEnsuring attention to detail and accuracy when working with multiple documentsEscorting contractors when requiredMaintaining the office stationery stocks and ordering when necessaryAssisting with meeting and greeting clients on our Front of HouseAnswering and directing calls in a professional mannerBookin...",join two team member central london office reporting office manager whilst role sit within facility management support fee earn team complex process driven administrative task alongside role also encompass business support responsibility occasional front house reception day day basis assist busy team duty include limit sort incoming outgo post arrange courier require carry general administration duty include scanning photocopy printing cd copying document bindingensuring attention detail accuracy work multiple documentsescorting contractor requiredmaintaining office stationery stock order necessaryassisting meeting greeting client front houseanswering direct call professional meeting room requiredproviding hospitality client meetingsbooking taxi hotel train wider officeit integral part role comply information security firm policy youyou fantastic attention detail meticulous follow process procedure always health safety forefront proactive flexible work individually part team minima...


In [377]:
## Final Data set for implementing the algorithm
final_data = new_df[["Job Url", "Dexc proc"]]
final_data

Unnamed: 0,Job Url,Dexc proc
0,http://www.indeed.com/rc/clk?jk=b174d86342664758&fccid=c7e13048dcc4f2c8&vjs=3,location bart health nh trustthink could deliver vaccine chance part history resident north east variety paid role available across clinical administrative role flexible shift pattern need people walk life help u reach epic milestone fight vaccination admin support responsible provide administrative support within team staff work mass vaccination site also responsible accessing maintain accurate patient record adhere confidentiality per site admin support vaccination activity support immuniser vaccination delivery vaccination vaccination consent mark infection waste control vaccination station
1,http://www.indeed.com/rc/clk?jk=5abe6dee9da1e139&fccid=c076058f5dfd0a7a&vjs=3,application job seeker require tier sponsorship work uk welcome consider alongside application however candidate may appoint post suitably qualify experienced skilled candidate available take post employ body unlikely circumstance satisfy resident labour market test uk visa immigration ukvi require employer complete test show suitably qualify eea eu worker fill post information please visit ukvi website april tier skilled worker applicant apply entry clearance uk must present criminal record certificate country reside continuously cumulatively month past year adult dependant year old also subject requirement guidance find criminal record check overseas post subject rehabilitation offender act exception order necessary submission disclosure make disclosure barring service
2,http://www.indeed.com/rc/clk?jk=aac567d7742e3752&fccid=aa26a8d42d4036bf&vjs=3,part time school day per week term time day hour half hour per annum pro rata lw per annum pro ratastart date january role role provide comprehensive administrative support furness primary school include provide support senior leadership team collaborate member school admin finance team uphold vision ethos school time provide excellent customer service range responsibilitiesprovide general clerical administrative support minute meeting computerise record management information system provide accurate list data require colleague compliance school admin email address ensure correspondence receive school trip efficiently take responsibility booking coordination trip school diary annual weekly planning organise oversee school record staff absence prepare return work interview sheet readiness external tandem school business manager ensure school health safety db check new school single central record update compliant main school office note job description exhaustive list add specificat...
3,http://www.indeed.com/rc/clk?jk=bc15fcd81349ac74&fccid=95523754d68d9059&vjs=3,job description administrative assistant administrative assistant play key role provision high quality responsive business support service team frontline staff need work flexible supportive manner range staff team depend team requirement depend team within post holder work may require undertake additional specific responsibility assist effective delivery service may also need initial point contact within team forward enquiry line local purpose provide administrative support service operational management team within adult social care enable smooth running activity work collaboratively team member deliver directorate aim provide personalised service adult community
4,http://www.indeed.com/rc/clk?jk=2545de5d84e6fa26&fccid=c869af706f9f123f&vjs=3,join two team member central london office reporting office manager whilst role sit within facility management support fee earn team complex process driven administrative task alongside role also encompass business support responsibility occasional front house reception day day basis assist busy team duty include limit sort incoming outgo post arrange courier require carry general administration duty include scanning photocopy printing cd copying document bindingensuring attention detail accuracy work multiple documentsescorting contractor requiredmaintaining office stationery stock order necessaryassisting meeting greeting client front houseanswering direct call professional meeting room requiredproviding hospitality client meetingsbooking taxi hotel train wider officeit integral part role comply information security firm policy youyou fantastic attention detail meticulous follow process procedure always health safety forefront proactive flexible work individually part team minima...
...,...,...
167,http://www.indeed.com/rc/clk?jk=8b7dd86955a721d1&fccid=9e9fd8a77343c806&vjs=3,exciting opportunity join establish company base reading area order picker warehouse temporary permanent opportunity work night company part friendly hardworking team pride maintain high standard efficiently work environment successful role need warehouse experience computer include order picking stock rotation within warehouse process delivery note use scanner general housekeeping whilst workingthe shift monday interested job role please hesitate submit cv
168,http://www.indeed.com/pagead/clk?mo=r&ad=-6NYlbfkN0AdTLGXAwdJY9smqTrxeiFfNaxzxctNoC5YukC3r5oD4G7MHGjy-B6GcUrQFisDokBlx6I8AHqGkc-htJ0U_E_npsZCaoxuM_Yh_g5WD9nehwk4VbpmaJhLLsQ4L3TUS7OQY8t9MurLocxarQF_nZq1W7WZ9kdTBu85GC-cyN5oulp25Sm2Nyj35pVTleQcWSiEfftlcOr1-mVZMGnyFU3u_IWTIuuAhe7YBynw_VUvC-q0fAB5_eKAklnMiOyCrmCqfDKUNz82S9Ua4xiFb1FYoezc4BESlhKcWTDwSeGuBBN5F-7ys3WJsgI_da4xDCOSXcNCJOJMgjfMMGIAMQrYCbivNrIDmqZ_GmwN9yvixjZF_94rVIii6Im0_cdAWlBC1P5rJ9-P9cZz0sTmEACNcri-3TVv-gpUmrpB3ZI2EblPjpbM34JMNbjdNgnVSGTYdqjVzP2acoe_cTHYStejG5Pr5RK6XVU=&p=12&fvj=1&vjs=3,dutiesmust year builder merchant experience forklift licenceforklift dutiescheck good load vehicle delivery delivery notespicking loading vehicle safely securely load uneven loading count stockkeep yard tidy orderserve customer collect materialsworking hour monday friday alternate saturday day holiday bank holidaysjob type type per yearbenefits company hour shiftexperience forklift counter balance year require builder merchant year require warehouse year require location finchley central station prefer job duty load unload lorry safe efficient mannermove stack materialskeep production line supply empty container packaging material neededgeneral housekeep duty warehouse production area duty shipmentsinspect maintain equipment report fault wear tear
169,http://www.indeed.com/company/White-Van-Gentlemen/jobs/Warehouse-Operative-a31dd63374e696bf?fccid=8f53eca5b22716cb&vjs=3,job descriptionwhite van gentleman white glove removal company base south west london year experience pride provide complete removal service offer bespoke removal service well man van delivery service addition offer storage solution size duration team experienced deliver high standard service every expand bespoke delivery service look welcome new member intel someone cable work warehouse system tablet android apple productsyou require work closely operation team daily need extremely focused great attention tot work team require team keep tidy warehousepallet wrapping general warehouse dutiesmanually unload lorry quality check good inorganize product correct store area bayspick prepare order installation teamsreceiving check stockusing warehouse equipment pallet jackscommunicating daily check client regardingkeeping stock software datepreparing stock short deadlinesrequirementswritten verbal englishexcellent organizational time management skillswork quickly efficiently pay close det...
170,http://www.indeed.com/company/All-Pet-Solutions/jobs/Warehouse-Operative-5464f79039ae6f5c?fccid=620e21da2460816d&vjs=3,pet solution online market leader go sustained period growth enter year huge expansion plan come year perfect opportunity join team exciting offer full time day per day hour sunday saturday rota issue week work within team warehouse operativespicking pack ordersrefilling stock shelvessafely load unload large often heavy container delivery within allocate timeworking fragile product ensure safe packing transportensuring set productivity target metmaintaining high standard health great communicator always look way helpfriendly calm efficient even busiest daysexcited challenge varied parkingbonus schemeemployee discountsreference id aps warehouse operative picker packer replenjob type permanentsalary per hourexperience warehouse year require location uxbridge prefer language english require


### Cosine Similarity:
The Jobs Descriptions are transformed as vectors in a geometric space. Therefore the angle between two vectors represents the closeness of those two vectors. Cosine similarity calculates similarity by measuring the cosine of the angle between two vectors.

In [379]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity 

count = CountVectorizer()
count_matrix = count.fit_transform(final_data['Dexc proc'])
cosine_sim = cosine_similarity(count_matrix)
print(cosine_sim)

[[1.         0.04126661 0.18151655 ... 0.08260162 0.07620499 0.10476454]
 [0.04126661 1.         0.10875201 ... 0.14913013 0.0880522  0.09126919]
 [0.18151655 0.10875201 1.         ... 0.18950196 0.14201333 0.15494925]
 ...
 [0.08260162 0.14913013 0.18950196 ... 1.         0.26988337 0.12259438]
 [0.07620499 0.0880522  0.14201333 ... 0.26988337 1.         0.07983581]
 [0.10476454 0.09126919 0.15494925 ... 0.12259438 0.07983581 1.        ]]


## Building recommendation function which gives top 10 similar job descriptions based on your CV:

In [381]:
## Finding the index of our cv, of our text in the Description column

user_url = "http://www.indeed.com/rc/clk?jk=3c9df461c8afddce&fccid=160efb82f2462f14&vjs=1002"

def get_index_from_url(url):
    return final_data[final_data["Job Url"] == url].index.values[0]
    

url = get_index_from_url(user_url)
url

171

In [382]:
## Creating a list with similarily of our CV with the job description
similar_jobs = list(enumerate(cosine_sim[url]))
similar_jobs

[(0, 0.10476454436543672),
 (1, 0.09126918636192564),
 (2, 0.15494925090586797),
 (3, 0.1431971603389681),
 (4, 0.17238473259593615),
 (5, 0.15977729015977948),
 (6, 0.1107739502333261),
 (7, 0.15926144393053876),
 (8, 0.11812651790609877),
 (9, 0.1703727074605764),
 (10, 0.12319152471463941),
 (11, 0.19818626434583295),
 (12, 0.1803141697242315),
 (13, 0.2418728449756597),
 (14, 0.2288626653694555),
 (15, 0.22403305120122177),
 (16, 0.1827676932405473),
 (17, 0.23551119173538854),
 (18, 0.0776566469211916),
 (19, 0.17467497698445048),
 (20, 0.1006421324789799),
 (21, 0.22403305120122177),
 (22, 0.08032711545197997),
 (23, 0.22154790046665213),
 (24, 0.21782118162804753),
 (25, 0.21181345437818172),
 (26, 0.12751534261266764),
 (27, 0.21782118162804753),
 (28, 0.21782118162804753),
 (29, 0.21782118162804753),
 (30, 0.21782118162804753),
 (31, 0.1504277527223641),
 (32, 0.21782118162804753),
 (33, 0.21782118162804753),
 (34, 0.16823082982871956),
 (35, 0.2013488589318756),
 (36, 0.21782

In [383]:
## Sorting the list 
sorted_similar_jobs = sorted(similar_jobs, key=lambda x:x[1], reverse=True)
sorted_similar_jobs

[(171, 0.9999999999999986),
 (109, 0.26764693139028045),
 (13, 0.2418728449756597),
 (17, 0.23551119173538854),
 (14, 0.2288626653694555),
 (149, 0.22623995411025635),
 (15, 0.22403305120122177),
 (21, 0.22403305120122177),
 (146, 0.22399601788396722),
 (23, 0.22154790046665213),
 (24, 0.21782118162804753),
 (27, 0.21782118162804753),
 (28, 0.21782118162804753),
 (29, 0.21782118162804753),
 (30, 0.21782118162804753),
 (32, 0.21782118162804753),
 (33, 0.21782118162804753),
 (36, 0.21782118162804753),
 (38, 0.21782118162804753),
 (39, 0.21782118162804753),
 (41, 0.21782118162804753),
 (42, 0.21782118162804753),
 (43, 0.21782118162804753),
 (44, 0.21782118162804753),
 (45, 0.21782118162804753),
 (52, 0.21782118162804753),
 (132, 0.21746852605578149),
 (118, 0.21670406454457042),
 (37, 0.21532761076060403),
 (25, 0.21181345437818172),
 (84, 0.21115368309936872),
 (135, 0.20586606098082774),
 (164, 0.204609985676822),
 (35, 0.2013488589318756),
 (11, 0.19818626434583295),
 (154, 0.196182886

In [384]:
## Printing some jobs that are fitting better to your CV
pd.options.display.max_colwidth = 1000

def get_title_from_url(index):
    return final_data[final_data.index == index]["Job Url"]
i=0
for job in sorted_similar_jobs:
    print(get_title_from_url(job[0]))
    i=i+1
    if i>15:
        break

171    http://www.indeed.com/rc/clk?jk=3c9df461c8afddce&fccid=160efb82f2462f14&vjs=1002
Name: Job Url, dtype: object
109    http://www.indeed.com/company/CentraNic-Ltd/jobs/Group-Financial-Data-Analyst-59999e06fd4b9009?fccid=004345d9813bd437&vjs=3
Name: Job Url, dtype: object
13    http://www.indeed.com/company/CriterionCapital/jobs/Office-Administration-Assistant-64da8cde37d422d9?fccid=1931435125b82129&vjs=3
Name: Job Url, dtype: object
17    http://www.indeed.com/pagead/clk?mo=r&ad=-6NYlbfkN0BIQv-klv4x57wzcCCXZDuUs4ETBBTY7U4BZbqajjMT5rLx4iIBIgIDjvqIt6UO8LKeIOY33Wnt4_eGGFmqJeUFdqLBu7U5oyAp-J0dXDp4UiTLVL041HcriHxDT6myJ6B1t5jySkfSP0xrQ1MSGJug_oWZSIBng5uU3tgIaZmdrw1f0HFsYk5o_w5zejOWcDSmC6lgzvZJ6vOa82rSFId3FatHT_qfXMi-PufkEZX4WyY6n0oncWV21jlODJXWsKuoJYw7GGwET4yAfy66eZiJmeyDU1xf9Dt8-V27KKcbAybIWSHq7Mjgv8OjIYZvnDEPVvDV5XHAtg10Eimq-WRh4abc6WEORw6KeHRnDyo2JYUI_WLt98BE3GEtIU9J_2zOiS1fykc7VgdRV6Zb9Gbk5l3k44WDTbVg7sH7h6rRjCG67hJvDDgIMKNCQzMTdIY=&p=2&fvj=1&vjs=3
Name: Job Url, dtype: object
14   

In [385]:
## Take only the links and send an email to the person that is looking for a position similar to their CV 
from pandas import DataFrame
df = DataFrame(sorted_similar_jobs[1:11], columns=["index", "similarity"])
df    

Unnamed: 0,index,similarity
0,109,0.267647
1,13,0.241873
2,17,0.235511
3,14,0.228863
4,149,0.22624
5,15,0.224033
6,21,0.224033
7,146,0.223996
8,23,0.221548
9,24,0.217821


In [386]:
## Above is a dataframe with the job's index and the degree of job similarity based on the CV that I uploaded
## I need the Links in a text shape in order to create the message, thus I found the links using the data frame indexes
## and I saved them into a text file

text = []
for i in df["index"]:
    text.append(final_data["Job Url"].iloc[i])
    with open("message.txt", 'w') as f:
         f.write("\n\n".join(map(str, text)))

file = open('message.txt','r')

#read the numbers on the file
body = file.read()

#Close the the numbers file
file.close()

#Print the Links, in other words the text that I will produce the messages
print(body)

http://www.indeed.com/company/CentraNic-Ltd/jobs/Group-Financial-Data-Analyst-59999e06fd4b9009?fccid=004345d9813bd437&vjs=3

http://www.indeed.com/company/CriterionCapital/jobs/Office-Administration-Assistant-64da8cde37d422d9?fccid=1931435125b82129&vjs=3

http://www.indeed.com/pagead/clk?mo=r&ad=-6NYlbfkN0BIQv-klv4x57wzcCCXZDuUs4ETBBTY7U4BZbqajjMT5rLx4iIBIgIDjvqIt6UO8LKeIOY33Wnt4_eGGFmqJeUFdqLBu7U5oyAp-J0dXDp4UiTLVL041HcriHxDT6myJ6B1t5jySkfSP0xrQ1MSGJug_oWZSIBng5uU3tgIaZmdrw1f0HFsYk5o_w5zejOWcDSmC6lgzvZJ6vOa82rSFId3FatHT_qfXMi-PufkEZX4WyY6n0oncWV21jlODJXWsKuoJYw7GGwET4yAfy66eZiJmeyDU1xf9Dt8-V27KKcbAybIWSHq7Mjgv8OjIYZvnDEPVvDV5XHAtg10Eimq-WRh4abc6WEORw6KeHRnDyo2JYUI_WLt98BE3GEtIU9J_2zOiS1fykc7VgdRV6Zb9Gbk5l3k44WDTbVg7sH7h6rRjCG67hJvDDgIMKNCQzMTdIY=&p=2&fvj=1&vjs=3

http://www.indeed.com/company/Agina-ltd/jobs/Office-Administrator-d3d925d98ebd3d3d?fccid=33d409a0d113e3d3&vjs=3

http://www.indeed.com/company/Builder-Depot/jobs/Packing-Dispatch-Warehouse-Assistant-4128a5380a31f2fe?fccid=9ee

### The final step is to send messages to friends or client using the smtplib module. I also attached an image at the bottom of the email.

In [403]:
## Reading my code from a text file
file = open("my_personal_file.txt")
lines = file.readlines()

In [404]:
import smtplib
import imghdr
from email.message import EmailMessage
Sender_Email = "nikoskalikis@gmail.com"

# You can send the emails with two methods, 1) By sending an email to every person in your list, 
## so everyone will be able to see the other person that you have in your list
## OR 2) to every person separately which is the method that I used.

Reciever_Email = ["nikoskalikis@gmail.com", "despoina615@hotmail.com"]
Password = lines[1]
for i in Reciever_Email:
    try:
        newMessage = EmailMessage()                         
        newMessage['Subject'] = "Check out some positions that fits your CV " 
        newMessage['From'] = Sender_Email                   
        newMessage['To'] = i                   
        newMessage.set_content(f"Check the jobs that could be great fit for you\n\n " + body) 
        with open('logo.png', 'rb') as f:
            image_data = f.read()
            image_type = imghdr.what(f.name)
            image_name = f.name
        newMessage.add_attachment(image_data, maintype='image', subtype=image_type, filename=image_name)
        with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp:

            smtp.login(Sender_Email, Password)              
            smtp.send_message(newMessage)
            print("Successfully Sent email !!!")
    except Exception:
           print("Error: unable to send email")

Successfully Sent email !!!
Successfully Sent email !!!
