## Text Mining CA
#### This Text mining assignment will focus on job market mining and follow the general steps follows:
- Data Extraction through web crawler.
- Text preprocessing.
- Use the text mining basic methods list the conclusions.
- Advanced methods to get the findings (word2vec). 
 
In this part, I mainly wrote the web crawler to finish the data extraction, also tried to refine the data for the raw text data.

From https://www.mycareersfuture.sg/, near 220 job results (the website is d) with query about "machine learning" are listed in this sample containing the 12 columns. 

Use the <u>BeautifulSoup</u>, <u>selenium</u> to crawler the data in the web page, and also fetch the json data through request apis.

This notebook can be a guideline to show the basic steps.

**Attention!!! Download chromedriver in Method 1, and replace the file path in google_chrome_driver_path in Method 1**

In [5]:
from selenium import webdriver
import time
from bs4 import BeautifulSoup
import math
import requests
import pandas as pd
import re

### Method 1: Use the root url like "https://www.mycareersfuture.sg" to set the query value to get the results

In [3]:
# need to download the driver support in selenium, refer to below 2 helpers.
# https://selenium-python.readthedocs.io/installation.html#drivers
# https://sites.google.com/a/chromium.org/chromedriver/downloads
google_chrome_driver_path = './chromedriver'
root_url = 'https://www.mycareersfuture.sg'

driver=webdriver.Chrome(google_chrome_driver_path)
driver.get(root_url)
# to wait the page finish loading.
time.sleep(2) 
# find and type in the search bar with "machine learning"
driver.find_element_by_name('search-text').send_keys('machine learning') 
time.sleep(1) 
# find and click the search button
driver.find_element_by_id('search-button').click()
time.sleep(1) 
# now, get the html of all the search result
html = driver.page_source

### Method 2: Can just joint the content you wanna query fill in the url:
see the url like https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=0
- set the *search=* equals to your wanted content such as 'machine%20learning', "%20" means the 'space' which is the URL encoding rule, refer to https://zh.wikipedia.org/wiki/%E7%99%BE%E5%88%86%E5%8F%B7%E7%BC%96%E7%A0%81 
- set the *page=* equals to the web pages you wanna jump to. attention the **no result**.

In [2]:
basic_url = 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page={}'
urls = []
# construct 20 pages
for page in range(0, 20):
    query_url = basic_url.format(page)
    urls.append(query_url)
# driver.get(urls[0])
# html = driver.page_source


### Start to extract the contents

#### 1. define the fetch function

In [429]:
def fetch_data(url, head, payload):
    response = requests.get(url, headers=head, params=payload)
    if response.status_code == 200:
        return response.json()
    else:
        return {'info': 'error', 'error_code': response.status_code}

#### 2. get the query result number.

In [5]:
google_chrome_driver_path = '/Users/alexjzy/Desktop/Py-Projects/text_mining/chromedriver'
driver=webdriver.Chrome(google_chrome_driver_path)
query_url = 'https://api.mycareersfuture.sg/jobs?search=machine%20learning&sortBy=new_posting_date'
response = fetch_data(query_url, {}, {})
result_num = math.ceil(response['count']/20)

#### 3. Since the results showing in the pages, in this part get the total number and the total page


In [6]:
basic_url = 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page={}'
urls = []
# construct 20 pages
for page in range(0, result_num ):
    query_url = basic_url.format(page)
    urls.append(query_url)
print(result_num)
print("urls shape: ", len(urls))

11
urls shape:  11


In [7]:
query_url = 'https://api.mycareersfuture.sg/jobs?search=machine%20learning&sortBy=new_posting_date'
response = fetch_data(query_url, {}, {})
result_num = math.ceil(response['count']/20)

In [8]:
urls # all the result in pages.

['https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=0',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=1',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=2',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=3',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=4',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=5',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=6',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=7',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=8',
 'https://www.mycareersfuture.sg/search?search=machine%20learning&sortBy=new_posting_date&page=9',
 'https://

#### 4. Use the uuid in every card to get the detail of the job description

In [9]:
def get_job_description(uuid):
    api_basic = 'https://api.mycareersfuture.sg/job/{}'
    api_jd_url = api_basic.format(uuid)
    json = fetch_data(api_jd_url, {}, {})
    jd = BeautifulSoup(json['job_description']).get_text(strip=True)
    jr = BeautifulSoup(str(json['other_requirements'])).get_text(strip=True)
    jsk = [item['skill'] for item in json['skills']]
    sal_max = json['max_monthly_salary']
    sal_min = json['min_monthly_salary']
    return jd, jr, jsk, sal_max, sal_min
    

#### 5. wrap the json return the result.

In [10]:
def get_detail(card):
    company = card.find("p", {"name": "company"}).get_text()
    job_title = card.find("h1", {"name": "job_title"}).get_text()
    
    # extract the data
    location = card.find_all("p", {"name": "location"})[0].get_text() if len(card.find_all("p", {"name": "location"})) > 0 else None
    employment_type = card.find_all("p", {"name": "employment_type"})[0].get_text() if len(card.find_all("p", {"name": "employment_type"})) > 0 else None
    seniority = card.find_all("p", {"name": "seniority"})[0].get_text() if len(card.find_all("p", {"name": "seniority"})) > 0 else None
    category = card.find_all("p", {"name": "category"})[0].get_text() if len(card.find_all("p", {"name": "category"})) > 0 else None
    
    # get the job detail and collect the jd and requirements which are the raw text
    job_uuid = card.find("a", href=True)['href'].split('-')[-1]
    job_description, job_requirement, job_skills, salary_max, salary_min = get_job_description(job_uuid)
    return {
        "company": company,
        "job_title": job_title,
        "location": location,
        "employment_type": employment_type,
        "seniority": seniority,
        "category": category,
        "job_description": job_description,
        "job_requirement": job_requirement,
        "job_skills": job_skills,
        "job_uuid": job_uuid,
        "salary_min": salary_min,
        "salary_max": salary_max
    }

#### 6. iterate the card in the cards list.

In [11]:
def get_card_info(page_url, res):
    driver.get(page_url)
    time.sleep(2)
    html = driver.page_source
    soup = BeautifulSoup(html)
    card_jobs = soup.find("div", {"class": "card-list"})
    cards = card_jobs.find_all("div", {"class": "card relative"})
    for card in cards:
        res.append(get_detail(card))

#### 7. get the result and convert to dataframe

In [None]:
result = []
for url in urls:
    get_card_info(url, result)
career_res = pd.DataFrame.from_dict(result)


### Start to refine and modify the dataframe

In [None]:
career_res['job_skills'] = career_res.job_skills.apply(lambda x: ', '.join(x))

In [None]:
career_res["job_description"] = \
career_res["job_description"].apply(lambda jd:BeautifulSoup(jd).get_text(strip=True))

career_res["job_requirement"] = \
career_res["job_requirement"].apply(lambda x:BeautifulSoup(str(x)).get_text(strip=True))

In [None]:
career_res["category"] = career_res.category.apply(lambda x:','.join(x.split('/ ')))

In [None]:
career_res["employment_type"] = career_res.employment_type.apply(lambda x: x.split('...')[0])

In [None]:
career_res.head()

#### Save the data to csv.

In [None]:
career_res.to_csv('mycareersfuture.csv')

# ——————————————————————————————————
### Start to extract the missed job requirements from responsibility

In [3]:
data = pd.read_csv('mycareersfuture.csv')

**Fill in the missing job requirements from job descriptions**

In [157]:
def generate_req_from_des(description):
    if len(re.findall('Requirements', description)) > 0:
        return description.split('Requirements')[-1]
    pattern = re.findall(re.compile('succeed', re.IGNORECASE), description)
    if len(pattern) > 0:
        return description.split(pattern[0])[-1]
    return 

In [163]:
data.loc[data.job_requirement == 'None', 'job_requirement'] = data[data.job_requirement == 'None']['job_description'].apply(lambda des: generate_req_from_des(des))


**process the long ambigous word, like "ResponsibilitiesWe" to "Responsibilities. We"**

In [378]:
def pruneLongAmbiguousWord(text):
    def _addFullStop(word):
        wg = word.group()
        indice = [i for i, c in enumerate(wg) if c.isupper()][-1]
        pruned = wg[:indice] + '. ' + wg[indice:]
        return pruned
    pattern = '(?P<word>[A-Z]?[a-z]+[A-Z][a-z]+)'
    return re.sub(pattern, _addFullStop, text)

def trimChar(text):
    text = re.sub(r'[·|\xa0]+', '.', text)
    text = re.sub(r'\.{2}', '.', text)
    if [c for c in text][-1] != '.':
        text = text + '.'
    return text
    
data['job_requirement'] = data['job_requirement'].apply(lambda req: pruneLongAmbiguousWord(str(req)))
data['job_description'] = data['job_description'].apply(lambda des: pruneLongAmbiguousWord(str(des)))

data['job_requirement'] = data['job_requirement'].apply(lambda req: trimChar(str(req)))
data['job_description'] = data['job_description'].apply(lambda des: trimChar(str(des)))

In [380]:
data.to_csv('./processed_data.csv', index=False)

### Mining the specific terms from job requirements such as degree or work experience

In [76]:
pattern_degree = re.compile(r'Bachelor|Master|PhD|Doctor|Ph.D|Diploma|Masters|BS|MS|BA|master|bachelor|phd|PHD')
pattern_years = re.compile(r'[\d]*[\+]? year[s]? [^.]*[\.|\;]|[one|two|three|four|five|six|seven|eight|nine|ten]* year[s]? [^.]*[\.|\;]')

pattern_years_precise = re.compile(r'[\d]*[\+]? year[s]?\b|[one|two|three|four|five|six|seven|eight|nine|ten]* year[s]?\b')
degree_info = data.job_requirement.apply(lambda req: set(re.findall(pattern_degree, req)))
exp_info = data.job_requirement.apply(lambda req: re.findall(pattern_years, req))
precise = exp_info.apply(lambda exp: re.findall(pattern_years_precise, ''.join([exp[0] for item in exp if exp])))
data['job_experience'] = precise.apply(lambda y: y[0].split(' ')[0] if y else '')
data['job_experience'] = data['job_experience'].apply(lambda year: re.sub('\+', '', year))

def trimDegree(degree):
    if len(degree) == 0: return ''
    newList = []
    for item in degree:
        if item in ['Bachelor', 'BS', 'BA', 'bachelor']:
            newList.append('Bachelor')
        if item in ['Master', 'Masters', 'MS', 'master', 'Msc']:
            newList.append('Master')
        if item in ['PhD', 'Ph.D', 'Doctor', 'phd', 'PHD']:
            newList.append('PhD')
        if item in ['Diploma']:
            newList.append('Diploma')
    dge = ', '.join(set(newList))
    return dge
            
data['job_degree'] = degree_info.apply(lambda x: trimDegree(x))





Unnamed: 0,category,company,employment_type,job_description,job_requirement,job_skills,job_title,job_uuid,location,salary_max,...,job_requirement_nn,job_description_nn,job_requirement_nnp,job_description_nnp,job_requirement_vb,job_description_vb,latitude,longtitude,job_experience,job_degree
0,Engineering ...,PROPERTYGURU PTE. LTD.,Permanent,Our websites attract more than 100 million mon...,Bachelor’s degree in IT or relevant field. Alt...,"Data Analysis, SQL, Microsoft Excel, Microsoft...",Data Engineer,82e462a13cadc477f93d57ad6812d1d1,Central,7000.0,...,"['Bachelor', '’', 'degree', 'IT', 'field', 'qu...","['websites', 'page-views', 'click-stream', 'be...","['Bachelor', '’', 'IT', 'SQL', 'Azkaban', 'Air...","['Property', 'Guru', 'Southeast', 'Asia.Our', ...","['s', 'be', 'working', 'Working', 'write', 'is...","['result', 'has', 'is', 'empowered', 'build', ...",1.300213,103.837286,2,Bachelor
1,Engineering ...,PROPERTYGURU PTE. LTD.,Permanent,Our websites attract more than 100 million mon...,Bachelor’s degree in IT or relevant field. Alt...,"Data Analysis, SQL, Microsoft Excel, Microsoft...",Data Engineer,763708ca5a581db4389a766ef71654a0,Central,9000.0,...,"['Bachelor', '’', 'degree', 'IT', 'field', 'qu...","['websites', 'page-views', 'click-stream', 'be...","['Bachelor', '’', 'IT', 'SQL', 'Azkaban', 'Air...","['Property', 'Guru', 'Southeast', 'Asia.Our', ...","['s', 'be', 'working', 'Working', 'write', 'is...","['result', 'has', 'is', 'empowered', 'build', ...",1.300213,103.837286,2,Bachelor
2,"Accounting ,Auditing ,Taxation",ERNST & YOUNG ADVISORY PTE. LTD.,Permanent,Join Fraud Investigation & Dispute Services (F...,To qualify for the role you must have. Strong ...,"Accounting, Microsoft Excel, Microsoft Word, G...",Fraud Investigation & Dispute Services (FIDS) ...,c59d6037f441a5b3b43ca4daff84806a,Central,16000.0,...,"['role', 'record', 'degree', 'field', 'years',...","['Join', 'Fraud', 'Investigation', 'Dispute', ...","['Life', 'Science', 'Ability', 'English', 'EY'...","['Join', 'Fraud', 'Investigation', 'Dispute', ...","['qualify', 'have', 'is', 'encouraged', 'apply...","['help', 'work', 'be', 'aligned', 'works', 'ex...",1.280895,103.851677,eight,
3,"Sciences ,Laboratory ,R&D",A*STAR RESEARCH ENTITIES,Contract,"The Agency for Science, Technology and Researc...",Bachelor or Master's Degree in physical scienc...,"Matlab, Algorithms, C++, Machine Learning, C, ...",Research Engineer / Senior Research Engineer (...,cec1905d6444838c2d7108c1a04049a3,West,5000.0,...,"['Bachelor', 'Master', 'Degree', 'sciences', '...","['Agency', 'Science', 'Technology', 'Research'...","['Bachelor', 'Master', 'Degree', 'A*STAR']","['Agency', 'Science', 'Technology', 'Research'...","['building', 'work', 'been', 'seek', 'develop'...","['is', 'fosters', 'drive', 'transform', '.For'...",1.285407,103.850568,5,"Bachelor, Master"
4,"Sciences ,Laboratory ,R&D",A*STAR RESEARCH ENTITIES,Contract,"Specialize in applying data analytics, machine...",PhD in Power System operation and analysis or ...,"R&D, Molecular Biology, Biotechnology, Lifesci...","Scientist (Power System Analytics), EPGC",d2da04bb4033006cb580fc80e752fa9e,West,11800.0,...,"['PhD', 'Power', 'System', 'operation', 'analy...","['Specialize', 'data', 'analytics', 'machine',...","['PhD', 'Power', 'System', 'Computer', 'Knowle...","['Specialize', 'PLC', 'Microgrid', 'EPGC']","['related', 'programming', 'work', 'Willing', ...","['applying', 'analysis.Specialize', 'modeling'...",1.285407,103.850568,,"Master, PhD"
5,Engineering ...,Company Undisclosed,Permanent,About The Role. As IAG focuses on creating an ...,These are the skills and experience we are loo...,"Analytics, Data Analysis, Analysis, Statistica...",Data Scientist,972f2d0cd68c45fbf7d50920fc229036,Central,12000.0,...,"['skills', 'experience', 'Experience', 'manipu...","['Role', 'IAG', 'organisation', 'Data', 'Scien...","['SQL', 'Python', 'Gigabytes', 'Py', 'Spark', ...","['Role', 'IAG', 'Data', 'Scientist', 'IAG', 'D...","['are', 'are', 'looking', 'extracting', 'disco...","['focuses', 'creating', 'is', 'is', 'deliverin...",1.279468,103.853750,,"Master, PhD"
6,"Public ,Civil Service",Smart Nation and Digital Government Office,Full Time,The Smart Nation and Digital Government Office...,We are in search of motivated individuals who ...,"Human Resources, Employee Relations, Recruitin...",Manager / Senior Manager (Talent and Manpower),cb3cb0fd4276956c30baab4084802b3a,,0.0,...,"['search', 'individuals', 'share', 'traits', '...","['Smart', 'Nation', 'Digital', 'Government', '...","['Value', 'Singapore', 'Strong', 'Excellent', ...","['Smart', 'Nation', 'Digital', 'Government', '...","['are', 'following', 'take', 'going', 'provide...","['prioritizes', 'raises', 'builds', 'promotes'...",,,2,
7,Information Technology,SCHELLDEN GLOBAL PTE. LTD.,Full Time,You will be responsible for end to end develop...,What you have done: Commercial software engin...,"Engineering, Project Management, Testing, Soft...",senior big data engineer,6b9679d4563c270b58f3a9a89638564a,Islandwide,7000.0,...,"['Commercial', 'software', 'engineering', '.Yo...","['end', 'development', 'Data', 'Analytics', 'u...","['Commercial', 'Java', 'Python', 'Py', 'Spark'...","['Data', 'Analytics', 'Data', 'Lake', 'Dev', '...","['have', 'done', 'have', 'debugging', 'have', ...","['be', 'end', 'be', 'be', 'work', '’', '\uf0b7...",1.275857,103.845955,3,Master
8,Banking and Finance,Company Undisclosed,Full Time,". Lead the definition of activities, scope, an...",Knowledge in at least 1 of the following data ...,"Management, Leadership, Strategy, Strategic Pl...","Vice President, Portfolio & Regulatory Managem...",e11a65ce7a59ca9f6a19bcf76c62ac68,Central,10000.0,...,"['Knowledge', 'data', 'science', 'domains', 'T...","['definition', 'activities', 'scope', 'timelin...","['Knowledge', 'Text', 'Mining', '/', 'NLPGraph...","['Create', 'Guide', 'Manage', 'Hive', 'Data', ...","['Passionate', 'asking', 'answering', 'be', 'c...","['Lead', 'support', 'following', 'discover', '...",1.285407,103.850568,,Bachelor
9,Banking and Finance,Company Undisclosed,Full Time,". Lead the definition of activities, scope, an...",Knowledge in at least 1 of the following data ...,"Management, Leadership, Strategy, Strategic Pl...","Vice President, Portfolio & Regulatory Managem...",dda2dea97c28a907fe2b7ee0346d4e77,Central,10000.0,...,"['Knowledge', 'data', 'science', 'domains', 'T...","['definition', 'activities', 'scope', 'timelin...","['Knowledge', 'Text', 'Mining', '/', 'NLPGraph...","['Create', 'Guide', 'Manage', 'Hive', 'Data', ...","['Passionate', 'asking', 'answering', 'be', 'c...","['Lead', 'support', 'following', 'discover', '...",1.285407,103.850568,,Bachelor


In [78]:
import nltk
from nltk import word_tokenize
from nltk.tag.stanford import StanfordPOSTagger, StanfordNERTagger
import os
from nltk import pos_tag_sents
from nltk import pos_tag
import os

### Use Standford NLP to see the POS tag and extract the NN and VERB

In [79]:
java_path = '/usr/bin/java'
os.environ['JAVAHOME'] = java_path

pos_model_path = './stanford-postagger-full-2018-02-27/models/english-bidirectional-distsim.tagger'
pos_jar_path = './stanford-postagger-full-2018-02-27/stanford-postagger.jar'

st_pos=StanfordPOSTagger(pos_model_path, pos_jar_path)

def pos_tag_nn(sent):
    words_tag = st_pos.tag(word_tokenize(sent))
    nn_pos_tag = [w for (w, t) in words_tag if 'NN' in t]
    return nn_pos_tag

def pos_tag_nnp(sent):
    words_tag = st_pos.tag(word_tokenize(sent))
    nnp_pos_tag = [w for (w, t) in words_tag if 'NNP' in t]
    return nnp_pos_tag    

def pos_tag_vb(sent):
    words_tag = st_pos.tag(word_tokenize(sent))
    vb_pos_tag = [w for (w, t) in words_tag if 'VB' in t]
    return vb_pos_tag
    
data['job_requirement_nn'] = data['job_requirement'].apply(lambda req: pos_tag_nn(req))
data['job_description_nn'] = data['job_description'].apply(lambda des: pos_tag_nn(des))

data['job_requirement_nnp'] = data['job_requirement'].apply(lambda req: pos_tag_nnp(req))
data['job_description_nnp'] = data['job_description'].apply(lambda des: pos_tag_nnp(des))

data['job_requirement_vb'] = data['job_requirement'].apply(lambda req: pos_tag_vb(req))
data['job_description_vb'] = data['job_description'].apply(lambda des: pos_tag_vb(des))

In [94]:
data['job_requirement_nn'] = data['job_requirement_nn'].apply(lambda x: ' '.join(x))
data['job_description_nn'] = data['job_description_nn'].apply(lambda x: ' '.join(x))

data['job_requirement_nnp'] = data['job_requirement_nnp'].apply(lambda x: ' '.join(x))
data['job_description_nnp'] = data['job_description_nnp'].apply(lambda x: ' '.join(x))

data['job_requirement_vb'] = data['job_requirement_vb'].apply(lambda x: ' '.join(x))
data['job_description_vb'] = data['job_description_vb'].apply(lambda x: ' '.join(x))

In [440]:
data = pd.read_csv('./processed_data.csv')
def get_more_detail(df):
    
    api_basic = 'https://api.mycareersfuture.sg/job/{}'
    df['latitude'] = df['job_uuid'].apply(lambda x: fetch_data(api_basic.format(x), {}, {})['lat'])
    df['longtitude'] = df['job_uuid'].apply(lambda x: fetch_data(api_basic.format(x), {}, {})['lng'])
    return df
    
    
data = get_more_detail(data)


In [95]:
data.to_csv('./processed_data.csv', index=False)

In [96]:
data

Unnamed: 0,category,company,employment_type,job_description,job_requirement,job_skills,job_title,job_uuid,location,salary_max,...,job_requirement_nn,job_description_nn,job_requirement_nnp,job_description_nnp,job_requirement_vb,job_description_vb,latitude,longtitude,job_experience,job_degree
0,Engineering ...,PROPERTYGURU PTE. LTD.,Permanent,Our websites attract more than 100 million mon...,Bachelor’s degree in IT or relevant field. Alt...,"Data Analysis, SQL, Microsoft Excel, Microsoft...",Data Engineer,82e462a13cadc477f93d57ad6812d1d1,Central,7000.0,...,Bachelor s degree IT field qualifications expe...,websites page-views click-stream behaviour dat...,IT Azkaban Airflow Python C++ Java Go Scala Ka...,Property Guru Southeast Asia.Our Data Science ...,be working Working authoring write is is be is...,attract result has is empowered build using be...,1.300213,103.837286,2,Bachelor
1,Engineering ...,PROPERTYGURU PTE. LTD.,Permanent,Our websites attract more than 100 million mon...,Bachelor’s degree in IT or relevant field. Alt...,"Data Analysis, SQL, Microsoft Excel, Microsoft...",Data Engineer,763708ca5a581db4389a766ef71654a0,Central,9000.0,...,Bachelor s degree IT field qualifications expe...,websites page-views click-stream behaviour dat...,IT Azkaban Airflow Python C++ Java Go Scala Ka...,Property Guru Southeast Asia.Our Data Science ...,be working Working authoring write is is be is...,attract result has is empowered build using be...,1.300213,103.837286,2,Bachelor
2,"Accounting ,Auditing ,Taxation",ERNST & YOUNG ADVISORY PTE. LTD.,Permanent,Join Fraud Investigation & Dispute Services (F...,To qualify for the role you must have. Strong ...,"Accounting, Microsoft Excel, Microsoft Word, G...",Fraud Investigation & Dispute Services (FIDS) ...,c59d6037f441a5b3b43ca4daff84806a,Central,16000.0,...,role record degree field years experience audi...,Fraud Investigation Dispute Services Assurance...,Life Science English Plus Opportunities,Fraud Investigation Dispute Services Assurance...,qualify have is encouraged apply run managing ...,Join help work be aligned works experience tai...,1.280895,103.851677,eight,
3,"Sciences ,Laboratory ,R&D",A*STAR RESEARCH ENTITIES,Contract,"The Agency for Science, Technology and Researc...",Bachelor or Master's Degree in physical scienc...,"Matlab, Algorithms, C++, Machine Learning, C, ...",Research Engineer / Senior Research Engineer (...,cec1905d6444838c2d7108c1a04049a3,West,5000.0,...,Bachelor Master Degree sciences track record s...,Agency Science Technology Research ( A*STAR ) ...,Technical,Agency Science Technology Research ( A*STAR ) ...,building been specialise interpret seek develo...,is fosters drive transform .For please advance...,1.285407,103.850568,5,"Bachelor, Master"
4,"Sciences ,Laboratory ,R&D",A*STAR RESEARCH ENTITIES,Contract,"Specialize in applying data analytics, machine...",PhD in Power System operation and analysis or ...,"R&D, Molecular Biology, Biotechnology, Lifesci...","Scientist (Power System Analytics), EPGC",d2da04bb4033006cb580fc80e752fa9e,West,11800.0,...,PhD Power System operation analysis Computer S...,data analytics machine learning data mining te...,Power System C/C++ Python Jurong Island,,related work work are ) are include based be a...,Specialize applying be prepare,1.285407,103.850568,,"Master, PhD"
5,Engineering ...,Company Undisclosed,Permanent,About The Role. As IAG focuses on creating an ...,These are the skills and experience we are loo...,"Analytics, Data Analysis, Analysis, Statistica...",Data Scientist,972f2d0cd68c45fbf7d50920fc229036,Central,12000.0,...,skills experience Experience value datasets ma...,Role IAG organisation Data Scientist role data...,SQL Python Py Spark Scala AWS GCP Advanced Ten...,IAG Data Scientist IAG Data Sciences IAG ( Bus...,are are looking manipulating processing extrac...,focuses creating is is delivering be built con...,1.279468,103.853750,,"Master, PhD"
6,"Public ,Civil Service",Smart Nation and Digital Government Office,Full Time,The Smart Nation and Digital Government Office...,We are in search of motivated individuals who ...,"Human Resources, Employee Relations, Recruitin...",Manager / Senior Manager (Talent and Manpower),cb3cb0fd4276956c30baab4084802b3a,,0.0,...,search individuals traits experience Value int...,Smart Nation Digital Government Office ( SNDGO...,Value Singapore Smart Nation Analytical date.S...,Smart Nation Digital Government Office ( SNDGO...,are share take going provide shaping conceptua...,plans prioritizes raises builds promotes takes...,,,2,
7,Information Technology,SCHELLDEN GLOBAL PTE. LTD.,Full Time,You will be responsible for end to end develop...,What you have done: Commercial software engin...,"Engineering, Project Management, Testing, Soft...",senior big data engineer,6b9679d4563c270b58f3a9a89638564a,Islandwide,7000.0,..., Commercial software engineering .You years s...,end development Data Analytics cases company D..., Commercial Java Python ( Py skills. Big Had...,Data Lake Dev Do Data Machine Learningplatform...,have done have debugging have working have 've...,be end use be be work examine analyze Pushing ...,1.275857,103.845955,3,Master
8,Banking and Finance,Company Undisclosed,Full Time,". Lead the definition of activities, scope, an...",Knowledge in at least 1 of the following data ...,"Management, Leadership, Strategy, Strategic Pl...","Vice President, Portfolio & Regulatory Managem...",e11a65ce7a59ca9f6a19bcf76c62ac68,Central,10000.0,...,Knowledge data science domains Mining NLPGraph...,definition activities scope timelines data sci...,Mining NLPGraph Network Analysis Deep Learning...,Guide Work Hive Data Work Data Scala Java API,following Text applied asking answering distri...,support following discover interpret document ...,1.285407,103.850568,,Bachelor
9,Banking and Finance,Company Undisclosed,Full Time,". Lead the definition of activities, scope, an...",Knowledge in at least 1 of the following data ...,"Management, Leadership, Strategy, Strategic Pl...","Vice President, Portfolio & Regulatory Managem...",dda2dea97c28a907fe2b7ee0346d4e77,Central,10000.0,...,Knowledge data science domains Mining NLPGraph...,definition activities scope timelines data sci...,Mining NLPGraph Network Analysis Deep Learning...,Guide Work Hive Data Work Data Scala Java API,following Text applied asking answering distri...,support following discover interpret document ...,1.285407,103.850568,,Bachelor
