# Data Understanding and Data Cleansing
In this section, I will explore the dataset and clean it to ensure that I can analyze it effectively. The dataset contains the first 999 job listings on Linedln for the keyword "Data Science" located in "Indonesia.


In [3]:
from datetime import datetime
import time
import pandas as pd
import numpy as np

In [5]:
# Change the file path to where you put the excel file to load
output_file_path = 'C:\\Users\\JUWITA\\Downloads\\DataScience2.xlsx'
df = pd.read_excel(output_file_path)

In [7]:
df.sample(10)

Unnamed: 0,ID,Date,Company,Title,Location,Description,Level,Type,Function,Industry
111,,2022-11-21,Qoala,Software Engineer (Back End) - Platform Engine...,"Jakarta, Jakarta, Indonesia",We are looking for a Backend Software Engineer...,Not Applicable,Full-time,Engineering,"Technology, Information and Internet"
604,,2022-11-24,Boga Group,IT Frontend Developer,"Tangerang, Banten, Indonesia",Work as part of a team on an existing React Na...,Mid-Senior level,Full-time,Engineering and Information Technology,Food and Beverage Services
267,,2022-11-24,Mitramas Infosys Global,Software Engineer,"Jakarta, Jakarta, Indonesia","Here in Mitramas Infosys Global, you will be e...",Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting
211,,2022-12-01,Ukirama,Junior Software Engineer,"Jakarta, Jakarta, Indonesia",Job Descriptions\n\nDesigning and writing high...,Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting
922,,2022-11-28,Fintopia Indonesia,Android Developer,"Jakarta, Jakarta, Indonesia",Job Description We are looking for an Android ...,Associate,Full-time,Information Technology and Engineering,Financial Services
391,,2022-11-03,PT Niaga Expert Teknologi,Backend Developer (Python) - Remote,"Bandung, West Java, Indonesia",Minimum Qualifications and Experience Experien...,Associate,Full-time,Information Technology and Engineering,IT Services and IT Consulting
228,,2022-12-07,Phintraco Group,Software Engineer,"Jakarta, Indonesia",Qualification:\n\n1. Bachelor degree in Comput...,Associate,Full-time,Engineering and Information Technology,IT Services and IT Consulting
491,,2022-12-21,PT. Treetan Nusantara Network,Frontend Developer Intern,"Jakarta, Jakarta, Indonesia",Responsibilites\n\nCollaborate with backend in...,Entry level,Internship,Information Technology and Engineering,IT Services and IT Consulting
433,,2022-11-29,KeDA Tech,Associate Front End Programmer,"West Jakarta, Jakarta, Indonesia",We are looking for a highly motivated individu...,Entry level,Full-time,Engineering and Information Technology,Software Development
721,,2022-11-28,PT Bank Digital BCA (BCA Digital),Business Intelligence Analyst,Jakarta Metropolitan Area,Role Description:\n\nAs a Business Intelligenc...,Mid-Senior level,Full-time,"Information Technology, Research, and Analyst",Banking and Financial Services


## Splitting the Location Field

In this section, I will clean and split the **Location** field into separate columns for **City**, **Province**, and **Country** based on the format specified.

### Location Field Formats

The **Location** field can be formatted in three ways:

1. **Single Field** (City / Country):
   - For locations like `Jakarta metropolitan area`, `Greater Jakarta`, or `Greater Semarang`, the location contains only the city and country.
   
   Example: 
   - `Jakarta metropolitan area` --> [City: Jakarta, Country: Indonesia]
   - `Greater Jakarta` --> [City: Jakarta, Country: Indonesia]

2. **Two Fields** (Province, Country):
   - For locations like `Bali, Indonesia`, the location contains the province and country.
   
   Example:
   - `Bali, Indonesia` --> [Province: Bali, Country: Indonesia]

3. **Three Fields** (City, Province, Country):
   - For locations like `Badung, Bali, Indonesia`, the location contains the city, province, and country.
   
   Example:
   - `Badung, Bali, Indonesia` --> [City: Badung, Province: Bali, Country: Indonesia]

### Process to Split Location

The approach to split the **Location** field will be as follows:

- If the location contains **one field** (city or province), we will assume the format to be `[City], [Country]`.
- If the location contains **two fields** (province and country), we will split the fields into `[Province], [Country]`.
- If the location contains **three fields** (city, province, and country), we will split the fields into `[City], [Province], [Country]`.

### Plan

1. Identify how many parts the location contains by counting commas.
2. Split the location accordingly:
   - If there is **one part**, assign it to the **City** and assume the country is `Indonesia`.
   - If there are **two parts**, assign them to **Province** and **Country**.
   - If there are **three parts**, assign them to **City**, **Province**, and **Country**.
   
3. Create new columns for **City**, **Province**, and **Country**.

### Next Steps

We will now perform this operation on the `Location` field and update the dataset accordingly.



In [10]:
City = []
Province = []
for row in df['Location']:
    x = row.split(",") #convert each element in the tuple to list
    n_element = len(x) #check how many element in the field (list length)
    #print(n_element) 
    if n_element == 3:
        City.append(x[0]) #take the first element as city
        Province.append(x[1]) #take the second element as province
    elif n_element == 2:
        City.append(np.nan)
        Province.append(x[0])
    else:
        if x[0] == 'Indonesia':
            City.append(np.nan)
            Province.append(np.nan)
        else:
            City.append(x[0])
            Province.append(np.nan)
print('done')

done


In [12]:
#Insert new columns after location. location starts from 0.
df.insert(5, 'City', City)
df.insert(6, 'Province', Province)
print('Done')

Done


In [14]:
df.tail(10)

Unnamed: 0,ID,Date,Company,Title,Location,City,Province,Description,Level,Type,Function,Industry
989,,2022-12-23,Champion Campus,Developer,"Yogyakarta, Yogyakarta, Indonesia",Yogyakarta,Yogyakarta,Champion lahir dan berkembang dalam industry p...,Full-time,,,
990,,2022-11-29,Accusentry,Software Engineer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,AccuSentry is an international technology comp...,Associate,Full-time,Information Technology and Engineering,Computer Hardware Manufacturing
991,,2022-12-05,AppCake,HTML5 Developer,Indonesia,,,AppCake is an innovative product company that ...,Entry level,Full-time,Engineering and Information Technology,Software Development
992,,2022-12-19,NTT Ltd.,Software Applications Development Engineer,"Gambir, Jakarta, Indonesia",Gambir,Jakarta,"Want to be a part of our team?\n\nnalyzes, des...",Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting
993,,2022-11-27,SIGMATECH,Golang Developer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Requirements\n\nExceptional CAN-DO ATTITUDE an...,Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting
994,,2022-11-15,Humata Indonesia,Full-Stack Developer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Job Requirements\n\nMust possess a Bachelor De...,Associate,Full-time,Information Technology and Engineering,Human Resources Services
995,,2022-11-29,PT Oktagon Global Utama,Golang Developer,"Tangerang, Banten, Indonesia",Tangerang,Banten,Company Description: Perusahaan kami hadir men...,Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting
996,,2022-12-09,Xapiens Teknologi Indonesia,Odoo Developer (Technical),"Tangerang Selatan, Banten, Indonesia",Tangerang Selatan,Banten,Xapiens (part of Indika Energy Group) is a sta...,Mid-Senior level,Contract,Engineering and Information Technology,IT Services and IT Consulting
997,,2022-11-25,NashTa Group,DevOps Engineer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Qualification\n\n\n\n\nBachelor Degree in IT M...,Full-time,,,
998,,2022-11-28,PT Kita Bisa Teknologi,Internship Web Developer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Candidate: High School/Vocational High School ...,Not Applicable,Internship,Information Technology and Engineering,IT Services and IT Consulting


## Fixing Content of Job Level and Job Type Fields

### Issue Description

During the data scraping process, the content of the **Type** field (Employment Type) sometimes gets shifted into the **Level** field (Seniority Level). This issue arises when a job post does not list a Seniority Level but does list a Job Type. The mismatch occurs due to the method of extracting data using XPATH, which relies on the order of HTML elements. 

When the first element (Level) is missing, the content of the second element (Type) is automatically shifted into the first field.

### Observations

- When **Level** contains values such as `Full-time`, `Part-time`, `Internship`, `Temporary`, or `Contract`, these values are actually meant for the **Type** field.
- We can safely assume that for such cases, the job does not have a Seniority Level listed, and the value can be shifted back into the **Type** field.

### Field Definitions

- **Level**: The seniority level expected for a job, such as:
  - `Entry-level`
  - `Mid-to-Senior Level`
  - `Associate`
  - `Director`
  - `Executive`
  
- **Type**: The employment type of the job, such as:
  - `Full-time`
  - `Part-time`
  - `Internship`
  - `Temporary`
  - `Contract`
  
- **Function**: The department or job function this role falls into (e.g., `Engineering`, `Marketing`).
- **Industry**: The field of the company in general (e.g., `Technology`, `Healthcare`).

### Solution

1. **Identify Misplaced Data**:
   - Look for rows where the **Level** field contains values that belong to the **Type** field (`Full-time`, `Part-time`, etc.).

2. **Correct the Fields**:
   - Shift the content from **Level** to **Type** for these rows.
   - Set the **Level** field to `None` or `NaN` to indicate that the seniority level is not provided.

3. **Validate the Data**:
   - Ensure that the content in **Level** aligns with seniority levels.
   - Verify that the content in **Type** corresponds to employment types.

4. **Update the Dataset**:
   - Save the corrected dataset for further analysis.



In [39]:
fix_list = ['Full-time', 'Part-time', 'Internship', 'Temporary', 'Contract']
new_level = []
new_type = []
length = len(df['Level'])
df_level = df['Level']
df_type = df['Type']
for row in range(length):
    if str(df_level.iloc[row]) in fix_list:
        new_level.append(np.nan)
        new_type.append(df_level.iloc[row]) #iloc = integer location. Function to select row by its index location
    else:
        new_level.append(df_level.iloc[row])
        new_type.append(df_type.iloc[row])
#Insert new columns after location. location starts from 0.
df['Level'] = new_level
df['Type'] = new_type

In [42]:
df.head(10)

Unnamed: 0,ID,Date,Company,Title,Location,City,Province,Description,Level,Type,Function,Industry
0,,2022-12-21,Telkom Indonesia,Data Science,"Jakarta, Indonesia",,Jakarta,Telkom Indonesia is looking for professional t...,Mid-Senior level,Contract,"Analyst, Science, and Engineering",IT Services and IT Consulting and Software Dev...
1,,2022-12-06,PT Smartfren Telecom Tbk,Data Scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Work with stakeholders throughout the organiza...,Entry level,Full-time,Engineering and Information Technology,Telecommunications
2,,2022-11-30,Indodana,Data Scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Job Description\n\n\n\n\nAnalyze data to bring...,Associate,Full-time,Engineering and Information Technology,Financial Services
3,,2022-12-13,PT. XL Axiata Tbk,Data Scientist,"Jakarta, Indonesia",,Jakarta,Job Purpose: Discover information hidden in va...,Associate,Full-time,Analyst and Information Technology,Telecommunications
4,,2022-12-19,Amartha,Data Scientist,"South Jakarta City, Jakarta, Indonesia",South Jakarta City,Jakarta,About The Role\n\nAmartha is one of the bigges...,Not Applicable,Full-time,Engineering and Information Technology,"Technology, Information and Internet"
5,,2022-11-21,Moladin,Machine Learning Engineer,"South Jakarta City, Jakarta, Indonesia",South Jakarta City,Jakarta,Design and implement cloud solutions and setup...,Not Applicable,Full-time,Engineering,"Technology, Information and Internet"
6,,2022-12-14,Kargo Technologies,Data Scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,About The Challenges\n\nWe believe data scienc...,Not Applicable,Full-time,Engineering and Information Technology,"Technology, Information and Internet"
7,,2022-12-09,Allianz Indonesia,Data Scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,19915 | IT & Tech Engineering | Professional |...,Not Applicable,Full-time,Engineering and Information Technology,Insurance and Financial Services
8,,2022-10-12,ilmuOne Data,Jr Data Scientist,"Jakarta, Indonesia",,Jakarta,We provide C-suite executives and start-up Fou...,Entry level,Full-time,Engineering and Information Technology,Software Development
9,,2022-12-23,Koltiva,Data Scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Job Description\n\nData Analysis takes the cen...,Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting


## Categorizing and Removing Unrelated Jobs

Despite using the keyword **"Data Scientist"** during the data collection process, not all job listings retrieved are directly related to Data Science. Some roles, such as General Software Developer or Python Programmer, do not align with the specific domain of Data Science. Additionally, Data Science roles might appear under different job titles.
ation process, as Data Science roles may appear under diverse titles.

In [45]:
df['Title'].value_counts()

Title
Software Engineer                  45
DevOps Engineer                    43
Data Engineer                      42
Data Scientist                     28
Backend Engineer                   21
                                   ..
Junior Researcher                   1
Fullstack Instructor Internship     1
Campaign Data Analytics             1
AWS Engineer Trainee                1
Internship Web Developer            1
Name: count, Length: 521, dtype: int64

To simplify filtering and analysis, I will categorize job titles into more general categories. Specifically, I aim to:

- Identify roles directly related to **Data Science**.
- Mark unrelated or ambiguous roles as **Others**.

##### Categorization Rules

I will categorize the job titles based on specific keywords:

###### Data Scientist:
If the job title contains any of the following keywords (case-insensitive):
- "data science"
- "data scientist"
- "machine learning"
- "artificial intelligence"
- "AI/ML"
- "ML"
- "AI Engineer"

###### Others:
- Any job title that does not match the criteria for **Data Scientist** will be categorized as **Others**.

In [51]:
df['Title'] = df['Title'].str.lower() #convert all values in Title to lowercase
keyword_list= ['data science', 'data scientist', 'machine learning', 'artificial intelligence', 'ai/ml', 'ml', 'ai engineer']
swe_list = ['software','software engineer', 'programmer', 'full stack', 'application', 'developer']        
fe_list = ['front-end', 'front end', 'frontend', 'frontend developer']
be_list = ['back end', 'back-end', 'backend developer']
length = len(df['Title'])
df_title = df['Title']
title_category = []
for row in range(length):
    if any(element in str(df_title.iloc[row]) for element in keyword_list)==True:
        title_category.append('Data Science')
    elif any(element in str(df_title.iloc[row]) for element in fe_list)==True:
        title_category.append('Front-End Engineer')
    elif any(element in str(df_title.iloc[row]) for element in be_list)==True:
        title_category.append('Back-End Engineer')
    elif any(element in str(df_title.iloc[row]) for element in swe_list)==True:
        title_category.append('Software Engineer')
    else:
        title_category.append('Others')

#Insert new columns after location. location starts from 0.
df['Title Category'] = title_category

In [53]:
df.sample(10)

Unnamed: 0,ID,Date,Company,Title,Location,City,Province,Description,Level,Type,Function,Industry,Title Category
856,,2022-12-17,Pensieve,frontend developer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,"As a Frontend Developer, you are expected to b...",Associate,Full-time,Information Technology and Engineering,IT Services and IT Consulting,Front-End Engineer
25,,2022-12-16,Blibli.com,senior data scientist,"Jakarta, Indonesia",,Jakarta,"Job Descriptions\n\nIdentify, analyze, and int...",Mid-Senior level,Full-time,Engineering and Information Technology,"Technology, Information and Internet",Data Science
86,,2022-12-19,PT. Intikom Berlian Mustika,python developer,Jakarta Metropolitan Area,Jakarta Metropolitan Area,,Intikom is hiring ...\n\n\n\n\nWFO Only\n\nJak...,Associate,Contract,Engineering and Information Technology,IT Services and IT Consulting,Software Engineer
477,,2022-12-01,PT Lion Super Indo,full stack developer,"Jakarta, Indonesia",,Jakarta,REQUIREMENTS\n\nBachelor Degree from Informati...,Mid-Senior level,Full-time,Information Technology,Retail,Software Engineer
886,,2022-12-19,ASTRO,data analyst,"Jakarta, Indonesia",,Jakarta,About Astro\n\nHello there!\n\nASTRO is Indone...,Associate,Full-time,Information Technology and Research,"Technology, Information and Internet",Others
539,,2022-12-06,Farmacare.id,backend developer,"Denpasar, Bali, Indonesia",Denpasar,Bali,Farmacare makes pharmacies' operational manage...,Entry level,Full-time,Engineering and Information Technology,"Technology, Information and Internet",Back-End Engineer
496,,2022-11-22,"Indocyber Global Teknologi, PT",software engineer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Development of new software and tools accordin...,Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting,Software Engineer
721,,2022-11-28,PT Bank Digital BCA (BCA Digital),business intelligence analyst,Jakarta Metropolitan Area,Jakarta Metropolitan Area,,Role Description:\n\nAs a Business Intelligenc...,Mid-Senior level,Full-time,"Information Technology, Research, and Analyst",Banking and Financial Services,Others
847,,2022-09-12,Mekari,software engineer (mekari talenta),"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Mekari is Indonesia's no. 1 Software-as-a-Serv...,Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting,Software Engineer
334,,2022-12-14,"QuantumBlack, AI by McKinsey",data engineer - quantumblack,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Qualifications\n\nUniversity degree or pursuin...,Entry level,Full-time,"Consulting, Engineering, and Information Techn...",IT Services and IT Consulting and Software Dev...,Others


In [55]:
# Count the total records by title category
df['Title Category'].value_counts()

Title Category
Software Engineer     380
Others                339
Back-End Engineer      96
Data Science           93
Front-End Engineer     91
Name: count, dtype: int64

In [57]:
# Slice only job lists with title category = Data Science
df_ds = df[df['Title Category']=='Data Science']

In [59]:
# Slice only job lists with title category = Software Engineer
# This is just a sample, for checking purpose
df_swe = df[df['Title Category']=='Software Engineer']

In [61]:
len(df_ds)

93

In [63]:
df_swe.sample(10)

Unnamed: 0,ID,Date,Company,Title,Location,City,Province,Description,Level,Type,Function,Industry,Title Category
842,,2022-11-29,Mandiri Sekuritas,mobile application developer,Jakarta Metropolitan Area,Jakarta Metropolitan Area,,"Mandiri Sekuritas, one of the largest stock br...",,Full-time,,,Software Engineer
315,,2022-12-15,Unit4,associate software engineer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Company Description\n\n\n\n\nWe are in Busines...,Associate,Full-time,Engineering,Software Development,Software Engineer
933,,2022-12-02,YOYO Holdings Pte. Ltd.,lead software engineer (work from home),Indonesia,,,🖥️ Position\n\nLead Software Engineer (Work-Fr...,Director,Full-time,Engineering and Information Technology,"Technology, Information and Internet",Software Engineer
64,,2022-12-09,PT Bank Central Asia Tbk (BCA),application developer (fresh graduate),"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Calling all tech-savvy individuals to join us ...,Entry level,Full-time,"Information Technology, Analyst, and Strategy/...","IT Services and IT Consulting, Computer and Ne...",Software Engineer
998,,2022-11-28,PT Kita Bisa Teknologi,internship web developer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Candidate: High School/Vocational High School ...,Not Applicable,Internship,Information Technology and Engineering,IT Services and IT Consulting,Software Engineer
499,,2022-12-12,SIGMATECH,react js developer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Min 35 years old.\n\nMin Diploma.\n\nAt least ...,Entry level,Contract,Engineering and Information Technology,IT Services and IT Consulting,Software Engineer
644,,2022-11-29,Jagoan Hosting Indonesia,software engineer,"Malang, East Java, Indonesia",Malang,East Java,"\n\n\nMelakukan perancangan, pengembangan, dan...",,Full-time,,,Software Engineer
419,,2022-12-23,Tech Mahindra,oracle developer,"Jakarta, Indonesia",,Jakarta,"Experience in Oracle Developer (Form, Report a...",Mid-Senior level,Full-time,Information Technology,IT Services and IT Consulting,Software Engineer
611,,2022-12-01,PointStar,javascript developer,"Jakarta, Indonesia",,Jakarta,Hi Javascript Enthusiast!\n\n \n\nPointStar is...,,Full-time,,,Software Engineer
497,,2022-11-26,PT Bank Jago Tbk,application security engineer,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,About The Job\n\nThis job is part of our colla...,Entry level,Full-time,Information Technology,"Technology, Information and Internet",Software Engineer


## Selecting Important Columns for Data Analysis

To streamline the dataset and focus on relevant information for analysis, I will select only the following important columns:

- **date**: The date when the job was posted.
- **company**: The name of the hiring company.
- **title**: The name of the job role.
- **location**: The full location of the job posting.
- **city**: The city extracted from the location.
- **province**: The province extracted from the location.
- **level**: The seniority level expected for the role.
- **type**: The employment type (e.g., Full-time, Part-time).
- **function**: The department or job function.
- **industry**: The field or industry of the company.
- **title_category**: The categorized job type for filtering purposes (e.g., Data Scientist, Others).


In [65]:
df.drop('Description', inplace=True, axis=1)
df.drop('ID', inplace=True, axis=1)
df_clean = df
df_clean.columns = ['date', 'company', 'title', 'location','city','province','level','type','function','industry','title_category']

In [71]:
df_clean.head(10)

Unnamed: 0,date,company,title,location,city,province,level,type,function,industry,title_category
0,2022-12-21,Telkom Indonesia,data science,"Jakarta, Indonesia",,Jakarta,Mid-Senior level,Contract,"Analyst, Science, and Engineering",IT Services and IT Consulting and Software Dev...,Data Science
1,2022-12-06,PT Smartfren Telecom Tbk,data scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Entry level,Full-time,Engineering and Information Technology,Telecommunications,Data Science
2,2022-11-30,Indodana,data scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Associate,Full-time,Engineering and Information Technology,Financial Services,Data Science
3,2022-12-13,PT. XL Axiata Tbk,data scientist,"Jakarta, Indonesia",,Jakarta,Associate,Full-time,Analyst and Information Technology,Telecommunications,Data Science
4,2022-12-19,Amartha,data scientist,"South Jakarta City, Jakarta, Indonesia",South Jakarta City,Jakarta,Not Applicable,Full-time,Engineering and Information Technology,"Technology, Information and Internet",Data Science
5,2022-11-21,Moladin,machine learning engineer,"South Jakarta City, Jakarta, Indonesia",South Jakarta City,Jakarta,Not Applicable,Full-time,Engineering,"Technology, Information and Internet",Data Science
6,2022-12-14,Kargo Technologies,data scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Not Applicable,Full-time,Engineering and Information Technology,"Technology, Information and Internet",Data Science
7,2022-12-09,Allianz Indonesia,data scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Not Applicable,Full-time,Engineering and Information Technology,Insurance and Financial Services,Data Science
8,2022-10-12,ilmuOne Data,jr data scientist,"Jakarta, Indonesia",,Jakarta,Entry level,Full-time,Engineering and Information Technology,Software Development,Data Science
9,2022-12-23,Koltiva,data scientist,"Jakarta, Jakarta, Indonesia",Jakarta,Jakarta,Entry level,Full-time,Engineering and Information Technology,IT Services and IT Consulting,Data Science


## Export the data to excel or csv

In [76]:
output_file_path_excel = 'C:\\Users\\JUWITA\\Downloads\\DataScienceClean.xlsx'
output_file_path_csv = 'C:\\Users\\JUWITA\\Downloads\\DataScienceClean.csv'
df_ds.to_excel(output_file_path_excel, index=False)
df.to_csv(output_file_path_csv, sep='$', index=False) #csv separator used = '$'