**In this note book I am going to clean the dataframe for all the jobs as well as will perform some basic exploration on data**
> Our dataframe have 6098 jobs, some of them are not the job profile we are looking to analyse (like data entry operator) which may be there because of keyword search in the search engine while scrapping.
> Need to explore what kind of data we have & convert/clean/tranform it to help us in further analysis

In [95]:
%matplotlib inline
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import _pickle as cPickle
pd.set_option('display.width', 500)
pd.set_option('display.max_columns', 100)
pd.set_option('display.notebook_repr_html', True)
sns.set_style("whitegrid")
sns.set_context("poster")

In [96]:
# First read the data-frame :
with open('data/complete_job_profiles.pkl', 'rb') as f:
    df = cPickle.load(f)

*Data is not publicly available. You can scrap it by following the procedure as described in the [Naukri_data_scrapper] notebook. Or else you can write to me at [ripunjoygohain79@gmail.com](mailto:ripunjoygohain79@gmail.com)*

In [97]:
print("Shape of data is "+ str(df.shape))
print("Our data looks like this :")
df.head(3)

Shape of data is (6098, 21)
Our data looks like this :


Unnamed: 0,Actual_Job_Link,Company_Name,Design_Role,Doctorate,Experience,Functional_Area,Industry,Job_Application,Job_Link,Job_Post,Job_Title,Job_View,Key_Skills,Location,Num_Openings,PG,Role_Category,Salary,SalaryI,Skill_Experience,UG
0,https://www.naukri.com/job-listings-Data-Scien...,niki.ai,Data Analyst,Doctorate Not Required,3 - 6 yrs,Analytics & Business Intelligence,IT-Software / Software Services,Less than 10,https://www.naukri.com/job-listings-Data-Scien...,Posted Just Now,Data Scientist - Perl/python,Less than 10,"Machine Learning,Python,Data Analysis,Statisti...",Bengaluru,,"M.Tech - Any Specialization, MS/M.Sc(Science) ...",Analytics & BI,Not Disclosed by Recruiter,Not Disclosed by Recruiter,Qualifications and Skills :1. B.tech/MS or equ...,B.Tech/B.E. - Any Specialization
1,https://www.naukri.com/job-listings-Data-Scien...,Brillio Technologies Pvt. Ltd,Research Scientist,,2 - 5 yrs,"Medical , Healthcare , R&D , ...",IT-Software / Software Services,14,https://www.naukri.com/job-listings-Data-Scien...,Posted 1 day ago,Data Scientist,166,"Analytics,Data analysis,Python,Visualization,A...",Bengaluru,,Post Graduation Not Required,R&D,Not Disclosed by Recruiter,Not Disclosed by Recruiter,,Any Graduate - Any Specialization
2,https://www.naukri.com/job-listings-Senior-Dat...,Knorex India,System Analyst,Doctorate Not Required,3 - 6 yrs,IT Software - System Programming,IT-Software / Software Services,Less than 10,https://www.naukri.com/job-listings-Senior-Dat...,Posted Just Now,"Senior Data Scientist, Data Scientist,",Less than 10,"python,machine learning,r,algorithms,java,mark...",Pune(Hadapsar),Openings: 1,"M.Tech - Computers, Post Graduation Not Required",Programming & Design,"INR 4,25,000 - 9,25,000 P.A.","4,25,000 - 9,25,000 P.A.",Please refer to the Job description above,B.Tech/B.E. - Computers


In [98]:
# Checking the data types
df.dtypes

Actual_Job_Link     object
Company_Name        object
Design_Role         object
Doctorate           object
Experience          object
Functional_Area     object
Industry            object
Job_Application     object
Job_Link            object
Job_Post            object
Job_Title           object
Job_View            object
Key_Skills          object
Location            object
Num_Openings        object
PG                  object
Role_Category       object
Salary              object
SalaryI             object
Skill_Experience    object
UG                  object
dtype: object

**Observations on data types as well as the columns**
1. Job_Link & Actual_Job_Link are almost same, need to remove Job_Link and change the type to string(str)
2. Instead of NAN, put NA in all column rows
3. Experience : split into min_exp & max_exp, change the type to integer
4. Job_Application : parse the integer value only
5. Job_Post : parse integer value only, it is in days. No integer value means today, assign value 0 for it
6. Need to explore Location
7. Num_Openings : get the integer value, change NAN to NA
8. Doctorate, PG, UG : need to explore
9. Job_View : parse the integer value only
10. Remove SalaryI column
11. Salary: split into min & max, parse the integer value

> **Using the following command for every column, while checking the general format of the column**
```python
# dropna=false gives the count of NaN values
df["Column_Name"].value_counts(dropna=False)
```

In [99]:
# Delete Job_Links & SalaryI
del_col = ["Job_Link", "SalaryI"]
df.drop(del_col, axis=1, inplace=True)

In [100]:
# Experience : split
#df["min_experience"], df["max_experience"] = zip(*df["Experience"].map(lambda x : str(x).split(" ")))
df["min_experience"], df["max_experience"] = df["Experience"].str.split("-",1).str

# Keep only numbers, becasue there are rows with "Not Mentioned"
df["min_experience"] = df["min_experience"].str.extract("(\d+)", expand = False).astype(float)
# Extract only the number, get rid of yrs from max_experience
# Nan is also there, we can't convert it to int without "fillna" to 0 :
# Not Mentioned doesn't means fresher or 0 years of experience, so we are retaining NaN value
df["max_experience"] = df["max_experience"].str.extract("(\d+)", expand = False).astype(float)

# Remove the experience column
df.drop("Experience", axis=1, inplace=True)

In [101]:
# Job_Application : few values are like Less than 10, 1500+ etc.
# Get only the numeric value
df["Job_Application"] = df["Job_Application"].str.extract("(\d+)", expand = False).astype(int)

In [102]:
# Job_Post : how many days before the job is posted, for O days it is written as few hours ago, or just now
# While extracting numeric value, these will become NaN, we will replace it with 0
df["Job_Post"] = df["Job_Post"].str.extract("(\d+)", expand = False).fillna(0).astype(int)

In [103]:
# Job_View : few values are like Less than 10, 1500+ etc.
# Get only the numeric value
df["Job_View"] = df["Job_View"].str.extract("(\d+)", expand = False).astype(int)

In [104]:
# Num Openings : in most of the job ad, it is NaN, so converting it to float so that NaN doesn't means 0
df["Num_Openings"] = df["Num_Openings"].str.extract("(\d+)", expand = False).astype(float)

In [170]:
# Salary : split
df["min_salary"], df["max_salary"] = df["Salary"].str.split("-",1).str

# Keep only numbers, becasue there are rows with "Not Disclosed"
# Remove commas from the min_salary amount
df["min_salary"] = df["min_salary"].str.replace(",", "")
df["min_salary"] = df["min_salary"].str.extract("(\d+)", expand = False).astype(float)
# Extract only the number, get rid of INR
# Nan is also there, we can't convert it to int without "fillna" to 0 :
# Not Disclosed doesn't means 0 salary, so we are retaining NaN value
df["max_salary"] = df["max_salary"].str.replace(",", "")
df["max_salary"] = df["max_salary"].str.extract("(\d+)", expand = False).astype(float)

# Again have some error in data: annual salary less than 10000, we will replace as NaN
df["min_salary"] = df["min_salary"].apply(lambda x: np.nan if x < 100000 else x).astype(float)
df["max_salary"] = df["max_salary"].apply(lambda x: np.nan if x < 100000 else x).astype(float)
# Remove the Salary column
df.drop("Salary", axis=1, inplace=True)

In [190]:
#df.drop(df["max_experience"], inplace=True, axis=1)
print(df.shape)
df.head(3)

(6098, 21)


Unnamed: 0,Actual_Job_Link,Company_Name,Design_Role,Doctorate,Functional_Area,Industry,Job_Application,Job_Post,Job_Title,Job_View,Key_Skills,Location,Num_Openings,PG,Role_Category,Skill_Experience,UG,min_experience,max_experience,min_salary,max_salary
0,https://www.naukri.com/job-listings-Data-Scien...,niki.ai,Data Analyst,Doctorate Not Required,Analytics & Business Intelligence,IT-Software / Software Services,10,0,Data Scientist - Perl/python,10,"Machine Learning,Python,Data Analysis,Statisti...",Bengaluru,,"M.Tech - Any Specialization, MS/M.Sc(Science) ...",Analytics & BI,Qualifications and Skills :1. B.tech/MS or equ...,B.Tech/B.E. - Any Specialization,3.0,6.0,,
1,https://www.naukri.com/job-listings-Data-Scien...,Brillio Technologies Pvt. Ltd,Research Scientist,,"Medical , Healthcare , R&D , ...",IT-Software / Software Services,14,1,Data Scientist,166,"Analytics,Data analysis,Python,Visualization,A...",Bengaluru,,Post Graduation Not Required,R&D,,Any Graduate - Any Specialization,2.0,5.0,,
2,https://www.naukri.com/job-listings-Senior-Dat...,Knorex India,System Analyst,Doctorate Not Required,IT Software - System Programming,IT-Software / Software Services,10,0,"Senior Data Scientist, Data Scientist,",10,"python,machine learning,r,algorithms,java,mark...",Pune(Hadapsar),1.0,"M.Tech - Computers, Post Graduation Not Required",Programming & Design,Please refer to the Job description above,B.Tech/B.E. - Computers,3.0,6.0,425000.0,925000.0


In [207]:
# Job_Title : creating a new data-frame
df_new = df
# job title like data entry operator/ business system analyst/ mis executive etc are out of our scope of analysis
# so we are going to discard those job titles
pattern = "data scientist|business analyst|data analyst|scientist|machine learning|big data|data science"

# changing the text to lower case
df_new["Job_Title_lower"] = df_new["Job_Title"].str.lower()
# writing a new column with true and false based on if the pattern is exists or not
df_new["flag"] = df_new["Job_Title_lower"].str.contains(pattern)

# filtering the rows where flag is True
df_filtered = df_new[df_new["flag"]==True]

# to handle SettingWithCopyWarning : explicitly saying that it is not a copy, so that changes in the copy don't reflects
# in the original
df_filtered.is_copy = None
# Remove the flag column
df_filtered.drop("flag", axis=1, inplace=True)

print("After filtering jobs, the shape is " + str(df_filtered.shape))
df_filtered.head(3)

After filtering jobs, the shape is (3484, 22)


Unnamed: 0,Actual_Job_Link,Company_Name,Design_Role,Doctorate,Functional_Area,Industry,Job_Application,Job_Post,Job_Title,Job_View,Key_Skills,Location,Num_Openings,PG,Role_Category,Skill_Experience,UG,min_experience,max_experience,min_salary,max_salary,Job_Title_lower
0,https://www.naukri.com/job-listings-Data-Scien...,niki.ai,Data Analyst,Doctorate Not Required,Analytics & Business Intelligence,IT-Software / Software Services,10,0,Data Scientist - Perl/python,10,"Machine Learning,Python,Data Analysis,Statisti...",Bengaluru,,"M.Tech - Any Specialization, MS/M.Sc(Science) ...",Analytics & BI,Qualifications and Skills :1. B.tech/MS or equ...,B.Tech/B.E. - Any Specialization,3.0,6.0,,,data scientist - perl/python
1,https://www.naukri.com/job-listings-Data-Scien...,Brillio Technologies Pvt. Ltd,Research Scientist,,"Medical , Healthcare , R&D , ...",IT-Software / Software Services,14,1,Data Scientist,166,"Analytics,Data analysis,Python,Visualization,A...",Bengaluru,,Post Graduation Not Required,R&D,,Any Graduate - Any Specialization,2.0,5.0,,,data scientist
2,https://www.naukri.com/job-listings-Senior-Dat...,Knorex India,System Analyst,Doctorate Not Required,IT Software - System Programming,IT-Software / Software Services,10,0,"Senior Data Scientist, Data Scientist,",10,"python,machine learning,r,algorithms,java,mark...",Pune(Hadapsar),1.0,"M.Tech - Computers, Post Graduation Not Required",Programming & Design,Please refer to the Job description above,B.Tech/B.E. - Computers,3.0,6.0,425000.0,925000.0,"senior data scientist, data scientist,"


In [208]:
# delete dataframe
del df_new

In [226]:
# Mention if the job position is junior, mid_level, senior
df_filtered["Position_Rank"] = pd.np.where(df_filtered["Job_Title_lower"].str.contains("senior|sr|sr.|lead|head|chief"), "senior",
                                          pd.np.where(df_filtered["Job_Title_lower"].str.contains("junior|jr|jr.|assistant"), "junior", "mid_level"))

# Job_Title contains other informations also, need to parse only the job_title
df_filtered["Actual_Job_Title"] = pd.np.where(df_filtered["Job_Title_lower"].str.contains("data science|data scientist|scientist"), "data scientist",
                                              pd.np.where(df_filtered["Job_Title_lower"].str.contains("machine learning"), "machine learning",
                                                         pd.np.where(df_filtered["Job_Title_lower"].str.contains("business analyst|business analysts"), "business analyst",
                                                                     pd.np.where(df_filtered["Job_Title_lower"].str.contains("data analyst|analyst"), "data analyst",
                                                                                pd.np.where(df_filtered["Job_Title_lower"].str.contains("big data"), "big data engineer", "others")))))

In [231]:
df_filtered.head(3)

Unnamed: 0,Actual_Job_Link,Company_Name,Design_Role,Doctorate,Functional_Area,Industry,Job_Application,Job_Post,Job_Title,Job_View,Key_Skills,Location,Num_Openings,PG,Role_Category,Skill_Experience,UG,min_experience,max_experience,min_salary,max_salary,Job_Title_lower,Position_Rank,Actual_Job_Title
0,https://www.naukri.com/job-listings-Data-Scien...,niki.ai,Data Analyst,Doctorate Not Required,Analytics & Business Intelligence,IT-Software / Software Services,10,0,Data Scientist - Perl/python,10,"Machine Learning,Python,Data Analysis,Statisti...",Bengaluru,,"M.Tech - Any Specialization, MS/M.Sc(Science) ...",Analytics & BI,Qualifications and Skills :1. B.tech/MS or equ...,B.Tech/B.E. - Any Specialization,3.0,6.0,,,data scientist - perl/python,mid_level,data scientist
1,https://www.naukri.com/job-listings-Data-Scien...,Brillio Technologies Pvt. Ltd,Research Scientist,,"Medical , Healthcare , R&D , ...",IT-Software / Software Services,14,1,Data Scientist,166,"Analytics,Data analysis,Python,Visualization,A...",Bengaluru,,Post Graduation Not Required,R&D,,Any Graduate - Any Specialization,2.0,5.0,,,data scientist,mid_level,data scientist
2,https://www.naukri.com/job-listings-Senior-Dat...,Knorex India,System Analyst,Doctorate Not Required,IT Software - System Programming,IT-Software / Software Services,10,0,"Senior Data Scientist, Data Scientist,",10,"python,machine learning,r,algorithms,java,mark...",Pune(Hadapsar),1.0,"M.Tech - Computers, Post Graduation Not Required",Programming & Design,Please refer to the Job description above,B.Tech/B.E. - Computers,3.0,6.0,425000.0,925000.0,"senior data scientist, data scientist,",senior,data scientist


In [267]:
# Company Name: Need to remove "premium hiring for"
# change to lower case
df_filtered["Company_Name_lower"] = df_filtered["Company_Name"].str.lower()

# removing hiring for sentence
df_filtered["Company"]=pd.np.where(df_filtered["Company_Name_lower"].str.contains("hiring for"),df_filtered["Company_Name_lower"].str.extract("for (.*)", expand=False),df_filtered["Company_Name_lower"])
# remove startups
df_filtered["Company"] = df_filtered["Company"].str.replace(" - startup", "")

# replace jpmorgan with jp morgan
df_filtered["Company"] = df_filtered["Company"].str.replace("jpmorgan", "jp morgan")
# huquo consulting pvt. ltd. with huquo
df_filtered["Company"] = df_filtered["Company"].str.replace("huquo consulting pvt. ltd.", "huquo")
# premium with job consultancy as well as confidential with job consultancy
df_filtered["Company"] = df_filtered["Company"].str.replace("premium", "job consultancy")
df_filtered["Company"] = df_filtered["Company"].str.replace("confidential", "job consultancy")

In [273]:
# Doctorate : Required or not : Yes & No
# where required it is mentioned as Ph.D or Any Doctorate or Other Doctorate
# where not required it is as NaN or Doctorate Not Required
df_filtered["Doc_require"] = pd.np.where(df_filtered["Doctorate"].str.contains("Ph.D|Any|Other"), "Yes", "No")

In [276]:
# PG : Required or not
df_filtered["PG_require"] = pd.np.where(df_filtered["PG"].str.contains("Not Required|NaN"), "No", "Yes")

In [279]:
# UG : Required or not
df_filtered["UG_require"] = pd.np.where(df_filtered["UG"].str.contains("Not Required|NaN"), "No", "Yes")

In [289]:
# Location : Clean the location column
df_filtered["Location_clean"] = df_filtered["Location"]
# Bangalore to Bengaluru
df_filtered["loc_Bengaluru"] = pd.np.where(df_filtered["Location"].str.contains("Bengaluru|Bangalore"), "Bengaluru", "")
df_filtered["loc_Delhi"] = pd.np.where(df_filtered["Location"].str.contains("Delhi|Gurgaon|Gurugram|Noida"), "Delhi NCR", "")
df_filtered["loc_Mumbai"] = pd.np.where(df_filtered["Location"].str.contains("Mumbai|Bombay"), "Mumbai", "")
df_filtered["loc_Hyd"] = pd.np.where(df_filtered["Location"].str.contains("Hyderabad"), "Hyderabad", "")
df_filtered["loc_Pune"] = pd.np.where(df_filtered["Location"].str.contains("Pune"), "Pune", "")
df_filtered["loc_Kol"] = pd.np.where(df_filtered["Location"].str.contains("Kolkata|Calcutta"), "Kolkata", "")
df_filtered["loc_Chennai"] = pd.np.where(df_filtered["Location"].str.contains("Chennai"), "Chennai", "")

In [None]:
# Concat Job Locations
df_filtered["Locations"] = df_filtered[[""]]

In [301]:
df_filtered["loc_Bengaluru"].value_counts(dropna=False)

             2515
Bengaluru     969
Name: loc_Bengaluru, dtype: int64

In [269]:
df_filtered.dtypes

Actual_Job_Link        object
Company_Name           object
Design_Role            object
Doctorate              object
Functional_Area        object
Industry               object
Job_Application         int32
Job_Post                int32
Job_Title              object
Job_View                int32
Key_Skills             object
Location               object
Num_Openings          float64
PG                     object
Role_Category          object
Skill_Experience       object
UG                     object
min_experience        float64
max_experience        float64
min_salary            float64
max_salary            float64
Job_Title_lower        object
Position_Rank          object
Actual_Job_Title       object
Company_Name_lower     object
Company                object
dtype: object

In [302]:
df_filtered.to_csv("basic_clean3.csv", encoding="utf-8")

In [291]:
df_filtered

Unnamed: 0,Actual_Job_Link,Company_Name,Design_Role,Doctorate,Functional_Area,Industry,Job_Application,Job_Post,Job_Title,Job_View,Key_Skills,Location,Num_Openings,PG,Role_Category,Skill_Experience,UG,min_experience,max_experience,min_salary,max_salary,Job_Title_lower,Position_Rank,Actual_Job_Title,Company_Name_lower,Company,Doc_require,PG_require,UG_require,Location_clean,loc_Bengaluru,loc_Delhi,loc_Mumbai,loc_Hyd,loc_Pune,loc_Kol,loc_Chennai
0,https://www.naukri.com/job-listings-Data-Scien...,niki.ai,Data Analyst,Doctorate Not Required,Analytics & Business Intelligence,IT-Software / Software Services,10,0,Data Scientist - Perl/python,10,"Machine Learning,Python,Data Analysis,Statisti...",Bengaluru,,"M.Tech - Any Specialization, MS/M.Sc(Science) ...",Analytics & BI,Qualifications and Skills :1. B.tech/MS or equ...,B.Tech/B.E. - Any Specialization,3.0,6.0,,,data scientist - perl/python,mid_level,data scientist,niki.ai,niki.ai,No,Yes,Yes,Bengaluru,Bengaluru,,,,,,
1,https://www.naukri.com/job-listings-Data-Scien...,Brillio Technologies Pvt. Ltd,Research Scientist,,"Medical , Healthcare , R&D , ...",IT-Software / Software Services,14,1,Data Scientist,166,"Analytics,Data analysis,Python,Visualization,A...",Bengaluru,,Post Graduation Not Required,R&D,,Any Graduate - Any Specialization,2.0,5.0,,,data scientist,mid_level,data scientist,brillio technologies pvt. ltd,brillio technologies pvt. ltd,Yes,No,Yes,Bengaluru,Bengaluru,,,,,,
2,https://www.naukri.com/job-listings-Senior-Dat...,Knorex India,System Analyst,Doctorate Not Required,IT Software - System Programming,IT-Software / Software Services,10,0,"Senior Data Scientist, Data Scientist,",10,"python,machine learning,r,algorithms,java,mark...",Pune(Hadapsar),1.0,"M.Tech - Computers, Post Graduation Not Required",Programming & Design,Please refer to the Job description above,B.Tech/B.E. - Computers,3.0,6.0,425000.0,925000.0,"senior data scientist, data scientist,",senior,data scientist,knorex india,knorex india,No,No,Yes,Pune(Hadapsar),,,,,Pune,,
3,https://www.naukri.com/job-listings-Data-Scien...,GoPaisa Netventures Pvt Ltd,Database Architect/Designer,Doctorate Not Required,"IT Software - eCommerce , Internet Techn...",Internet / Ecommerce,10,2,Data Scientist,108,"mysql,solr,git,web technologies,performance tu...","New Delhi, Jasola",,"M.Tech - Any Specialization, MCA - Computers",Programming & Design,"Qualification\t-B-Tech , MCA , M-Tech",B.Tech/B.E. - Any Specialization,3.0,5.0,,,data scientist,mid_level,data scientist,gopaisa netventures pvt ltd,gopaisa netventures pvt ltd,No,Yes,Yes,"New Delhi, Jasola",,Delhi NCR,,,,,
4,https://www.naukri.com/job-listings-Data-Engin...,IDS Infotech Ltd.,Software Developer,,"IT Software - Application Programming , ...",IT-Software / Software Services,10,0,Data Engineer/analyst/scientist (Big Data),10,"Big Data Engineer,Big Data Developer,Big Data ...",Chandigarh,1.0,"M.Tech - Computers, MCA - Computers",Programming & Design,Please refer to the Job description above,"B.Tech/B.E. - Computers, BCA - Computers",2.0,4.0,400000.0,800000.0,data engineer/analyst/scientist (big data),mid_level,data scientist,ids infotech ltd.,ids infotech ltd.,Yes,Yes,Yes,Chandigarh,,,,,,,
5,https://www.naukri.com/job-listings-Data-Scien...,Inkredo - Startup,Database Architect/Designer,Doctorate Not Required,"IT Software - DBA , Datawarehousing",Internet / Ecommerce,11,1,Data Scientist,68,"software solutions,data scientist,NLP,data sci...",Gurgaon,,Any Postgraduate - Any Specialization,Programming & Design,"Experience: Experience in NLP, network analys...","Any Graduate - Any Specialization, B.Tech/B.E....",1.0,3.0,,,data scientist,mid_level,data scientist,inkredo - startup,inkredo,No,Yes,Yes,Gurgaon,,Delhi NCR,,,,,
6,https://www.naukri.com/job-listings-Data-Scien...,xtLytics,DBA,Doctorate Not Required,"IT Software - DBA , Datawarehousing",IT-Software / Software Services,36,5,Data Scientist - SQL/ Hive/ Pig,227,"Hive,Data Science,Machine Learning,R,SQL,Data ...","Delhi NCR, Noida",,Any Postgraduate - Any Specialization,Admin/Maintenance/Security/Datawarehousing,Education- A Bachelor's degree in a quantitati...,"B.Tech/B.E. - Any Specialization, B.Sc - Any S...",5.0,10.0,,,data scientist - sql/ hive/ pig,mid_level,data scientist,xtlytics,xtlytics,No,Yes,Yes,"Delhi NCR, Noida",,Delhi NCR,,,,,
7,https://www.naukri.com/job-listings-Tacit-DDC-...,Shell India Markets Private Limited,Data Analyst,Ph.D,Analytics & Business Intelligence,IT-Software / Software Services,10,1,Tacit- DDC- Lead Data Scientist,120,"data science,data analytics,agile,machine lear...",Bengaluru,1.0,"MBA/PGDM - Any Specialization, M.Tech - Any Sp...",Analytics & BI,Requirements:\tSkills & Requirements: Bache...,"B.Tech/B.E. - Computers, B.Sc - Computers, Mat...",12.0,14.0,,,tacit- ddc- lead data scientist,senior,data scientist,shell india markets private limited,shell india markets private limited,Yes,Yes,Yes,Bengaluru,Bengaluru,,,,,,
8,https://www.naukri.com/job-listings-Data-Scien...,Kafal Software,Data Analyst,Doctorate Not Required,Analytics & Business Intelligence,IT-Software / Software Services,29,5,Data Scientist - R/ Statistical Modelling - II...,248,"Data Analysis,Data Mining,R,Statistical Modeli...",Delhi NCR,,Any Postgraduate - Any Specialization,Analytics & BI,IITs and NITs preferred,B.Tech/B.E. - Any Specialization,4.0,9.0,,,data scientist - r/ statistical modelling - ii...,mid_level,data scientist,kafal software,kafal software,No,Yes,Yes,Delhi NCR,,Delhi NCR,,,,,
9,https://www.naukri.com/job-listings-Data-Scien...,FedEX Express Transportation & Supply ChainSer...,Other,Doctorate Not Required,Analytics & Business Intelligence,KPO / Research / Analytics,954,0,Data Scientist - Digital Intelligence COE,2500,"hadoop,hive,data analysis,oozie,flume,sql,sqoo...",Mumbai,,Post Graduation Not Required,Other,Please refer to the Job description above,"B.Tech/B.E. - Any Specialization, BCA - Comput...",2.0,7.0,,,data scientist - digital intelligence coe,mid_level,data scientist,fedex express transportation & supply chainser...,fedex express transportation & supply chainser...,No,No,Yes,Mumbai,,,Mumbai,,,,
