This is an EDA project that explores the Data Science Salaries in 2023 dataset. The purpose of this project is to gain insights into the current trends and patterns of salaries in the data science industry. The dataset was obtained from Kaggle website and contains information on various features such as job title, years of experience, education level, location, and salary.
The dataset used in this project can be found on Kaggle website under the title "Data Science Salaries in 2023". It consists of a CSV file with over 10,000 rows and 11 columns. The columns include:
Data Science Job Salaries Dataset contains 11 columns, each are:
- work_year: The year the salary was paid
- experience_level: EN Entry-level(Junior) MI Mid-level(Intermediate) SE Senior-level(Expert) EX Executive-level(Director)
- job_title: The role worked in during the year
- salary: The total gross salary amount paid
- salary_currency: The currency of the salary paid as an ISO 4217 currency code
- salary_in_usd: The salary in USD
- employee_residence: Employee's primary country of residence in during the work year
- remote_ratio: 0 No remote work (less than 20%) 50 Partially remote 100 Fully remote (more than 80%)
- company_location: The country of the mployer's main office or contracting branch
- company_size: The average number of people that worked for the company during the year: S (less than 50 employees) M (50-250 employees) L (more than 250 employees) large
- employment_type: The type of employement for the role: PT Part-time FT Full-time CT Contract FL Freelance
The methodology used in this project involves performing exploratory data analysis on the dataset to gain insights into the patterns and trends of salaries in the data science industry. The analysis includes data cleaning, data visualization, and statistical analysis.
The results of the analysis are presented in the Jupyter Notebook file Data_Science_Salaries_EDA.ipynb. The findings include insights into the distribution of salaries based on job title, years of experience, education level, and location. The analysis also reveals the correlation between various features and their impact on the salary.
In conclusion, this project provides valuable insights into the current trends and patterns of salaries in the data science industry. The findings can be used by individuals and companies to make informed decisions about salaries and job offers.