## Summary ##
This project demonstrates these skills and abilities:
* Using Python for data analysis with the pandas and numpy library
* Cleaning Data
    * Checking for missing values
    * Checking for Outliers
    * Checking for duplicates
* Transforming Data / Data Manipulation
* Making Data Visualizations in Tableau
* Dashboarding in Tableau

You can check out the completed dashboard here on my Tableau public account. [Here](https://public.tableau.com/views/Data_Professional_Salary_Summary/DataJobSalariesDSHBRD?:language=en-US&:sid=&:redirect=auth&:display_count=n&:origin=viz_share_link) 

## Introduction ##

#### In this notebook we'll be cleaning and exploring a dataset that includes information about different data roles. The data set is a collection of jobs related to data like Data Analyst, Data Engineer, Business Intelligence Analyst, Data Manager, etc. The goal here is to clean and transform the dataset before we begin making our dashboard in tableau. 
#### Each row describes a job that someone has/held, our columns are ####
* work_year : Year of recording
* experience_level: Executiive(EX), Senior(SE), Mid Level(MI), Entry Level(EN)          
* employment_type: Full-Time(FT), Part-Time(PT), Contract(CT), Freelance(FL)
* job_title: Title of the position
* salary: Yearly wage
* salary_currency: Currency of Salary (EX: USD, EUR, JPY, etc.)
* salary_in_usd: Salary converted to USD
* employee_residence: The country the employee lives in
* remote_ration: How remote the position is
* company_location: The country where the company resides
* company_size: Large(L), Medium(M), Small(S)

#### My intention with this dataset is to clean the data wherever it's needed and then create a dashboard in tableau that summarizes some of the dataset's main points. With the dasboard I want to answer exploratory questions like "What's the average pay for x position?", "What's the average experience level for x position?", and/or "From 2020-2023 has the data professional market changed?"  #### 

#### Here we'll do simple cleaning for missing values, any typos in categorical columns, data type checking, identifying duplicates and finding any other issues in the data that may potentially disrupt our visualizations. ####

In [89]:
# import python libraries
import pandas as pd
import numpy as np

In [90]:
# Load dataset and view the structure
df = pd.read_csv('/kaggle/input/data-science-salaries-2023/ds_salaries.csv')
df.head()

Unnamed: 0,work_year,experience_level,employment_type,job_title,salary,salary_currency,salary_in_usd,employee_residence,remote_ratio,company_location,company_size
0,2023,SE,FT,Principal Data Scientist,80000,EUR,85847,ES,100,ES,L
1,2023,MI,CT,ML Engineer,30000,USD,30000,US,100,US,S
2,2023,MI,CT,ML Engineer,25500,USD,25500,US,100,US,S
3,2023,SE,FT,Data Scientist,175000,USD,175000,CA,100,CA,M
4,2023,SE,FT,Data Scientist,120000,USD,120000,CA,100,CA,M


## Initial Exploration ##

In [91]:
# View descriptive statistics for numeric columns in the dataset
df.describe().map("{0:.2f}".format)

Unnamed: 0,work_year,salary,salary_in_usd,remote_ratio
count,3755.0,3755.0,3755.0,3755.0
mean,2022.37,190695.57,137570.39,46.27
std,0.69,671676.5,63055.63,48.59
min,2020.0,6000.0,5132.0,0.0
25%,2022.0,100000.0,95000.0,0.0
50%,2022.0,138000.0,135000.0,0.0
75%,2023.0,180000.0,175000.0,100.0
max,2023.0,30400000.0,450000.0,100.0


In [92]:
# View descriptive statistics for string objects in the dataset
df.describe(include=object)

Unnamed: 0,experience_level,employment_type,job_title,salary_currency,employee_residence,company_location,company_size
count,3755,3755,3755,3755,3755,3755,3755
unique,4,4,93,20,78,72,3
top,SE,FT,Data Engineer,USD,US,US,M
freq,2516,3718,1040,3224,3004,3040,3153


In [93]:
# View the datatypes for each column in our data
df.dtypes

work_year              int64
experience_level      object
employment_type       object
job_title             object
salary                 int64
salary_currency       object
salary_in_usd          int64
employee_residence    object
remote_ratio           int64
company_location      object
company_size          object
dtype: object

#### There are no missing values in the data set which is good, and the data types add up to what they are supposed to be. There might not be any cleaning that we need to do but lets go throgh the process anyways. 

#### Lets take a look at the categorical values to see if there are any typos in them. We want to check specifically that all categories are grouped exactly how they should be. ####
#### Lets say for example if we saw in the ['employment_type'] column the two values 'FT' and 'Ft' both representing Full Time, we'd want to change the odd one from the group to match the rest of the values. ####

In [94]:
# Viewing the distribution and values for categorical columns
print(df['experience_level'].value_counts())
print("________________________________________________")
print(df['employment_type'].value_counts())
print("________________________________________________")
print(df['salary_currency'].value_counts())
print("________________________________________________")
print(df['company_size'].value_counts())

experience_level
SE    2516
MI     805
EN     320
EX     114
Name: count, dtype: int64
________________________________________________
employment_type
FT    3718
PT      17
CT      10
FL      10
Name: count, dtype: int64
________________________________________________
salary_currency
USD    3224
EUR     236
GBP     161
INR      60
CAD      25
AUD       9
SGD       6
BRL       6
PLN       5
CHF       4
HUF       3
DKK       3
JPY       3
TRY       3
THB       2
ILS       1
HKD       1
CZK       1
MXN       1
CLP       1
Name: count, dtype: int64
________________________________________________
company_size
M    3153
L     454
S     148
Name: count, dtype: int64


#### The categorical values look good! The three remaining columns have a larger number of values per column and it would be easier to spot any typos in our visualization tool. If we spot a typo then we will have to comeback and edit them out here, but for now I think we can move passed checking for typos in our data.

#### Next lets check for duplicates. It's possible that we have employees that are working under the same exact circumstances but if we see that there are multiple duplicates of the same row then I think we can safely assume that the duplicate was a mistake. #### 

In [95]:
# Print the number number of rows and columns with duplicated values
print(df[df.duplicated(keep=False)].shape, "\n\n")

# View completely duplicated rows
df[df.duplicated(keep=False)].sort_values(by=['job_title','salary']).head(10)

(1715, 11) 




Unnamed: 0,work_year,experience_level,employment_type,job_title,salary,salary_currency,salary_in_usd,employee_residence,remote_ratio,company_location,company_size
2920,2022,SE,FT,Analytics Engineer,63000,USD,63000,US,0,US,M
2945,2022,SE,FT,Analytics Engineer,63000,USD,63000,US,0,US,M
2697,2022,SE,FT,Analytics Engineer,110000,USD,110000,US,100,US,M
2738,2022,SE,FT,Analytics Engineer,110000,USD,110000,US,100,US,M
2696,2022,SE,FT,Analytics Engineer,130000,USD,130000,US,100,US,M
2737,2022,SE,FT,Analytics Engineer,130000,USD,130000,US,100,US,M
2324,2022,SE,FT,Analytics Engineer,138750,USD,138750,US,100,US,M
2326,2022,SE,FT,Analytics Engineer,138750,USD,138750,US,100,US,M
120,2023,SE,FT,Analytics Engineer,143860,USD,143860,US,0,US,M
402,2023,SE,FT,Analytics Engineer,143860,USD,143860,US,0,US,M


#### This is interesting, almost half of the rows in the dataset seem to be duplicates. Like I mentioned previously it is possible that the same position in the same company would pay the same amount. We'll assume the data is true and that each row accurately represents a different individual. ####

#### Aside from the "duplicates" we might have seen this dataset looks well cleaned. We're going to start double checking and customizing a few columns to enhance the visualizations we'll make in tableau. If we don't do it here we'd have to do it in Tableau. ####

## Editing Categorical Columns ##

In [96]:
# Increase the max number of rows that are displayed.
pd.set_option('display.max_rows',100)

In [97]:
# Since we don't have too many different job titles we'll view them all
# And decide how to fix/group the ones with similar names/functions
df.job_title.value_counts().sort_values()

job_title
Finance Data Analyst                           1
Principal Data Architect                       1
Head of Machine Learning                       1
Cloud Data Architect                           1
Data DevOps Engineer                           1
BI Data Engineer                               1
Staff Data Scientist                           1
Deep Learning Researcher                       1
Staff Data Analyst                             1
Marketing Data Engineer                        1
Power BI Developer                             1
Compliance Data Analyst                        1
Data Science Tech Lead                         1
Data Management Specialist                     1
Principal Machine Learning Engineer            1
Azure Data Engineer                            1
Manager Data Management                        1
Product Data Scientist                         1
Data Scientist Lead                            2
Data Analytics Consultant                      2
Marketing 

#### The ['job_title'] column can use some editing. The job titles' column has values that are very similar to eachother. An example in the column is where we see "Financial Data Analyst" and "Finance Data Analyst". Another example is "M.L Engineer" and "Machine Learning Engineer". We'll go through the ['job_title'] column now and group similar job titles together. ####

In [98]:
# Changing the similar named job titles to match the majority
df.replace({'AI Programmer':'AI Developer', 'BI Analyst':'BI Data Analyst', 'Cloud Database Engineer':'Cloud Data Engineer', 
            'Computer Vision Engineer':'Computer Vision Software Engineer','Finance Data Analyst': 'Financial Data Analyst',
            'ML Engineer':'Machine Learning Engineer', 'Power BI Developer':'BI Developer'}, inplace=True)
df.job_title.nunique()

86

#### We know that the previous cell ran correctly because there are 7 job titles we're changing and 93 total from the start. 93-7=86. 

#### We're also going to change the ['experience_level'] and ['employment_type'] columns' values from abbreviations to their full words/phrases. Ex: EN = Entry Level, MI = Middle/Intermediate, FT= Full Time CT = Contract etc. 

In [99]:
# View the distribution of eployment type and experience level.
# We're going to change the abbreviations to their full words
print(df.employment_type.value_counts())
print("___________________________________________")
print(df.experience_level.value_counts())

employment_type
FT    3718
PT      17
CT      10
FL      10
Name: count, dtype: int64
___________________________________________
experience_level
SE    2516
MI     805
EN     320
EX     114
Name: count, dtype: int64


In [100]:
# Replacing the abbreviations in employment type and experience level
df.replace({'employment_type':{'FT':'Full Time', 'PT':'Part Time', 'CT':'Contract', 'FL':'Freelance'},
            'experience_level':{'SE':'Senior', 'MI':'Intermediate', 'EN':'Entry', 'EX':'Executive'}}, inplace=True)

print(df.employment_type.value_counts())
print("___________________________________________")
print(df.experience_level.value_counts())

employment_type
Full Time    3718
Part Time      17
Contract       10
Freelance      10
Name: count, dtype: int64
___________________________________________
experience_level
Senior          2516
Intermediate     805
Entry            320
Executive        114
Name: count, dtype: int64


In [101]:
# Making a Dictionary for every Country in the dataset so that we can replace the abbreviations
print(df.employee_residence.value_counts().index)
print(df.company_location.value_counts().index)
print(df.company_size.value_counts().index)
country_map = {
    'US': 'United States', 'GB': 'Great Britain', 'CA': 'Canada', 'ES': 'Spain',
    'IN': 'India', 'DE': 'Germany', 'FR': 'France', 'PT': 'Portugal', 'BR': 'Brazil',
    'GR': 'Greece', 'NL': 'Netherlands', 'AU': 'Australia', 'MX': 'Mexico',
    'IT': 'Italy', 'PK': 'Pakistan', 'JP': 'Japan', 'IE': 'Ireland', 'NG': 'Nigeria',
    'AT': 'Austria', 'AR': 'Argentina', 'PL': 'Poland', 'PR': 'Puerto Rico', 
    'TR': 'Turkey', 'BE': 'Belgium', 'SG': 'Singapore', 'RU': 'Russia', 'LV': 'Latvia', 
    'UA': 'Ukraine', 'CO': 'Colombia', 'CH': 'Switzerland', 'SI': 'Slovenia', 'BO': 'Bolivia', 
    'DK': 'Denmark', 'HR': 'Croatia', 'HU': 'Hungary', 'RO': 'Romania', 'TH': 'Thailand', 
    'AE': 'United Arab Emirates', 'VN': 'Vietnam', 'HK': 'Hong Kong', 
    'UZ': 'Uzbekistan', 'PH': 'Philippines', 'CF': 'Central African Republic', 
    'CL': 'Chile', 'FI': 'Finland', 'CZ': 'Czech Republic', 'SE': 'Sweden',
    'AS': 'American Samoa', 'LT': 'Lithuania', 'GH': 'Ghana', 'KE': 'Kenya', 
    'DZ': 'Algeria', 'NZ': 'New Zeland', 'JE': 'Jersey', 'MY': 'Malaysia', 
    'MD': 'Moldova', 'IQ': 'Iraq', 'BG': 'Bulgaria', 'LU': 'Luxembourg', 'RS': 'Serbia', 
    'HN': 'Honduras', 'EE': 'Estonia', 'TN': 'Tunisia', 'CR': 'Costa Rica', 'ID': 'Indonesia', 
    'EG': 'Egypt', 'DO': 'Dominican Republic', 'CN': 'China', 'SK': 'South Korea', 
    'IR': 'Iran', 'MA': 'Morocco', 'IL': 'Israel', 'MK': 'North Macedonia', 'BA': 'Bosnia', 
    'AM': 'Armenia', 'CY': 'Cyprus', 'KW': 'Kuwait', 'MT': 'Malta', 'BS': 'The Bahamas',
    'AL': 'Albania'
}

# Replacing the abbreviations for employee_residence, company_location, and company_size
df.replace({'employee_residence': country_map,
           'company_location': country_map,
            'company_size': {'S':'Small', 'M':'Medium', 'L':'Large'}
           }, inplace=True)
print("\n____________________After_Change_____________________________")
print(df.employee_residence.value_counts().index)
print(df.company_location.value_counts().index)
print(df.company_size.value_counts().index)

Index(['US', 'GB', 'CA', 'ES', 'IN', 'DE', 'FR', 'PT', 'BR', 'GR', 'NL', 'AU',
       'MX', 'IT', 'PK', 'JP', 'IE', 'NG', 'AT', 'AR', 'PL', 'PR', 'TR', 'BE',
       'SG', 'RU', 'LV', 'UA', 'CO', 'CH', 'SI', 'BO', 'DK', 'HR', 'HU', 'RO',
       'TH', 'AE', 'VN', 'HK', 'UZ', 'PH', 'CF', 'CL', 'FI', 'CZ', 'SE', 'AS',
       'LT', 'GH', 'KE', 'DZ', 'NZ', 'JE', 'MY', 'MD', 'IQ', 'BG', 'LU', 'RS',
       'HN', 'EE', 'TN', 'CR', 'ID', 'EG', 'DO', 'CN', 'SK', 'IR', 'MA', 'IL',
       'MK', 'BA', 'AM', 'CY', 'KW', 'MT'],
      dtype='object', name='employee_residence')
Index(['US', 'GB', 'CA', 'ES', 'IN', 'DE', 'FR', 'BR', 'AU', 'GR', 'PT', 'NL',
       'MX', 'IE', 'SG', 'AT', 'JP', 'TR', 'CH', 'NG', 'PL', 'PK', 'LV', 'DK',
       'IT', 'PR', 'SI', 'BE', 'CO', 'UA', 'HR', 'TH', 'RU', 'AR', 'CZ', 'AE',
       'FI', 'AS', 'LU', 'HU', 'ID', 'LT', 'RO', 'SE', 'KE', 'EE', 'CF', 'IL',
       'GH', 'EG', 'MD', 'CL', 'NZ', 'CN', 'IQ', 'DZ', 'HK', 'HN', 'MY', 'AL',
       'MA', 'PH', 'BO', 'VN', 'AM', '

In [102]:
df.head()

Unnamed: 0,work_year,experience_level,employment_type,job_title,salary,salary_currency,salary_in_usd,employee_residence,remote_ratio,company_location,company_size
0,2023,Senior,Full Time,Principal Data Scientist,80000,EUR,85847,Spain,100,Spain,Large
1,2023,Intermediate,Contract,Machine Learning Engineer,30000,USD,30000,United States,100,United States,Small
2,2023,Intermediate,Contract,Machine Learning Engineer,25500,USD,25500,United States,100,United States,Small
3,2023,Senior,Full Time,Data Scientist,175000,USD,175000,Canada,100,Canada,Medium
4,2023,Senior,Full Time,Data Scientist,120000,USD,120000,Canada,100,Canada,Medium


####  These were minor issues that came down to preference and what I thought would make understanding/reading visualizations easier. ####

## Adding a Continent Column ##

#### We're going to create a new column in the dataframe called Continent, match the company country to the correct continent as a value in the column. ####

In [103]:
continent = pd.Series()
continent = df['company_location'].copy()
continent

0               Spain
1       United States
2       United States
3              Canada
4              Canada
            ...      
3750    United States
3751    United States
3752    United States
3753    United States
3754            India
Name: company_location, Length: 3755, dtype: object

In [104]:
continent_map = {'United States': 'North America', 'Canada': 'North America', 'Mexico': 'North America', 'Puerto Rico': 'North America',
    'Honduras': 'North America', 'Costa Rica': 'North America',
    'Brazil': 'South America', 'Argentina': 'South America', 'Colombia': 'South America', 'Chile': 'South America', 'Bolivia': 'South America',
    'Great Britain': 'Europe', 'Germany': 'Europe', 'France': 'Europe', 'Greece': 'Europe',
    'Portugal': 'Europe', 'Netherlands': 'Europe', 'Ireland': 'Europe', 'Austria': 'Europe', 'Switzerland': 'Europe',
    'Poland': 'Europe', 'Latvia': 'Europe', 'Denmark': 'Europe', 'Italy': 'Europe', 'Slovenia': 'Europe', 'Belgium': 'Europe',
    'Ukraine': 'Europe', 'Croatia': 'Europe', 'Czech Republic': 'Europe', 'Finland': 'Europe', 'Luxembourg': 'Europe',
    'Hungary': 'Europe', 'Lithuania': 'Europe', 'Romania': 'Europe', 'Sweden': 'Europe', 'Estonia': 'Europe', 'Malta': 'Europe',
    'Russia': 'Europe', 'Turkey': 'Europe',
    'India': 'Asia', 'Japan': 'Asia', 'Singapore': 'Asia', 'Thailand': 'Asia', 'Indonesia': 'Asia',
    'China': 'Asia', 'Pakistan': 'Asia', 'South Korea': 'Asia', 'Vietnam': 'Asia', 'Philippines': 'Asia',
    'Malaysia': 'Asia', 'Hong Kong': 'Asia', 'Iran': 'Asia', 'Iraq': 'Asia', 'Armenia': 'Asia',
    'United Arab Emirates': 'Asia', 'Israel': 'Asia',
    'Egypt': 'Africa', 'Nigeria': 'Africa', 'Kenya': 'Africa', 'Ghana': 'Africa', 'Algeria': 'Africa',
    'Morocco': 'Africa', 'Central African Republic': 'Africa',
    'New Zeland': 'Oceania', 'Australia': 'Oceania', 'American Samoa': 'Oceania',
    'The Bahamas': 'North America', 'Moldova': 'Europe', 'Spain': 'Europe', 'Bosnia': 'Europe', 
    'North Macedonia': 'Europe', 'Albania': 'Europe'}

continent.replace(continent_map, inplace=True)
continent.value_counts()

company_location
North America    3144
Europe            462
Asia               93
South America      24
Oceania            18
Africa             14
Name: count, dtype: int64

In [105]:
continent

0              Europe
1       North America
2       North America
3       North America
4       North America
            ...      
3750    North America
3751    North America
3752    North America
3753    North America
3754             Asia
Name: company_location, Length: 3755, dtype: object

In [106]:
continent.to_csv('continent_col.csv')

## Aggregating Job Titles ##

#### I want to filter out a lot of the professions that are included in the data set even more than before. ####

#### I'm going to categorize some of the professions like I previously mentioned and if the profession fits into the categories of Data Analyst, Data Scientist, Data Engineer, Machine Learning Engineer, or Data Management positions they will be grouped here, exported, and I'll make a second dashboard for those 5 professions. ####

#### A couple of examples would be BI Analyst, Business Data Analyst, BI Developer, Financial Data Analyst, Insight Analyst , and anything similar in the dataset will be labeled as Data Analyst in the new dataframe. I'll do the same thing for the other 5 professions. ####

#### We are generalizing many different data roles but my main focus for this project is to make a nice dashboard. ####

In [107]:
df['job_title'].value_counts()

job_title
Data Engineer                               1040
Data Scientist                               840
Data Analyst                                 612
Machine Learning Engineer                    323
Analytics Engineer                           103
Data Architect                               101
Research Scientist                            82
Applied Scientist                             58
Data Science Manager                          58
Research Engineer                             37
Data Manager                                  29
Machine Learning Scientist                    26
BI Data Analyst                               24
Data Science Consultant                       24
Computer Vision Software Engineer             23
Data Analytics Manager                        22
AI Scientist                                  16
Business Data Analyst                         15
BI Developer                                  14
Data Specialist                               14
AI Develop

In [108]:
df['job_title'] = df['job_title'].apply(lambda x: 'Data Manager/Lead' if any(keyword in x for keyword in ['Management', 'Manager', 'Head', 'Lead']) else x)
df['job_title'] = df['job_title'].apply(lambda x: 'Machine Learning Developer' if any(keyword in x for keyword in ['Machine Learning', 'ML', 'AI']) else x)
df['job_title'] = df['job_title'].apply(lambda x: 'Data Analyst' if any(keyword in x for keyword in ['Analyst', 'BI', 'Data Analytics']) else x)
df['job_title'] = df['job_title'].apply(lambda x: 'Data Scientist' if any(keyword in x for keyword in ['Data Science', 'Scientist']) else x)
df['job_title'] = df['job_title'].apply(lambda x: 'Data Engineer' if any(keyword in x for keyword in ['Engineer', 'Architect', 'ETL']) else x)

df['job_title'].value_counts()

job_title
Data Engineer                    1379
Data Scientist                   1040
Data Analyst                      704
Machine Learning Developer        435
Data Manager/Lead                 172
Data Specialist                    14
3D Computer Vision Researcher       4
Data Modeler                        2
Data Strategist                     2
Autonomous Vehicle Technician       2
Deep Learning Researcher            1
Name: count, dtype: int64

#### Data Specialist and Strategist are very broad and can fit in multiple categories so those will just be left out. We're going to extract the dataframe as a csv file with the top 5 job_titles. ####

In [109]:
top5_titles = df['job_title'].value_counts().nlargest(5)
filtered_df = df[df['job_title'].isin(top5_titles.keys())]
filtered_df.reset_index(inplace=True)
filtered_df['job_title'].value_counts()

job_title
Data Engineer                 1379
Data Scientist                1040
Data Analyst                   704
Machine Learning Developer     435
Data Manager/Lead              172
Name: count, dtype: int64

In [110]:
filtered_df.to_csv('Top5_DataProfessionals.csv')

In [111]:
filtered_df.head()

Unnamed: 0,index,work_year,experience_level,employment_type,job_title,salary,salary_currency,salary_in_usd,employee_residence,remote_ratio,company_location,company_size
0,0,2023,Senior,Full Time,Data Scientist,80000,EUR,85847,Spain,100,Spain,Large
1,1,2023,Intermediate,Contract,Machine Learning Developer,30000,USD,30000,United States,100,United States,Small
2,2,2023,Intermediate,Contract,Machine Learning Developer,25500,USD,25500,United States,100,United States,Small
3,3,2023,Senior,Full Time,Data Scientist,175000,USD,175000,Canada,100,Canada,Medium
4,4,2023,Senior,Full Time,Data Scientist,120000,USD,120000,Canada,100,Canada,Medium


## Creating a Second Continent Column ##

#### We could map the old continent column indices to our filtered_df's indices but instead we are going to make a whole new continent series. We'll make a new continent series with the same method we used before, export it, and then create a relationship between it and the filtered_df in Tableau. ####

In [112]:
continent2 = pd.Series()
continent2 = filtered_df['company_location'].copy()
continent2

continent2.replace(continent_map, inplace=True)
continent2.value_counts()

company_location
North America    3124
Europe            460
Asia               92
South America      24
Oceania            17
Africa             13
Name: count, dtype: int64

In [113]:
continent2.to_csv('filtered_continent2.csv', index=False)

#### This project is complete, in it I've demonstrated skills and abilities including:
* Ability to use Python for data analysis with the pandas library
* Cleaning Data
    * Checking for missing values
    * Checking for Outliers
    * Checking for duplicates
* Transforming Data / Data Manipulation
* Making Data Visualizations in Tableau
* Dashboarding
* Summarizing

You can check out the completed dashboard on tableau [here](https://public.tableau.com/views/Data_Professional_Salary_Summary/DataJobSalariesDSHBRD?:language=en-US&:sid=&:redirect=auth&:display_count=n&:origin=viz_share_link)