# Koodiklinikka Salary Survey 2024 - Data Download

## Introduction

This notebook downloads and saves the **2024 Finnish Tech Salary Survey** data from [Koodiklinikka](https://koodiklinikka.fi/), a Finnish developer community.

The survey collects salary information from tech professionals working in Finland, including:
- Monthly and annual salaries
- Job roles and experience levels
- Company types and locations
- Remote work arrangements
- And more...

**Data Source:** https://koodiklinikka.github.io/palkkakysely/2024

In [9]:
import pandas as pd
from pathlib import Path
import os

# Ignore warnings related to NaN values
import warnings
warnings.filterwarnings("ignore")

# Detect if running on Kaggle
IS_KAGGLE = 'KAGGLE_KERNEL_RUN_TYPE' in os.environ

# Set output directory based on environment
if IS_KAGGLE:
    RAW_DIR = Path("/kaggle/working")
    print("Running on Kaggle - output will be saved to /kaggle/working")
else:
    RAW_DIR = Path("raw")
    # Create directory if it doesn't exist
    RAW_DIR.mkdir(exist_ok=True)
    print(f"Running locally - raw data will be saved to '{RAW_DIR}/'")


Running locally - raw data will be saved to 'raw/'


## Download the data

In [10]:
# Data URLs - English (CSV) and Finnish (TSV) versions
url_en = "https://koodiklinikka.github.io/palkkakysely/2024/raw-en.tsv"
url_fi = "https://koodiklinikka.github.io/palkkakysely/2024/raw-fi.tsv"

# Download both datasets
df_en = pd.read_csv(url_en, sep='\t')
df_fi = pd.read_csv(url_fi, sep='\t')

print(f"English version: {len(df_en)} rows and {len(df_en.columns)} columns")
print(f"Finnish version: {len(df_fi)} rows and {len(df_fi.columns)} columns")

print("\n--- English version preview ---")
df_en.head()


English version: 52 rows and 26 columns
Finnish version: 683 rows and 26 columns

--- English version preview ---


Unnamed: 0,Timestamp,Employee or entrepreneur,Have you switched from employment to entrepreneurship or vice versa after 1.10.2023?,Age,Gender,Relevant work experience from the industry (in years),Education,Change in income from last year (in %),How many years have you worked as an entrepreneur in this industry?,What services do you offer?,...,What kind of a company you work in?,Full time / part time,How much of your work time you spend in company office? (in %),Role / title,"Monthly salary (gross, in EUR)","Yearly income (incl. bonuses, etc; in EUR)",Free description of your compensation model,Is your salary competitive?,What was left unasked that you want to answer to?,Feedback of the survey
0,07/10/2024 12:42:34,Employee,No,36-40,Male,16.0,Masters degree,0.0,,,...,Consulting,100.0,0.0,Lead software developer,8400.0,0.0,Based on hours working for client,No,,
1,07/10/2024 12:47:37,Employee,No,31-35,Male,8.0,"MSc, Applied Mathematics",2.0,,,...,A company where software is support role (for ...,100.0,70.0,C++/CUDA algorithm engineer,5300.0,75000.0,,Maybe,,
2,07/10/2024 21:12:16,Entrepreneur,,26-30,,7.0,,,,,...,,,,,,,,,,
3,10/10/2024 18:07:34,Entrepreneur,No,31-35,M,10.0,DI,-6.0,3.0,"backend, frontend, full-stack,devops,architecture",...,,,,,,,,,,
4,11/10/2024 12:33:36,Entrepreneur,No,46-50,Male,25.0,cs@university (not completed),0.0,13.0,Web development,...,,,,,,,,,,


In [11]:
print("\n--- Finnish version preview ---")
df_fi.head()



--- Finnish version preview ---


Unnamed: 0,Timestamp,Oletko palkansaaja vai laskuttaja?,Oletko siirtynyt palkansaajasta laskuttajaksi tai päinvastoin 1.10.2023 jälkeen?,Ikä,Sukupuoli,Työkokemus alalta (vuosina),Koulutustaustasi,Tulojen muutos viime vuodesta (%),Montako vuotta olet tehnyt laskuttavaa työtä alalla?,Mitä palveluja tarjoat?,...,Millaisessa yrityksessä työskentelet?,Työaika,Kuinka suuren osan ajasta teet lähityönä toimistolla?,Rooli / titteli,"Kuukausipalkka (brutto, euroina)","Vuositulot (sis. bonukset, osingot yms, euroina)",Vapaa kuvaus kokonaiskompensaatiomallista,Onko palkkasi nykyroolissasi mielestäsi kilpailukykyinen?,Vapaa sana,Palautetta kyselystä ja ideoita ensi vuoden kyselyyn
0,07/10/2024 10:04:36,Palkansaaja,En,46-50,mies,27.0,ylioppilas,3.0,,,...,Konsulttitalossa,100.0,80.0,Senior Fullstack Developer,7250.0,96000.0,Palkka + Bonus + Osingot,Kyllä,,
1,07/10/2024 10:05:39,Palkansaaja,En,46-50,,25.0,,3.0,,,...,"Yrityksessä, jossa softa on tukeva toiminto (e...",37.5,50.0,,8750.0,110000.0,Palkka + vuosibonus,Kyllä,,
2,07/10/2024 10:06:12,Palkansaaja,En,31-35,mies,11.0,tietotekniikan kandidaatti,2.0,,,...,"Yrityksessä, jossa softa on tukeva toiminto (e...",100.0,10.0,Site Reliability Engineer,5100.0,65200.0,,Kyllä,,
3,07/10/2024 10:07:01,Palkansaaja,En,36-40,Mies,12.0,Ylioppilas,0.0,,,...,Konsulttitalossa,100.0,5.0,Fullstack developer,5850.0,105000.0,Palkka+osinko,Kyllä,,
4,07/10/2024 10:07:20,Palkansaaja,En,41-45,mies,15.0,DI,,,,...,"Tuotetalossa, jonka core-bisnes on softa",100.0,30.0,Lead,6800.0,81600.0,Rahapalkka,Kyllä,,


## Save the data 

In [12]:
# Save both versions to CSV (raw data)
output_path_en = RAW_DIR / "koodiklinikka_salary_survey_2024_en.csv"
output_path_fi = RAW_DIR / "koodiklinikka_salary_survey_2024_fi.csv"

df_en.to_csv(output_path_en, index=False)
df_fi.to_csv(output_path_fi, index=False)

print(f"English data saved to: {output_path_en.absolute()}")
print(f"Finnish data saved to: {output_path_fi.absolute()}")


English data saved to: /Users/elar.saks/Desktop/finnish-tech-salary-survey/data/raw/koodiklinikka_salary_survey_2024_en.csv
Finnish data saved to: /Users/elar.saks/Desktop/finnish-tech-salary-survey/data/raw/koodiklinikka_salary_survey_2024_fi.csv
