# High-Level Results Analysis for Qualtrics Python Skill Level Survey

This document provides a high-level analysis of the results from the Qualtrics Python Skill Level Survey. The survey was designed to assess the Python skill levels of participants, by placing them into one of the 2 categories: "Beginner" or "Advanced". This analysis is intended to summarize the findings and provide insights into the best questions around Python programming errors and debugging that can be used to differentiate between these two skill levels.

## Step 0: Install Required Libraries

Prior to running the analysis, ensure that the required libraries are installed. For this, you need only install the libraries defined in the `requirements.txt` file. You can do this by running the following command in your terminal:

```bash
pip install -r requirements.txt
```

In [1]:
%pip install -r requirements.txt

Collecting pandas==2.2.3 (from -r requirements.txt (line 1))
  Using cached pandas-2.2.3-cp312-cp312-macosx_11_0_arm64.whl.metadata (89 kB)
Collecting numpy>=1.26.0 (from pandas==2.2.3->-r requirements.txt (line 1))
  Downloading numpy-2.2.6-cp312-cp312-macosx_14_0_arm64.whl.metadata (62 kB)
Collecting pytz>=2020.1 (from pandas==2.2.3->-r requirements.txt (line 1))
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas==2.2.3->-r requirements.txt (line 1))
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Using cached pandas-2.2.3-cp312-cp312-macosx_11_0_arm64.whl (11.4 MB)
Downloading numpy-2.2.6-cp312-cp312-macosx_14_0_arm64.whl (5.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.1/5.1 MB[0m [31m61.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pytz-2025.2-py2.py3-none-any.whl (509 kB)
Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB)
Installing collected packages: pytz, tzdata, numpy, pa

## Step 1: Load the Data

We need to load the survey data from a CSV file, prior to performing any sort of analysis.

In [3]:
import pandas as pd

# Set the path to the Qualtrics CSV file (adjust as needed, but default is in the same directory)
file_path = "./qualtrics_responses_labels_simplified.csv"

# 1. Load data (Qualtrics exports are semicolon‑delimited by default)
df = pd.read_csv(file_path, sep=';')

## Step 2: Keep Only Completed Surveys

In the dataset, it is possible to encounter entries of respondents that have not completed the survey within the allowed 2 weeks timeframe. The following step filters out these incomplete surveys, ensuring that only fully completed responses are analyzed.

In [None]:
# 2. Keep only respondents who finished the survey
completed = df[df["Finished"] == 1]

## Step 3: Analyze the Data

In [None]:
# 3. Calculate the required metrics
metrics = {
    "Completion rate (%)": round(df["Finished"].mean() * 100, 2),
    "Average duration (sec)": completed["Duration (in seconds)"].mean(),
    "Average Python experience (years)": completed["Q2.2"].mean(),
    "Average general experience (years)": completed["Q2.3"].mean(),
    # Pearson correlation between years of Python experience and general experience
    "Correlation (Python vs general years)": completed[["Q2.2", "Q2.3"]].corr().iloc[0, 1],
    "Average estimated correct answers": completed["Q11.1"].mean(),
}

# 4. Present results in a tidy table
summary_df = pd.DataFrame(list(metrics.items()), columns=["Metric", "Value"])

try:
    # If available, show an interactive table
    from ace_tools import display_dataframe_to_user
    display_dataframe_to_user("Survey summary metrics", summary_df)
except ImportError:
    # Fallback to a simple printout
    print(summary_df.to_string(index=False))