# EXCEED Python Qualtrics Survey - Detailed Analysis

This notebook provides a detailed analysis of the EXCEED Python Qualtrics survey data. It includes data loading, cleaning, and various analyses to understand the survey responses better. The ultimate goal of this analysis is to find the best subset of questions that can best differentiate between Python _beginner_/_novice_ and _advanced_/_expert_ users.

The Qualtrics survey is composed of the following blocks, where each participant is required to answer all questions in each block. Note that for blocks 3 to 8, each respondent is given a random subset of 2 out of the 7 available questions, hence the total number of survey-related questions is 16. The blocks are as follows:
1. **Consent Form**: Participants agree to take part in the survey. If they do not agree, they are redirected to the end of the survey.
2. **Self-assessment**:
    1. **Python Experience**: Participants self-assess their Python experience using Dreyfus levels.
    2. **Python Programming YoE**: Participants indicate their years of experience with Python programming.
    3. **General Programming YoE**: Participants indicate their years of experience with programming in general.
3. **General Programming Error Understanding**: 7 questions
4. **Python-Specific General Error Understanding**: 7 questions
5. **Code Reading / Understanding**: 7 questions
6. **Error Message Comprehension**: 7 questions
7. **Error Resolution**: 7 questions
8. **Error Message Comprehension**: 7 questions
9. **Natural Language Scenarios**: 7 questions
10. **Miscellanous Questions - Various Complexity & Scope**: 7 questions
11. **Self-Assessment of Results**: Participants self-assess the number of questions they answered correctly in the survey (0 to 16).

## Step 0: Install Required Libraries

Prior to running the analysis, ensure that the required libraries are installed. For this, you need only install the libraries defined in the `requirements.txt` file. You can do this by running the following command in your terminal:

```bash
pip install -r requirements.txt
```

In [None]:
%pip install -r requirements.txt

## Step 1: Load the data

For this step, we need to load 2 CSV files:
1. `survey_results.csv`: Contains the survey responses and some other metadata
2. `survey_answers.csv`: Contains the correct answers for the survey questions

The default files are in the same directory as this notebook. If you have the files in a different directory, please change the file paths accordingly.

In [None]:
import pandas as pd

file_path_survey_results = "./survey_results.csv"
file_path_survey_answers = "./survey_answers.csv"

# Step 0. Load data (files are semicolon‑delimited)
df_results = pd.read_csv(file_path_survey_results, sep=';')
df_answers = pd.read_csv(file_path_survey_answers, sep=';')