# FORMATTING PRACTICES FOR EXERCISES IN CP2024

This document is to establish formatting practices to make automatic processing of exercises easy and fast with scripting.

# Exercise Notebooks

## Markdown cells


Markdown cells are assumed to allways be shown to students. To add comments (which are not visible) to a markdown cell, add:

[comment]: <> (An actual comments which cant be seen unles editing)

```[comment]: <> (This is a comment, it will not be included)```

To format things as general code (on its own lines) you can use
```
# Here is a string with multiple spaces
my_string = "Hello   World"
print(my_string)
```
To format things as python code you can use 
```python
# Here is a string with multiple spaces
my_string = "Hello   World"
print(my_string)
```
Note that this preserves multiple spaces. Inline code can be shown with e.g. `"i am code"`, although it kills multiple spaces: `my_string = "Hello   World"`.

## Code cells

All code cells should begin with a type specifier in the first line. This line of code should be one of the following
| Specifier   |  Desc.  |
| ----------- | ------- |
| ##REPL      |  Read-eval-print loop: The code in these cells is evaluated such that students can see the outputs. If no variable is defined then the variable is shown.|
| ##SOLUTION  |  A hidden code cell used to define solutions. If a solution cell is before a REPL cell and in the same exercise, then the defined variables should be accessible. E.g. for unit test displays. |
| ##SNIPPET   |  A code cell which is meant to be copy-pasted by students  |
| ##CODEBOX   |  A code cell which is NOT meant to be copy-pasted by students |
| ##DOCUMENTATION | A code cell which contains documentation for a function, so that the students can see it. |

# Quiz JSON File


## Homebrewed questions


The quiz json file is supposed to contain around 40 weekly-quiz questions. It stores a list of dictionaries, where each dictionary is a question in the following format:
```python
{
    "id": 7,
    "status": "success",
    "question": "What is stored in the variable `my_string` when the code is run?",
    "code": "my_string = 'ababa_aba'.replace('aba', 'ccc')",
    "options_are_code": true,
    "options": {
        "a": "'cccba_ccc'",
        "b": "'ccc_ccc'",
        "c": "'abccc_ccc'",
        "d": "'ababa_ccc'"
    },
    "subject": "string_methods_replace",
    "type": ""
}
```
| Field       |  Desc.  |
| ----------- | ------- |
| id          |  A unique question id.  |
| status      |  If the question has been processed, the status of the processing is written in this field. Will be one of "missing" (there is no question),"success","error_extract" (an error was encountered when trying to extract the fields from the dictionary), "error_format" (an error was encountered when trying to format the question as markdown).  |
| question       |  The question for the quiz (in markdown).|
| code        |  Python code which is displayed for students after the question. |
| options_are_code |  If true, then options will be displayed with code-like formatting when formatting as markdown.|
| options       |  A set of options stored as a dictionary. Usually indexed with `"a","b","c","d"` |
| subject       |  A string describing which subject the question is about. |
| type        |  An optional string to categorize questions. E.g. if half the questions have `type=="at_home"` then this can be used to extract a specific set of questions.  |
| gt        |  Key for the ground truth, if the question does not contain a gt key, it is assumed that `gt="a"` |

## Prompted questions

To create questions with little effort, you can prompt chatgpt for questions. There are 2 optinos on how to do this. The first is familiarizing e.g. ChatGPT with the formatting and then making it produce questions following said format. An example prompt is shown below.

The second option is to just prompt ChatGPT to generate questions, without telling it formatting. The script in utils.py has functionality to automatically find the question,code and options from the raw copied prompt.

| Prompt 1    |  Prompt 2   |
| ----------- | ----------- |
| Hi, Take a look at the following json data. It represents the data structure used to store python programming exercises. Option a) is always set as the correct answer. Things in the "code" field are formatted as python code, the question is formatted as markdown. options are formatted as code if "options_are_code" is set as true and otherwise markdown. <br/> {"id": 7,<br/>"status": "",<br/>"question": "What is stored in the variable my_string after the code is run?",<br/>"code": "my_string = 'ababa_aba'.replace('aba', 'ccc')",<br/>"options_are_code": true,<br/>"options": {<br/>"a": "'cccba_ccc'",<br/>"b": "'ccc_ccc'",<br/>"c": "'abccc_ccc'",<br/>"d": "'ababa_ccc'"<br/>},<br/>"subject": "",<br/>"type": ""} <br/> Please make a list of 10 new exercises. The subject should be strings. The exercises should not use any imports, and the questions should be simple and short. Questions should be multiple choice with options a-d. Most of the questions should be directly related to e.g. a line of code (with a few questions simply natural language about the subject). Make the questions require basic understanding of the code, but do not make questions which could be obviously answer e.g. based only on what name a method has. | Make 20 multiple choice questions about python programming on the subject: strings. The exercises should not use any imports, and the questions should be simple and short. Questions should be multiple choice with options a-d. Most of the questions should be directly related to e.g. a line of code (with a few questions simply natural language about the subject). Make the questions require basic understanding of the code, but do not make questions which could be obviously answer e.g. based only on what name a method has. 

Please make the questions varied in format and difficulty. Some should be a couple of lines of code, some only 1. Make the order of options such that A is always the correct answer. If the multiple choice option is a string, format with single quotation marks, like 'some string'. |

If the second method is used, copy pasting will usually require a small piece of code such that it can actually be pasted into the json file:

In [4]:
s = """What is the output of the following code?

python
Copy code
s = "python"
print(s[2:5])
a) 'tho'
b) 'pyth'
c) 'pyt'
d) 'ytho'"""
s.replace("\"","'")

"What is the output of the following code?\n\npython\nCopy code\ns = 'python'\nprint(s[2:5])\na) 'tho'\nb) 'pyth'\nc) 'pyt'\nd) 'ytho'"

The cell output (excluding the quotations " at the first and last character) can be copied into a question in the field `chatgpt_text`. This method does not require any of the normal question fields (`question,code,options`) but the script in `utils.py` will automatically extract them.