## 5.1 Problem definition

The first stage of the process is to read the informal problem description and
translate it into a succinct but precise problem definition.
The first question to ask yourself is:

- What type of problem is it? Is it asking to
  - check a property of the input? (decision problem)
  - assign a category to each input? (classification problem)
  - find one or more items satisfying some criteria? (search problem)

Remember that a search problem may not be stated as such.

M269 problems are operations on some data, so the next question is:

- Is the operation a function in the mathematical sense,
  i.e. does it not modify its inputs?

If it is, fill in this template:

**Function**: the name of the function\
**Inputs**: the name and type of each input variable\
**Preconditions**: any conditions on the inputs\
**Output**: the name and type of the output variable\
**Postconditions**: how the output relates to the inputs

If it isn't, fill in this one instead:

**Operation**: the name of the operation\
**Inputs/Outputs**: the name and type of each variable that is modified\
**Inputs**: the name and type of each input variable\
**Preconditions**: any conditions on the inputs\
**Output**: the name and type of the output variable\
**Postconditions**: how the outputs relate to the inputs

In the second template, use pre-*x* and post-*x* in the postconditions to refer
to the value of input/output variable *x* before and after the operation,
respectively.

Ask yourself the following questions to fill out the templates.

### 5.1.1 Problem and output names

The problem name states what the function or operation does or produces.
The names of the problem and output variable are often similar or related.

- What does the function/operation do? What does it produce?
- If it's a decision problem, what yes/no question does it ask of the input?
- If it's a search problem, what is being looked for?

### 5.1.2 Inputs and outputs

M269 problems usually have at most one input/output variable,
which must be a mutable sequence, and exactly one output-only variable.

- In the problem description, what is given (input), what is asked for (output)
  and what changes (input/output)?
- Thinking backwards, what data is needed to compute the function's output?
- If it's a decision problem then the output is a Boolean, unless the description
  says otherwise. What does it represent when it's true?
- If it's a search problem, what kind of input sequence is searched?
- If it's a categorisation problem, what are the possible categories?
- For each variable, what is its type?

If the value is ... | then use type ...
-|-
a number  | integer or real number
a logical value  | Boolean
a text  |  string
any sequence  |  sequence
an immutable sequence  | tuple
a mutable sequence  |  list
any value  |  object

### 5.1.3 Preconditions

Not all of the following questions apply to every problem,
but they help you not miss some typical preconditions.

- If the input is an integer:
  - What are its smallest and largest values?
  - Can it be zero? Can it only be positive, or negative, or odd?
  - Does it have a unit?
- If the input is a sequence:
  - Can it be empty? Does it have a minimum or maximum length?
  - Is it sorted? By which criterion? Is the order ascending or descending?
  - Must the items be pairwise comparable?
    This may be necessary for search problems, e.g. to find the largest item.
  - Are items unique or can there be duplicates?
  - What are the preconditions on the items? For example:
      - If the input is a list of integers, must they be positive?
      - If the input is a string, can it only contain certain characters?
- Do the preconditions include output-only variables? If so, something's wrong
  because preconditions only restrict the inputs.

### 5.1.4 Postconditions

- If it's a decision problem, under which conditions is the output true?
- If it's a classification problem then, for each category, what are the conditions
  for the input to belong to that category?
- If it's a search problem, what are the search criteria?
- If an input is a sequence:
  - What happens if it's empty?
  - What happens if it has odd length? What if it has even length?
- If the output is a sequence:
  - Can it be empty? Does it have a minimum or maximum length?
  - Does it have to be a subsequence of an input sequence?
  - Does it have be be sorted?
  - What are the postconditions for its items?
- Does every input occur in the postconditions?
  If not, the postconditions are still incomplete or
  the input can be removed, as it's not necessary to compute the output.

### 5.1.5 Test table

Each test is a problem instance and its expected output, so
depending on the pre- and postconditions, some of these questions may not apply,
e.g. there may not be a largest input or output, or the input sequence can't have
items of different types.

- Does the test table have one column for the test case description,
  one column per input and one column for the output?
- Do the inputs of each test satisfy the preconditions?
  If not, you have to revise the test or the preconditions.
- Do the outputs of each test satisfy the postconditions?
  If not, you have to revise the test or the postconditions.
- Are there tests for the smallest and for the largest possible inputs?
- Are there tests for the smallest and for the largest possible output?
- For an input sequence, are there tests both for even and odd lengths,
  for different types of items, for sorted and unsorted sequences,
  and for sequences with no, some or all items of the same value?
- For a classification problem, is there a test for each category boundary?
- For a decision problem, are there tests with true and false outputs?
- For a search problem, are there tests for no, one, multiple and all items
  matching the search criteria?

⟵ [Previous section](05-introduction.ipynb) | [Up](05-introduction.ipynb) | [Next section](05_2_algorithms.ipynb) ⟶