# 1. Understand the Problem

## Introduction

In this lesson, we'll go through a procedure for breaking down programming problems.  Now this procedure is particularly valuable in interview questions, which are oftentimes consist of logic as well as programming problems.  However, the techniques discussed here also apply to programming problems in general.

## The Problem

> **Problem**: Write a function that given a list of strings, determines whether there are any duplicates of that string.

### Step 1: Rephrase the question

The first step to making sure you understand the problem at hand is to simply repeat it.  Ideally, you rephrase the problem a little, to make sure that you are capturing the important components.
 
**Why this is critical** As a developer, you will be given an assignment and then your work will be checked later.  You can waste a lot of time by working towards the wrong problem, or not doing the upfront work of considering the scope of what you are being asked to solve.  Hiring managers know this.  They have likely experienced firsthand developers who build things that were not needed nor asked for.  Finding developers who take the time to understand the problem before solving it is critical. 

**Now, what questions would you ask about this problem?**

Take a minute, and think about it. 

There does seem to be some ambiguity with this problem, so let's move forward with this procedure.

1. *Repeat the question:* 	 "Ok, so you'll provide me with a list of strings, and I should return the list of duplicate strings."  

> Notice the above has us rephrasing the problem with a focus on two different components: what are the inputs, and what is the output.  In the original question, the output was fairly vague.  But the input is perhaps vague as well -- is the input only a string, or can we provide a list of numbers?  

It turns out the return value should be in the form of a dictionary.  We will be given a list of strings and will return a dictionary where the repeated string is the key, and the indices are the values.

```python
{'foobar': [2, 5], 'foobaz': [1, 3]}
```

### 2. *Determine the scope* 

#### A. Ask for the context?  Find out why?

Now just repeating the question is a good first step.  But the problem may have even more requirements that we didn't consider.  

**Why this is critical** Software engineers receive their work from product managers.  And the product managers may forget to mention some requirements.  For example, how much data are we working with.  Or maybe the issue of duplicate strings really isn't a problem at all.  Or maybe it's a symptom of a larger issue that we should be solving.  

Here are a couple of good ways to ask why:

* Is there a specific use case I should be considering?
* Can you tell me a little bit more about how this feature may be used?

> **Interviewer**: We are trying to identify duplicate phone numbers -- so we can check whether we have users who are registering twice, or providing a fake phone number. 


#### B. Are there edge cases?

Edge cases are situations that occur when the data is a bit unexpected.  For example, with the check if a string is unique, it seems like we should format the string first to remove any hyphens, or perhaps standardize country codes.

Here are a couple of typical edge cases to consider:

> * Negative numbers
> * An empty input
> * Capital letters vs lowercase letters

Notice that asking about the usecase gave us some hints about some edge cases.

> **Why this is critical** Even if this does not change the problem at all, it still makes the interviewer aware that we are trying to cast a wide net, and consider the scope of the problem.  We also give the interviewer an opportunity to direct him in certain areas.  If the interviewer wants us to skip over this phase and get straight to solving the problem, the interviewer will point us in that direction.

#### 3. *Give an example of the problem and solution*: 

Ok, so now that we have a good understanding of the problem, it's time to make it concrete.  Provide an example of the input and the output. 

> If the interviewer provided us with an example, we would have used the provided example (as the interviewer may have chosen it for a reason).  Here, instead we come up with our own.
> 
When choosing an example, it's a good idea to choose something complicated enough that avoids edge cases (like a single letter or zero letters), yet easy enough that our brains can solve it.  Generally, choosing three to six elements in a collection is fine.  Edge cases can be delayed to solve later on.

So with that in mind, let's choose the following:

In [2]:
numbers = ["2124443321", "2158861321", 
           "8564659988", "3121100845",
           "8564659988", "2124443321"]

And we can write the following function.

In [3]:
def find_repeat(numbers):
    pass

Which should produce the following output.

In [1]:
{"2124443321": [0, 5], "8564659988": [2, 4]}

{'2124443321': [0, 5], '8564659988': [2, 4]}

### Summary 



In this lesson, we discussed techniques for understanding a problem.  We described a procedure that involves the following steps:

1. Rephrase/repeat the problem

Here, we are making sure that we absorbed the initial description, and are not leaving out critical details.  Rephrasing the problem oftentimes provides more of a check on this than simply repeating the problem, because it forces us to filter for relevant details.

2. Determine the scope

A. What's the use case?

By asking about the use case, we may learn some context that will allow us to better understand the problem, which could make the problem more or less complicated to solve.  

B. Consider edge cases

Edge cases are those scenarios that may be atypical -- often in terms of the data we receive.  Typical edge cases are empty inputs, negative numbers, poorly formatted data, or capital letters.  

3. Give an example

Once we understand the problem, the next step is to provide an example input and solution. 

We want to choose an exmaple that applies to the usecase, but avoids edge cases and is easy enough that our brains can solve it.  Generally, choosing three to six elements in a collection is fine.