# Ensuring Correct Code

## Introduction

We've seen a lot of code so far, and you'll be writing more of it as well. How do we solve problems using code? Here's one methodology:-

1. Write some code
2. Code works
3. You're done!

If you can do the above, congratulations, you can skip everything below and pack up and go home! If you suspect it's not as straightforward though... that's what we're covering in this notebook.

## Problem-solving with Code

Let's try thinking about our problem in this way:-

1. What do I have? *Input*
2. What do I want to get? *Output*
3. How do I get from 1. to 2.?
4. For each step in 3., repeat all these steps

As a concrete example, let's consider a phone-book application. We'll think about the problem as above for one iteration:-

1. What do I have?
  - Some phone numbers and names

2. What do I want to get?
  - If I have a name, I should be able to find the phone number. If I have a phone number, I should be able to find the name.

3. How do I get from 1. to 2.?
  - My phone numbers and names must be stored and linked
  - I must be able to search through all numbers
  - I must be able to search through all names

4. For each step in 3., repeat all these steps

## Breaking it down further

After one iteration it is clear we're not done. We've identified three steps (answers to question 3), and each of them need to be looked at separately. Let's take the first one, which says "My phone numbers and names must be stored and linked". Let's assume the phone numbers and names are given as lists, as below:-

In [None]:
names = ['Dong Ah Wei', 'Shi Rong Hai', 'Qiang Lin Guo',
         'An Dong Hai', 'Gang An Wen', 'Tu Jun Wu',
         'Huang Ping Qiu', 'Zhou Zheng Huang', 'Shui Yi Qing',
         'Bai Tai Qing', 'Chang Min Jin', 'Bo Jian Hung',
         'Shi  Zheng  Cheng', 'Heng Hai Hua', 'Jin Zan Rong',
         'Guo Da Dong', 'Yong Li Hua', 'Min Wen Ling',
         'Lin Su Jing', 'Zan Tu Jiang', 'Rong Ming Xue']
numbers = ['012-503 5290', '012-718 9095', '017-462 4563',
           '016-868 2837', '013-743 8070', '017-360 0399',
           '019-981 5621', '013-802 8090', '017-152 4110',
           '012-716 1836', '012-448 7629', '019-466 7718',
           '017-954 7798', '016-665 9924', '017-379 1819',
           '013-410 3699', '017-772 8574', '013-102 9685',
           '013-839 4659', '016-339 5731', '019-953 6751']
# All names and numbers are randomly generated and totally fictional.

Right now, we have to make a choice. We could combine both lists into a list of lists. We could use a Numpy 2D array. We could use a DataFrame.

There is normally no one right answer. Also, the choice you make can be changed later on with a bit of work. For now, just choose any one of the above, and write code to convert the 2 lists into your chosen format.

In [None]:
# Empty cell to be filled in by student

import numpy as np

# Convert the 2 lists into 2D numpy array
my_data_2D_numpy_array = np.array(list(zip(names, numbers)))

# Display
print(my_data_2D_numpy_array)

[['Dong Ah Wei' '012-503 5290']
 ['Shi Rong Hai' '012-718 9095']
 ['Qiang Lin Guo' '017-462 4563']
 ['An Dong Hai' '016-868 2837']
 ['Gang An Wen' '013-743 8070']
 ['Tu Jun Wu' '017-360 0399']
 ['Huang Ping Qiu' '019-981 5621']
 ['Zhou Zheng Huang' '013-802 8090']
 ['Shui Yi Qing' '017-152 4110']
 ['Bai Tai Qing' '012-716 1836']
 ['Chang Min Jin' '012-448 7629']
 ['Bo Jian Hung' '019-466 7718']
 ['Shi  Zheng  Cheng' '017-954 7798']
 ['Heng Hai Hua' '016-665 9924']
 ['Jin Zan Rong' '017-379 1819']
 ['Guo Da Dong' '013-410 3699']
 ['Yong Li Hua' '017-772 8574']
 ['Min Wen Ling' '013-102 9685']
 ['Lin Su Jing' '013-839 4659']
 ['Zan Tu Jiang' '016-339 5731']
 ['Rong Ming Xue' '019-953 6751']]


In [None]:
import pandas as pd

# Convert the 2 lists into dataframe
my_data_dataframe = pd.DataFrame({'Name': names, 'Number': numbers})

# Display
print(my_data_dataframe)

                 Name        Number
0         Dong Ah Wei  012-503 5290
1        Shi Rong Hai  012-718 9095
2       Qiang Lin Guo  017-462 4563
3         An Dong Hai  016-868 2837
4         Gang An Wen  013-743 8070
5           Tu Jun Wu  017-360 0399
6      Huang Ping Qiu  019-981 5621
7    Zhou Zheng Huang  013-802 8090
8        Shui Yi Qing  017-152 4110
9        Bai Tai Qing  012-716 1836
10      Chang Min Jin  012-448 7629
11       Bo Jian Hung  019-466 7718
12  Shi  Zheng  Cheng  017-954 7798
13       Heng Hai Hua  016-665 9924
14       Jin Zan Rong  017-379 1819
15        Guo Da Dong  013-410 3699
16        Yong Li Hua  017-772 8574
17       Min Wen Ling  013-102 9685
18        Lin Su Jing  013-839 4659
19       Zan Tu Jiang  016-339 5731
20      Rong Ming Xue  019-953 6751


## Using functions for some tasks

Functions are useful for abstraction, meaning they allow us to represent a block of work as a single function instead of multiple lines of code. For example, the conversion that you did before this could be put in a function (you should try this if you have some time).

For now, let's write a function to solve our second step, which says "I must be able to search through all the numbers". Running through our questions, the 1st question (what do I have) is easy, but the second question (what do I want to get?) needs defining.

In summary, I expect that I should be able to run something like:-

    name = my_function(a_phone_number)
    
If I provide a phone number to the function, the function should return the name linked to that number. This seems like a really good reason to write a function. Perhaps we can call it `search_for_number`.

In [None]:
def search_for_number():
    pass

The cell above defines a function, but it doesn't do anything! You should decide what input(s) you want the function to have. You should then write some code to find the matching name for the number passed to the function. Remember to `return` the answer at the end. Remember as well that the function should in the end behave as in the example above!

In [None]:
# Empty cell to be filled in by student
def search_for_number_array(phone_number):
  for entry in my_data_2D_numpy_array:
    if entry[1]==phone_number:
      return entry[0]
  return "Number not found"


In [None]:
def search_for_number_dataframe(phone_number):
    # Use the DataFrame to find the name associated with the phone number
    # This line searches the my_data_dataframe DataFrame for rows where the 'Number' column matches the input phone_number.
    # The expression my_data_dataframe['Number'] == phone_number creates a boolean mask that is True for rows where the 'Number' column matches phone_number and False otherwise.
    # my_data_dataframe[my_data_dataframe['Number'] == phone_number] uses this boolean mask to filter the DataFrame, resulting in a new DataFrame result that contains only the matching rows.
    result = my_data_dataframe[my_data_dataframe['Number'] == phone_number]
    print(result)
    # This line checks if the result DataFrame is not empty. The empty attribute of a DataFrame returns True if the DataFrame is empty (has no rows) and False otherwise.
    # The condition if not result.empty evaluates to True if result contains one or more rows.
    if not result.empty:
      # If the result DataFrame is not empty, this line returns the 'Name' value of the first row (iloc[0]) in the result DataFrame.
      # iloc[0] is used to access the first row of the DataFrame, and ['Name'] is used to access the value in the 'Name' column of that row.
        return result.iloc[0]['Name']
    # If the result DataFrame is empty (i.e., no matching phone number was found), this line returns the string "Number not found".
    return "Number not found"

Obviously we need to test whether our function does what we want.

In [None]:
# Empty cell to be filled in by student
print(search_for_number_array('016-339 5731'))
print(search_for_number_array('019-953 6751'))
print(search_for_number_array('000-000 0000'))  # What should happen here?

Zan Tu Jiang
Rong Ming Xue
Number not found


In [None]:
# Empty cell to be filled in by student
print(search_for_number_dataframe('016-339 5731'))
print(search_for_number_dataframe('019-953 6751'))
print(search_for_number_dataframe('000-000 0000'))  # What should happen here?

            Name        Number
19  Zan Tu Jiang  016-339 5731
Zan Tu Jiang
             Name        Number
20  Rong Ming Xue  019-953 6751
Rong Ming Xue
Empty DataFrame
Columns: [Name, Number]
Index: []
Number not found
