# Python Prerequisite Test

By: Pacmann AI

---

**Instructions**:
- Make a Copy of this notebook.
- Answer all questions.
- Submit your answer in **.ipynb** format named with `your_name-python.ipynb`
- Example: `budi_xaskha-python.ipynb`
- **Submission Link**: [SUBMIT YOUR ANSWER](https://forms.gle/3VGV3t59rsPhbpZn9)

**Note**:
- There are 10 questions to answer.
- You have to answer the question according to the descriptions. **Please read the description and docstring carefully**.
- There are several test cases you can use to **validate your answer**.
- <font color='red'>**YOU MUST ANSWER THE QUESTIONS WITHOUT USING ANY LIBRARY**</font>
- **Do not change** the function / class name.
- **Do not use** input() function
- <font color='red'>**Hard deadline**: 1 week before first main course (not a prerequisite course) start</font>

# 1. Find the Structure of Data
---

Given a 2D list of data, your task is to create a function that returns the number of rows & columns from that list.

---
Input example-1:

```python
data = [[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
        [10, 11, 12]]

res = check_structure(data)
print(res)
```

Output example-1:
```
[4,3]
```

---
Input example-2:

```python
data = [[4, 3, 2, 10],
        [8, 2, 1, 3]]

res = check_structure(data)
print(res)
```

Output example-2:
```
[2,4]
```

---
Input example-3:

```python
data = [[0.5, 0.3, 0.1]]

res = check_structure(data)
print(res)
```

Output example-3:
```
[1,3]
```

---
Write the function here

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def check_structure(data):
    '''
    Function to check the data structures

    Parameters
    ----------
    data : list
        The 2D sample data

    Returns
    --------
    data_shape : list
        The shape of data with format [nrows, ncols]
        nrows = number of rows
        ncols = number of columns
    '''
    ###
    ### YOUR CODE HERE
    ###
    try:
      # Validate data is list
      if not isinstance(data, list):
        raise TypeError(f"Whoops, Matrix type error. Should be list")

      # Check if raw_data is a 2D list and not empty
      if not data or not all(isinstance(row, list) for row in data):
          raise TypeError('Mismatch input size. The data is not in 2D')

      # Validate data is not empty
      if not all(isinstance(row, list) and row for row in data):
         raise ValueError(f"Whoops, Matrix should have value")

      # Validate is row is has same column
      ncols = len(data[0])
      if not all(len(row)==ncols for row in data):
         raise ValueError(f"Whoops, Matrix should have same column")

      # Validate values is must be number no str
      for row in data:
        if not all(isinstance(elemen,(int,float)) for elemen in row):
            raise ValueError(f"Whoops, Matrix should be numerical")
      nrows = len(data)
      shape_of_matrix = [nrows,ncols]
      return shape_of_matrix

    except (TypeError,ValueError) as err:
      return err




Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
data = [[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
        [10, 11, 12]]

data_shape = check_structure(data)
print(data_shape)

[4, 3]


# 2. Create a Weighted Average Function
---

Given a list of number (`data`) and list of weight (`w`), we want to find its weighted average.

To calculate the weighted average, you can use belows formula

$$
\begin{align*}
\text{weighted average} &= \sum_{i=1}^{n} w_{i} \cdot \text{data}_{i} \\
\text{weighted average} &= w_{1} \cdot \text{data}_{1} +  w_{2} \cdot \text{data}_{2} + \cdots + w_{n} \cdot \text{data}_{n}\\
\end{align*}
$$

---
Input example-1:

```python
data = [10, 20, 30, 40, 50]
w = [0.10, 0.20, 0.25, 0.3, 0.15]

avg = calc_weighted_avg(data, w)
print(avg)
```

Output example-1:
```
32.0
```

---
Input example-2:

```python
data = [-2, -1, 0, 1, 2]
w = [0.2, 0.2, 0.2, 0.2, 0.2]

avg = calc_weighted_avg(data, w)
print(avg)
```

Output example-2:
```
0.0
```

---
Input example-3:

```python
data = [12, 13.5, 9.8, 10.3]
w = [0.10, 0.20, 0.30, 0.40]

avg = calc_weighted_avg(data, w)
print(avg)
```

Output example-3:
```
10.96
```

---
Write function

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def calc_weighted_avg(data, w):
    '''
    Function to calculate the weighted average of a list

    Parameters
    ----------
    data : list
        The sample data

    w : list
        The sample weights

    Returns
    -------
    avg : float
        The weighted average
    '''
    ###
    ### YOUR CODE HERE
    ###
    ### Assume data and weight value is ordered pair by index data[i] == w[i]

    try:

      # Validate data and weight is list
      if not isinstance(data, list) or not isinstance(w, list) :
        raise TypeError(f"Whoops, data type error. Should be list")

      # Validate data and w have same length
      if not len(data) == len(w):
        raise ValueError(f"Whoops, data and weight must be have same length")

      # Validate values is must be number not str
      for elemen in data:
        if not isinstance(elemen,(int,float)):
            raise ValueError(f"Whoops, Data value should be numerical")

      # Validate values is must be number not str
      for elemen in w:
        if not isinstance(elemen,(int,float)):
            raise ValueError(f"Whoops, Weight value should be numerical")

      average = 0
      for idx, item in enumerate(data):
        average += item * w[idx]

      return average

    except (TypeError,ValueError) as err:
      return err


Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
data = [10, 20, 30, 40, 50]
w = [0.10, 0.20, 0.25, 0.3, 0.15]

avg = calc_weighted_avg(data, w)
print(avg)

32.0


# 3. Which Toko is More Profitable?
---

You are a data analyst that works with financial team. You are given three list of data: `revenues`, `costs`, and `toko_ID`.

Your manager asked which `toko_ID` that is profitable. A `toko` is called profitable if its revenue is higher than its cost.

Create a function to return which `toko_ID` is profitable from a given `revenues` and `costs` data.

---
Input example-1:

```python
toko_ID = ['A001', 'B002', 'C003', 'D004']
revenues = [80000, 120000, 57000, 450000]
costs = [90000, 110000, 57000, 420000]

toko_profit = profitable_toko(toko_ID, revenues, costs)
print(toko_profit)
```

Output example-1:
```
['B002', 'D004']
```

---
Input example-2:

```python
toko_ID = ['p', 'q', 'r']
revenues = [80, 90, 30]
costs = [70, 100, 20]

toko_profit = profitable_toko(toko_ID, revenues, costs)
print(toko_profit)
```

Output example-2:
```
['p', 'r']
```

---
Input example-3:

```python
toko_ID = ['a', 'b', 'c']
revenues = [80, 80, 70]
costs = [80, 80, 90]

toko_profit = profitable_toko(toko_ID, revenues, costs)
print(toko_profit)
```

Output example-3:
```
[]
```

---
Write a function

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def profitable_toko(toko_ID, revenues, costs):
    '''
    Function to return which toko ID is profitable

    Parameters
    ----------
    toko_ID : list
        The ID of toko

    revenues : list
        The list of revenue from the corresponding toko_ID

    costs : list
        The list of cost from the corresponding toko_ID

    Returns
    -------
    toko_profit : list
        The ID of toko that is profitable
    '''
    ###
    ### Your Code Here
    ###
    ### Assume data and weight value is ordered pair by index toko_ID[i] == revenues[i] == costs[i]

    try:
      # Validate toko_ID or revenues or costs is list
      if not isinstance(toko_ID, list) or not isinstance(revenues, list) or not isinstance(costs, list):
        raise TypeError(f"Whoops, data type error. Should be list")

      # Validate data and w have same length
      if not len(toko_ID) == len(revenues) or not len(toko_ID) == len(costs) or not len(revenues) == len(costs):
        raise ValueError(f"Whoops, data and weight must be have same length")

      # Validate values is must be number not str
      for elemen in toko_ID:
        if not isinstance(elemen,(str)):
            raise ValueError(f"Whoops, toko_ID value should be String")

      # Validate values is must be number not str
      for elemen in revenues:
        if not isinstance(elemen,(int,float)):
            raise ValueError(f"Whoops, revenues value should be numerical")


      # Validate values is must be number not str
      for elemen in costs:
        if not isinstance(elemen,(int,float)):
            raise ValueError(f"Whoops, costs value should be numerical")

      toko_profit = []
      for idx, _ in enumerate(revenues):
        if revenues[idx] > costs[idx]:
          toko_profit.append(toko_ID[idx])

      return toko_profit
    except (TypeError,ValueError) as err:
      return err

Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
toko_ID = ['A001', 'B002', 'C003', 'D004']
revenues = [80000, 120000, 57000, 450000]
costs = [90000, 110000, 57000, 420000]

toko_profit = profitable_toko(toko_ID, revenues, costs)
print(toko_profit)

['B002', 'D004']


# 4. Clean the Phone Number
---

You are on a marketing team that have to transfer e-money as a gift to your customer.

You noticed that the phone number data is inconsistent & you have to clean it.

This is the valid definition of phone number
- Starts with `62`, e.g. `62xxxxxxxxxxx`
- It must be 11 digit number, excluding `62`

Clean the phone number first. If after cleaning, there is an invalid phone number, change the phone number with `'Invalid number'`.

---
Input example-1:

```python
phone_lists = [
    '82123321123',
    '082321123321',
    '+6282-456-654-456',
    '+62 82 789 987 789',
    '14045',
    '82145-451-145',
    '829102394821'
]

phone_clean = clean_phone_number(phone_lists)
print(phone_clean)
```

Output example-1:
```
[6282123321123, 6282321123321, 6282456654456, 6282789987789, 'Invalid number', 6282145451145, 'Invalid number']
```

---
Input example-2:

```python
phone_lists = [
    '82432234432',
    '+62 82 32',
    '14032',
    '082 234 432 234'
]

phone_clean = clean_phone_number(phone_lists)
print(phone_clean)
```

Output example-2:
```
[6282432234432, 'Invalid number', 'Invalid number', 6282234432234]
```

---
Write function

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def clean_phone_number(phone_list):
    '''
    Function to clean the phone number

    Parameters
    ----------
    phone_list : list
        The raw sample of phone data

    Returns
    -------
    phone_clean : list
        The clean sample of phone data
    '''
    ###
    ### YOUR CODE HERE
    ###
    try:
      #Handle Character
      phone_number = lambda p_numb : p_numb.replace('-','').replace(' ','')
      #Convert all data to string
      phone_list_str = lambda list_phone : [str(phone_number(elemen)) for elemen in phone_list]
      phone_list     = phone_list_str(phone_list)

      prefix = '62'
      for idx, item in enumerate(phone_list):
        if (len(item) == 11 and item[0] != '0'):
          phone_list[idx] = int(prefix+str(item))
        elif (len(item) == 12 and item[0] == '0'):
          phone_list[idx] = int(prefix+str(item[1:]))
        elif item[0:3] == '+62' and len(item[3:]) == 11:
          phone_list[idx] = int(str(item[1:]))
        else:
          phone_list[idx] = 'Invalid number'

      return phone_list
    except (Exception) as err:
      return err

Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
phone_list = [
    '82123321123',
    '082321123321',
    '+6282-456-654-456',
    '+62 82 789 987 789',
    '14045',
    '82145-451-145',
    '829102394821'
]

phone_clean = clean_phone_number(phone_list)
print(phone_clean)

[6282123321123, 6282321123321, 6282456654456, 6282789987789, 'Invalid number', 6282145451145, 'Invalid number']


# 5. Find the Nearest Tourism Object
---

You are developing a web application that recommend good place to stay during holiday sessions.

A feature loved by most of customer is **finding** the nearest tourism object from a position (could be their hotel).

Given that each object/places has an location coordinates (x, y), you can calculate the distance of two objects by

$$
\text{dist}(A, B)
=
\sqrt{
    (A_{x} - B_{x})^{2}
    +
    (A_{y} - B_{y})^{2}
}
$$

Given a list of tourism object (`tourism_name`), tourism object coordinates (`tourism_coor`), and customer places coordinates (`current_coor`), return the nearest tourism object as a `dict` type.

Example of calculating distance between two objects

- Say we want to calculate the distance between `Taman C` and `Danau D`
  - `Taman C` location coordinate: `[46.67, 40.44]`
  - `Danau D` location coordinate: `[21.83, 1.94]`
- So the distance is

$$
\begin{align*}
\text{dist}(C, D)
&=
\sqrt{
    (C_{x} - D_{x})^{2} + (C_{y} - D_{y})^{2}
} \\
&=
\sqrt{
    (46.67 - 21.83)^{2} + (40.44-1.94)^{2}
} \\
&=
\sqrt{
    (24.84)^{2} + (38.50)^{2}
} \\
&=
\sqrt{
    2099.28
} \\
\text{dist}(C, D)
&= 45.82
\end{align*}
$$

---
Input example-1:

```python
current_coor = [-2.21, 3.15]
tourism_coor = [
    [-34.93, -31.23],
    [-77.90, 79.90],
    [46.67, 40.44],
    [21.83, 1.94],
    [41.77, -63.44],
    [-1.10, -47.22],
    [68.81, 64.65],
    [-21.23, 22.03],
    [68.30, -69.73],
    [12.82, 30.75],
]
tourism_name = [
    'Pantai A',
    'Jembatan B',
    'Taman C',
    'Danau D',
    'Perpustakaan E',
    'Mall F',
    'Monumen G',
    'Taman Hutan H',
    'Air terjun I',
    'Gunung J'
]

nearest_object = find_nearest(current_coor, tourism_coor, tourism_name)
print(nearest_object)
```

Output example-1:
```
{'object': 'Danau D', 'dist': 24.07043206924213}
```

---
Input example-2:

```python
current_coor = [0, 0]
tourism_coor = [
    [0, 1],
    [2, 0],
    [0, 3],
    [-1, -1],
    [-2, -1],
    [-1, -3]
]
tourism_name = [
    'object a',
    'object b',
    'object c',
    'object d',
    'object e',
    'object f'
]

nearest_object = find_nearest(current_coor, tourism_coor, tourism_name)
print(nearest_object)
```

Output example-2:
```
{'object': 'object a', 'dist': 1.0}
```

---
Write function

In [21]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
# Function to find distance between 2 object
def calc_dist(A, B):
    '''
    Function to calculate distance between two objects

    Parameters
    ----------
    A : list
        The coordinates of object A

    B : list
        The coordinates of object B

    Returns
    -------
    dist : float
        The distance between A & B
    '''
    ###
    ### YOUR CODE HERE
    ###
    try:
      # ASSUME COORDINATE ALWAYS 2 AXIS (X,Y)
      distances = []
      l_list = len(A)
      nested = any(isinstance(row, list) for row in B)
      if not nested:
          distance = ((A[0]-(B[0]))**2 + (A[1]-B[1])**2)**(1/2)
          distances.append(distance)
          return distances

      for dist in B:
        if len(dist) == l_list:
          distance = ((A[0]-(dist[0]))**2 + (A[1]-dist[1])**2)**(1/2)
          distances.append(distance)
      return distances

    except Exception as err:
      return err

In [22]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def find_nearest(current_coor, tourism_coor, tourism_name):
    '''
    Function to find nearest tourism object near current coordinates

    Parameters
    ----------
    current_coor : list
        The guest current coordinate

    tourism_coor : list
        The tourism object coordinates

    toursim_name : list
        The tourism object name

    Returns
    -------
    nearest_object : dict
        The dictionary of nearest tourism object
    '''
    ###
    ### YOUR CODE HERE
    ###
    try:
      distance = calc_dist(current_coor, tourism_coor)
      if distance:
        dist  = distance[0]
        index = 0
        # Get the minimum distance
        for idx in range(1,len(distance)):
          if distance[idx] < dist:
            dist = distance[idx]
            index  = idx


        # Convert distance and tourism name to dictioary
        json_data = {
            'object' : tourism_name[index],
            'dist': dist
        }
        return json_data
      return {'object' : None,'dist': 0.0}
    except Exception as err:
      return err

In [23]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
tourism_name = [
    'Pantai A',
    'Jembatan B',
    'Taman C',
    'Danau D',
    'Perpustakaan E',
    'Mall F',
    'Monumen G',
    'Taman Hutan H',
    'Air terjun I',
    'Gunung J'
]

tourism_coor = [
    [-34.93, -31.23],
    [-77.90, 79.90],
    [46.67, 40.44],
    [21.83, 1.94],
    [41.77, -63.44],
    [-1.10, -47.22],
    [68.81, 64.65],
    [-21.23, 22.03],
    [68.30, -69.73],
    [12.82, 30.75],
]

current_coor = [-2.21, 3.15]

nearest_object = find_nearest(current_coor, tourism_coor, tourism_name)
print(nearest_object)

{'object': 'Danau D', 'dist': 24.07043206924213}


Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
tourism_name = [
    'Pantai A',
    'Jembatan B',
    'Taman C',
    'Danau D',
    'Perpustakaan E',
    'Mall F',
    'Monumen G',
    'Taman Hutan H',
    'Air terjun I',
    'Gunung J'
]

tourism_coor = [
    [-34.93, -31.23],
    [-77.90, 79.90],
    [46.67, 40.44],
    [21.83, 1.94],
    [41.77, -63.44],
    [-1.10, -47.22],
    [68.81, 64.65],
    [-21.23, 22.03],
    [68.30, -69.73],
    [12.82, 30.75],
]

current_coor = [-2.21, 3.15]

nearest_object = find_nearest(current_coor, tourism_coor, tourism_name)
print(nearest_object)

{'object': 'Danau D', 'dist': 24.07043206924213}


# 6. Find Size of People in Some Groups
---

You are in a research team. Your product team asked you to help their research agenda, i.e. performing a Focus Group Discussion (FGD).

The product team need to distribute the user equally to each group. Please help the product team.

Your tasked is to divide $n$ people as equal as possible to $k$ groups.

---
Input example-1:

```python
n = 14
k = 3

group_size = distribute_user(n, k)
print(group_size)
```

Output example-1:
```
[5, 5, 4]
```

---
Input example-2:

```python
n = 40
k = 7

group_size = distribute_user(n, k)
print(group_size)
```

Output example-2:
```
[6, 6, 6, 6, 6, 5, 5]
```

---
Input example-3:

```python
n = 20
k = 2

group_size = distribute_user(n, k)
print(group_size)
```

Output example-3:
```
[10, 10]
```

---
Write function

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def distribute_user(n, k):
    '''
    Function to distribute n user equally to k class

    Parameters
    ----------
    n : int
        The number of people to distribute

    k : int
        The number of group

    Returns
    -------
    group_size : list
        The list of group size (must be integer) and shape of (k)
    '''
    ###
    ### YOUR CODE
    ###
    try:
      # Validate data is list
      if not isinstance(n, int) or not isinstance(k, int):
        raise TypeError(f"Whoops, data should be Integer")

      # Validate data is not empty
      if not n or not k:
          raise ValueError(f"Whoops, data should not empty")
      data = [0]*k

      idx = 0
      group_idx = 0
      while idx < n:

        if group_idx >= k:
          group_idx = 0

        if group_idx < k :
          data[group_idx] = data[group_idx] + 1
          group_idx +=1

        idx+=1
      return data
    except (TypeError,ValueError) as err:
      return err

Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
n = 14
k = 3

group_size = distribute_user(n, k)
print(group_size)

[5, 5, 4]


# 7. Summarize Data with OOP
---

Write a Class that can summarize the given data

- Named the class with `Data`
- Initialize the object with no input
- `Data` has several attributes:
  - `data` --> return the input data
  - `size` --> return the size of the input data
- `Data` has several methods:
  - `read_data(data)` --> to read the input data
  - `find_total()` --> return the total sum of the input data
  - `find_average()` --> return the average of the input data

---
Input example-1:

```python
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

data_obj = Data()
data_obj.read_data(data)
print(data_obj.data)
print(data_obj.size)
print(data_obj.find_total())
print(data_obj.find_average())
```

Output example-1:
```
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
10
45
4.5
```

---
Input example-2:

```python
data = [-4, -3, -2, -1, 0, 1, 2, 3, 4, 5]

data_obj = Data()
data_obj.read_data(data)
print(data_obj.data)
print(data_obj.size)
print(data_obj.find_total())
print(data_obj.find_average())
```

Output example-2:
```
[-4, -3, -2, -1, 0, 1, 2, 3, 4, 5]
10
5
0.5
```

---
Write class

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE CLASS
class Data:
    ###
    ### YOUR CODE
    ###
    def __init__(self):
      self.data = []
      self.Size = 0

    def read_data(self,data):
      try:
        self.data = self.return_raw_data(data)
        self.size = self.size(data)
      except Exception as err:
        return err


    def return_raw_data(self,data):
      try:
        self.__validation()
        return data
      except Exception as err:
        return err

    def size(self,data):
      try:
        self.__validation()
        return len(data)
      except Exception as err:
        return err


    def find_total(self):
      try:
        self.__validation()
        total = 0
        for item in self.data:
          total +=item
        return total
      except Exception as err:
        return err


    def find_average(self):
      try:
        self.__validation()
        avg = self.find_total()/self.size
        return avg
      except Exception as err:
        return err

    def __validation(self):
        # Validate data is list
        if not isinstance(self.data, list):
          raise TypeError(f"Whoops, data should be List")


Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

try:
    data_obj = Data()
    data_obj.read_data(data)
    print(data_obj.data)
    print(data_obj.size)
    print(data_obj.find_total())
    print(data_obj.find_average())
except Exception as e:
    print('There is something wrong')

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
10
45
4.5


# 8. User with Double Promo
---

You are on a marketing team. Your team have 2 types of promo, `promo A` and `promo B`, that user can get **only one of them**.

You suspect that there are some users have double promo. Given a `user_ID`, `promo_A_status`, and `promo_B_status`, create a function that return `user_ID` whose have double promo.

---
Input example-1:

```python
user_ID = ['01', '02', '03', '04', '05', '06', '07']
promo_A_status = [1, 0, 0, 1, 1, 0, 1]
promo_B_status = [0, 0, 1, 1, 0, 1, 1]

double_ID = find_double_promo(user_ID, promo_A_status, promo_B_status)
print(double_ID)
```

Output example-1:
```
['04', '07']
```

---
Input example-2:

```python
user_ID = ['a', 'b', 'c', 'd', 'e']
promo_A_status = [1, 1, 1, 1, 1]
promo_B_status = [0, 1, 1, 1, 1]

double_ID = find_double_promo(user_ID, promo_A_status, promo_B_status)
print(double_ID)
```

Output example-2:
```
['b', 'c', 'd', 'e']
```

---
Input example-3:

```python
user_ID = ['a4a', '23b', 'f4c', '5d6']
promo_A_status = [0, 0, 1, 0]
promo_B_status = [0, 0, 1, 1]

double_ID = find_double_promo(user_ID, promo_A_status, promo_B_status)
print(double_ID)
```

Output example-3:
```
['f4c']
```

---
Write function

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def find_double_promo(user_ID, promo_A_status, promo_B_status):
    '''
    Find user ID that has double promo

    Parameters
    ----------
    user_ID: list
        List of user ID

    promo_A_status: list
        List of user ID that get promo A.
        1 = get promo
        0 = did not get promo

    promo_B_status: list
        List of user ID that get promo B.
        1 = get promo
        0 = did not get promo

    Returns
    -------
    double_ID: list
        List of user that get double promo.
        If None, return []
    '''
    ###
    ### YOUR CODE HERE
    ###
    try:
      # Check if raw_data is a list
      if not isinstance(user_ID, list) or not isinstance(promo_A_status, list) or not isinstance(promo_B_status, list):
        raise TypeError('Mismatch input type, must be list')

      # Check if all data is list
        if not (len(promo_B_status) != len(user_ID) and len(user_ID)!=len(promo_A_status) and len(promo_A_status)!=len(promo_B_status)):
          raise ValueError('Mismatch data columns lenght')

      user_get_promo = []

      for idx in range(len(user_ID)):
        if promo_A_status[idx] == 1 and promo_B_status[idx] == 1:
          user_get_promo.append(user_ID[idx])
      return user_get_promo

    except Exception as err:
        return err

Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
user_ID = ['01', '02', '03', '04', '05', '06', '07']
promo_A_status = [1, 0, 0, 1, 1, 0, 1]
promo_B_status = [0, 0, 1, 1, 0, 1, 1]

double_ID = find_double_promo(user_ID, promo_A_status, promo_B_status)
print(double_ID)

['04', '07']


# 9. Find Duplicate Person in a Research
---

The marketing team & sales want to make sure that there is no duplicate person in the data.

Given a list of people ID & its name, find people ID & names that has similar names.

---
Input example-1:

```python
people_ID = ['01', '02', '03', '04', '05', '06', '07']
people_name = [
    'Budi santoso',
    'Pramono Setiadi',
    'Rijal',
    'Dedi setiawan',
    'rijal',
    'Alesha Nur',
    'Dedi Setiawan'
]

people_duplicate = find_duplicates(people_ID, people_name)
print(people_duplicate)
```

Output example-1:
```
[['03', 'Rijal'], ['04', 'Dedi setiawan'], ['05', 'rijal'], ['07', 'Dedi Setiawan']]
```

---
Input example-2:

```python
people_ID = ['1e', 'd2', '3b', 'a4', 'q5']
people_name = [
    'aa cahya',
    'AA cahYa',
    'bb durian',
    'cc maANGGa',
    'AA CAHYA ',
]

people_duplicate = find_duplicates(people_ID, people_name)
print(people_duplicate)
```

Output example-2:
```
[['1e', 'aa cahya'], ['d2', 'AA cahYa'], ['q5', 'AA CAHYA ']]
```

---
Write function

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def find_duplicates(people_ID, people_name):
    '''
    Function to find duplicate person

    Parameters
    ----------
    people_ID : List
        list of people ID

    people_name : list
        list of people name

    Returns
    -------
    people_duplicate : list
        List of duplicate people
    '''
    ###
    ### YOUR CODE HERE
    ###

    ### ASSUME people ID is map with People Name based on same index
    try:
      # Validate data and weight is list
      if not isinstance(people_ID, list) or not isinstance(people_name, list) :
        raise TypeError(f"Whoops, data type error. Should be list")

      # Validate data and w have same length
      if not len(people_ID) == len(people_name):
        raise ValueError(f"Whoops, data and weight must be have same length")

        # Validate data is not eoty
      if not people_ID  or not people_name:
        raise ValueError(f"Whoops, must not be empty")

      name_dict = {}
      duplicates = []
      added_ids = set()

      for id, name in zip(people_ID, people_name):
          normalized_name = name.strip().lower()

          if normalized_name in name_dict:
              if name_dict[normalized_name][0] not in added_ids:
                  duplicates.append([name_dict[normalized_name][0], name_dict[normalized_name][1]])
                  added_ids.add(name_dict[normalized_name][0])
              if id not in added_ids:
                  duplicates.append([id, name])
                  added_ids.add(id)
          else:
              name_dict[normalized_name] = (id, name)

      return duplicates

    except (TypeError,ValueError) as err:
      return err

Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
people_ID = ['01', '02', '03', '04', '05', '06', '07']
people_name = [
    'Budi santoso',
    'Pramono Setiadi',
    'Rijal',
    'Dedi setiawan',
    'rijal',
    'Alesha Nur',
    'Dedi Setiawan'
]

people_duplicate = find_duplicates(people_ID, people_name)
print(people_duplicate)

[['03', 'Rijal'], ['05', 'rijal'], ['04', 'Dedi setiawan'], ['07', 'Dedi Setiawan']]


# 10. Time to Transport Logistic
---

You are a data analyst in a transport logistic. You need to recommend the logistic driver a set of route so that they can deliver the goods efficiently.

Simply, you can create a function to calculate time needed for the driver to follow several set of routes from an initial position and comeback again.

We already have a look-up table that shows the duration needed for moving between two locations

**Durations Table**

<center>

||A|B|C|D|E|
|:-:|:-:|:-:|:-:|:-:|:-:|
|A| 0 | 3 | 5 | 10 | 4 |
|B| 3 | 0 | 6 | 8 | 9 |
|C| 5 | 6 | 0 | 7 | 2 |
|D| 10 | 8 | 7 | 0 | 1 |
|E| 4 | 9 | 2 | 1 | 0 |

Say, you want to know the duration of city `B` to city `E`,

**First**, `B` is the starting point, so go to row `B`

||A|B|C|D|E|
|:-:|:-:|:-:|:-:|:-:|:-:|
|B| 3 | 0 | 6 | 8 | 9 |

<br>

**Then**, `E` is the destination, so from the selected row, go to column `E`

||A|B|C|D|E|
|:-:|:-:|:-:|:-:|:-:|:-:|
|B|  |  |  |  | 9 |

<br>

**Finally**, the duration needed to go from `B` to `E` is 9

To calculate it easily in Python, we turn the duration table into this list

```python
duration_table = [
    [0, 3, 5, 10, 4],
    [3, 0, 6, 8, 9],
    [5, 6, 0, 7, 2],
    [10, 8, 7, 0, 1],
    [4, 9, 2, 1, 0]
]
```

Say you want to measure duration from city `B` to `E`, then the duration would be `duration_table[1][4]` or `duration_table[4][1]`

In [None]:
# Run this code
# Don't change anything
duration_table = [
    [0, 3, 5, 10, 4],
    [3, 0, 6, 8, 9],
    [5, 6, 0, 7, 2],
    [10, 8, 7, 0, 1],
    [4, 9, 2, 1, 0]
]

duration_table[1][4], duration_table[4][1]

(9, 9)

We usually calculate the duration needed from a set of route, e.g. `ABCDEBE`.

The step to calculate the total duration needed to move according to the route (`ABCDEBE`) and get back to the original position (in this case is `A`)

**First**, Extract all the pair of destination inside the route `ABCDEBE`

|Route|Time|
|:-:|:-:|
|`A -> B`||
|`B -> C`||
|`C -> D`||
|`D -> E`||
|`E -> B`||
|`B -> E`||
|`E -> A`||

<br>

**Next**, obtain all the route duration from the destination table

|Route|Time|
|:-:|:-:|
|`A -> B`|3|
|`B -> C`|6|
|`C -> D`|7|
|`D -> E`|1|
|`E -> B`|9|
|`B -> E`|9|
|`E -> A`|4|

<br>

**Finally**, add all the time

$$
\begin{align*}
\text{total time} &= 3 + 6 + 7 + 1 + 9 + 9 + 4 \\
\text{total time} &= 39 \\
\end{align*}
$$


---
Input example-1:

```python
duration_table = [
    [0, 3, 5, 10, 4],
    [3, 0, 6, 8, 9],
    [5, 6, 0, 7, 2],
    [10, 8, 7, 0, 1],
    [4, 9, 2, 1, 0]
]
route = 'ABCDEBE'

total_time = calculate_duration(route, duration_table)
print(total_time)
```

Output example-1:
```
39
```

---
Input example-2:

```python
duration_table = [
    [0, 1, 2, 3],
    [1, 0, 4, 5],
    [2, 4, 0, 6],
    [3, 5, 6, 0]
]
route = 'ABCDCDBAB'

total_time = calculate_duration(route, duration_table)
print(total_time)
```

Output example-2:
```
31
```

---
Write function

In [None]:
# DO NOT CHANGE THE NAME & INPUT OF THE FUNCTION
def calculate_duration(route, duration_table):
    '''
    Function to calculate travel duration from a given route and back to the origin

    Parameters
    ----------
    route : str
        The route

    Returns
    -------
    total_time : int
        The travel duration
    '''
    ###
    ### YOUR CODE HERE
    ###

    try:
      # Validate data is list
      if not isinstance(duration_table, list):
        raise TypeError(f"Whoops, Matrix type error. Should be list")

      # Check if duration_table is a 2D list and not empty
      if not duration_table or not all(isinstance(row, list) for row in duration_table):
          raise TypeError('Mismatch input size. The data is not in 2D')

      # Validate data is not empty
      if not all(isinstance(duration_table, list) and row for row in duration_table):
         raise ValueError(f"Whoops, Matrix should have value")

      # Validate is row is has same column
      ncols = len(duration_table[0])
      if not all(len(row)==ncols for row in duration_table):
         raise ValueError(f"Whoops, Matrix should have same column")


      # Validate is row and column has same number (3x3)
      ncols = len(duration_table[0])
      if not ncols==len(duration_table):
         raise ValueError(f"Whoops, Matrix should rectangle")


      # Validate values is must be number no str
      for row in duration_table:
        if not all(isinstance(elemen,(int,float)) for elemen in row):
            raise ValueError(f"Whoops, Matrix should be numerical")

      # Validate data is list
      if not route:
        raise TypeError(f"Whoops, Please insert route")

      idx = 97
      aphabet_idx = idx + len(duration_table[0])
      dict_data = {}
      dict_idx  = {}
      idx_item = 0
      distance = 0
      last_idx = len(route)-1

      while idx < aphabet_idx:
          dict_data[chr(idx).upper()] = duration_table[idx_item]
          dict_idx[chr(idx).upper()]  = idx_item
          idx +=1
          idx_item +=1

      for idx,item in enumerate(route):
          from_step = dict_data[str(item)]
          if idx != last_idx:
              distance += from_step[dict_idx[route[idx+1]]]
          else:
              distance += from_step[dict_idx[route[0]]]
      return distance
    except (TypeError,ValueError) as err:
      return err

Check function

In [None]:
# DO NOT CHANGE ANYTHING IN THIS CELL
# Just run after you finish the function
route = 'ABCDEBE'
duration_table = [
    [0, 3, 5, 10, 4],
    [3, 0, 6, 8, 9],
    [5, 6, 0, 7, 2],
    [10, 8, 7, 0, 1],
    [4, 9, 2, 1, 0]
]


total_time = calculate_duration(route, duration_table)
print(total_time)

39
