# Homework 5
## CSCI E-82A

## Background

Robotics are becoming common place in many business situations. For example in retail sales robotics are used in warehouse management. Amazon is widely known to be a leader in this area. But, according to a recent Wall Street Journal article, UK online grocery retailer, Ocado, is rumored to have even more sophisticated order fulfilment robotics.

https://www.wsj.com/articles/how-robots-and-drones-will-change-retail-forever-1539604800   

Not surprisingly, companies like Amazon and Ocado do not disclose much information on their robotics. Nonetheless, we can be sure that sensor fusion is a significant problem. For example, sensor fusion is a significant issue with self driving cars. See for example:

https://arxiv.org/ftp/arxiv/papers/0709/0709.1099.pdf  

These types of complex robots require years, even decades, to develop and perfect. In reality, the processes used in such complex robots are decomposable into a large number of **simple single tasks**. The complete system then operates by integrating the many single tasks into a **multi-task** environment. To avoid this complexity (and finish the course in less than 10 years:) you will address a simple single task problem. 


## Scenario  

Bob's Orchards is a premium seller of apples and pears. Bob's customers pay a substantial premium for superior fruit. To satisfy these customers, Bob's must ensure that the fruit delivered is correctly packed and perfectly ripe. 

Like many legacy industries requiring specialized human skills, Bob's is facing a talent problem. An expert human inspector will only pass fruit at the perfect ripeness, maximizing customer satisfaction and utility.   

However, many of the human inspectors who expertly check each piece of fruit shipped for ripeness are approaching retirement age. Management's attempts to recruit younger people to apprentice as fruit inspectors have been, well, fruitless (oh, sorry!:). Therefore, it has become imperative to find some type of automated system which can reduce the workload on the diminishing number of human inspectors. To address this problem, Bob's is deploying technology from Robots R Us.

The first robotic system to be deployed at Bob's uses a sensor array to determine if the fruit being shipped is at the correct ripeness. There are two sensors, a color vision system that examines the fruit to determine if it is ripe, and a smell sensor that determines if the fruit is not ripe enough or over ripe. If either sensor indicates the fruit is bad it is not shipped. In addition customers may reject even perfect fruit for no apparent reason, whereas others seem perfectly happy with less than perfect fruit.   

The probability distributions, decisions and utilities of this system can be summarized as follows:
1. The unconditional probability distribution of the ripeness of the fruit being packed is known.
2. A conditional distribution for the visual color sensor reading conditioned on fruit quality. 
3. A conditional distribution for the smell sensor reading conditioned on fruit quality. 
4. A conditional probability distribution of shipping the fruit conditioned on the readings of both sensors. This can be thought of as classifying the fruit as good or bad based on the senor results. There is a decision variable associated with this CPD. The fruit is only shipped if both sensors read positive.  
5. There is a conditional probability distribution of a customer accepting an order, or not, depending conditioned on CPD for the fruit classified for shipment and the actual fruit quality. 
6. There is a utility of the customer accepting or rejecting an order.  

## Instructions

You have been hired as a consultant to determine the optimal decision process for the ripeness testing robot. To perform this analysis you will do the following steps:

1. Draw an influence diagram for the fruit inspection task. 
2. Use a combination of pgmy and Python with numpy, to compute and compare the utility for using robot aided inspection.
3. Compare the utility of robot aided inspection to the current manual inspection process. 

### Influence Diagram

Once you have completed your influence diagram, display it here by replacing the influence.jpg file. 

<img src="Influence.jpg" alt="Drawing" style="width:600px; height:400px"/>
<center> Your influence diagram must go here </center>


> **Note:** You can save your figure as a file titled Influence.jpg in the same directory as your notebook. The figure should then be visible in the notebook and in the .html you will download. 


### Define the DAG   

Before you proceed, execute the code in the cell below:

In [None]:
from pgmpy.models import BayesianModel
from pgmpy.factors.discrete import TabularCPD
from pgmpy.inference import VariableElimination
import numpy as np

As a next step, you will define the CPDs for the DAG.   

**Fruit Quality**    

The fruit quality arriving from the orchard is characterized by the unconditional probability distribution:     

| Fruit Quality | Bad | Good |
|----|----|----|
|Probability | 0.3 | 0.7 |     

**Color Sensor**    

The color sensor determines fruit quality given the actual quality of the fruit. The sensor is not completely reliable. It will detect good fruit as bad and vice versa. The vendor has tuned the sensor to improve accuracy of bad fruit detection at the expense of good fruit detection accuracy. The CPD is as follows: 

| sensor reading | bad fruit | good fruit |
|----|----|----|
| Sensed as Bad | 0.9 | 0.20 |
| Sensed as Good | 0.1 | 0.80 |

**Smell Sensor**    

The smell sensor determines fruit quality given the actual quality of the fruit. As with the color sensor, this sensor is not completely reliable. It will detect good fruit as bad and vice versa. The vendor has tuned the sensor to improve accuracy of bad fruit detection at the expense of good fruit detection accuracy. Overall, this sensor is less reliable than the color sensor. The CPD is as follows: 

| sensor reading | good fruit | bad fruit |
|----|----|----|
| Sensed as Bad | 0.8 | 0.30 |
| Sensed as Good | 0.2 | 0.70 |

**Fruit Classification**

Bob's Fruit is quite particular about the quality of fruit shipped. An order will not be shipped unless both sensors agree that the fruit is good.  

You must determine the values of this CPD and the evidence representing the decision process. This variable is conditioned on the two sensor CPDs. Keep in mind that as long as the probabilities in each column of the CPD adds to 1.0, this is a valid distribution. 

**Customer Satisfaction**

The customer satisfaction is conditional on the actual fruit quality and the sensor classification of the shipment. Some customers will reject good shipments, whereas some customers will accept a bad shipment. This CPD is:

| Conditional Variables | Bad sensors - Bad fruit | Bad sensors - Good fruit | Good sensors - Bad fruit | Good sensors - Good fruit |  
|----|----|----|----|----|    
| Not satisfied | 0.8 | 0.1 | 0.8 | 0.1 |    
| Satisfied | 0.2 | 0.9 | 0.2 | 0.9 |  

Define these CPDs in the cell below. 

In the cell below do the following:

1. Define the DAG model.
2. Add the CPDs to the model.
3. Check the model.

### Inference and Utility Analysis

Now, you will define the utility function of customer satisfaction. In the cell below define an array for the utility function as shown in the table:

|  | Satisfied | Not Satisfied |
|----|----|----|
|Utility | 20 | -40 |

As a first step in this analysis, you will create baseline utility figures so that you can compare these to other utilities. Compute the utility for these cases:

1. The fruit is shipped without inspection. The quality of the fruit will be determined by what comes from the orchard. 
2. The fruit is 100% manually inspected by the expert human inspector, so that fruit of perfect quality is shipped, keeping in mind that customers may reject perfectly good fruit. 

> **Hint:** You may wish to do these calculations using numpy, rather than pgmpy. 

Does inspection of the fruit significantly improve customer satisfaction? 

ANS:  

In the cell below define a function to compute the utility given a VariableEliminatiion object, a query variable, the utility function, and evidence dictionary. 

In the cell below create a VariableElimination object using your model as an argument.

Now, you are ready to do a query on your model and to compute utility of the robotic fruit inspection. In the cell below do the following:

1. Compute and print the results of a query on the customer satisfaction variable, with decision variable (evidence) that only fruit determined good by both sensors is shipped, the decision variable. 
2. Compute and print the utility for customer satisfaction with decision variable (evidence) that only fruit determined good by both sensors is shipped, the decision variable. 

Examine the marginal distribution of the query variable. Is this distribution (done by sensors) close to the value of customer satisfaction with perfect fruit inspection (done by expert human inspectors)?    

Next, compare the utility value of sensor inspection to the one found for perfect fruit inspection. Are the values similar? 

ANS 1:     

ANS 2: 

The forgoing analysis assumes the cost of either automatic or human fruit inspection is zero. It is more realistic to consider the costs of the inspection when computing and comparing the total utility for the two scenarios. 

Assume the following utility functions for human and auto

|  | No Inspection | Inspection |
|----|----|----|
|Human Inspection Utility | 0 | -5 |   

|  | No Inspection | Inspection |
|----|----|----|
|Sensor Inspection Utility | 0 | -1 |

Using these utility functions compute and compare the total utility for both scenarios and answer these questions. 

- What is the total utility for human inspected fruit?
- What is the total utility for sensor inspected fruit?
- Which process produces higher utility? 

ANS 1:    
ANS 2:    
ANS 3:   

### Single Sensor 

Your foregoing analysis is based on a scenario with two sensors with **independent errors**. Now, you will investigate the value of using a single sensor verses using multiple sensors.   

The color vision sensor is known to be more accurate, so the scenario is to use only this sensor. The DAG no longer needs a CPD for the smell sensor or a CPD for the classification of fruit quality, which integrated the output of the two sensors. Fruit determined to be good by the single sensor will be shipped. The decision variable (ship or not) is associated only with the single sensor CPD.     

In the cell below you will do the following:

1. Create a new CPD for customer satisfaction. The probability distribution is no different. However, this CPD is conditional on the color vision sensor CPD and actual fruit quality.  
2. Define the edges of your DAG model with only the three remaining CPDs. Notice two CPDs are unchanged.   
3. Add the CPDs to you new DAG model. 
4. Check you model!

Now, you are ready to do a query on your new model and to compute utility of the single sensor inspection. In the cell below do the following:

1. Compute and print the results of a query on the customer satisfaction variable, with decision variable (evidence) that only fruit determined good by the color vision sensor is shipped, the decision variable. 
2. Compute and print the utility for customer satisfaction with decision variable (evidence) that only fruit determined good by the single sensors is shipped, the decision variable. 

Examine the marginal distribution of the query variable. Is this distribution close to the value of customer satisfaction for fruit inspected by the two sensor array?    

Next, compare the utility value to the one found for fruit inspected by the two sensor array. Are the values significantly different? Which method is superior? 

ANS 1:    

ANS 2:   

As with the first comparison of utilities the cost of inspection should be considered. However, this difference may not be that great, since much of fixtures and computer system required are the same as for single sensor inspection. The utility function for single sensor inspection is shown below. 

|  | No Inspection | Inspection |
|----|----|----|
|Sensor Inspection Utility | 0 | -0.8 |

Using this utility function, compute the total utility of single sensor inspection and compare it to the total utility for multi-sensor inspection. 

- What is the total utility for single sensor inspected fruit?
- Which process produces higher total utility?   
- In understanding this difference how important do you think it is that the errors of the two sensors are independent? 

ANS 1:   
ANS 2:   
ANS 3:   