## Tests of Goodness of Fit (Determine Whether a Population Being Sampled Has a Specific Probability Distribution)
### Multinomial Probability Distribution
Consider the market share study being conducted by Scott Marketing Research. Over the past year, market shares for a certain product have stabilized at 30% for company A, 50% for company B, and 20% for company C. Since each customer is classified as buying from one of these companies, we have a multinomial probability distribution with three possible outcomes. The probability for each of the three outcomes is as follows.

$p_A$: probability a customer purchases the company A product 

$p_B$: probability a customer purchases the company B product 

$p_C$: probability a customer purchases the company C product

Using the historical market shares, we have multinomial probability distribution with $p_A$ = .30, $p_B$ = .50, and $p_C$ = .20.
Company C plans to introduce a “new and improved” product to replace its current entry in the market. Company C has retained Scott Marketing Research to determine whether the new product will alter or change the market shares for the three companies. Specifically, the Scott Marketing Research study will introduce a sample of customers to the new company C product and then ask the customers to indicate a preference for the company A product, the company B product, or the new company C product. Based on the sample data, the following hypothesis test can be used to determine if the new company C product is likely to change the historical market shares for the three companies.

$H_0$:	$p_A$ = .30, $p_B$ = .50, and $p_C$ = .20 

$H_a$: The population proportions are not $p_A$ = .30, $p_B$ = .50, and $p_C$ = .20

Let us assume that the market research firm has used a consumer panel of 200 customers. Each customer was asked to specify a purchase preference among the three alternatives: company A’s product, company B’s product, and company C's new product.
#### The Observed Frequencies

| Company A's Product | Company B's Product | Company C's Product|
|:-------------:|:-------------:|:-------------:|
| 48            | 98          |   54         |

#### The Expected Frequencies

| Company A's Product | Company B's Product | Company C's Product|
|:-------------:|:-------------:|:-------------:|
| 200(.3) = 60            | 200(.5) = 100         |   200(.2) = 40         |


In [1]:
from scipy.stats import chi2

In [2]:
observed = [48, 98, 54]
expected = [60, 100, 40]

In [3]:
chi_square = sum([(x-y)**2./y for x, y in zip(observed, expected)])
crit = chi2.ppf(0.95, 2) # Find the critical value for 95% confidence
p_value = 1 - chi2.cdf(chi_square, 2)
print chi_square, crit, p_value

7.34 5.99146454711 0.0254764699467


<img src="Fig12-5.bmp">