# Market Entry Analysis for Hawkeye Liquor Distributors

Created by: Anish Puthuraya, Akash Reddy, Sonal Kaur, Wenxuan Yan, Alexander Heger

__See __[here](https://prod-useast-a.online.tableau.com/t/ba775team2cohorta/views/A02-Market-Entry-Analysis-For-Hawkeye-Liquors/Dashboard1?:origin=card_share_link&:embed=n)__ for Tableau dashboard (Puthuraya et al.).__

<img src="https://i.ibb.co/zXYVCcq/Dashboard-1.png" width= 800 alt="Alt text that describes the graphic" title="Title text" />

<hr />

## Table of Contents

#### I. Introduction 

#### II. Data Cleaning

#### III. Data Exploration

#### IV. Data Scoping

#### V. Conclusions

#### VI. References


<hr>

## I. Introduction

Starting a business is hard.  That's why start-ups seek insights about the market they are seeking to enter before they open their doors.  These insights could include data on demand, competition, pricing, target demographics, and more.  In this report, we look at the highest demand sectors, as measured by revenue, for liquor sales in Iowa on behalf of a fictional liquor distributor named Hawkeye Liquors.  Iowa, with its growing population, high per capita sales on liquor, and median age of 35.6 (Eathington), is a good candidate for new liquor sales.

The State of Iowa is one of 18 U.S. States that directly controls the sale and distribution of alcoholic beverages (Iowa Alcoholic Beverages Division).  This adds an additional complication since distributors of liquor need to understand both the demand for their product(s) and the process of selling to the Alcoholic Beverages Division.

#### Objective
Our goal with this dataset is to perform a multi-level market analysis for Hawkeye Liquor Distributors (fictional) using SQL, with a focus on facilitating the following business decisions:

1. __Location within Iowa__ - We will analyze which regions of Iowa have the highest/lowest sales, and make a recommendation on which areas yield the highest revenue.

2. __Type of Liquor__ - Based on the regional sales of different types of liquor, we will make an overall recommendation on the type(s) of liquor Hawkeye should distribute to the State of Iowa.

3. __Pricing__ - For our recommended type(s) of liquor, we will advise Hawkeye on a target price for distribution to the State of Iowa.

4. __Distribution Channel Mix__ - Finally, we will determine which type(s) of store (grocery, wholesale, convenience, etc.) will yield the highest sales for Hawkeye's liquor products.

#### Report Summary
Our analysis shows that __the highest revenue market for liquors in the State of Iowa resides in Polk County, distributing Canadian Whiskies with sale prices in the $5-10 range for sale at convenience, grocery, and liquor stores__.  Each of these market characteristics is derived from the highest revenue measured for county, type of liquor, price, and distribution channel from a public dataset from the Iowa Department of Commerce containing over 10 years of Iowa liquor sales information.

#### Data Brief 
The dataset contains every wholesale order of liquor by all grocery stores, liquor stores, convenience stores, etc., with details about the store and location, the exact liquor brand and size, and the number of bottles ordered since January 1, 2012.

The original dataset comprises 24 fields (all nullable) and 24,229,431 records and occupies 6.4 GB of storage as of October 3, 2022.

__Data sourced from Iowa Department of Commerce via Google BigQuery.  See __[here](https://data.iowa.gov/Sales-Distribution/Iowa-Liquor-Sales/m3tr-qhgy/data)__.__

In order to stop the ingest of new data into the table for our analysis, we froze the raw dataset in a table called `ba775-a02-fall22.main.sales` in our BigQuery project on October 3, 2022.  Our analysis is based on this extract of the original dataset cited above.

In [1]:
%%bigquery
SELECT *
FROM `bigquery-public-data.iowa_liquor_sales.sales`
LIMIT 5;

Query complete after 0.01s: 100%|██████████| 2/2 [00:00<00:00, 580.41query/s]                         
Downloading: 100%|██████████| 5/5 [00:01<00:00,  3.59rows/s]


Unnamed: 0,invoice_and_item_number,date,store_number,store_name,address,city,zip_code,store_location,county_number,county,...,item_number,item_description,pack,bottle_volume_ml,state_bottle_cost,state_bottle_retail,bottles_sold,sale_dollars,volume_sold_liters,volume_sold_gallons
0,INV-29610300001,2020-08-20,4677,Costco Wholesale #1111 / Coralville,2900 Heartland Dr,Coralville,52241.0,POINT (-91.61494100000002 41.698028),52,JOHNSON,...,15187,WhistlePig 6 Year PiggyBack,6,750,25.0,37.5,30,1125.0,22.5,5.94
1,INV-32504000164,2020-12-08,5662,Riverside Liquor 2 / Davenport,1528 W Locust,Davenport,52804.0,POINT (-90.59739400000001 41.53826),82,SCOTT,...,31277,Midwest Gin,12,1000,17.0,25.5,4,102.0,4.0,1.05
2,S08706400005,2012-11-01,4129,Cyclone Liquors,626 LINCOLN WAY,AMES,50010.0,POINT (-93.618911 42.022854),85,Story,...,27102,Templeton Rye,6,750,18.08,27.13,30,813.9,22.5,5.94
3,S31637000033,2016-04-06,4599,Quik Trip #500 / Hubbell DM,3700 HUBBELL AVE,DES MOINES,50317.0,POINT (-93.54490600000001 41.629026),77,Polk,...,56195,Paul Masson Peach Grande Amber Brandy,24,375,3.22,4.83,9,43.47,3.38,0.89
4,S22893200038,2014-12-10,4251,Aj's Liquor / Ames,4518 MORTENSON RD STE 109,AMES,50014.0,,85,Story,...,82607,Dekuyper Sour Apple,12,1000,7.62,11.43,4,45.72,4.0,1.06


#### Original Data Schema

`invoice_and_item_number` - STRING - Concatenated invoice and line number associated with the liquor order. This provides a unique identifier for the individual liquor products included in the store order.

`date` - DATE - Date of order.

`store_number` - STRING - Unique number assigned to the store who ordered the liquor.

`store_name` - STRING - Name of store who ordered the liquor.

`address` - STRING - Address of store who ordered the liquor.

`city` - STRING - City where the store who ordered the liquor is located.

`zip_code` - STRING - Zip code where the store who ordered the liquor is located.

`store_location` - STRING - Location of store who ordered the liquor. The Address, City, State and Zip Code are geocoded to provide geographic coordinates. Accuracy of geocoding is dependent on how well the address is interpreted and the completeness of the reference data used.

`county_number` - STRING - Iowa county number for the county where store who ordered the liquor is located.

`county` - STRING - County where the store who ordered the liquor is located.

`category` - STRING - Category code associated with the liquor ordered.

`category_name` - STRING - Category of the liquor ordered.

`vendor_number` - STRING - The vendor number of the company for the brand of liquor ordered.

`vendor_name` - STRING - The vendor name of the company for the brand of liquor ordered.

`item_number` - STRING - Item number for the individual liquor product ordered.

`item_description` - STRING - Description of the individual liquor product ordered.

`pack` - INTEGER - The number of bottles in a case for the liquor ordered.

`bottle_volume_ml` - INTEGER - Volume of each liquor bottle ordered in milliliters.

`state_bottle_cost` - FLOAT - The amount that Alcoholic Beverages Division paid for each bottle of liquor ordered.

`state_bottle_retail` - FLOAT - The amount the store paid for each bottle of liquor ordered.

`bottles_sold` - INTEGER - The number of bottles of liquor ordered by the store.

`sale_dollars` - FLOAT - Total cost of liquor order (number of bottles multiplied by the state bottle retail).

`volume_sold_liters` - FLOAT - Total volume of liquor ordered in liters. (i.e. (Bottle Volume (ml) x Bottles Sold)/1,000)

`volume_sold_gallons` - FLOAT - Total volume of liquor ordered in gallons. (i.e. (Bottle Volume (ml) x Bottles Sold)/3785.411784)

<hr />

## II. Data Cleaning 

### How did you narrow down the data set for the purpose of your analysis?

We dropped the following columns based on relevancy to the questions that we are trying to answer:


In [None]:
# %%bigquery
# CREATE OR REPLACE TABLE ba775-a02-fall22.main.sales_columns_dropped
# AS
#     (SELECT * EXCEPT
#         (invoice_and_item_number, 
#          store_number, 
#          item_number, 
#          address,
#          city,
#          county_number, 
#          zip_code,
#          category,
#          vendor_number, 
#          sale_dollars,
#          volume_sold_liters, 
#          volume_sold_gallons)
# FROM `ba775-a02-fall22.main.sales` 

#### Data problems related to Decision 1 (Location within Iowa):

__Q__: Is there missing information for location?  How will you address it?

__A__: No records are missing `county` information; therefore we will scope our analysis around information at the county-level.

__Q__: Were there other formatting or missing data issues?  What was their impact?

__A1__: Some `zip_code` records were stored as FLOAT causing multiple DISTINCT records for the same store.  Our overall analysis is not impacted by this error though, considering we are focused on county-level sales.

__A2__: Many counties had a mix of uppercase and lowercase naming conventions within their occurences in the data. We performed the conversion of all county name to their capitalized versions to overcome this issue.

In [None]:
# %%bigquery 
# UPDATE `ba775-a02-fall22.main.sales_columns_dropped`
# SET county = UPPER(county)
# WHERE county != UPPER(county);

__A3__: There were 4 counties' names mispelled in the original data.

In [3]:
%%bigquery 
SELECT DISTINCT(county)
FROM `ba775-a02-fall22.main.sales`
WHERE (
 county LIKE 'BUENA VIST%'
 OR county LIKE 'BUENA VISTA%'
 OR county LIKE 'POTTAWATTA%'
 OR county LIKE 'POTTAWATTAMIE%'
 OR county LIKE 'O\'BRIEN%'
 OR county LIKE 'OBRIEN%'
 OR county LIKE 'CERRO GORD%'
 OR county LIKE 'CERRO GORDO%'
);

Query complete after 0.00s: 100%|██████████| 2/2 [00:00<00:00, 1035.25query/s]                        
Downloading: 100%|██████████| 8/8 [00:01<00:00,  7.11rows/s]


Unnamed: 0,county
0,CERRO GORDO
1,BUENA VIST
2,O'BRIEN
3,POTTAWATTAMIE
4,CERRO GORD
5,POTTAWATTA
6,BUENA VISTA
7,OBRIEN


__A4:__ Additionally, two records were present from El Paso County, which is not in Iowa.

#### Data problems related to Decision 2 (Type of Liquor): 

__Q__: Is there missing information for type of liquor?  How will you address it?

__A__: 24,644 records for `category_name` are NULL.  However, considering this is 0.1% of the dataset, we will exclude these null records.

In [4]:
%%bigquery  
SELECT COUNT(*) 
FROM `ba775-a02-fall22.main.sales`
WHERE category_name IS NULL;

Query complete after 0.00s: 100%|██████████| 2/2 [00:00<00:00, 1398.33query/s]                        
Downloading: 100%|██████████| 1/1 [00:01<00:00,  1.22s/rows]


Unnamed: 0,f0_
0,24644


__Q__: Were there other formatting or missing data issues?  What was their impact? 

__A1__: Performing COUNT DISTINCT of `category` shows 167 different categories of liquor due to 111 category numbers ending in ".0" as if a previous schema for the dataset stored this field as FLOAT.  This informed our decision to used `category_name` as the primary field for determining type of liquor.  

In [6]:
%%bigquery 
SELECT 
  COUNT(DISTINCT category)
FROM `ba775-a02-fall22.main.sales`
WHERE category LIKE "%.0";

Query complete after 0.01s: 100%|██████████| 3/3 [00:00<00:00, 1521.70query/s]                        
Downloading: 100%|██████████| 1/1 [00:01<00:00,  1.75s/rows]


Unnamed: 0,f0_
0,111


__A2__: Additionally, there are duplicate `category_name` values due to capitalized versions of values, which we fix using the below query.

In [8]:
# %%bigquery 
# UPDATE `ba775-a02-fall22.main.sales_columns_dropped`
# SET category_name = UPPER(category_name)
# WHERE category_name != UPPER(category_name)

#### Data problems related to Decision 3 (Pricing): 

There are no data problems impacting our analysis related to pricing.

#### Data problems related to Decision 4 (Distribution Channel Mix):

__Q__: Is there missing information on stores?  How will you address it?

__A__: No information is missing from `store_name`, however no clear categorization of stores exists in the original dataset.  We address this in Section IV, Data Scoping.

<hr>

## III. Data Exploration

#### Findings related to Decision 1 (Location within Iowa):

__Q__: How is location represented in the data schema?

__A__: The original dataset includes location information in the `address`, `city`, `zip_code`, `store_location`, and `county` fields, which are all stored as STRING.  For the purposes of our analysis we will use the information from `county`.

__Q__: How many counties have liquor distributed from the original dataset?

__A__: 99 counties

In [7]:
%%bigquery
SELECT 
  COUNT(DISTINCT UPPER(county)) AS counties
FROM `ba775-a02-fall22.main.sales_columns_dropped`

Query complete after 0.00s: 100%|██████████| 3/3 [00:00<00:00, 1710.56query/s]                        
Downloading: 100%|██████████| 1/1 [00:01<00:00,  1.62s/rows]


Unnamed: 0,counties
0,99


__Q__: Which of those counties had the highest number of liquor sales?

__A__: Polk County

In [8]:
%%bigquery
SELECT 
    county, 
    COUNT(*) AS value_occurrences
FROM `ba775-a02-fall22.main.sales_columns_dropped`
GROUP BY county
ORDER BY value_occurrences DESC
LIMIT 1;

Query complete after 0.00s: 100%|██████████| 3/3 [00:00<00:00, 1967.00query/s]                        
Downloading: 100%|██████████| 1/1 [00:01<00:00,  1.43s/rows]


Unnamed: 0,county,value_occurrences
0,POLK,4442375


#### Findings related to Decision 2 (Type of Liquor):

__Q__: How is type of liquor represented in the data schema?

__A__: Type of liquor information is stored in the `category` and `category_name` fields, which are stored as STRING.  We will use `category_name` for our analysis (see Data Cleaning for details).

__Q__: How many types of liquor were sold in the original dataset?

__A__: 107

In [9]:
%%bigquery
SELECT 
  DISTINCT singular_name,
FROM 
  (SELECT CASE WHEN category_name LIKE '%BRANDY'
    THEN REPLACE(category_name, 'BRANDY', 'BRANDIES')
    WHEN category_name LIKE '%VODKA'
    THEN REPLACE(category_name, 'VODKA', 'VODKAS')
    WHEN category_name LIKE '%WHISKEY'
    THEN REPLACE(category_name, 'WHISKEY', 'WHISKIES')
    WHEN category_name LIKE '%GIN'
    THEN REPLACE(category_name, 'GIN', 'GINS')
    WHEN category_name LIKE '%LIQUEUR'
    THEN REPLACE(category_name, 'LIQUEUR', 'LIQUEURS')
    ELSE category_name END AS singular_name
  FROM `ba775-a02-fall22.main.sales_columns_dropped`)
WHERE singular_name IS NOT NULL;

Query complete after 0.00s: 100%|██████████| 2/2 [00:00<00:00, 1319.38query/s]                        
Downloading: 100%|██████████| 107/107 [00:01<00:00, 64.78rows/s]


Unnamed: 0,singular_name
0,STRAIGHT BOURBON WHISKIES
1,MISC. IMPORTED CORDIALS & LIQUEURS
2,AMERICAN FLAVORED VODKAS
3,IMPORTED CORDIALS & LIQUEURS
4,IMPORTED VODKAS
...,...
102,DELISTED ITEMS
103,DELISTED / SPECIAL ORDER ITEMS
104,HOLIDAY VAP
105,IMPORTED VODKA - CHERRY


__Q__: Which type of liquor sold the most in the original dataset?

__A__: Canadian Whiskies

In [11]:
%%bigquery
SELECT 
  UPPER(category_name) AS category_name,
  COUNT(*) AS value_occurrences
FROM `ba775-a02-fall22.main.sales_columns_dropped`
GROUP BY category_name
ORDER BY value_occurrences DESC
LIMIT 1;

Query complete after 0.00s: 100%|██████████| 3/3 [00:00<00:00, 2007.48query/s]                        
Downloading: 100%|██████████| 1/1 [00:01<00:00,  1.58s/rows]


Unnamed: 0,category_name,value_occurrences
0,CANADIAN WHISKIES,2308400


#### Findings related to Decision 3 (Pricing):

__Q__: How is pricing information represented in the dataset?

__A__: The target field for our analysis is `state_bottle_cost` (FLOAT) since this price is what the Iowa Alcoholic Beverages Division pays per bottle.  The `state_bottle_retail` (FLOAT) field contains the price each authorized vendor paid to the state for each bottle.

#### Findings related to Decision 4 (Distribution Channel Mix):

__Q__: How is store information represented in the dataset?

__A__: Stores are identified by `store_name` and `store_number` fields as STRING in the original dataset.  In order to glean information regarding distribution channel mix, we will need to further categorize stores by their types (see Section IV).

<hr />

## IV. Data Scoping

### Generating Revenue Column

By multiplying columns `pack` and `state_bottle_cost` we can calculate revenue for each sale of liquor.  This key discovery from our data exploration provides us a fundamental business metric by which we can recommend the highest revenue decisions for Hawkeye Liquor Distributors.


In [None]:
# %%bigquery
# CREATE OR REPLACE TABLE ba775-a02-fall22.main.sales_revenue
# AS 
# SELECT
#   * , 
#   (state_bottle_cost*pack) AS revenue
# FROM ba775-a02-fall22.main.sales_columns_dropped;

### Creating Bucketing for Prices

We create a new table `price_buckets` which marks the `state_bottle_cost` according to 8 different price buckets in multiples of $5. This is done to facilitate providing a focused pricing recommendation for Hawkeye

In [None]:
# %%bigquery
# CREATE TABLE ba775-a02-fall22.main.price_buckets AS
# SELECT
# DISTINCT state_bottle_cost,
# CASE
#     WHEN state_bottle_cost > 75 THEN '$75+'
#     WHEN state_bottle_cost BETWEEN 50 AND 75 THEN '$50-75'
#     WHEN state_bottle_cost BETWEEN 25 AND 50 THEN '$25-50'
#     WHEN state_bottle_cost BETWEEN 20 AND 25 THEN '$20-25'
#     WHEN state_bottle_cost BETWEEN 15 AND 20 THEN '$15-20'
#     WHEN state_bottle_cost BETWEEN 10 AND 15 THEN '$10-15'
#     WHEN state_bottle_cost BETWEEN 5 AND 10 THEN '$5-10'
#     WHEN state_bottle_cost BETWEEN 0 AND 5 THEN '$0-5'
# END AS price_bucket
# FROM `ba775-a02-fall22.main.sales_columns_dropped`

### Creating Mapping for Distribution Channels

Within the Sales data, no clear categorization of stores exists in the original dataset, so we *__manually__* assigned `store_category` to all 2,424 stores in a new table called `store_to_channel_mapping` which we will JOIN later with our `sales` table.  

Each member of the group took a fifth of the records, researched the store, and added it to a category based off their findings. The categories include: 'Convenience', 'Grocery', 'Hospitality', 'Liquor', and 'Wholesale'.  This categorization will inform our recommendation to Hawkeye Liquor Distributors on how to balance its approach to selling in a variety of store channels.

In [12]:
%%bigquery
SELECT *
FROM `ba775-a02-fall22.main.store_to_channel_mapping`
LIMIT 100;

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 653.22query/s]                          
Downloading: 100%|██████████| 100/100 [00:01<00:00, 58.38rows/s]


Unnamed: 0,store_category,store_name
0,Liquor,"""Double """"D"""" Liquor Store"""
1,Liquor,1st Stop Beverage Shop
2,Liquor,218 Fuel Express & Chubby's Liquor
3,Liquor,7 Rayos Liquor Store
4,Liquor,7Star Liquor & Tobacco Outlet
...,...,...
95,Liquor,East End Liquor / Des Moines
96,Liquor,East End Liquor & Tobacco
97,Liquor,East Side Liquor & Grocery
98,Liquor,East Side Liquor & Grocery / Marshalltown


<hr />

## V. Conclusions

### Location within Iowa

We saw earlier that Polk County has the highest number of liquor sales in Iowa. Upon analyzing the highest revenue counties, we still see __Polk__ as the leader in sales with over __$467 Million__ in revenue. Therefore, we recommend Hawkeye Liquor Distributors to pursue their operations within Polk County.

<img src="https://i.ibb.co/hM8dVwg/Screenshot-2022-10-06-at-1-09-16-AM.jpg" width= 500 alt="Alt text that describes the graphic" title="Title text" />


In [2]:
%%bigquery
SELECT 
    UPPER(county), 
    SUM(revenue) AS revenue
FROM `ba775-a02-fall22.main.sales_columns_dropped`
GROUP BY UPPER(county)
ORDER BY revenue DESC
LIMIT 10;

Query complete after 0.00s: 100%|██████████| 3/3 [00:00<00:00, 973.08query/s]                         
Downloading: 100%|██████████| 10/10 [00:01<00:00,  7.29rows/s]


Unnamed: 0,f0_,revenue
0,POLK,466664300.0
1,LINN,203611800.0
2,SCOTT,150185000.0
3,BLACK HAWK,137633300.0
4,JOHNSON,128676300.0
5,STORY,80938320.0
6,POTTAWATTAMIE,80097790.0
7,WOODBURY,77565050.0
8,DUBUQUE,72006670.0
9,CERRO GORDO,53759280.0


### Type of Liquor

__Canadian Whiskies__ have the highest sales in Polk County with __$38 Million__ total sales.  Therefore, Hawkeye Liquours should produce Canadian Whiskies if it seeks to target the highest demand liquor product in the Polk County, Iowa market.

<img src="https://i.ibb.co/WyphWWh/Screenshot-2022-10-06-at-1-06-36-AM.jpg" width= 500 alt="Alt text that describes the graphic" title="Title text" />

In [14]:
%%bigquery
SELECT 
    category_name,
    SUM(revenue) AS revenue
FROM
    `ba775-a02-fall22.main.sales_columns_dropped`
WHERE 
    UPPER(county) = 'POLK'
GROUP BY category_name
ORDER BY revenue DESC
LIMIT 10;

Query complete after 0.00s: 100%|██████████| 3/3 [00:00<00:00, 850.25query/s]                         
Downloading: 100%|██████████| 10/10 [00:01<00:00,  7.08rows/s]


Unnamed: 0,category_name,revenue
0,CANADIAN WHISKIES,38062440.0
1,STRAIGHT BOURBON WHISKIES,34266730.0
2,AMERICAN VODKAS,28831130.0
3,SPICED RUM,19097560.0
4,TENNESSEE WHISKIES,18646930.0
5,WHISKEY LIQUEUR,17414130.0
6,100% AGAVE TEQUILA,17219170.0
7,SINGLE MALT SCOTCH,15288360.0
8,IMPORTED BRANDIES,15121640.0
9,VODKA 80 PROOF,15087480.0


### Pricing

Joining the `sales` and `price_bucket` tables and grouping by the price buckets, we find that the highest revenue price range is the following:

##### \$5-10

Therefore Hawkeye Liquours should __increase the sales__ in this price bracket for their Canadian Whiskies to generate maximum revenue.

<img src="https://i.ibb.co/kK3nsgK/Screenshot-2022-10-06-at-1-22-14-AM.jpg" width= 500 alt="Alt text that describes the graphic" title="Title text" />

In [15]:
%%bigquery
SELECT
   price_bucket, SUM(revenue) AS revenue
FROM
   (
   SELECT *
   FROM `ba775-a02-fall22.main.sales_columns_dropped`
   WHERE UPPER(county) = 'POLK' AND category_name = 'CANADIAN WHISKIES'
   )
LEFT JOIN  
   ba775-a02-fall22.main.price_buckets
USING(state_bottle_cost)
GROUP BY
   price_bucket
ORDER BY
   revenue DESC;

Query complete after 0.00s: 100%|██████████| 4/4 [00:00<00:00, 1175.29query/s]                        
Downloading: 100%|██████████| 8/8 [00:01<00:00,  5.41rows/s]


Unnamed: 0,price_bucket,revenue
0,$5-10,16166699.14
1,$15-20,9339448.02
2,$0-5,4977742.84
3,$10-15,4194265.82
4,$25-50,2384647.56
5,$20-25,594067.98
6,$50-75,377285.64
7,$75+,28285.64


### Distribution Channel Mix

Joining the `sales`, `price_bucket` and `store_to_channel_mapping` tables and grouping by `store_category`, we observe that the highest revenue generating distribution channels were the following:

##### Convenience
##### Grocery
##### Liquor

The above comprise __95.5%__ of the total revenue from Canadian Whiskies sales (priced between $5-10) in Polk County, and therefore should be the 3 distribution channels on which Hawkeye should focus.

<img src="https://i.ibb.co/v1WNQBy/Screenshot-2022-10-06-at-1-26-23-AM.jpg" width= 500 alt="Alt text that describes the graphic" title="Title text" />

In [20]:
%%bigquery
SELECT 
    store_category, SUM(REVENUE) revenue
FROM
    (SELECT *
    FROM ba775-a02-fall22.main.sales_columns_dropped
    LEFT JOIN  
    ba775-a02-fall22.main.price_buckets
    USING(state_bottle_cost))
INNER JOIN
    ba775-a02-fall22.main.store_to_channel_mapping
USING(store_name)
WHERE
    category_name = 'CANADIAN WHISKIES'
    AND UPPER(county) = 'POLK'
    AND price_bucket = '$5-10'
GROUP BY
    store_category
ORDER BY
    revenue DESC;

Query complete after 0.00s: 100%|██████████| 5/5 [00:00<00:00, 1444.02query/s]                        
Downloading: 100%|██████████| 5/5 [00:01<00:00,  3.12rows/s]


Unnamed: 0,store_category,revenue
0,Convenience,6969194.22
1,Grocery,4559402.06
2,Liquor,2916802.22
3,Wholesale,650566.44
4,Hospitality,23762.28


#### Closing Thoughts

The fact that Polk County is the fastest growing, most highly paid region in Iowa (Eathington) makes it the premiere location for Hawkeye Liquors to begin selling their product.  Polk County also enjoys the highest revenue sales of alcohol in the state, making this argument is more convincing still. While they will be met with competition in the more urban setting of Polk County, this region's high demand offers Hawkeye opportunity to succeed.

We recommend that Hawkeye focuses its efforts on Canadian-style whiskies, since these are the most sold and highest selling liquors in the Iowa and Polk County markets.  However, Hawkeye will need to ensure its product remains competitive to customers and the state.

Considering that Hawkeye will sell directly to the State of Iowa, we recommend a target price in the $5-10 range, since this range yielded the highest revenue to distributors.  This price point also aligns with the consumer base of the convenience-driven distribution channels across Iowa.  

<hr />

## VI. References

Eathington, Liesl. (2019). Retail Trade Analysis Fiscal Year 2019. *Iowa State University
Department of Economics*. https://www.icip.iastate.edu/sites/default/files/retail/retail_19153.pdf

Iowa Alcoholic Beverages Division. (2022). Iowa ABD. *Iowa.gov*. https://abd.iowa.gov/iowa-abd#:~:text=Breadcrumb&text=Since%20the%20repeal%20of%20prohibition,of%20alcohol%20and%20tobacco%20products

Iowa Department of Commerce. (2022). Iowa Liquor Sales. *Iowa Data*. https://data.iowa.gov/Sales-Distribution/Iowa-Liquor-Sales/m3tr-qhgy/data

Puthuraya, Anish et al. (2022). Market Entry Analysis: Hawkeye Liquors. *Boston University Questrom School of Business*. https://prod-useast-a.online.tableau.com/t/ba775team2cohorta/views/A02-Market-Entry-Analysis-For-Hawkeye-Liquors/Dashboard1?:origin=card_share_link&:embed=n

