![](imgs/kodolamaczlogo.png)

# Przetwarzanie Big Data z użyciem Apache Spark

Autor notebooka: Jakub Nowacki.

## Funkcje okienne w Spark

Funkcje okienne (ang. window functions) w Spark zostały dodane w wersji 1.4, ale nadal są mniej znaną funkcją systemu; zobacz [wpis na blogu Databrics](https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html) przybliżający te funkce. Ninejszy notebook przejdzie przez najważniejsze funkcje okienne dostępne w Spark.

## Pobieranie danych

Zestaw danych do tego ćwiczenia to dane udostępnione przez IBM i są to przykładowe dane sprzezażowe dostępne [tutaj](https://www.ibm.com/communities/analytics/watson-analytics-blog/sales-products-sample-data/). Poniższy kod pobierze dane automatycznie.

In [1]:
import findspark
findspark.init()

In [3]:
# IMDB dataset CERTIFICATE_VERIFY_FAILED exception workaround
import requests
requests.packages.urllib3.disable_warnings()
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    # Legacy Python that doesn't verify HTTPS certificates by default
    pass
else:
    # Handle target environment that doesn't support HTTPS verification
    ssl._create_default_https_context = _create_unverified_https_context

In [4]:
import os
import urllib.request

if not os.path.exists('data'):
    os.mkdir('data')

url = 'https://community.watsonanalytics.com/wp-content/uploads/2015/08/WA_Sales_Products_2012-14.csv'    
csv_file = 'data/WA_Sales_Products_2012-14.csv'

urllib.request.urlretrieve(url, 'data/WA_Sales_Products_2012-14.csv');

Inicjujemy sesje Spark jak zwykle.

In [5]:
import pyspark
import pyspark.sql.functions as func

spark = pyspark.sql.SparkSession.builder\
    .appName('why_sql')\
    .getOrCreate()

Dane są w formacie CSV, więc możemy użyć wbudowany czytnik CSV; dane są dość małe i dobrej jakości, więc możemy pozwolić sobie na automatyczne odkrycie schematu.

In [6]:
sales = spark.read.csv(csv_file, header=True, inferSchema=True)

# This is needed for string queries, you don't need to do it using DF syntax
sales.createTempView('sales') 

sales.printSchema()
sales.show()

root
 |-- Retailer country: string (nullable = true)
 |-- Order method type: string (nullable = true)
 |-- Retailer type: string (nullable = true)
 |-- Product line: string (nullable = true)
 |-- Product type: string (nullable = true)
 |-- Product: string (nullable = true)
 |-- Year: integer (nullable = true)
 |-- Quarter: string (nullable = true)
 |-- Revenue: double (nullable = true)
 |-- Quantity: integer (nullable = true)
 |-- Gross margin: double (nullable = true)

+----------------+-----------------+-------------+--------------------+--------------------+--------------------+----+-------+---------+--------+------------+
|Retailer country|Order method type|Retailer type|        Product line|        Product type|             Product|Year|Quarter|  Revenue|Quantity|Gross margin|
+----------------+-----------------+-------------+--------------------+--------------------+--------------------+----+-------+---------+--------+------------+
|   United States|              Fax|Outdoors Sho

## Czym są funkcje okienne?

Spróbujmu znaleźć produkty o największym dochodzie (Reveniue)

In [7]:
sales.select('Product', 'Product line', 'Product type', 'Year', 'Revenue')\
    .orderBy(func.desc('Revenue'))\
    .limit(10)\
    .show()

+--------------------+--------------------+-------------+----+----------+
|             Product|        Product line| Product type|Year|   Revenue|
+--------------------+--------------------+-------------+----+----------+
|Hailstorm Titaniu...|      Golf Equipment|        Woods|2014|1635687.96|
|        Star Gazer 2|   Camping Equipment|        Tents|2014| 1486717.1|
|           Star Lite|   Camping Equipment|        Tents|2013|1415141.91|
|Hailstorm Titaniu...|      Golf Equipment|        Woods|2014| 1388659.5|
|        Star Gazer 2|   Camping Equipment|        Tents|2013| 1335112.9|
|        Star Gazer 2|   Camping Equipment|        Tents|2013| 1311874.3|
|                Zone|Personal Accessories|      Eyewear|2013|1230450.95|
|Hailstorm Titaniu...|      Golf Equipment|        Woods|2013|1226669.53|
|           Star Lite|   Camping Equipment|        Tents|2012|1210413.68|
|  Hibernator Extreme|   Camping Equipment|Sleeping Bags|2013|1199043.84|
+--------------------+----------------

Ale danych jest więcej w zależności np od lini, typu lub roku.

### Zadanie

1. Znajdź ile jest i jakie są unikalne lini (Product line), typy (Product type) produkcji i lata raportu (Year)?
1. Pogrupuj po tych elementach i zobacz jakie są zakresy dochodu.

Jako, że zmiennych opisujących, które wpływają na dochód jest sporo, takie globalne sortowanie jest średnio informacyjne. Bardziej użyteczne jest np. grupowanie po latach. Jak zrobić zatem takie grupowanie aby dostać 3 najlepiej zarabiające produkty w danym roku? Możemy użyć funkcji okiennej `rank`.

In [20]:
rows = sales.select('Product line').distinct()
rows.count(), rows.show(truncate=False)

+------------------------+
|Product line            |
+------------------------+
|Camping Equipment       |
|Golf Equipment          |
|Mountaineering Equipment|
|Outdoor Protection      |
|Personal Accessories    |
+------------------------+



(5, None)

In [23]:
rows = sales.select('Product type').distinct()
rows.count(), rows.show(rows.count())

+--------------------+
|        Product type|
+--------------------+
|             Eyewear|
|          Navigation|
|Climbing Accessories|
|               Tents|
|             Putters|
|               Tools|
|               Woods|
|   Insect Repellents|
|           First Aid|
|               Packs|
|          Binoculars|
|            Lanterns|
|        Cooking Gear|
|              Safety|
|             Watches|
|                Rope|
|               Irons|
|       Sleeping Bags|
|           Sunscreen|
|              Knives|
|    Golf Accessories|
+--------------------+



(21, None)

In [25]:
rows = sales.select('Year').distinct()
rows.count(), rows.toPandas()

(3,    Year
 0  2013
 1  2014
 2  2012)

In [32]:
sales.groupBy('Product line').agg(func.count('Product line'), func.max('Revenue'), func.min('Revenue')).show()

+--------------------+-------------------+------------+------------+
|        Product line|count(Product line)|max(Revenue)|min(Revenue)|
+--------------------+-------------------+------------+------------+
|   Camping Equipment|              24866|   1486717.1|         0.0|
|      Golf Equipment|               7764|  1635687.96|         0.0|
|Mountaineering Eq...|               7943|    725496.0|         0.0|
|  Outdoor Protection|               8620|   160956.39|         0.0|
|Personal Accessories|              39282|  1230450.95|         0.0|
+--------------------+-------------------+------------+------------+



In [38]:
w = pyspark.sql.Window\
    .partitionBy('Year')\
    .orderBy(func.desc('Revenue'))

year_with_rank = sales.withColumn('rank', func.rank().over(w))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Revenue', 'Rank')

year_with_rank.show()

+--------------------+--------------------+-------------+----+----------+----+
|             Product|        Product line| Product type|Year|   Revenue|Rank|
+--------------------+--------------------+-------------+----+----------+----+
|           Star Lite|   Camping Equipment|        Tents|2013|1415141.91|   1|
|        Star Gazer 2|   Camping Equipment|        Tents|2013| 1335112.9|   2|
|        Star Gazer 2|   Camping Equipment|        Tents|2013| 1311874.3|   3|
|                Zone|Personal Accessories|      Eyewear|2013|1230450.95|   4|
|Hailstorm Titaniu...|      Golf Equipment|        Woods|2013|1226669.53|   5|
|  Hibernator Extreme|   Camping Equipment|Sleeping Bags|2013|1199043.84|   6|
|                Zone|Personal Accessories|      Eyewear|2013| 1118965.3|   7|
|           Star Lite|   Camping Equipment|        Tents|2013|1052750.28|   8|
|            Infinity|Personal Accessories|      Watches|2013|  973804.4|   9|
|        Star Gazer 2|   Camping Equipment|        T

In [34]:
year_with_rank.where('Rank <= 3').show()

+--------------------+--------------------+------------+----+----------+----+
|             Product|        Product line|Product type|Year|   Revenue|Rank|
+--------------------+--------------------+------------+----+----------+----+
|           Star Lite|   Camping Equipment|       Tents|2013|1415141.91|   1|
|        Star Gazer 2|   Camping Equipment|       Tents|2013| 1335112.9|   2|
|        Star Gazer 2|   Camping Equipment|       Tents|2013| 1311874.3|   3|
|Hailstorm Titaniu...|      Golf Equipment|       Woods|2014|1635687.96|   1|
|        Star Gazer 2|   Camping Equipment|       Tents|2014| 1486717.1|   2|
|Hailstorm Titaniu...|      Golf Equipment|       Woods|2014| 1388659.5|   3|
|           Star Lite|   Camping Equipment|       Tents|2012|1210413.68|   1|
|                Zone|Personal Accessories|     Eyewear|2012| 1042285.0|   2|
|                Zone|Personal Accessories|     Eyewear|2012| 1009957.9|   3|
+--------------------+--------------------+------------+----+---

## Zadanie

1. Wykonaj tę samą funkcję `rank` ale dla innej columny.
1. Wykonaj funkcję `rank` po kilku kolumnach.
1. Usuń kolumnę `rank`.
1. ★ Ile więcej zarobiły te produkty od średniej w danym roku?

Można to samo zapisać w formie SQL jak poniżej. Funkcja `RANK` jest zreszą częścią standardu ANSI SQL 1999.

In [39]:
query = """
SELECT 
    Product, 
    `Product line`,
    `Product type`,
    Year,
    Revenue,
    RANK() OVER(PARTITION BY Year ORDER BY Revenue DESC) AS Rank
FROM sales
HAVING Rank <= 3
"""
spark.sql(query).show()

+--------------------+--------------------+------------+----+----------+----+
|             Product|        Product line|Product type|Year|   Revenue|Rank|
+--------------------+--------------------+------------+----+----------+----+
|           Star Lite|   Camping Equipment|       Tents|2013|1415141.91|   1|
|        Star Gazer 2|   Camping Equipment|       Tents|2013| 1335112.9|   2|
|        Star Gazer 2|   Camping Equipment|       Tents|2013| 1311874.3|   3|
|Hailstorm Titaniu...|      Golf Equipment|       Woods|2014|1635687.96|   1|
|        Star Gazer 2|   Camping Equipment|       Tents|2014| 1486717.1|   2|
|Hailstorm Titaniu...|      Golf Equipment|       Woods|2014| 1388659.5|   3|
|           Star Lite|   Camping Equipment|       Tents|2012|1210413.68|   1|
|                Zone|Personal Accessories|     Eyewear|2012| 1042285.0|   2|
|                Zone|Personal Accessories|     Eyewear|2012| 1009957.9|   3|
+--------------------+--------------------+------------+----+---

In [41]:
w = pyspark.sql.Window\
    .partitionBy('Product')\
    .orderBy(func.desc('Revenue'))

product_with_rank = sales.withColumn('rank', func.rank().over(w))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Revenue', 'Rank')

product_with_rank.where('Rank <= 3').show()

+--------------------+--------------------+--------------------+----+---------+----+
|             Product|        Product line|        Product type|Year|  Revenue|Rank|
+--------------------+--------------------+--------------------+----+---------+----+
|             Fairway|Personal Accessories|             Eyewear|2013|466774.75|   1|
|             Fairway|Personal Accessories|             Eyewear|2013| 442932.9|   2|
|             Fairway|Personal Accessories|             Eyewear|2013|426678.75|   3|
|Firefly Rechargea...|Mountaineering Eq...|Climbing Accessories|2014|179899.04|   1|
|Firefly Rechargea...|Mountaineering Eq...|Climbing Accessories|2014|166948.48|   2|
|Firefly Rechargea...|Mountaineering Eq...|Climbing Accessories|2013|132374.32|   3|
|TrailChef Deluxe ...|   Camping Equipment|        Cooking Gear|2014|429703.01|   1|
|TrailChef Deluxe ...|   Camping Equipment|        Cooking Gear|2013|314852.65|   2|
|TrailChef Deluxe ...|   Camping Equipment|        Cooking Gear|2

In [45]:
w = pyspark.sql.Window\
    .partitionBy(['Product', 'Product line'])\
    .orderBy(func.desc('Revenue'))

multi_with_rank = sales.withColumn('rank', func.rank().over(w))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Revenue', 'Rank')

multi_with_rank.where('Rank <= 2').show()

+-------------------+--------------------+--------------------+----+---------+----+
|            Product|        Product line|        Product type|Year|  Revenue|Rank|
+-------------------+--------------------+--------------------+----+---------+----+
|          Seeker 50|Personal Accessories|          Binoculars|2014| 153134.8|   1|
|          Seeker 50|Personal Accessories|          Binoculars|2014| 139885.2|   2|
|        Trail Scout|Personal Accessories|          Navigation|2013| 170335.0|   1|
|        Trail Scout|Personal Accessories|          Navigation|2012| 114954.0|   2|
|       Trail Master|Personal Accessories|          Navigation|2012|  91750.0|   1|
|       Trail Master|Personal Accessories|          Navigation|2012|  74095.0|   2|
|  Granite Chalk Bag|Mountaineering Eq...|Climbing Accessories|2014| 66817.08|   1|
|  Granite Chalk Bag|Mountaineering Eq...|Climbing Accessories|2014| 61670.16|   2|
|         Sun Shield|  Outdoor Protection|           Sunscreen|2012| 38131.0

In [50]:
w = pyspark.sql.Window\
    .partitionBy('Product', 'Product line')\
    .orderBy(func.desc('Revenue'))

multi_with_rank = sales.withColumn('rank', func.rank().over(w))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Revenue', 'Rank')

multi_with_rank.drop('Rank').show()

+--------------------+--------------------+--------------------+----+---------+
|             Product|        Product line|        Product type|Year|  Revenue|
+--------------------+--------------------+--------------------+----+---------+
|TrailChef Deluxe ...|   Camping Equipment|        Cooking Gear|2012| 59628.66|
|TrailChef Double ...|   Camping Equipment|        Cooking Gear|2012| 35950.32|
|           Star Dome|   Camping Equipment|               Tents|2012| 89940.48|
|        Star Gazer 2|   Camping Equipment|               Tents|2012|165883.41|
|     Hibernator Lite|   Camping Equipment|       Sleeping Bags|2012| 119822.2|
|  Hibernator Extreme|   Camping Equipment|       Sleeping Bags|2012| 87728.96|
| Hibernator Camp Cot|   Camping Equipment|       Sleeping Bags|2012| 41837.46|
|        Firefly Lite|   Camping Equipment|            Lanterns|2012|  8268.41|
|     Firefly Extreme|   Camping Equipment|            Lanterns|2012|   9393.3|
|     EverGlow Single|   Camping Equipme

In [57]:
w = pyspark.sql.Window.partitionBy('Year')

sales.withColumn('Avg Rev by Year', func.avg('Revenue').over(w)) \
    .select('Year', 'Avg Rev by Year', 'Revenue', (func.col('Revenue') - func.col('Avg Rev by Year')).alias('Diff')) \
    .show()

+----+-----------------+---------+-------------------+
|Year|  Avg Rev by Year|  Revenue|               Diff|
+----+-----------------+---------+-------------------+
|2013|45298.46170547772| 19418.52| -25879.94170547772|
|2013|45298.46170547772| 42304.32|-2994.1417054777194|
|2013|45298.46170547772| 52266.32|  6967.858294522281|
|2013|45298.46170547772|  5211.64| -40086.82170547772|
|2013|45298.46170547772| 51714.46|   6415.99829452228|
|2013|45298.46170547772|112037.76|  66739.29829452228|
|2013|45298.46170547772| 22786.56|-22511.901705477718|
|2013|45298.46170547772| 12559.96| -32738.50170547772|
|2013|45298.46170547772| 13101.88| -32196.58170547772|
|2013|45298.46170547772|  3494.05|-41804.411705477716|
|2013|45298.46170547772| 12967.05| -32331.41170547772|
|2013|45298.46170547772|  32832.0|-12466.461705477719|
|2013|45298.46170547772| 57900.38| 12601.918294522278|
|2013|45298.46170547772|  43093.9|-2204.5617054777176|
|2013|45298.46170547772|  22847.5| -22450.96170547772|
|2013|4529

Podobny efekt daje funkcja `dense_rank`.

In [60]:
w = pyspark.sql.Window\
    .partitionBy('Product type')\
    .orderBy('Year')

with_dense_rank = sales.withColumn('Dense rank', func.dense_rank().over(w))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Revenue', 'Dense rank')

with_dense_rank.show()

+------------+--------------------+------------+----+--------+----------+
|     Product|        Product line|Product type|Year| Revenue|Dense rank|
+------------+--------------------+------------+----+--------+----------+
|   Polar Sun|Personal Accessories|     Eyewear|2012| 7015.34|         1|
|   Polar Ice|Personal Accessories|     Eyewear|2012|  3825.8|         1|
|       Capri|Personal Accessories|     Eyewear|2012| 10838.9|         1|
|     Cat Eye|Personal Accessories|     Eyewear|2012| 4428.85|         1|
|       Dante|Personal Accessories|     Eyewear|2012| 9759.75|         1|
|     Fairway|Personal Accessories|     Eyewear|2012| 8241.35|         1|
|     Inferno|Personal Accessories|     Eyewear|2012| 12935.0|         1|
|     Maximus|Personal Accessories|     Eyewear|2012|  9325.0|         1|
|      Trendi|Personal Accessories|     Eyewear|2012|  9104.3|         1|
|        Zone|Personal Accessories|     Eyewear|2012|  4574.3|         1|
|   Polar Sun|Personal Accessories|   

In [61]:
with_dense_rank.groupBy('year')\
    .agg(func.first('Dense rank'))\
    .orderBy('Year')\
    .show()

+----+------------------------+
|year|first(Dense rank, false)|
+----+------------------------+
|2012|                       1|
|2013|                       2|
|2014|                       3|
+----+------------------------+



### Zadanie

1. Zobacz co się stanie po zamianie `dense_rank` na `rank`?

Jeżeli tylko potrzebny jest numer kolejnych elementów a nie ich ranga, można użyc funcji `row_number`.

In [62]:
w = pyspark.sql.Window\
    .partitionBy('Product type')\
    .orderBy('Year')

sales.withColumn('Row', func.row_number().over(w))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Revenue', 'Row')\
    .show()

+------------+--------------------+------------+----+--------+---+
|     Product|        Product line|Product type|Year| Revenue|Row|
+------------+--------------------+------------+----+--------+---+
|   Polar Sun|Personal Accessories|     Eyewear|2012| 7015.34|  1|
|   Polar Ice|Personal Accessories|     Eyewear|2012|  3825.8|  2|
|       Capri|Personal Accessories|     Eyewear|2012| 10838.9|  3|
|     Cat Eye|Personal Accessories|     Eyewear|2012| 4428.85|  4|
|       Dante|Personal Accessories|     Eyewear|2012| 9759.75|  5|
|     Fairway|Personal Accessories|     Eyewear|2012| 8241.35|  6|
|     Inferno|Personal Accessories|     Eyewear|2012| 12935.0|  7|
|     Maximus|Personal Accessories|     Eyewear|2012|  9325.0|  8|
|      Trendi|Personal Accessories|     Eyewear|2012|  9104.3|  9|
|        Zone|Personal Accessories|     Eyewear|2012|  4574.3| 10|
|   Polar Sun|Personal Accessories|     Eyewear|2012| 25787.7| 11|
|   Polar Ice|Personal Accessories|     Eyewear|2012| 20835.1|

Funkcje okienne mogą też być numeryczne. Przykładowo możemy średnią wartość przychodu dla poszczególnych lat i typu produktu.

In [63]:
w = pyspark.sql.Window\
    .partitionBy('Product type')\
    .orderBy('Revenue')

w2 = pyspark.sql.Window\
    .partitionBy('Product type')
    
sales.withColumn('Rank', func.rank().over(w))\
    .withColumn('Rev avg', func.avg('Revenue').over(w2))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Revenue', 'Rev avg', 'Rank')\
    .where('Rank <= 3')\
    .show()

+--------------------+--------------------+--------------------+----+-------+------------------+----+
|             Product|        Product line|        Product type|Year|Revenue|           Rev avg|Rank|
+--------------------+--------------------+--------------------+----+-------+------------------+----+
|             Fairway|Personal Accessories|             Eyewear|2014| 233.75| 46668.50330365328|   1|
|             Maximus|Personal Accessories|             Eyewear|2012|  240.0| 46668.50330365328|   2|
|               Retro|Personal Accessories|             Eyewear|2014|  250.6| 46668.50330365328|   3|
|         Trail Scout|Personal Accessories|          Navigation|2012|  238.0| 32606.59504915291|   1|
|         Trail Scout|Personal Accessories|          Navigation|2012|  238.0| 32606.59504915291|   1|
|           Sky Pilot|Personal Accessories|          Navigation|2014|  358.0| 32606.59504915291|   3|
|           Sky Pilot|Personal Accessories|          Navigation|2014|  358.0| 3260

Można też definiować przesówne okna. Przykładowo możemy znajdować maksymalną wartość w przesównym oknie rozmiaru 3, z wartościami jeden przed i jeden za obecną wartością.

In [64]:
w = pyspark.sql.Window\
    .partitionBy('Product')\
    .rowsBetween(-1, 1)

sales.select('Product', 
             'Revenue', 
             func.max('Revenue').over(w).alias('Max rev window'))\
    .show()

+-------+---------+--------------+
|Product|  Revenue|Max rev window|
+-------+---------+--------------+
|Fairway|  8241.35|      36209.55|
|Fairway| 36209.55|      36209.55|
|Fairway|  11646.7|      36209.55|
|Fairway| 29237.65|       42919.5|
|Fairway|  42919.5|     130874.25|
|Fairway|130874.25|     130874.25|
|Fairway| 47251.75|     209136.85|
|Fairway|209136.85|     209136.85|
|Fairway| 11545.95|     209136.85|
|Fairway| 21903.05|      21903.05|
|Fairway|  13339.3|       60853.0|
|Fairway|  60853.0|       60853.0|
|Fairway| 27907.75|       60853.0|
|Fairway| 12916.15|      27907.75|
|Fairway|  6186.05|      15092.35|
|Fairway| 15092.35|      55795.35|
|Fairway| 55795.35|      55795.35|
|Fairway|   8019.7|      99641.75|
|Fairway| 99641.75|      99641.75|
|Fairway|   8100.3|      99641.75|
+-------+---------+--------------+
only showing top 20 rows



Można to równierz zapisać w notacji SQL.

In [None]:
query = """
SELECT
    Product,
    Revenue,
    MAX(Revenue) OVER(
        PARTITION BY Product
        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
    ) AS `Max rev window`
FROM sales
"""
spark.sql(query).show()

### Zadanie

1. Pokaż 5 najlepiej sprzedających się produktów dla danego typu sprzedawcy (Retailer type).
1. Policz ile mniej zarobił dany produkt od maksymalnej wartości przychodu w linii produktu w danym roku.
1. ★ Jaka jest różnica przychodu pomiędzy średnią dla danego roku, a średnią dla danych kwartałów.
1. ★ Policz średnią okienną dla okna o rozmiarze 5, w kolejności wielkości przychodu.

In [74]:
w = pyspark.sql.Window\
    .partitionBy('Retailer type') \
    .orderBy(func.desc('Revenue'), func.desc('Quantity'))
    
quantity_with_rank = sales.withColumn('Rank', func.rank().over(w))\
    .select('Product', 'Product line', 'Product type', 'Year', 'Retailer type', 'Revenue', 'Quantity', 'Rank')

quantity_with_rank.where('`Rank` <= 5').show()

+--------------------+--------------------+-------------+----+----------------+----------+--------+----+
|             Product|        Product line| Product type|Year|   Retailer type|   Revenue|Quantity|Rank|
+--------------------+--------------------+-------------+----+----------------+----------+--------+----+
|        Star Gazer 2|   Camping Equipment|        Tents|2013| Warehouse Store| 1335112.9|    2413|   1|
|           Star Lite|   Camping Equipment|        Tents|2012| Warehouse Store|1210413.68|    3479|   2|
|           Star Lite|   Camping Equipment|        Tents|2013| Warehouse Store|1052750.28|    2994|   3|
|        Star Gazer 2|   Camping Equipment|        Tents|2012| Warehouse Store| 944385.75|    1725|   4|
|           Star Lite|   Camping Equipment|        Tents|2014| Warehouse Store| 850221.71|    2353|   5|
|Canyon Mule Journ...|   Camping Equipment|        Packs|2013|Department Store| 726873.21|    2074|   1|
|Canyon Mule Journ...|   Camping Equipment|        Pack