# Processing mixed beverage data
This Jupyter Notebook uses curl down download [Mixed Beverage Gross Receipts](https://comptroller.texas.gov/taxes/mixed-beverage/receipts.php) files from the Texas Comptroller's [data center](https://comptroller.texas.gov/transparency/open-data/search-datasets/), and then a python library called [agate](http://agate.readthedocs.io/) to clean and process that data for [stories similar to this one](http://www.mystatesman.com/business/austin-alcohol-sales-percent-february/Oo2txZUkuDlqBl0rU9O1lJ/) on monthly alcohol sales.

This first part uses bash (which talks to the macOS) and curl to download the file we need.

- [Top sales statewide](#Top-sales-statewide)
- [Austin sales](#Austin-sales-and-sums)
- [Central Texas cities](#More-Central-Texas-cities)



### Downloading the file

In [1]:
%%bash
## downloads the mixedbev file
## You have to set this URL based on the data center
## curl -O https://comptroller.texas.gov/auto-data/odc/MIXEDBEV_04_2017.CSV


There is supposedly a way to call a file from a [remote url](http://agate-remote.readthedocs.io/en/0.2.0/) into agate, but I use bash above to curl the file and store it locally instead.

Next, we'll use a bash command to peek at the data, which we know is a mess:

In [2]:
%%bash
head -n 5 MIXEDBEV_04_2017.CSV

"MB821424    ","ABI-HAUS                      ","959 N 2ND ST                  ","ABILENE             ","TX","79601","221","          ","2017/02", 000000632.34
"MB638028    ","ABILENE BEEHIVE INC           ","442 CEDAR ST STE A            ","ABILENE             ","TX","79601","221","          ","2017/03", 000002472.16
"MB543114    ","ABILENE BOWLING LANES INC     ","279 RUIDOSA AVE               ","ABILENE             ","TX","79605","221","          ","2017/03", 000000295.67
"MB933130    ","ABILENE CABARET LLC           ","1918 BUTTERNUT ST             ","ABILENE             ","TX","79602","221","          ","2017/03", 000000806.34
"N 037863    ","ABILENE COUNTRY CLUB          ","4039 S TREADAWAY BLVD         ","ABILENE             ","TX","79602","221","          ","2017/03", 000001583.54


Now that we have our file and know what it looks like, we'll use Python and the agate library to clean and analyze it. You'll need to make sure that you have agate installed, preferably in an virtual environment like Conda, as described in the [ReadMe](README.md).

In [3]:
# imports the libraries we will use
import agate
from decimal import Decimal
import re



In [4]:
# this surpresses the timezone warning
# Might comment out during development so other warnings
# are not surpressed
import warnings
warnings.filterwarnings('ignore')

### Study variables

This is where you set which file you are working with, and which month you want to study, etc.

First, we'll list the files in our directory that we have downloaded so far so we can get the filename:

In [5]:
ls

MIXEDBEV_02_2017.CSV             README.md
MIXEDBEV_03_2017.CSV             counties.csv
MIXEDBEV_04_2015.CSV             [34mdata-raw[m[m/
MIXEDBEV_04_2016.CSV             headers.txt
MIXEDBEV_04_2017.CSV             mixbev-env.txt
Mixed beverages agate-new.ipynb  mixbev-pip.txt
Mixed beverages agate.ipynb


Then we set some values based on those.

- The **`file`** is the name of the file we want to process
- The **`tax_rate`** is the value we need for this file to get the Gross Recipts (vs the Tax Reported, which is just the tax amount the establishment paid). The comptroller [has information on the tax](https://comptroller.texas.gov/taxes/mixed-beverage/receipts.php), but this [old record layout](https://github.com/utdata/cli-tools/blob/master/data/mixbevtax/OLD-MIXEDBEVTAX-LAYOUT.txt) best describes the math.
- The **`month_studied`** is the YYYY/MM designation for the month before the file release. The file released in February has mostly records from January, but can also have any other month, so we set here the specific month we want. Note there is a check later on that counts the number of files by month, which is worth checking.

In [6]:
# this is our source file, which may have been downloaded above
file = 'MIXEDBEV_04_2016.CSV'

# Sets the tax rate to convert Report Tax to Gross Receipts
# It's 6.7 since January 1, 2014
tax_rate = Decimal('6.7')

# setting the month_studied var.
# This should be checked in the table below that counts records by month
month_studied = '2016/03'


### Import the file
There are a couple of things we have to set to import a file. Remember we had to do the same when we did this manually in Tableau.
- Set the column header names
- Set the ZIP can County codes as text, so we preserve '001'.
- The encoding type of the file (Tableau enferred, but we have to specify here. Common types are 'iso-8859-1' or  'utf-8', or 'latin1'. Just try it until it works.

In [7]:
# sets the column names of the original data set.
column_names = [
    'TABC Permit Number',
    'Trade Name',
    'Location Address',
    'Location City',
    'Location State',
    'Location Zip Code',
    'Location County Code',
    'Blank',
    'Report Period',
    'Report Tax'
]
# Helps us import some text fields that may be considered numbers in error.
specified_types = {
    'Location Zip Code': agate.Text(),
    'Location County Code': agate.Text()
}

# this imports the file specified above, along with the proper types
mixbev_raw = agate.Table.from_csv(file, column_names, encoding='iso-8859-1', column_types=specified_types)

# prints table fields so we an check thoes data types
print(mixbev_raw)

| column               | data_type |
| -------------------- | --------- |
| TABC Permit Number   | Text      |
| Trade Name           | Text      |
| Location Address     | Text      |
| Location City        | Text      |
| Location State       | Text      |
| Location Zip Code    | Text      |
| Location County Code | Text      |
| Blank                | Boolean   |
| Report Period        | Text      |
| Report Tax           | Number    |



### Clean up text fields and compute gross receipts

In [8]:
# mixbev_trim creates a new interim table with results of compute function
# that takes the four columns that need trimming and strips them of white space,
# adding them to the end of the table with new names.
# The last computation does the math to create the Gross Receipts based on the tax_rate set above

mixbev_trim = mixbev_raw.compute([
    ('Permit', agate.Formula(agate.Text(), lambda r: r['TABC Permit Number'].strip())),
    ('Name', agate.Formula(agate.Text(), lambda r: r['Trade Name'].strip())),
    ('Address', agate.Formula(agate.Text(), lambda r: r['Location Address'].strip())),
    ('City', agate.Formula(agate.Text(), lambda r: r['Location City'].strip())),
    ('Receipts_compute', agate.Formula(agate.Number(), lambda r: (r['Report Tax'] / tax_rate) * 100))
])

# the Receipts_compute computation above returns as a decimal number,
# so this function rounds those numbers.
# I might refactor this late so I can use it elsewhere.
def round_receipt(row):
    return row['Receipts_compute'].quantize(Decimal('0.01'))

# This compute method uses round_recipt function above,
# putting the results into a new table.
mixbev_round = mixbev_trim.compute([
    ('Receipts', agate.Formula(agate.Number(), round_receipt))
])

## shows the new columns added to the interim table
print(mixbev_round)

| column               | data_type |
| -------------------- | --------- |
| TABC Permit Number   | Text      |
| Trade Name           | Text      |
| Location Address     | Text      |
| Location City        | Text      |
| Location State       | Text      |
| Location Zip Code    | Text      |
| Location County Code | Text      |
| Blank                | Boolean   |
| Report Period        | Text      |
| Report Tax           | Number    |
| Permit               | Text      |
| Name                 | Text      |
| Address              | Text      |
| City                 | Text      |
| Receipts_compute     | Number    |
| Receipts             | Number    |



In [9]:
# creates new table, selecting just the columns we need
# then renames some of them for ease later.
mixbev_cleaned = mixbev_round.select([
    'Permit',
    'Name',
    'Address',
    'City',
    'Location State',
    'Location Zip Code',
    'Location County Code',
    'Report Period',
    'Report Tax',
    'Receipts'
]).rename(column_names = {
    'Location State': 'State',
    'Location Zip Code': 'Zip',
    'Location County Code': 'CountyCode',
    'Report Period': 'Period',
    'Report Tax': 'Tax'
})

## these are now the columns present in our new, cleaned table
print(mixbev_cleaned)

| column     | data_type |
| ---------- | --------- |
| Permit     | Text      |
| Name       | Text      |
| Address    | Text      |
| City       | Text      |
| State      | Text      |
| Zip        | Text      |
| CountyCode | Text      |
| Period     | Text      |
| Tax        | Number    |
| Receipts   | Number    |



In [10]:
# and this peeks at a couple of columns the data (Tax and Receipts)
# to make sure they make sense and the math is right
# During development, I did send this to_csv and made sure columns were trimmed, etc
mixbev_cleaned.select(['Tax','Receipts']).limit(5).print_table()

|      Tax |  Receipts |
| -------- | --------- |
| 2,389.62 | 35,665.97 |
|   297.07 |  4,433.88 |
| 1,533.02 | 22,880.90 |
| 2,093.75 | 31,250.00 |
| 2,615.07 | 39,030.90 |


### Create establishment column

We do this so we make sure we have single establishments instead of grouping trade names together from different addresses, like 'CHILI'S BAR & GRILL'.

In [11]:
# Concatenates the name and address
mixbev_cleaned_est = mixbev_cleaned.compute([
    ('Establishment', agate.Formula(agate.Text(), lambda row: '%(Name)s %(Address)s' % row))
])

# Prints columns so you see it is there
print(mixbev_cleaned_est)

| column        | data_type |
| ------------- | --------- |
| Permit        | Text      |
| Name          | Text      |
| Address       | Text      |
| City          | Text      |
| State         | Text      |
| Zip           | Text      |
| CountyCode    | Text      |
| Period        | Text      |
| Tax           | Number    |
| Receipts      | Number    |
| Establishment | Text      |



In [12]:
# selects and prints Establishment to check what is looks like
mixbev_establishment = mixbev_cleaned_est.select('Establishment')
mixbev_establishment.limit(5).print_table(max_column_width=80)

| Establishment                              |
| ------------------------------------------ |
| ABILENE BEEHIVE INC 442 CEDAR ST STE A     |
| ABILENE BOWLING LANES INC 279 RUIDOSA AVE  |
| ABILENE CABARET LLC 1918 BUTTERNUT ST      |
| ABILENE COUNTRY CLUB 4039 S TREADAWAY BLVD |
| ABILENE SEAFOOD TAVERN 1882 S CLACK ST     |


### Import and merge counties lookup table
We do this to get county names. I got this list from the comptroller.

In [13]:
# importing countes.csv, ensuring that the 'code' column is text
counties = agate.Table.from_csv('counties.csv', column_types={'code': agate.Text()})

# peek at the column names
print(counties)

| column | data_type |
| ------ | --------- |
| id     | Number    |
| county | Text      |
| code   | Text      |



In [14]:
# peek at the data
counties.limit(5).print_table()

| id | county   | code |
| -- | -------- | ---- |
|  1 | Anderson | 001  |
|  2 | Andrews  | 002  |
|  3 | Angelina | 003  |
|  4 | Aransas  | 004  |
|  5 | Archer   | 005  |


In [15]:
# joines the counties table to the mixed bev cleaned data with establishments
mixbev_joined = mixbev_cleaned_est.join(counties, 'CountyCode', 'code')

# check that the merge was succesful 
print(mixbev_joined)

| column        | data_type |
| ------------- | --------- |
| Permit        | Text      |
| Name          | Text      |
| Address       | Text      |
| City          | Text      |
| State         | Text      |
| Zip           | Text      |
| CountyCode    | Text      |
| Period        | Text      |
| Tax           | Number    |
| Receipts      | Number    |
| Establishment | Text      |
| id            | Number    |
| county        | Text      |



In [16]:
# get just the columns we need and rename county
# THIS is the finished, cleaned mixbev table
mixbev = mixbev_joined.select([
    'Permit',
    'Name',
    'Address',
    'Establishment',
    'City',
    'State',
    'Zip',
    'county',
    'Period',
    'Tax',
    'Receipts'
]).rename(column_names = {
    'county': 'County'
})

# peek at the column names
print(mixbev)

| column        | data_type |
| ------------- | --------- |
| Permit        | Text      |
| Name          | Text      |
| Address       | Text      |
| Establishment | Text      |
| City          | Text      |
| State         | Text      |
| Zip           | Text      |
| County        | Text      |
| Period        | Text      |
| Tax           | Number    |
| Receipts      | Number    |



In [17]:
# peek at the table
mixbev.limit(5).print_table()

| Permit   | Name                 | Address              | Establishment        | City    | State | ... |
| -------- | -------------------- | -------------------- | -------------------- | ------- | ----- | --- |
| MB638028 | ABILENE BEEHIVE INC  | 442 CEDAR ST STE A   | ABILENE BEEHIVE I... | ABILENE | TX    | ... |
| MB543114 | ABILENE BOWLING L... | 279 RUIDOSA AVE      | ABILENE BOWLING L... | ABILENE | TX    | ... |
| MB933130 | ABILENE CABARET LLC  | 1918 BUTTERNUT ST    | ABILENE CABARET L... | ABILENE | TX    | ... |
| N 037863 | ABILENE COUNTRY CLUB | 4039 S TREADAWAY ... | ABILENE COUNTRY C... | ABILENE | TX    | ... |
| MB200506 | ABILENE SEAFOOD T... | 1882 S CLACK ST      | ABILENE SEAFOOD T... | ABILENE | TX    | ... |


### Looking at dates of the records

Here we are looking at the entire mixbev table to see what range of dates we have. This way we can make sure we are analyzing the correct month based on this data. (More than one month can be present, but it will be predominately the previous month).

To explain what we are doing here, as it is kind of obtuse in agate:
- use group_by to create a tableset by the Period field.
- Create a table and set it it based on counting the number of records for each Period.
- create a table to then sort the period in reverse order to put the dominate month at the top
- Then print the sorted table (top 10 rows)

In [18]:
# this is the group_by
by_period = mixbev.group_by('Period')

# Then aggregate that group by count of records in Period
period_totals = by_period.aggregate([
    ('count', agate.Count())
])

# Take those results and sort them
period_totals_sorted = period_totals.order_by('count', reverse=True)

# prints the table of period and number of records
period_totals_sorted.print_table(max_rows=None)


| Period  |  count |
| ------- | ------ |
| 2016/03 | 13,786 |
| 2016/02 |  1,394 |
| 2016/01 |    160 |
| 2016/04 |     50 |
| 2015/12 |     45 |
| 2015/11 |     26 |
| 2015/10 |     20 |
| 2015/09 |     12 |
| 2015/08 |     10 |
| 2015/04 |     10 |
| 2015/03 |      9 |
| 2014/12 |      8 |
| 2015/02 |      8 |
| 2015/07 |      8 |
| 2015/06 |      7 |
| 2014/11 |      7 |
| 2015/01 |      7 |
| 2014/01 |      6 |
| 2014/02 |      6 |
| 2014/03 |      6 |
| 2014/04 |      6 |
| 2014/05 |      6 |
| 2014/06 |      6 |
| 2014/10 |      6 |
| 2015/05 |      6 |
| 2014/07 |      5 |
| 2014/08 |      5 |
| 2014/09 |      5 |
| 2013/11 |      4 |
| 2012/09 |      3 |
| 2013/12 |      3 |
| 2012/03 |      2 |
| 2012/04 |      2 |
| 2012/05 |      2 |
| 2012/06 |      2 |
| 2012/07 |      2 |
| 2012/08 |      2 |
| 2012/10 |      2 |
| 2012/11 |      2 |
| 2012/12 |      2 |
| 2013/01 |      2 |
| 2013/02 |      2 |
| 2013/03 |      2 |
| 2013/04 |      2 |
| 2013/05 |      2 |
| 2013/06 |  

In [19]:
# Pivot the mixbev table by Period. Default it give a Count of the records
# We then order the table by Count in descending order
by_period = mixbev.pivot('Period').order_by('Count', reverse=True)

# prints the table of period and number of records
by_period.print_table(max_rows=None)

| Period  |  Count |
| ------- | ------ |
| 2016/03 | 13,786 |
| 2016/02 |  1,394 |
| 2016/01 |    160 |
| 2016/04 |     50 |
| 2015/12 |     45 |
| 2015/11 |     26 |
| 2015/10 |     20 |
| 2015/09 |     12 |
| 2015/08 |     10 |
| 2015/04 |     10 |
| 2015/03 |      9 |
| 2014/12 |      8 |
| 2015/02 |      8 |
| 2015/07 |      8 |
| 2015/06 |      7 |
| 2014/11 |      7 |
| 2015/01 |      7 |
| 2014/01 |      6 |
| 2014/02 |      6 |
| 2014/03 |      6 |
| 2014/04 |      6 |
| 2014/05 |      6 |
| 2014/06 |      6 |
| 2014/10 |      6 |
| 2015/05 |      6 |
| 2014/07 |      5 |
| 2014/08 |      5 |
| 2014/09 |      5 |
| 2013/11 |      4 |
| 2012/09 |      3 |
| 2013/12 |      3 |
| 2012/03 |      2 |
| 2012/04 |      2 |
| 2012/05 |      2 |
| 2012/06 |      2 |
| 2012/07 |      2 |
| 2012/08 |      2 |
| 2012/10 |      2 |
| 2012/11 |      2 |
| 2012/12 |      2 |
| 2013/01 |      2 |
| 2013/02 |      2 |
| 2013/03 |      2 |
| 2013/04 |      2 |
| 2013/05 |      2 |
| 2013/06 |  

The top value in the table above is typically the month before the reporting date. This also shows how many records are filed for OTHER months. We want to make sure the top month value is included as the **month_studied** variable at the top of this file.

So, now we can filter the data to our specific month, which will use for the rest of the analysis:

In [20]:
## filters the records to our month_studied
mixbev_month = mixbev.where(lambda row: row['Period'] == month_studied)

## The number of records in our month
len(mixbev_month)

13786

## Top sales statewide

Because we want to group our results by more than one field and perform more than one aggregation, we'll do this a little differently. We'll use group_by to create a grouped table, then perform aggregations on that new table to computer the Tax and Receipts columns.

In [21]:
# groups the data based on Establishment and City
mixbev_grouped = mixbev_month.group_by('Establishment').group_by('County').group_by('City')

# computes the sales based on the grouping
state_summary = mixbev_grouped.aggregate([
    ('Tax_sum', agate.Sum('Tax')),
    ('Sales_sum', agate.Sum('Receipts'))
])

# sorts the results by most sold. We could probalby chain it above if we wanted to.
state_summary_sorted = state_summary.order_by('Sales_sum', reverse=True)

# prints the top 10 results
state_summary_sorted.limit(10).print_table(max_column_width=40)

| Establishment                            | County  | City               |    Tax_sum |    Sales_sum |
| ---------------------------------------- | ------- | ------------------ | ---------- | ------------ |
| THREE NRG PARK 2000 SOUTH LOOP W         | Harris  | HOUSTON            | 233,266.86 | 3,481,594.93 |
| ARAMARK SPORTS AND ENTERTAINME 211 AT... | Bexar   | SAN ANTONIO        | 133,631.36 | 1,994,497.91 |
| GAYLORD TEXAN 1501 GAYLORD TRL           | Tarrant | GRAPEVINE          | 107,012.19 | 1,597,196.87 |
| HOSPITALITY INTERNATIONAL, INC 23808 ... | Bexar   | SAN ANTONIO        |  79,657.37 | 1,188,915.97 |
| LEVY RESTAURANTS 2500 VICTORY AVE        | Dallas  | DALLAS             |  73,245.27 | 1,093,212.99 |
| WLS BEVERAGE CO 110 E 2ND ST             | Travis  | AUSTIN             |  61,818.82 |   922,668.96 |
| SALC, INC. 2201 N STEMMONS FWY FL 1      | Dallas  | DALLAS             |  61,641.40 |   920,020.90 |
| HAPPIEST HOUR, LLC 2616 OLIVE ST         | Dallas  | DALLAS   

## Overall statewide sum

In [22]:
# summing sales statewide for month
mixbev_month.aggregate(agate.Sum('Receipts'))

Decimal('546361388.25')

## Location sums function

Because we want to get the top sellers in a bunch of cities and couties, we create a function so we don't have to repeat the code. This function allows us to pass in a city or county name to filter the monthly receipts table and then sum the Tax and Receipts columns. The result can then be acted on to print or aggreggate.

In [23]:
# function to group sales by a specific location
# City or County passed in should be ALL CAPS
# Location_type can be 'City' or 'County'

def location_sum(location_type, location):
    # Filters the data to the specified city
    location_filtered = mixbev_month.where(lambda row: row[location_type].upper() == location)

    # groups the data based on Establishment and location
    location_grouped = location_filtered.group_by('Establishment').group_by(location_type)
    # computes the sales based on the grouping
    location_summary = location_grouped.aggregate([
        ('Tax_sum', agate.Sum('Tax')),
        ('Receipts_sum', agate.Sum('Receipts'))
    ])
    
    # sorts the results by most sold
    location_summary_sorted = location_summary.order_by('Receipts_sum', reverse=True)
    # prints the top 10 results
    
    return(location_summary_sorted)


## Austin sales and sums

With this, we refernce the location_sum function above, and pass the type of location (City) and the name of the city (AUSTIN). At the same time, we limit the result of that function to the first 10 records, and then print the results. We are basically stringing together a bunch of stuff at once.

In [24]:
# uses the city_sum function to filter
austin = location_sum('City', 'AUSTIN')

# print the resulting table
austin.limit(5).print_table(max_column_width=60)

| Establishment                                  | City   |   Tax_sum | Receipts_sum |
| ---------------------------------------------- | ------ | --------- | ------------ |
| WLS BEVERAGE CO 110 E 2ND ST                   | AUSTIN | 61,818.82 |   922,668.96 |
| 400 BAR/CUCARACHA/CHUPACABRA/J 400 E 6TH ST    | AUSTIN | 46,656.99 |   696,372.99 |
| THE DOGWOOD DOMAIN 11420 ROCK ROSE AVE STE 700 | AUSTIN | 36,806.78 |   549,354.93 |
| THE BLIND PIG PUB 317 E 6TH ST                 | AUSTIN | 35,440.65 |   528,964.93 |
| W HOTEL AUSTIN 200 LAVACA ST                   | AUSTIN | 34,213.34 |   510,646.87 |


### String methods together to print a table
So far, we've been printing tables, but Agate can also print charts. The **`print_bars`** method creates a simple, text-based bar chart.

In [25]:
# We'll use the same function, but instead of creating a new table,
# we'll just string on the limit and print bars methods
# print_bars needs to arguments, the label column and then the value to make the chart from
location_sum('City', 'AUSTIN').limit(10).print_bars('Establishment', 'Receipts_sum', width=90)

Establishment                                      Receipts_sum
WLS BEVERAGE CO 110 E 2ND ST                         922,668.96 ▓░░░░░░░░░░░░░░░░░░░░░░░  
400 BAR/CUCARACHA/CHUPACABRA/J 400 E 6TH ST          696,372.99 ▓░░░░░░░░░░░░░░░░░        
THE DOGWOOD DOMAIN 11420 ROCK ROSE AVE STE 700       549,354.93 ▓░░░░░░░░░░░░░░           
THE BLIND PIG PUB 317 E 6TH ST                       528,964.93 ▓░░░░░░░░░░░░░            
W HOTEL AUSTIN 200 LAVACA ST                         510,646.87 ▓░░░░░░░░░░░░░            
SAN JACINTO BEVERAGE CORPORATI 98 SAN JACINTO BLVD   509,055.97 ▓░░░░░░░░░░░░░            
STUBB'S BAR-B-Q 801 RED RIVER ST                     447,926.87 ▓░░░░░░░░░░░              
SALC. INC.(HILTON HOTEL) 500 E 4TH ST                425,320.00 ▓░░░░░░░░░░░              
DH BEVERAGE LLC 604 BRAZOS ST                        423,596.87 ▓░░░░░░░░░░░              
TOP GOLF 2700 ESPERANZA XING                         417,740.00 ▓░░░░░░░░░░               
                          

### Total sales Austin

In [26]:
# Austin total sales as s city
# This sums the grouped table, but it works
location_sum('City', 'AUSTIN').aggregate(agate.Sum('Receipts_sum'))

Decimal('68206672.31')

## More Central Texas cities

In [27]:
location_sum('City', 'BASTROP').limit(5).print_table(max_column_width=60)

| Establishment                              | City    |  Tax_sum | Receipts_sum |
| ------------------------------------------ | ------- | -------- | ------------ |
| OLD TOWN RESTURANT AND BAR/PIN 931 MAIN ST | BASTROP | 4,142.81 |    61,832.99 |
| CHILI'S GRILL & BAR 734 HIGHWAY 71 W       | BASTROP | 2,889.64 |    43,128.96 |
| NEIGHBOR'S 601 CHESTNUT ST UNIT C          | BASTROP | 2,345.80 |    35,011.94 |
| LA HACIENDA RESTAURANT 1800 WALNUT ST      | BASTROP | 1,817.37 |    27,124.93 |
| VERANDA 910 MAIN ST                        | BASTROP | 1,494.10 |    22,300.00 |


In [28]:
location_sum('City', 'BEE CAVE').limit(3).print_table(max_column_width=60)

| Establishment                                       | City     |  Tax_sum | Receipts_sum |
| --------------------------------------------------- | -------- | -------- | ------------ |
| WOODY TAVERN AND GRILL, INC. 12801 SHOPS PKWY # 100 | BEE CAVE | 5,726.49 |    85,470.00 |
| MAUDIE'S HILL COUNTRY, LLC 12506 SHOPS PKWY         | BEE CAVE | 4,835.59 |    72,172.99 |
| CAFE BLUE 12800 HILL COUNTRY BLVD STE               | BEE CAVE | 4,782.66 |    71,382.99 |


In [29]:
location_sum('City', 'BUDA').limit(3).print_table(max_column_width=60)

| Establishment                  | City |  Tax_sum | Receipts_sum |
| ------------------------------ | ---- | -------- | ------------ |
| WILLIE'S JOINT 824 MAIN ST     | BUDA | 3,291.10 |    49,120.90 |
| CLEVELAND'S 100 N MAIN ST      | BUDA | 3,153.42 |    47,065.97 |
| PINBALLZ KINGDOM 15201 S IH 35 | BUDA | 2,979.49 |    44,470.00 |


In [30]:
location_sum('City', 'CEDAR PARK').limit(3).print_table(max_column_width=60)

| Establishment                                               | City       |  Tax_sum | Receipts_sum |
| ----------------------------------------------------------- | ---------- | -------- | ------------ |
| LUPE TORTILLA MEXICAN RESTAURA 4501 183A TOLL RD STE B      | CEDAR PARK | 6,865.75 |   102,473.88 |
| SHOOTERS BILLIARDS & SPORTS BA 601 E WHITESTONE BLVD BLDG 5 | CEDAR PARK | 5,626.05 |    83,970.90 |
| WILD WEST 401 E WHITESTONE BLVD STE B1                      | CEDAR PARK | 5,509.00 |    82,223.88 |


In [31]:
location_sum('City', 'DRIPPING SPRINGS').limit(3).print_table(max_column_width=60)

| Establishment                                       | City             |  Tax_sum | Receipts_sum |
| --------------------------------------------------- | ---------------- | -------- | ------------ |
| TRUDY'S FOUR STAR 13059 FOUR STAR BLVD              | DRIPPING SPRINGS | 4,518.27 |    67,436.87 |
| FLORES MEXICAN RESTAURANT 2440 E HIGHWAY 290 BLDG D | DRIPPING SPRINGS | 3,680.51 |    54,932.99 |
| PROOF & COOPER 18800 HAMILTON POOL RD               | DRIPPING SPRINGS | 2,522.68 |    37,651.94 |


In [32]:
location_sum('City', 'GEORGETOWN').limit(3).print_table(max_column_width=60)

| Establishment                      | City       |  Tax_sum | Receipts_sum |
| ---------------------------------- | ---------- | -------- | ------------ |
| EL MONUMENTO 205 W 2ND ST          | GEORGETOWN | 5,911.41 |    88,230.00 |
| HARDTAILS 1515 N IH 35             | GEORGETOWN | 4,839.14 |    72,225.97 |
| DOS SALSAS CAFE INC 1104 S MAIN ST | GEORGETOWN | 4,515.80 |    67,400.00 |


In [33]:
location_sum('City', 'KYLE').limit(3).print_table(max_column_width=60)

| Establishment                                       | City |  Tax_sum | Receipts_sum |
| --------------------------------------------------- | ---- | -------- | ------------ |
| CASA GARCIA'S MEXICAN RESTAURA 5401 FM 1626 STE 300 | KYLE | 4,934.34 |    73,646.87 |
| EVO ENTERTAINMENT CENTER 3200 KYLE XING             | KYLE | 3,966.66 |    59,203.88 |
| DOWN SOUTH RAILHOUSE 107 E CENTER ST                | KYLE | 3,880.57 |    57,918.96 |


In [34]:
location_sum('City', 'LAGO VISTA').limit(3).print_table(max_column_width=60)

| Establishment                                 | City       |  Tax_sum | Receipts_sum |
| --------------------------------------------- | ---------- | -------- | ------------ |
| COPPERHEAD GRILL 6115 LOHMANS FORD RD         | LAGO VISTA | 1,221.54 |    18,231.94 |
| NATURE'S POINT LTD 18206 LAKESHORE POINT BLVD | LAGO VISTA |   548.66 |     8,188.96 |
| NATURES POINT, LTD. 20552 HIGHLAND LAKE DR    | LAGO VISTA |     0.00 |         0.00 |


In [35]:
location_sum('City', 'LAKEWAY').limit(3).print_table(max_column_width=60)

| Establishment                                          | City    |  Tax_sum | Receipts_sum |
| ------------------------------------------------------ | ------- | -------- | ------------ |
| THE GROVE WINE BAR AND KITCHEN 3001 RANCH ROAD 620 S   | LAKEWAY | 7,177.17 |   107,121.94 |
| HIGH 5 ENTERTAINMENT 1502 RANCH ROAD 620 S             | LAKEWAY | 3,818.79 |    56,996.87 |
| FLORES MEXICAN RESTAURANT 2127 LOHMANS CROSSING RD STE | LAKEWAY | 3,796.95 |    56,670.90 |


In [36]:
location_sum('City', 'LEANDER').limit(3).print_table(max_column_width=60)

| Establishment                                        | City    |  Tax_sum | Receipts_sum |
| ---------------------------------------------------- | ------- | -------- | ------------ |
| BROOKLYN HEIGHTS PIZZERIA 3550 LAKELINE BLVD STE 135 | LEANDER | 4,052.69 |    60,487.91 |
| JARDIN DEL REY 703 S HIGHWAY 183                     | LEANDER | 2,717.78 |    40,563.88 |
| TAPATIA JALISCO #3 LLC 651 N US 183                  | LEANDER |   797.83 |    11,907.91 |


In [37]:
location_sum('City', 'LIBERTY HILL').limit(3).print_table(max_column_width=60)

| Establishment                                   | City         |  Tax_sum | Receipts_sum |
| ----------------------------------------------- | ------------ | -------- | ------------ |
| JARDIN CORONA 15395 W STATE HIGHWAY 29          | LIBERTY HILL | 2,336.29 |    34,870.00 |
| MARGARITA'S RESTAURANT 10280 W STATE HIGHWAY 29 | LIBERTY HILL | 1,191.79 |    17,787.91 |
| FIRE OAK DISTILLERY 4600 COUNTY ROAD 207        | LIBERTY HILL |     0.00 |         0.00 |


In [38]:
location_sum('City', 'PFLUGERVILLE').limit(3).print_table(max_column_width=60)

| Establishment                             | City         |  Tax_sum | Receipts_sum |
| ----------------------------------------- | ------------ | -------- | ------------ |
| WAGNOR BROTHERS 15505 INTERSTATE 35 STE C | PFLUGERVILLE | 7,013.09 |   104,672.99 |
| MAVERICKS 1700 GRAND AVENUE PKWY STE 2    | PFLUGERVILLE | 6,594.94 |    98,431.94 |
| HANOVER'S DRAUGHT HAUS 108 E MAIN ST      | PFLUGERVILLE | 4,891.46 |    73,006.87 |


In [39]:
location_sum('City', 'ROUND ROCK').limit(5).print_table(max_column_width=60)

| Establishment                                           | City       |   Tax_sum | Receipts_sum |
| ------------------------------------------------------- | ---------- | --------- | ------------ |
| RICK'S CABARET 3105 S INTERSTATE 35                     | ROUND ROCK | 12,413.29 |   185,272.99 |
| CHUY'S ROUND ROCK 2320 N I H 35                         | ROUND ROCK | 11,027.06 |   164,582.99 |
| THIRD BASE ROUND ROCK, LLC 3107 S INTERSTATE 35 STE 810 | ROUND ROCK | 10,677.58 |   159,366.87 |
| FAST EDDIE'S NEIGHBORHOOD BILL 100 PARKER DR            | ROUND ROCK |  9,946.41 |   148,453.88 |
| JACK ALLEN'S KITCHEN 2500 HOPPE TRL                     | ROUND ROCK |  9,771.95 |   145,850.00 |


In [40]:
location_sum('City', 'SAN MARCOS').limit(5).print_table(max_column_width=60)

| Establishment                                   | City       |   Tax_sum | Receipts_sum |
| ----------------------------------------------- | ---------- | --------- | ------------ |
| ZELICKS 336 W HOPKINS ST                        | SAN MARCOS | 10,309.55 |   153,873.88 |
| THE TAP ROOM & THE PORCH ON HO 129 E HOPKINS ST | SAN MARCOS |  7,167.72 |   106,980.90 |
| PLUCKERS WING BAR 105 N INTERSTATE 35           | SAN MARCOS |  7,061.19 |   105,390.90 |
| SEAN PATRICK'S 202 E SAN ANTONIO ST             | SAN MARCOS |  6,795.47 |   101,424.93 |
| CHIMY'S SAN MARCOS 217 E HOPKINS ST             | SAN MARCOS |  6,752.19 |   100,778.96 |


In [41]:
location_sum('City', 'SPICEWOOD').limit(3).print_table(max_column_width=60)

| Establishment                                      | City      |  Tax_sum | Receipts_sum |
| -------------------------------------------------- | --------- | -------- | ------------ |
| ANGEL'S ICEHOUSE 21815 W HWY 71                    | SPICEWOOD | 3,777.12 |    56,374.93 |
| POODIES HILLTOP ROADHOUSE 22308 STATE HIGHWAY 71 W | SPICEWOOD | 3,229.86 |    48,206.87 |
| APIS RESTAURANT 23526 STATE HIGHWAY 71 W           | SPICEWOOD | 1,522.64 |    22,725.97 |


In [42]:
location_sum('City', 'SUNSET VALLEY').limit(3).print_table(max_column_width=60)

| Establishment                                        | City          |  Tax_sum | Receipts_sum |
| ---------------------------------------------------- | ------------- | -------- | ------------ |
| DOC'S BACKYARD 5207 BRODIE LN # 100                  | SUNSET VALLEY | 5,830.13 |    87,016.87 |
| BJ'S RESTAURANT AND BREWHOUSE 5207 BRODIE LN STE 300 | SUNSET VALLEY | 4,680.01 |    69,850.90 |
| LONGHORN STEAKHOUSE #5423 4809 W HIGHWAY 290         | SUNSET VALLEY | 1,213.90 |    18,117.91 |


In [43]:
location_sum('City', 'WEST LAKE HILLS').limit(3).print_table(max_column_width=60)

| Establishment                                               | City            |  Tax_sum | Receipts_sum |
| ----------------------------------------------------------- | --------------- | -------- | ------------ |
| LUPE TORTILLA MEXICAN RESTAURA 701 S CAPITAL OF TEXAS HWY S | WEST LAKE HILLS | 6,273.54 |    93,634.93 |
| CHIPOTLE CHIPOTLE MEXICAN GRIL 3300 BEE CAVES RD STE 670    | WEST LAKE HILLS |    35.77 |       533.88 |
| LION & ROSE 701 S CAPITAL OF TEXAS HWY                      | WEST LAKE HILLS |     0.00 |         0.00 |


## Sales by county

In this case, we pass in the location type of 'County' and then a county name in caps to get the most sales in a particular county.

In [44]:
location_sum('County', 'BASTROP').limit(5).print_table(max_column_width=80)

| Establishment                                     | County  |   Tax_sum | Receipts_sum |
| ------------------------------------------------- | ------- | --------- | ------------ |
| LOST PINES BEVERAGE LLC 575 HYATT LOST PINES ROAD | Bastrop | 32,356.57 |   482,933.88 |
| OLD TOWN RESTURANT AND BAR/PIN 931 MAIN ST        | Bastrop |  4,142.81 |    61,832.99 |
| CHILI'S GRILL & BAR 734 HIGHWAY 71 W              | Bastrop |  2,889.64 |    43,128.96 |
| NEIGHBOR'S 601 CHESTNUT ST UNIT C                 | Bastrop |  2,345.80 |    35,011.94 |
| REGULATOR'S SPORTS BAR & GRILL 202 S AVENUE C     | Bastrop |  1,979.91 |    29,550.90 |


In [45]:
location_sum('County', 'CALDWELL').limit(3).print_table(max_column_width=80)

| Establishment                                             | County   |  Tax_sum | Receipts_sum |
| --------------------------------------------------------- | -------- | -------- | ------------ |
| GUADALAJARA MEXICAN RESTAURANT 1710 S COLORADO ST STE 110 | Caldwell | 1,218.06 |    18,180.00 |
| BRMJ, LLC 831 S COLORADO ST                               | Caldwell |   519.11 |     7,747.91 |
| MR TACO 1132 E PIERCE ST                                  | Caldwell |   395.16 |     5,897.91 |


In [46]:
location_sum('County', 'HAYS').limit(3).print_table(max_column_width=80)

| Establishment                                   | County |   Tax_sum | Receipts_sum |
| ----------------------------------------------- | ------ | --------- | ------------ |
| ZELICKS 336 W HOPKINS ST                        | Hays   | 10,309.55 |   153,873.88 |
| THE TAP ROOM & THE PORCH ON HO 129 E HOPKINS ST | Hays   |  7,167.72 |   106,980.90 |
| PLUCKERS WING BAR 105 N INTERSTATE 35           | Hays   |  7,061.19 |   105,390.90 |


In [47]:
location_sum('County', 'TRAVIS').limit(3).print_table(max_column_width=80)

| Establishment                                  | County |   Tax_sum | Receipts_sum |
| ---------------------------------------------- | ------ | --------- | ------------ |
| WLS BEVERAGE CO 110 E 2ND ST                   | Travis | 61,818.82 |   922,668.96 |
| 400 BAR/CUCARACHA/CHUPACABRA/J 400 E 6TH ST    | Travis | 46,656.99 |   696,372.99 |
| THE DOGWOOD DOMAIN 11420 ROCK ROSE AVE STE 700 | Travis | 36,806.78 |   549,354.93 |


In [48]:
location_sum('County', 'WILLIAMSON').limit(3).print_table(max_column_width=80)

| Establishment                                      | County     |   Tax_sum | Receipts_sum |
| -------------------------------------------------- | ---------- | --------- | ------------ |
| CHUY'S ROUND ROCK 2320 N I H 35                    | Williamson | 11,027.06 |   164,582.99 |
| ALAMO DRAFTHOUSE CINEMA 14028 N HIGHWAY 183 BLDG F | Williamson | 10,858.69 |   162,070.00 |
| JACK ALLEN'S KITCHEN 2500 HOPPE TRL                | Williamson |  9,771.95 |   145,850.00 |


## Sales by ZIP Code
For this, we'll do a simple pivot on Zip, but instead of using the default Count method, we'll pass in an aggregation to add the Tax values together for all the Zips. Whe then order_by in reverse to get the top values.

In [49]:
# top zip code for Tax value
zip_tax = mixbev_month.pivot('Zip', aggregation=agate.Sum('Tax')).order_by('Sum', reverse=True)
zip_tax.limit(5).print_table()

| Zip   |          Sum |
| ----- | ------------ |
| 78701 | 2,040,719.27 |
| 78205 |   761,269.84 |
| 75201 |   685,529.78 |
| 77002 |   549,587.31 |
| 77006 |   533,970.55 |


In [50]:
# top zip code gross receipts
zip_receipts = mixbev_month.pivot('Zip', aggregation=agate.Sum('Receipts')).order_by('Sum', reverse=True)
zip_receipts.limit(5).print_table()

| Zip   |           Sum |
| ----- | ------------- |
| 78701 | 30,458,497.06 |
| 78205 | 11,362,236.70 |
| 75201 | 10,231,787.96 |
| 77002 |  8,202,795.88 |
| 77006 |  7,969,709.90 |
