## State of Washington - Licensee - The Clone Zone

* UBI: 603347650

We'll be using the [`cannapy`](https://github.com/CannabisData/cannapy) library to access the portal data.  `cannapy` aims to provide an abstract interface for accessing and working with *Cannabis* data from around the world.  It utilizes [xmunoz](https://github.com/xmunoz)'s [`sodapy`](https://github.com/xmunoz/sodapy) client to access Socrata-based open data portals and can return data loaded into [Pandas DataFrames](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

### Dataset: Licensed Businesses

* Canonical Dataset ID: **bhbp-x4eb**
* Detail screen on the WSLCB Portal: https://data.lcb.wa.gov/Licensing/Licensed-Businesses/u3zh-ri66
* Detail screen on Socrata's Open Data Foundry: https://dev.socrata.com/foundry/data.lcb.wa.gov/bhbp-x4eb

In [1]:
import time
import cannapy.us.wa.wslcb.portal as wslcb
import pandas as pd

In [2]:
# Specify your own Socrata App Token if you plan to experiment
app_token = 'XaB9MBqc81C3KT4Vps6Wh5LZt'

# Instantiate a cannapy interface to the WSLCB open data portal
portal = wslcb.WSLCBPortal(app_token)

# We'll be using the Licensed Businesses dataset
dataset_id = 'bhbp-x4eb'

# And we're looking for data on a particular licensee
licensee_ubi = '603347650'

In [3]:
# Check when the dataset was last updated
last_updated = portal.dataset_last_updated(dataset_id)
print('Last updated: {}'.format(time.strftime('%c', last_updated)))

Last updated: Wed Jun 27 14:02:12 2018


In [4]:
# Retrieve the dataset preloaded into a Pandas DataFrame
licenses = portal.get_dataframe(dataset_id)

In [5]:
# The UBI column uniquely identifies each licensee, but obscures ownership of multiple licenses by the same entity.
# Let's break that column apart into its constituent parts:
# Unified Business Identifier (UBI): first nine digits
# Business ID Number: next three digits
# Location Number: last four digits
df_v2 = licenses.rename(columns={'ubi': 'ubi_source'})
df_v2['ubi'] = df_v2.ubi_source.str[0:9]
df_v2['ubi_business_id'] = df_v2.ubi_source.str[9:12]
df_v2['ubi_location'] = df_v2.ubi_source.str[12:]
licensee_licenses = df_v2.loc[df_v2['ubi'] == licensee_ubi]
licensee_licenses

Unnamed: 0,license,type,createdate,active,organization,address,address_line_2,city,state,zip,county,dayphone,ubi_source,ubi,ubi_business_id,ubi_location
352,412598,MARIJUANA PRODUCER TIER 2/MARIJUANA PROCESSOR,20180301,ACTIVE (ISSUED),THE CLONE ZONE,17835 59TH AVE NE BLDG 8 STE B,,ARLINGTON,WA,982236429,SNOHOMISH,4252394473,6033476500010000,603347650,1,0


### Dataset: Enforcement Visits

* Canonical Dataset ID: **w7wg-8m52**
* Detail screen on the WSLCB Portal: https://data.lcb.wa.gov/dataset/Enforcement-Visits-Dataset/jizx-thwg
* Detail screen on Socrata's Open Data Foundry: https://dev.socrata.com/foundry/data.lcb.wa.gov/w7wg-8m52

In [7]:
# Let's see how many enforcement visits the licensee has hosted
dataset_id = 'w7wg-8m52'

# Select the licensee's license number
# TODO: find a way to do this without hardcoding the row number
licensee_license_number = licensee_licenses.loc[352, 'license']

# Check when the dataset was last updated
last_updated = portal.dataset_last_updated(dataset_id)
print('Last updated: {}'.format(time.strftime('%c', last_updated)))

Last updated: Mon Feb 25 13:34:36 2019


In [8]:
# Retrieve the dataset preloaded into a Pandas DataFrame
enforcement_visits = portal.get_dataframe(dataset_id)

# Suppress the chained assignment warning: https://stackoverflow.com/a/20627316/7622699
pd.options.mode.chained_assignment = None

# Pull aside the enforcement visits by the selected licensee
licensee_enforcement_visits = enforcement_visits.loc[enforcement_visits['license_number'] == licensee_license_number]

# Sort the DataFrame by 'date'
licensee_enforcement_visits.sort_values(by='date', inplace=True)

licensee_enforcement_visits

Unnamed: 0,date,license_number,city_name,county_name,activity
21298,2015-09-15T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
20888,2015-10-21T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
20824,2015-10-23T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
20701,2015-10-30T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
20270,2015-12-10T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
20086,2015-12-29T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
19359,2016-03-02T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
19215,2016-03-09T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
16949,2016-07-25T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check
16934,2016-07-26T00:00:00.000,412598,ARLINGTON,SNOHOMISH,Marijuana Premises Check


### Dataset: Violations

* Canonical Dataset ID: **dgm4-3cm6**
* Detail screen on the WSLCB Portal: https://data.lcb.wa.gov/dataset/Violations-Dataset/dx3i-tzh2
* Detail screen on Socrata's Open Data Foundry: https://dev.socrata.com/foundry/data.lcb.wa.gov/dgm4-3cm6

In [9]:
# Let's pull up all of the licensee's violations
dataset_id = 'dgm4-3cm6'

# Check when the dataset was last updated
last_updated = portal.dataset_last_updated(dataset_id)
print('Last updated: {}'.format(time.strftime('%c', last_updated)))

Last updated: Mon Feb 25 13:33:58 2019


In [10]:
# Retrieve the dataset preloaded into a Pandas DataFrame
violations = portal.get_dataframe(dataset_id)

# Pull aside the violations by the selected licensee
licensee_violations = violations.loc[violations['license_number'] == licensee_license_number]

# Sort the DataFrame by 'visit_date'
licensee_violations.sort_values(by='visit_date', inplace=True)

licensee_violations

Unnamed: 0,visit_date,license_number,county_name,city_name,case,violation_code,wac_code,penalty_type
2324,2015-10-21T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G5294A,,314.55.083(3),AVN
2325,2015-10-21T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G5294A,,314.55.020,Written Warning
2326,2015-10-21T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G5294A,,314.55.083(4),AVN
2315,2015-10-23T00:00:00.000,412598,SNOHOMISH,ARLINGTON,3G5296A,,314.55.083(3),AVN
2211,2015-12-10T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G5344A,,314.55.083(1),Written Warning
2212,2015-12-10T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G5344A,,314.55.097,Written Warning
2214,2015-12-10T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G5344A,,314.55.083(4),AVN
2084,2016-03-02T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G6062A,,314.55.083(4),Written Warning
2088,2016-03-02T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7G6062A,,314.55.097,Written Warning
1780,2016-07-25T00:00:00.000,412598,SNOHOMISH,ARLINGTON,7B6207A,,314.55.085,AVN
