# Imports 

In [1]:
import pandas as pd
import re 
import numpy as np
import datetime
from datetime import datetime

## a couple recordlinkage packages
import fuzzywuzzy
import recordlinkage

## nltk for string distance
import nltk

## jarowinkler
from pyjarowinkler import distance

## repeated printouts
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# 0. Load and view the head of the two datasets

Located in `public_data` in our class repo:

- `sd_forfuzzy.csv`: subsample of San Diego businesses from tax certificate data; same data we used in exact match activity with NAICS codes
- `ppploans_forfuzzy.csv`: sample of businesses in San Diego that received federal Paycheck Protection Program (PPP) loans for help weathering COVID-19

In [2]:
## code to load the two datasets and view head
## NOTE: if following along and you moved this notebook to another
## directory, you'll need to change the pathname

sd = pd.read_csv("../../../public_data/sd_forfuzzy.csv")
sd.head()
sd.shape


ppp = pd.read_csv("../../../public_data/ppploans_forfuzzy.csv")
ppp.head()
ppp.shape

Unnamed: 0,dba_name,business_owner_name,naics_code,address_no,address_pd,address_road,address_sfx,address_city,address_zip,zip_6dig
0,KLEINFELDER CONSTRUCTION SERVICES,KLEINFELDER CONSTRUCTION SERVICES INC,54161,550.0,W,C,ST,SAN DIEGO,92101-3532,92101
1,KLEINFELDER INC,KLEINFELDER INC,541615,770.0,,01ST,AVE,SAN DIEGO,92101-6171,92101
2,PN SHUTTLE SERVICE,NICHOLAS C WATSON & PAUL M BAK-SKLENER,4855,,,,,SAN DIEGO,92176-1038,92176
3,DENTAL ARTICULATING PAPER CO,KLEIN STEVEN H,422,9285.0,,DOWDY,DR,SAN DIEGO,92126-6381,92126
4,COLORS INTERIOR DESIGN,CHIEN-HO SUN,54141,17303.0,,CARRANZA,DR,SAN DIEGO,92127-1326,92127


(408, 10)

Unnamed: 0,BorrowerName,BorrowerAddress,BorrowerCity,BorrowerZip,FranchiseName,NAICSCode,BorrowerZip_6dig,Race,Ethnicity
0,EPSILON SYSTEMS SOLUTIONS INC,9242 LIGHTWAVE AVE Ste 100,SAN DIEGO,92123-6402,,336611.0,92123,Unanswered,Unknown/NotStated
1,YMCA OF SAN DIEGO COUNTY,3708 Ruffin Rd,San Diego,92123-1812,,813410.0,92123,Unanswered,Unknown/NotStated
2,"CERCA TROVA RESTAURANT GROUP HOLDINGS, INC.",7676 HAZARD CENTER DR,SAN DIEGO,92108-4501,Outback Steakhouse,722511.0,92108,Unanswered,Unknown/NotStated
3,RETAIL SERVICES WIS CORPORATION,9265 SKY PARK CT STE 100,SAN DIEGO,92123-4375,,561499.0,92123,White,Not Hispanic or Latino
4,"THE KLEINFELDER GROUP, INC.",550 West C Street,SAN DIEGO,92101,,541330.0,92101,Unanswered,Unknown/NotStated


(5580, 9)

In [3]:
## try exact matching using sd as the left hand side
## data and business name
test_exact = pd.merge(sd, 
                     ppp,
                     how = "left",
                     left_on = ["dba_name", "zip_6dig"],
                     right_on = ["BorrowerName",
                                "BorrowerZip_6dig"],
                     suffixes = ["_sd", "_ppp"],
                     indicator = "sd_match_status")
test_exact.sd_match_status.value_counts()

## see only two real matches (duran freight duplicated 
## across non-cap and cap address)
test_exact.loc[test_exact.sd_match_status == "both",
              ['dba_name', 'BorrowerName',
              'zip_6dig', 'BorrowerZip_6dig',
              'BorrowerAddress']]

left_only     406
both            3
right_only      0
Name: sd_match_status, dtype: int64

Unnamed: 0,dba_name,BorrowerName,zip_6dig,BorrowerZip_6dig,BorrowerAddress
13,AFFIRMED HOUSING GROUP INC,AFFIRMED HOUSING GROUP INC,92128,92128.0,13520 EVENING CREEK DR N Suite 160
402,DURAN FREIGHT CORPORATION,DURAN FREIGHT CORPORATION,92154,92154.0,7295 SIEMPRE VIVA RD
403,DURAN FREIGHT CORPORATION,DURAN FREIGHT CORPORATION,92154,92154.0,7295 Siempre Viva Rd


# 1. More manual approach to fuzzy matching

In these cells, we'll review what's going on "under the hood" in fuzzy matching packages. We'll use an example of two PPP loan recipient businesses:

- THE KLEINFELDER GROUP, INC.
- DURAN FREIGHT CORPORATION

## 1.1 Write regex patterns to find possible matches

In these cells, we:
    
- Define a regex pattern to characterize variations of each of the PPP business names
- Use list comprehension and `re.match` (covered in regex lecture) to find candidate businesses in the San Diego data for matches

In [4]:
klein_patt = r".*(\s+)?KLEINFELDER\s+.*"
klein_possible = [biz for biz in sd.dba_name
                 if re.match(klein_patt, biz) is not None]
klein_possible

['KLEINFELDER CONSTRUCTION SERVICES', 'KLEINFELDER INC']

In [5]:
duran_patt = r".*(\s+)?DURAN\s+.*"
duran_possible = [biz for biz in sd.dba_name
                 if re.match(duran_patt, biz) is not None]
duran_possible

['DURAN FREIGHT CORPORATION']

## 1.2 Calculate string similarity in business names

For Kleinfelder Group in the PPP data, we see two possible matches: Kleinfelder INC and Kleinfelder Construction Services. The first approach is, still focusing on the business name field, to calculate the string similarity between the name as spelled in the PPP loan data and the name as spelled in the San Diego tax data

**General approach**: minimize the distance between the strings

**Specifics**: there are many string similarity/distance metrics. Here, we'll focus on a couple:

1. edit distance (aka Levensthein): finds the # of deletions, substitutions, and insertions required to transform string A into string B
2. jaccard distance: transforms each string into a set of unique letters; calculates the "jaccard similarity" metric which is the intersection of string set A and string set B divided by the union of the strings; distance is 1-similarity
3. jaro-winkler distance: broadly measures number of characters in common (jaro part of the alg.) and winkler part of the alg makes similarities at the beginning of the string count more than similarities at the end

For more discussion, see:

- Discussion of edit versus jaccard: https://python.gotrained.com/nltk-edit-distance-jaccard-distance/
- Discussion of `fuzzywuzzy` package for string similarity: https://towardsdatascience.com/fuzzy-string-matching-in-python-68f240d910fe

In [7]:
## first, let's process the biz name
## and remove everything that's not (^)
## words or spaces and also remove the "the"
## at the beginning of the string
focal_ppp_raw = "THE KLEINFELDER GROUP, INC."
focal_ppp_cleaner = re.sub("THE\s", 
                           "", 
                    re.sub(r"[^\w\s]", "", focal_ppp_raw))
focal_ppp_cleaner

'KLEINFELDER GROUP INC'

In [8]:
### look at a few different distance metrics
sd['dist_focal_edit'] = [nltk.edit_distance(focal_ppp_cleaner, other_name)
                     for other_name in sd.dba_name]

sd[['dba_name', 'dist_focal_edit']].sort_values(by = 'dist_focal_edit')

sd['dist_focal_jacc'] = [nltk.jaccard_distance(set(focal_ppp_cleaner), set(other_name))
                     for other_name in sd.dba_name]

sd[['dba_name', 'dist_focal_jacc']].sort_values(by = 'dist_focal_jacc')


Unnamed: 0,dba_name,dist_focal_edit
1,KLEINFELDER INC,6
10,IDS GROUP INC,9
211,ONLINE AUTO GROUP INC,9
282,ALPINE FENCE INC,12
99,HENKELS & MCCOY INC,12
...,...,...
248,SAN DIEGO SPORTS MEDICINE & FAMILY HEALTH CNTR,39
299,SAN DIEGO PSYCHOANALYTIC SOCIETY AND INSTITUTE,40
348,UNIVERSAL CONFERENCE MANAGEMENT SYSTEMS & SUPPORT,41
382,SAN YSIDRO HEALTH MOUNTAIN HEALTH FAMILY MEDICINE,41


Unnamed: 0,dba_name,dist_focal_jacc
168,DEMSKI FINANCIAL GROUP,0.176471
3,DENTAL ARTICULATING PAPER CO,0.250000
287,GUDINO ELECTRIC,0.266667
241,DR YING LIU OD PC,0.266667
1,KLEINFELDER INC,0.285714
...,...,...
181,HAIRSTYLIST,0.842105
285,SWEETS BY NANA,0.850000
164,NEXTX,0.875000
88,STICHIC,0.882353


In [9]:
## jaro is similarity score so 1 - that
sd['dist_focal_jaro'] = [1-distance.get_jaro_distance(focal_ppp_cleaner, other_name,
                                        winkler = True, scaling = 0.1)
                     for other_name in sd.dba_name]

sd[['dba_name', 'dist_focal_jaro']].sort_values(by = 'dist_focal_jaro')

Unnamed: 0,dba_name,dist_focal_jaro
1,KLEINFELDER INC,0.06
0,KLEINFELDER CONSTRUCTION SERVICES,0.14
184,KB ENTERPRISES LLC,0.27
81,KAIZEN BUILT INC,0.29
41,DELTA GROUP ELECTRONICS INC,0.30
...,...,...
315,HOTZ COSMETICS,0.66
256,BRADY HOLLY,0.66
341,PHO CALI,0.72
243,CHAU TRAN,0.73


## 1.2 Use the zip code field to try rule out false positive matches

"Blocking" on 6-digit zip code, or requiring an exact match

In [10]:
## get the zip- using iloc since we just want it as a string
## rather than series
focal_ppp_zip = ppp.loc[ppp.BorrowerName == 
                "THE KLEINFELDER GROUP, INC.",
                ["BorrowerName",
                 "BorrowerAddress",
                 "BorrowerZip_6dig", "NAICSCode"]].copy()
focal_ppp_zip

Unnamed: 0,BorrowerName,BorrowerAddress,BorrowerZip_6dig,NAICSCode
4,"THE KLEINFELDER GROUP, INC.",550 West C Street,92101,541330.0


In [11]:
## create true false if same as focal biz
sd['is_match_zip'] = np.where(sd.zip_6dig == 
                    focal_ppp_zip.BorrowerZip_6dig.iloc[0],
                    True, False)

sd.loc[(sd.is_match_zip) &
      (sd.dba_name.isin(klein_possible)),
      ['dba_name'] + [col for col in sd.columns if "address" in col] + 
      ["zip_6dig", "naics_code"]]

Unnamed: 0,dba_name,address_no,address_pd,address_road,address_sfx,address_city,address_zip,zip_6dig,naics_code
0,KLEINFELDER CONSTRUCTION SERVICES,550,W,C,ST,SAN DIEGO,92101-3532,92101,54161
1,KLEINFELDER INC,770,,01ST,AVE,SAN DIEGO,92101-6171,92101,541615


## 1.3 Construct a match score summarizing these two fields (zip code and name similarity)

Record linkage methods have different ways for aggregating across fields

Here, we're going with a simple one of:

- Need to match the zip code of the focal Kleinfelder group directly
- Within those, find the average of the jarowinkler and jaccard string distance measures (we're excluding edit distance from that avg since on diff scale)

Whichever has the lowest average of two we consider the best match

In [13]:
## get string distance column names
string_dist_fields = [col for col in sd.columns if "dist_" in col and 
                     "edit" not in col]


## take the row mean (axis = 1) across those columns
mean_distances = sd[string_dist_fields].mean(axis = 1)

## assign that as a new column
sd['mean_string_dist'] = mean_distances

## sort from highest to lowest string
## distance among matches
sd.loc[sd.is_match_zip].sort_values(by = 
                    "mean_string_dist").head(3)


['dist_focal_jacc', 'dist_focal_jaro']

Unnamed: 0,dba_name,business_owner_name,naics_code,address_no,address_pd,address_road,address_sfx,address_city,address_zip,zip_6dig,dist_focal_edit,dist_focal_jacc,dist_focal_jaro,is_match_zip,mean_string_dist
1,KLEINFELDER INC,KLEINFELDER INC,541615,770,,01ST,AVE,SAN DIEGO,92101-6171,92101,6,0.285714,0.06,True,0.172857
0,KLEINFELDER CONSTRUCTION SERVICES,KLEINFELDER CONSTRUCTION SERVICES INC,54161,550,W,C,ST,SAN DIEGO,92101-3532,92101,17,0.294118,0.14,True,0.217059
252,WESTERN FINANCIAL CORP,WESTERN FINANCIAL CORP,52393,600,,B,ST,SAN DIEGO,92101-4508,92101,21,0.444444,0.33,True,0.387222


### Preview of activity step 1: clean addresses in each of the datasets

Previous example shows us address can help adjudicate between matches.

When we break into groups, you'll
    
- Paste together the address_no, address_pd, address_road, address_sfx fields in the SD active biz to create a field similar to `BorrowerAddress` in the PPP loan data 
- When doing so, make sure to pay attention to the following issues that might cause failures to clean / match:
    - NaN in inputs to the address fields in the San Diego data; if you paste the string literally, these will show up as NaN; better to convert to "" or whitespace
    - Capitalization: make sure to standardize the capitalization so that the addresses in both datasets are either all lowercase or all uppercase


# 2. That was a lot of steps. How can we use a package to automate a bit?

Google "fuzzy matching" or "probablistic record linkage" packages in python

Here, we'll focus on 

- recordlinkage. Documentation: https://recordlinkage.readthedocs.io/en/latest/notebooks/link_two_dataframes.html


## 2.1 Clean potential join fields (here: focus on BorrowerName in PPP; dba_name in SD)

In [14]:
## clean name similarly to how we did before
ppp['bizname_4match'] = [re.sub(r"[^\w\s]", "", one_n) 
                         for one_n in ppp.BorrowerName]
ppp.loc[ppp.bizname_4match != ppp.BorrowerName,
       ['BorrowerName', 'bizname_4match']].head()

sd['bizname_4match'] = [re.sub(r"[^\w\s]", "", one_n) 
                        for one_n in sd.dba_name]
sd.loc[sd.bizname_4match != sd.dba_name,
       ['dba_name', 'bizname_4match']].head()

Unnamed: 0,BorrowerName,bizname_4match
2,"CERCA TROVA RESTAURANT GROUP HOLDINGS, INC.",CERCA TROVA RESTAURANT GROUP HOLDINGS INC
4,"THE KLEINFELDER GROUP, INC.",THE KLEINFELDER GROUP INC
5,"TRADEMARK CONSTRUCTION CO., INC.",TRADEMARK CONSTRUCTION CO INC
6,"EVANS HOTELS, LLC",EVANS HOTELS LLC
8,A.O. REED & CO,AO REED CO


Unnamed: 0,dba_name,bizname_4match
25,O'REILLY AUTO ENTERPRISES #2714,OREILLY AUTO ENTERPRISES 2714
33,RALLY'S HAMBURGERS,RALLYS HAMBURGERS
39,FILKEY & ASSOCIATES INC,FILKEY ASSOCIATES INC
42,BEST LIFE CHIROPRACTIC - DR GERALD PALMES PC,BEST LIFE CHIROPRACTIC DR GERALD PALMES PC
47,GOLDEN POPPY PRESCHOOL & INFANT CENTER,GOLDEN POPPY PRESCHOOL INFANT CENTER


## 2.2: for ease of use, standardize colnames for the fields we'll use

In this practice exercise, we'll use:

- Fuzzy match on business name
- Exact match on 6-digit zip code

We only need to standardize the name of the exact match field, but are here just standardizing all for ease of use

In [16]:
## define rename dictionary for sd_biz and rename saving to new (just for convenience to not reload if we want to
## change earlier step)
newcols_sd = {'zip_6dig': 'zip_4match'}
sd = sd.rename(columns = newcols_sd, inplace = False)

## same for ppp data
newcols_ppp = {'BorrowerZip_6dig': 'zip_4match'}
ppp = ppp.rename(columns = newcols_ppp, inplace = False)


## 2.3: initialize the match object and tell it if anything to "block on" or exact match

Here, we're blocking on 6-digit zip

In [18]:
## initialize indexer
my_recordmatcher = recordlinkage.Index()
print(type(my_recordmatcher))

## tell it what to block on (skip if not blocking on anything)
my_recordmatcher.block("zip_4match")


<class 'recordlinkage.api.Index'>


<Index>

## 2.4: create candidate links based on that blocking variable

In [19]:
## then, feed the record matcher the two datasets (must have that blocking variable)
## this will create candidate_links that are exact matches on those
candidate_links_zip = my_recordmatcher.index(sd, ppp)
candidate_links_zip

print(type(candidate_links_zip))



MultiIndex([(  0,    4),
            (  0,    9),
            (  0,   13),
            (  0,   21),
            (  0,   30),
            (  0,   50),
            (  0,   61),
            (  0,   67),
            (  0,   80),
            (  0,   84),
            ...
            (242, 1081),
            (242, 2351),
            (242, 4180),
            (242, 4181),
            (242, 4879),
            (242, 4951),
            (242, 5401),
            (242, 5494),
            (242, 5496),
            (368, 5078)],
           length=75007)

<class 'pandas.core.indexes.multi.MultiIndex'>


In [21]:
## see that it's a list of tuples and first element in tuple is index
## of first df we feed it; second is index in second df we feed it

## print example of links
sd.loc[sd.index == 242,
        [col for col in sd.columns if "4match" in col]]
ppp.loc[ppp.index.isin([1081, 2351]),
        [col for col in ppp.columns if "4match" in col]]

Unnamed: 0,zip_4match,bizname_4match
242,92124,KARENS KRITTERS


Unnamed: 0,zip_4match,bizname_4match
1081,92124,NEWBREAK CHURCH
2351,92124,ARMARANA ENTERPRISES INCORPORATED


## 2.5- initialize Compare class and define fuzzy fields and threshold for each

Note in documentation about diff string compare methods:

This class is used to compare string values. The implemented algorithms are: ‘jaro’,’jarowinkler’, ‘levenshtein’, ‘damerau_levenshtein’, ‘qgram’ or ‘cosine’. In case of agreement, the similarity is 1 and in case of complete disagreement it is 0. The Python Record Linkage Toolkit uses the jellyfish package for the Jaro, Jaro-Winkler, Levenshtein and Damerau- Levenshtein algorithms.

In [23]:
compare = recordlinkage.Compare()
print(type(compare))

thres_bizname = 0.65
compare.string('bizname_4match', 'bizname_4match', 
               method='jaro', threshold=thres_bizname)


<class 'recordlinkage.api.Compare'>


<Compare>

## 2.6- using the compare Class and the candidate links, compute comparisons

In [24]:
## use compare class to compute
## feed it (1) candidate links based on zip code blocking
## and string comparison, (2) raw datasets (order matters)
compare_vectors = compare.compute(candidate_links_zip, sd, ppp)
compare_vectors
print(type(compare_vectors))

## convert to a dataframe- the leftmost index is sd 
## since that's the first/left data; the right index is
## ppp since that's the second/right data; see that 
## most are non-matches
compare_vectors_df = pd.DataFrame(compare_vectors.reset_index())
compare_vectors_df.columns = ["index_sd", "index_ppp", "name_match"]
compare_vectors_df.sample(n = 5)


Unnamed: 0,Unnamed: 1,0
0,4,1.0
0,9,0.0
0,13,0.0
0,21,0.0
0,30,0.0
...,...,...
242,4951,0.0
242,5401,0.0
242,5494,0.0
242,5496,0.0


<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,index_sd,index_ppp,name_match
53939,345,906,0.0
7338,245,750,0.0
46701,52,103,0.0
29932,185,3404,0.0
43322,225,4769,0.0


In [26]:

## get the ppp row index of the Kleinfelder group business
index_klein = ppp.index[ppp.bizname_4match == "THE KLEINFELDER GROUP INC"]

## using the dataframe version of compare, look for
## sd data indices of matches
poss_klein = compare_vectors_df[\
                compare_vectors_df.index_ppp.isin(index_klein) &
                compare_vectors_df.name_match == 1]
poss_klein

## print results
sd.loc[sd.index.isin(poss_klein.index_sd),
      ['bizname_4match', 'zip_4match']]


Unnamed: 0,index_sd,index_ppp,name_match
0,0,4,1.0
804,1,4,1.0


Unnamed: 0,bizname_4match,zip_4match
0,KLEINFELDER CONSTRUCTION SERVICES,92101
1,KLEINFELDER INC,92101


## 2.7. decide what counts as a true match

Three general approaches:

- Threshold based: look at the raw scores and determine what scores are above a threshold
- Unsupervised: something that clusters the pairs into "likely match" or "likely not match" but where we're not feeding it "labels" corresponding to true matches
- Supervised: we have some gold-standard label dataset that has an indicator for whether records are true matches; we train a model on those true matches and generalize to new cases

See here for many classifiers: https://recordlinkage.readthedocs.io/en/latest/ref-classifiers.html

Here, we're using unsupervised and k-means clustering algorithm

Other option is an EM-based classifier initialized as follows, but not enough data here to fit:
ecm = recordlinkage.ECMClassifier()



In [29]:
## initialize classifier
kmeans = recordlinkage.KMeansClassifier()
kmeans_results = kmeans.fit_predict(compare_vectors)
print(type(kmeans_results))
kmeans_results


<class 'pandas.core.indexes.multi.MultiIndex'>


MultiIndex([(  0,    4),
            (  0,  135),
            (  0,  218),
            (  0,  380),
            (  0,  413),
            (  0,  422),
            (  0,  477),
            (  0,  478),
            (  0,  497),
            (  0,  616),
            ...
            (371, 3679),
            (371, 3726),
            (371, 3962),
            (371, 4062),
            (371, 4470),
            (371, 4492),
            (371, 4493),
            (371, 5446),
            (371, 5489),
            (371, 5534)],
           length=3684)

## 2.8:  extract pairs using indices and summarize

In [30]:
## since sd was our left hand side data, they're 
## the first index in the tuple- extract
indices_sd = [x[0] for x in kmeans_results]

## since ppp loans were our right hand side data, they're
## the second index in the tuple - extract
indices_ppp = [x[1] for x in kmeans_results]

## create dataframe
df_matchpairs = pd.DataFrame({'sd_indices': indices_sd,
                'ppp_indices': indices_ppp})

df_matchpairs

## add indices as col to orig data
sd['index_4merge'] = sd.index
ppp['index_4merge'] = ppp.index

## then, join matches

### first, i'm joining the sd info
df_matchpairs_wsd = pd.merge(df_matchpairs,
                            sd,
                            how = "left",
                            left_on = "sd_indices",
                            right_on = "index_4merge")

## then, i'm joining the ppp info and adding a suffix to distinguish the vars
df_matchpairs_wboth = pd.merge(df_matchpairs_wsd,
                              ppp,
                              how = "left",
                              left_on = "ppp_indices",
                              right_on = "index_4merge",
                              suffixes= ["_sd", "_ppp"])

df_matchpairs_wboth[['bizname_4match_sd', 'bizname_4match_ppp',
                     'zip_4match_sd', 'zip_4match_ppp',
                     'BorrowerAddress'] + 
                   [col for col in sd.columns if 
                   "address" in col]].head()

## see some true and false positives; would want to use business address

Unnamed: 0,sd_indices,ppp_indices
0,0,4
1,0,135
2,0,218
3,0,380
4,0,413
...,...,...
3679,371,4492
3680,371,4493
3681,371,5446
3682,371,5489


Unnamed: 0,bizname_4match_sd,bizname_4match_ppp,zip_4match_sd,zip_4match_ppp,BorrowerAddress,address_no,address_pd,address_road,address_sfx,address_city,address_zip
0,KLEINFELDER CONSTRUCTION SERVICES,THE KLEINFELDER GROUP INC,92101,92101,550 West C Street,550,W,C,ST,SAN DIEGO,92101-3532
1,KLEINFELDER CONSTRUCTION SERVICES,KLINEDINST PC,92101,92101,"501 West Broadway, Suite 600",550,W,C,ST,SAN DIEGO,92101-3532
2,KLEINFELDER CONSTRUCTION SERVICES,SAN DIEGO CONVENTION CENTER CORPORATION,92101,92101,111 W Harbor Dr,550,W,C,ST,SAN DIEGO,92101-3532
3,KLEINFELDER CONSTRUCTION SERVICES,LATE MORNINGS INC,92101,92101,1909 India St,550,W,C,ST,SAN DIEGO,92101-3532
4,KLEINFELDER CONSTRUCTION SERVICES,RED DOOR INTERACTIVE INC,92101,92101,350 10th Ave Suite 100,550,W,C,ST,SAN DIEGO,92101-3532
