# Strategy in Finding a Business Location

To find the best business location, start by analyzing your target market, researching potential areas, and considering factors like accessibility, infrastructure, and zoning regulations.
Here's a more detailed strategy:  

##  1. Define Your Needs and Goals: 
- **Target Market:** Identify your ideal customer base and where they are located
    - For definition see [this](https://www.investopedia.com/terms/t/target-market.asp#:~:text=Demographic%3A%20These%20are%20the%20main,in%20the%20era%20of%20globalization.)
- **Business Type:** Determine if your business requires 
    - high foot traffic
    - proximity to suppliers
    - access to specific infrastructure (e.g., highways, rail yards)
- **Budget:** Establish a realistic budget for 
    - rent
    - utilities
    - other location-related costs

### *Target Market Examples*

Each dictionary entry includes:
- **Description**: A brief overview of the business type.
- **Demographics**: Key demographic information relevant to the target market.
- **Location**: Geographical considerations, if applicable.
- **Interests**: Common interests or behaviors of the target audience.
- **Platforms**: Recommended platforms for reaching the target audience.
- **Source**: The credible source from which the information was derived.
- **URL**: A direct link to the source for further reading.

These detailed profiles can assist small businesses in tailoring their marketing strategies to effectively reach their desired audiences.

In [None]:
target_markets_examples = {
    "home_services": {
        "description": "Businesses offering home maintenance and improvement services.",
        "demographics": {
            "age_range": "35-65",
            "gender": "50% women, 50% men",
            "homeownership": "86% homeowners",
            "median_income": 108000
        },
        "location": "Suburban areas",
        "interests": ["home improvement", "DIY projects", "family activities"],
        "platforms": ["Google Ads", "Facebook", "Instagram"],
        "source": "LocaliQ",
        "url": "https://localiq.com/industries/home-services/"
    },
    "healthcare": {
        "description": "Medical professionals and clinics providing healthcare services.",
        "demographics": {
            "age_range": "25-65",
            "gender": "56% women, 44% men",
            "household_income": 91000,
            "has_children": True
        },
        "platforms": ["Facebook", "Instagram"],
        "interests": ["quality healthcare", "family dental care"],
        "source": "LocaliQ",
        "url": "https://localiq.com/industries/healthcare/"
    },
    "luxury_electronics": {
        "description": "Retailers specializing in high-end electronics and gadgets.",
        "demographics": {
            "age_range": "16-35",
            "income_level": "High disposable income"
        },
        "platforms": ["Instagram", "TikTok"],
        "interests": ["technology", "social media", "fashion"],
        "source": "Indeed",
        "url": "https://www.indeed.com/career-advice/career-development/target-market-example"
    },
    "health_and_wellness": {
        "description": "Stores offering natural health products and wellness items.",
        "demographics": {
            "age_range": "25-55",
            "gender": "Predominantly women",
            "education": "Well-educated"
        },
        "interests": ["healthy lifestyle", "natural products", "family well-being"],
        "source": "Indeed",
        "url": "https://www.indeed.com/career-advice/career-development/target-market-example"
    },
    "organic_farm_shop": {
        "description": "Farm shops selling organic produce and sustainable products.",
        "demographics": {
            "age_range": "35-55",
            "income_level": "Medium to high"
        },
        "family_status": "Families",
        "location": "Rural areas near towns",
        "interests": ["organic food", "sustainable living", "home delivery services"],
        "source": "Indeed",
        "url": "https://www.indeed.com/career-advice/career-development/target-market-example"
    },
    "cycling_cafe": {
        "description": "Cafés catering to cyclists with quick meals and refreshments.",
        "demographics": {
            "age_range": "20-55",
            "income_level": "Moderate"
        },
        "interests": ["cycling", "outdoor activities", "quick meals"],
        "location": "Near popular cycling routes",
        "source": "Indeed",
        "url": "https://www.indeed.com/career-advice/career-development/target-market-example"
    },
    "hiking_supply_shop": {
        "description": "Stores selling hiking and running gear and accessories.",
        "demographics": {
            "age_range": "18-30",
            "gender": "Both men and women"
        },
        "interests": ["hiking", "running", "outdoor adventures"],
        "values": ["Safety", "Quality gear", "Style"],
        "platforms": ["Instagram", "Facebook"],
        "source": "Indeed",
        "url": "https://www.indeed.com/career-advice/career-development/target-market-example"
    },
    "financial_advisory": {
        "description": "Services offering financial planning and advisory.",
        "demographics": {
            "age_range": "35-50",
            "income_level": "Medium to high"
        },
        "interests": ["investment planning", "retirement strategies", "financial security"],
        "source": "Investopedia",
        "url": "https://www.investopedia.com/articles/financialcareers/07/idea-clients.asp"
    }
}


## 2. Research Potential Locations:
- **Demographics:** Research the demographics of potential areas to ensure they align with your target market
- **Competition:** Analyze the competitive landscape in each area to understand the level of competition
- **Infrastructure:** Assess the availability of essential infrastructure, such as 
    - transportation
    - utilities
    - internet access
- **Zoning Laws:** Understand local zoning regulations and restrictions to ensure compliance
- **Local Resources:** Explore local government census data or tools like WIGeoLocation for further research  
- **Accessibility:** Evaluate the accessibility of the location for both customers and employees
- **Traffic:** Consider traffic patterns and parking availability, especially if foot traffic is important 

### Market Analysis
Example: Laundromat in Vandergrift, PA

- **Demographic Analysis:** Use local census data to understand the demographics of your area. 
    - Look for age distributions, income levels, and household sizes that match the typical profile of a user (for our example laundromat user).
- **Psychographic Information:** Go beyond basic demographics to explore customer lifestyles, values, and habits. 

For instance, are there large groups of environmentally conscious consumers in your area who might appreciate eco-friendly laundry solutions?

In [None]:
# Import modules
import matplotlib.pyplot as plt
import pandas as pd
from census import Census
from us import states
import requests
from typing import Dict, Optional
import json
from pathlib import Path
import zipcodes
import addfips
from sklearn.preprocessing import MinMaxScaler


# Created modules
from ipython_config import CENSUS_KEY


In [None]:
# Coin-Operated Laundries and Drycleaners for all years!
NAICS = '812310'

# Vandergrift, PA
ZIPCODE = '15690'

state = states.PA
STATEFIPS = state.fips
STATENAME = state.name

result = zipcodes.matching(ZIPCODE)[0]
COUNTYNAME = result['county']

af = addfips.AddFIPS()

# Get FIPS code for a single county
COUNTYFIP = af.get_county_fips(county=COUNTYNAME, state=STATENAME)
COUNTYCODE = COUNTYFIP[2:]

print(f"""
NAICS: {NAICS}
ZIPCODE: {ZIPCODE}
STATEFIPS: {STATEFIPS}
STATENAME: {STATENAME}
COUNTYNAME: {COUNTYNAME} 
COUNTYFIP: {COUNTYFIP}  
COUNTYCODE: {COUNTYCODE}
""")

## Locate & Analyze Customers and Market with **Census Business Builder**
- [Video How-To](https://www.census.gov/data/academy/data-gems/2023/locate-analyze-customers-market-with-cbb.html)
- [Census Business Builder](https://cbb.census.gov/cbb/)

In [None]:
for key, value in result.items():
    print(f'{key.replace('_', '').upper()}: {value}')

## API's For Small Business Statistics
- Nonemployer Statistics
- Business Patterns County Business Patterns
- Economic Census

## Annual Business Survey (ABS) Data

In [None]:
# Nonemployer Statistics

nonemp_params = {
    'variables' : "",
    'geography': f"county:{COUNTYCODE}&in=state:{STATEFIPS}",
    'api_key': CENSUS_KEY,
    'dataset_base': 'nonemp',
    'year': ''
}

In [None]:
# Business Patterns County Business Patterns

cbp_params = {
    'variables' : "",
    'geography': f"county:{COUNTYCODE}&in=state:{STATEFIPS}",
    'api_key': CENSUS_KEY,
    'dataset_base': 'cbp',
    'year': ''
}


In [None]:

# Economic Census

ecnbasic_params = {
    'variables' : "",
    'geography': f"county:{COUNTYCODE}&in=state:{STATEFIPS}",
    'api_key': CENSUS_KEY,
    'dataset_base': 'ecnbasic',
    'year': ''
}

In [None]:
# ACS 1-Year Estimates

acs1_params = {
    'variables' : "",
    'geography': f"county:{COUNTYCODE}&in=state:{STATEFIPS}",
    'api_key': CENSUS_KEY,
    'dataset_base': 'acs/acs1',
    'year': ''
}

In [None]:
# ACS 5-Year Estimates

acs5_params = {
    'variables' : "",
    'geography': f"tract:*&in=state:{STATEFIPS}&in=county:{COUNTYCODE}",
    'api_key': CENSUS_KEY,
    'dataset_base': 'acs/acs5',
    'year': ''
}

In [None]:
import pandas as pd
from IPython.display import display


def get_data(abs_params):
    """
    Get data from the Cenus Bureau API
    
    Parameters:
    - variables: Comma-separated string of variable names
    - geography: Geographic level (default: national level)
    - api_key: Your Census Bureau API key
    """
    base_url = "https://api.census.gov/data"
    year = abs_params['year']
    dataset = abs_params['dataset_base']
    variables = abs_params['variables']
    geography = abs_params['geography']
    api_key = abs_params['api_key']
    
    # Construct the full URL
    url = f"{base_url}/{year}/{dataset}?get={variables}&for={geography}"
    print(url)
    
    if api_key:
        url += f"&key={api_key}"
        
    response = requests.get(url)
    
    if response.status_code == 200:
        return response.json()
    else:
        raise Exception(f"Failed to retrieve data for year {year}. Status code: {response}") 

def get_all_business_code_table(df, naics_code, naics_colum):
    n = len(naics_code)
    full_list = []
    for i in range(n):

        naics_temp = naics_code[:i]
        naics_table = df[(df[naics_colum]==naics_temp)]

        if naics_table.shape[0]:
            full_list.append(naics_table)
        
    full_df = pd.concat(full_list)
    return  full_df

In [None]:
# nonemp_all_years = []
# for i in range(2017, 2026):

#     print(f'begin year {i}')

#     naics_year = 2022 if i >= 2022 else 2017

#     nonemp_params['year'] = i
#     nonemp_params['variables'] = f'NAME,COUNTY,NAICS{naics_year},NAICS{naics_year}_LABEL,NESTAB,NRCPTOT,NRCPTOT_N,RCPSZES,YEAR'

#     try:
#         nonemp = pd.DataFrame(get_data(nonemp_params))
#         nonemp.columns = nonemp.iloc[0]
#         nonemp = nonemp.iloc[1:]

#         naics_nonemp = get_all_business_code_table(nonemp, NAICS, f'NAICS{naics_year}')
        
#         nonemp_all_years.append(naics_nonemp)
        
#     except:
#         pass

#     print(f'finished year {i}')

In [None]:
# nonemp_all_years = pd.concat(nonemp_all_years)
# display(nonemp_all_years)

In [None]:
# cbp_all_years = []
# for i in range(2017, 2026):

#     print(f'begin year {i}')

#     naics_year = 2022 if i > 2022 else 2017

#     cbp_params['year'] = i
#     cbp_params['variables'] = f"NAME,CBSA,CD,COUNTY,CSA,EMP,EMP_N,EMPSZES_LABEL,ESTAB,NAICS{naics_year},NAICS{naics_year}_LABEL,PAYANN,PAYANN_N,PAYQTR1,PAYQTR1_N,YEAR"

#     try:
#         cbp = pd.DataFrame(get_data(cbp_params))
#         cbp.columns = cbp.iloc[0]
#         cbp = cbp.iloc[1:]

#         naics_cbp = get_all_business_code_table(cbp, NAICS, f'NAICS{naics_year}')
        
#         cbp_all_years.append(naics_cbp)
#     except:
#         pass
    
#     print(f'finished year {i}')

In [None]:
# cbp_all_years = pd.concat(cbp_all_years)
# display(cbp_all_years)

In [None]:
# ecnbasic_all_years = []
# for i in range(2017, 2026):

#     print(f'begin year {i}')

#     naics_year = 2022 if i >= 2022 else 2017

#     ecnbasic_params['year'] = i
#     ecnbasic_params['variables'] = f"NAME,CBSA,CD,COUNTY,CSA,EMP,EMP_N,EMPSZES_LABEL,ESTAB,NAICS{naics_year},NAICS{naics_year}_LABEL,PAYANN,PAYANN_N,PAYQTR1,PAYQTR1_N,YEAR,ZIPCODE"
#     try:
#         ecnbasic = pd.DataFrame(get_data(ecnbasic_params))
#         ecnbasic.columns = ecnbasic.iloc[0]
#         ecnbasic = ecnbasic.iloc[1:]

#         naics_ecnbasic = get_all_business_code_table(ecnbasic, NAICS, f'NAICS{naics_year}')
        
#         ecnbasic_all_years.append(naics_ecnbasic)
#     except:
#         pass

#     print(f'finished year {i}')

In [None]:
# ecnbasic_all_years = pd.concat(ecnbasic_all_years)
# display(ecnbasic_all_years)

In [None]:
# acs1_all_years = []
# for i in range(2017, 2026):

#     print(f'begin year {i}')

#     acs1_params['year'] = i
#     acs1_params['variables'] = "NAME,B19013_001E" # NAME = geography name, B19013_001E = median household income
#     try:
#         acs1 = pd.DataFrame(get_data(acs1_params))
#         acs1.columns = acs1.iloc[0]
#         acs1 = acs1.iloc[1:]     
#         acs1['year'] = i   
#         acs1_all_years.append(acs1)
#     except:
#         pass

#     print(f'finished year {i}')

# acs1_all_years = pd.concat(acs1_all_years)

In [None]:
# acs1_all_years.rename(columns={
#     'B19013_001E': 'median household income'
# }, inplace=True)

# acs1_all_years["median household income"] = pd.to_numeric(acs1_all_years["median household income"])

# # ESTIMATE: Use a flat 20% tax rate (can vary, ideally you'd have better tax data)
# acs1_all_years["Estimated_Taxes"] = acs1_all_years["median household income"] * 0.20
# acs1_all_years["Disposable_Income"] = acs1_all_years["B19013_001E"] - acs1_all_years["Estimated_Taxes"]

# display(acs1_all_years)

In [None]:
acs5_all_years = []
for i in range(2017, 2026):

    print(f'begin year {i}')
    
    acs5_params['year'] = i
    acs5_params['variables'] = "NAME,B01003_001E,B19013_001E,B01002_001E,B15003_022E,B15003_023E,B15003_024E,B15003_025E,B01001_011E,B01001_012E,B01001_035E,B01001_036E,B19083_001E,C02003_008E,C02003_004E,C02003_003E,C02003_007E,C02003_006E,C02003_005E" 
    try:
        acs5 = pd.DataFrame(get_data(acs5_params))
        acs5.columns = acs5.iloc[0]
        acs5 = acs5.iloc[1:]     
        acs5['year'] = i    
        acs5_all_years.append(acs5)
    except:
        pass

    print(f'finished year {i}')

acs5_all_years = pd.concat(acs5_all_years)

In [None]:
# import requests

# def download_census_variables(url):
#     try:
#         # Send GET request
#         response = requests.get(url)
        
#         # Check if the request was successful
#         response.raise_for_status()
        
#         # Save the JSON content to a file
#         with open('data/variables.json', 'wb') as file:
#             file.write(response.content)
            
#         print("Successfully downloaded variables.json")
        
#     except requests.exceptions.HTTPError as http_err:
#         print(f'HTTP Error occurred: {http_err}')
#     except Exception as err:
#         print(f'Other error occurred: {err}')

# # Download the variables file
# url = "https://api.census.gov/data/2023/acs/acs5/variables.json"
# download_census_variables(url)

In [None]:
import json

# Open and read the JSON file
with open('data/acs5_variables.json', 'r') as file:
    data = json.load(file)

# Print the data
acs5_variable = pd.DataFrame(data['variables']).T.reset_index()

In [None]:
acs5_variable.head(10)

In [None]:
for row in acs5_variable[acs5_variable['index'].isin(['C17002_002E', 'C17002_003E', 'B01003_001E'])][['index','label', 'concept']].iterrows():
    id, series = row
    print(f"'{series['index']}': '{series['label']} {series['concept']}',")

In [None]:
# Show the dataframe
print('Shape: ', acs5_all_years.shape)

# Check column data types for census data
print("Column data types for census data:\n{}".format(acs5_all_years.dtypes))

acs5_all_years[['C17002_002E', 'C17002_003E', 'B01003_001E']] = acs5_all_years[['C17002_002E', 'C17002_003E', 'B01003_001E']].astype(float)

# Source: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dtypes.html
# Get poverty rate and store values in new column
acs5_all_years["Poverty_Rate"] = (acs5_all_years["C17002_002E"] + acs5_all_years["C17002_003E"]) / acs5_all_years["B01003_001E"] * 100

# Show dataframe
display(acs5_all_years[['NAME', 'Poverty_Rate']])

In [None]:
# for row in acs5_variable[acs5_variable['label'].str.contains('one race')][['index','label']].iterrows():
#     id, series = row
#     print(f"'{series['index']}': '{series['label']}',")

In [None]:
# Rename columns for clarity
acs5_all_years.rename(columns={
    'NAME': 'County',
    'B01003_001E': 'Population',
    'B19013_001E': 'Median_Income',
    'B01002_001E': 'Median_Age',
    'B15003_022E': 'Bachelors',
    'B15003_023E': 'Masters',
    'B15003_024E': 'Professional',
    'B15003_025E': 'Doctorate',
    'B01001_011E': 'M_25_29', 
    'B01001_012E': 'M_30_34',
    'B01001_035E': 'F_25_29', 
    'B01001_036E': 'F_30_34',
    'B19083_001E': 'Gini_Index',
    'C02003_008E': 'Other',
    'C02003_004E': 'Black',
    'C02003_003E': 'White',
    'C02003_007E': 'Native_Hawaiian_and_Other_Pacific_Islander',
    'C02003_006E': 'Asian',
    'C02003_005E': 'American_Indian_and_Alaska_Native',
}, inplace=True)

# Convert numerical columns
acs5_all_years['Population'] = pd.to_numeric(acs5_all_years['Population'])
acs5_all_years['Median_Income'] = pd.to_numeric(acs5_all_years['Median_Income'])
acs5_all_years['Gini_Index'] = pd.to_numeric(acs5_all_years['B19083_001E'])
acs5_all_years['Edu_High'] = acs5_all_years['Bachelors'].astype(int) + acs5_all_years['Masters'].astype(int) + acs5_all_years['Professional'].astype(int) + acs5_all_years['Doctorate'].astype(int)
acs5_all_years['Edu_Rate'] = (acs5_all_years['Edu_High'] / acs5_all_years['Population']) * 100

acs5_all_years['Median_Age'] = pd.to_numeric(acs5_all_years['Median_Age'])

# Example: % of people aged 25–44 (prime working/spending age)
acs5_all_years['Age_25_34'] = acs5_all_years['M_25_29'].astype(int) + acs5_all_years['M_30_34'].astype(int) + acs5_all_years['F_25_29'].astype(int) + acs5_all_years['F_30_34'].astype(int)
acs5_all_years['Age_25_34_Pct'] = (acs5_all_years['Age_25_34'] / acs5_all_years['Population']) * 100

# Simpson's Diversity Index example:
ethnic_cols = ['Other', 'Black', 'Native_Hawaiian_and_Other_Pacific_Islander', 'Asian', 'White', 'American_Indian_and_Alaska_Native']
ethnic_shares = acs5_all_years[ethnic_cols].div(acs5_all_years['Population'], axis=0)
acs5_all_years['Diversity_Index'] = 1 - (ethnic_shares ** 2).sum(axis=1)

# Normalize scores
scaler = MinMaxScaler()
features = ['Population', 'Median_Income', 'Edu_Rate', 'Diversity_Index', 'Age_25_34_Pct']
acs5_all_years_scaled = pd.DataFrame(scaler.fit_transform(acs5_all_years[features]), columns=features)

# Weighted score
acs5_all_years['Score'] = (0.25 * acs5_all_years_scaled['Population'] +
               0.25 * acs5_all_years_scaled['Median_Income'] +
               0.20 * acs5_all_years_scaled['Edu_Rate'] +
               0.15 * acs5_all_years_scaled['Diversity_Index'] +
               0.15 * acs5_all_years_scaled['Age_25_44_Pct'])

# Sort by Median Income (for example)
top_counties = acs5_all_years[acs5_all_years['year'] == 2023].sort_values(by='Median_Income', ascending=False).head(10)
display(top_counties[['County', 'Population', 'Median_Income', 'Gini_Index', 'Edu_High', 'Edu_Rate', 'Median_Age', 'Age_25_34_Pct', 'Score', 'year']])


In [None]:
# Population growth = (New - Old) / Old * 100
# acs5_all_years['Pop_Growth_Rate'] = ((acs5_all_years['Population_2021'] - acs5_all_years['Population_2010']) / acs5_all_years['Population_2010']) * 100

In [None]:
# Helps identify competition or opportunity.
# acs5_all_years['Business_Density'] = acs5_all_years['Number_of_Businesses'] / acs5_all_years['Population'] * 1000

In [None]:
# acs5_all_years['Unemployment_Rate'] = (acs5_all_years['Unemployed'] / acs5_all_years['Labor_Force']) * 100

### Optional Next Steps:
- Use geopandas to map this data geographically.
- Add business data using sources like Yelp API or Google Places API for competitor analysis.
- Use clustering (sklearn) to group similar counties/tracts.

## 3. Evaluate and Compare:  
- **Visit Potential Locations:** Conduct site visits to assess the suitability of each location firsthand
- **Cost Analysis:** Compare the costs of different 
    - locations
    - including rent
    - utilities
    - taxes  
- **Pros and Cons:** Create a list of pros and cons for each potential location to help make a decision
- **Long-Term Growth Potential:** Consider the long-term growth potential of the area and its ability to support your business  

## 4. Consider Legal and Institutional Factors: 
- **Business Licenses:** Research the necessary business licenses and permits required for your business type and location. 
- **Local Regulations:** Understand any local regulations or restrictions that may affect your business operations.
- **Government Incentives:** Explore any local or state government incentives or programs that may be available for businesses in specific areas.