# Step 1: Explore the Homeless Services Data

Goal: Understand what data we have before building the search system.

In [1]:
import json
import pandas as pd

# Load the data
with open('../data/homeless_services_hackathon.json', 'r') as f:
    services = json.load(f)

print(f"Total services: {len(services)}")

Total services: 1719


In [2]:
# Look at one service record
services[0]

{'url': 'https://211.my.site.com/a1jDo0000029VC5IAM',
 'organization': 'Urban Street Angels',
 'service_name': 'Interim Shelter Bed Program (ISB)',
 'address': '1404 5TH AVE\nSAN DIEGO, CA\xa092101',
 'has_location': True,
 'ada_accessible': 'ADA Accessible',
 'main_phone': '(619) 415-6616',
 'website': 'http://www.urbanstreetangels.org',
 'email': 'info@urbanstreetangels.org',
 'latitude': '32.7200729',
 'longitude': '-117.1605922',
 'intake_hours_of_operation': '24 hours a day, 7 days a week',
 'specific_hours': '',
 'intake_procedure': 'Referral required',
 'documents_required': 'No Documents Required',
 'intake_notes': 'Must be referred through the Coordinated Entry System (CES); self-referrals are not accepted.',
 'description': 'Offers an interim shelter program that provides shelter for homeless youth who need a bed and food in a trauma-informed environment. \n\nOffers the following once enlisted: \n•\tFood\n•\tHygiene supplies\n•\tClothing \n•\tTransitional housing\n•\tJob sear

In [3]:
# Convert to DataFrame for easier exploration
df = pd.DataFrame(services)
df.head()

Unnamed: 0,url,organization,service_name,address,has_location,ada_accessible,main_phone,website,email,latitude,...,insurance_plans_accepted,service_rules_and_guidelines,capacity_limitations,wait_list,helpful_tips,areas_of_focus,target_populations,unique_keywords,category,types
0,https://211.my.site.com/a1jDo0000029VC5IAM,Urban Street Angels,Interim Shelter Bed Program (ISB),"1404 5TH AVE\nSAN DIEGO, CA 92101",True,ADA Accessible,(619) 415-6616,http://www.urbanstreetangels.org,info@urbanstreetangels.org,32.7200729,...,,,52 beds,Waitlist varies based in daily occupancy,,"[Youth Shelters, Personal, Grooming Supplies, ...",Homeless Youth/Runaway/Youth Shelter Residents/,"[homeless, who, a, urban, therapy, an, placeme...",[Housing and Shelter],"[Case Management & Coordination, TAY Services,..."
1,https://211.my.site.com/a1jRl00000D0Y5RIAV,"Youth and Family Services, YMCA of San Diego C...","TAY Overnight Lodging, Escondido","150 LA TERRAZA BLVD\nESCONDIDO, CA 92025",True,ADA Accessible,(760) 908-9373,https://www.ymcasd.org/community-support/ymca-...,taysupports@ymcasd.org,33.1165369,...,,,5 beds,,,"[Youth Shelters, Runaway, Youth Shelters, Publ...",Transition Age Youth/Young Adults/Homeless Wom...,"[runaway, a, services,, resources., ymca, show...",[Housing and Shelter],"[Home Accessibility & Improvement, Case Manage..."
2,https://211.my.site.com/a1jRl00000D0bzNIAR,"Youth and Family Services, YMCA of San Diego C...","TAY Overnight Lodging, Oceanside","205 BARNES ST\nOCEANSIDE, CA 92054",True,ADA Accessible,(760) 908-9647,https://www.ymcasd.org/community-support/ymca-...,taysupports@ymcasd.org,33.2004425,...,,,5 beds,,,"[Runaway, Youth Shelters, Youth Shelters, Publ...",Transition Age Youth/Homeless Youth/Young Adul...,"[runaway, a, services,, resources., ymca, show...",[Housing and Shelter],"[Home Accessibility & Improvement, Case Manage..."
3,https://211.my.site.com/a1j41000000fB49AAE,WomanHaven A Center For Family,Domestic Violence Shelters,,False,Not ADA Accessible,(760) 353-6922,,info@womanhaven.org,,...,,Shelter is available for up to 45 days in a sa...,,,,"[Domestic Violence Shelters, Counseling for Ch...",Families/Friends of Abused Women/Men/,"[a, crisis, counseling,, services,, counseling...",[Housing and Shelter],"[Family Services, Mental Health Services, Emer..."
4,https://211.my.site.com/a1j41000001PvdrAAC,Interfaith Community Services,Carlsbad Rental Assistance,"5731 PALMER WAY STE A\nCARLSBAD, CA 92010",True,ADA Accessible,(760) 448-5696,https://www.surveymonkey.com/r/ICSCbadRA,laborconnections@interfaithservices.org,33.1380145,...,,Must complete online screening first. This can...,Call for information,2-3 months for rental assitance. Must call for...,Clients are recommended to arrive early to the...,"[Rent Payment Assistance, Utility Service Paym...",Low Income/Evicted People/At Risk for Homeless...,"[enrolled, interfaith, late, for, in, this, on...",[Housing and Shelter],"[Housing Financial Assistance, Housing Search ..."


In [4]:
# What columns do we have?
print("Columns:")
for col in df.columns:
    print(f"  - {col}")

Columns:
  - url
  - organization
  - service_name
  - address
  - has_location
  - ada_accessible
  - main_phone
  - website
  - email
  - latitude
  - longitude
  - intake_hours_of_operation
  - specific_hours
  - intake_procedure
  - documents_required
  - intake_notes
  - description
  - eligibility
  - area_served
  - types_of_fees
  - fee_amount
  - insurance_type_accepted
  - language_other_than_english
  - insurance_plans_accepted
  - service_rules_and_guidelines
  - capacity_limitations
  - wait_list
  - helpful_tips
  - areas_of_focus
  - target_populations
  - unique_keywords
  - category
  - types


In [5]:
# What types of services exist?
all_types = []
for types_list in df['types']:
    if isinstance(types_list, list):
        all_types.extend(types_list)

from collections import Counter
type_counts = Counter(all_types)
print("Service Types (top 15):")
for service_type, count in type_counts.most_common(15):
    print(f"  {service_type}: {count}")

Service Types (top 15):
  Mental Health Services: 844
  Food & Basic Needs Assistance: 640
  Family Services: 474
  Emergency Shelter & Crisis Intervention: 464
  Disability Services: 350
  Case Management & Coordination: 305
  Substance Abuse Disorder: 289
  Senior Services: 252
  Homelessness Prevention & Diversion: 199
  Transitional & Supportive Housing: 182
  Legal Assistance & Tenant Advocacy: 154
  Housing Search & Navigation: 124
  Domestic Violence Support: 102
  TAY Services: 94
  Housing Education & Counseling: 86


In [6]:
# What categories exist?
all_categories = []
for cat_list in df['category']:
    if isinstance(cat_list, list):
        all_categories.extend(cat_list)

print("Categories:")
print(Counter(all_categories))

Categories:
Counter({'Mental Health and Substance Use Disorder Services': 782, 'Food': 543, 'Housing and Shelter': 394})


## Next Steps

After exploring, we'll:
1. Create embeddings for each service (notebook 02)
2. Build a query pipeline that searches + uses LLM (notebook 03)