# LightBox API - Purpose of Search Endpoint

Return an address based on the full text string 'text,' where each result includes a representative point for the address and references to related parcels. If there is no exact match, this will return the best possible match. The '$ref' value within each 'parcels' object for a specific address can be used to get information about the parcel.

## Batch Geocoding Addresses with Search

This batch geocoder processes addresses in batches sequentially, which means it sends one request at a time to the LightBox API per address within a batch. This approach is straightforward but does not leverage concurrent processing techniques such as multi-threading or asynchronous requests.

This notebook provides a step-by-step guide to batch geocoding addresses using the LightBox API. The process involves several key steps, each detailed in subsequent sections:

1. **Setup**
   - Importing necessary Python libraries.
   - Defining global configurations and API keys.

2. **Function Definitions**
   - `geocode_address`: Function to geocode a single address.
   - `read_addresses_from_csv`: Function to read and format addresses from a CSV file.
   - `batch_geocode_addresses`: Function to process addresses in batches and geocode them.

3. **API Key**
   - Enter your API Key for Authorization.

4. **Reading Input Data**
   - Reading and formatting addresses from a user-specified CSV file.

5. **Batch Geocoding Process**
   - Executing the batch geocoding process using the defined functions.
   - Handling different scenarios and errors during the geocoding process.

6. **Saving Results**
   - Saving the geocoded data to a CSV file.
   - Format and content of the output data.

Additional Materials:
[LightBox Developer Portal](https://developer.lightboxre.com/)

### 1. Import the required python packages

In [7]:
import requests
import pandas as pd
from typing import Dict, List

### 2. Import function definitions

In [8]:
# Function to geocode a single address using the LightBox API.
def geocode_address(lightbox_api_key: str, address: str) -> Dict:
    """
    Geocodes the provided address using the LightBox API.
    
    Args:
        lightbox_api_key (str): The API key for accessing the LightBox API.
        address (str): The address string for matching.
    
    Returns:
        dict: The geocoded address information in JSON format.
    """
    # API endpoint configuration
    BASE_URL = "https://api.lightboxre.com/v1"
    ENDPOINT = "/addresses/search"
    URL = BASE_URL + ENDPOINT

    # Setting up request parameters and headers
    params = {'text': address}
    headers = {'x-api-key': lightbox_api_key}

    # Sending request to the LightBox API
    response = requests.get(URL, params=params, headers=headers)

    return response

# Function to read addresses from a CSV file and format them.
def read_addresses_from_csv(file_path: str) -> List[str]:
    """
    Reads addresses from a CSV file and formats them into 'Address, City State Zip Code'.
    
    Args:
        file_path (str): Path to the CSV file.
    
    Returns:
        List[str]: A list of formatted address strings.
    """
    df = pd.read_csv(file_path)

    # Concatenating address components into a single address string per row
    formatted_addresses = df.apply(
        lambda row: f"{row['Address']}, {row['City']} {row['State']} {row['Zip Code']}", 
        axis=1
    )
    return formatted_addresses.tolist()

# Function to batch process addresses for geocoding.
def batch_geocode_addresses(api_key: str, addresses: List[str], batch_size: int = 200) -> pd.DataFrame:
    """
    Batch processes a list of addresses for geocoding.

    Args:
        api_key (str): API key for the geocoding service.
        addresses (List[str]): List of addresses to geocode.
        batch_size (int): Number of addresses to process in each batch.
    
    Returns:
        pd.DataFrame: DataFrame containing original addresses and expanded geocoded data.
    """
    batched_addresses = [addresses[i:i + batch_size] for i in range(0, len(addresses), batch_size)]
    all_results = []

    for batch in batched_addresses:
        for address in batch:
            result = geocode_address(api_key, address)
            if result.status_code == 200:
                data = result.json()
                # Extracting data from the first match
                if data['addresses']:
                    first_match = data['addresses'][0]
                    latitude = first_match['location']['representativePoint']['latitude']
                    longitude = first_match['location']['representativePoint']['longitude']
                    confidence_score = first_match['$metadata']['geocode']['confidence']['score']
                    precision_code = first_match['$metadata']['geocode']['precisionCode']  # Extracting precision code
                    all_results.append({
                        "address": address, 
                        "latitude": latitude, 
                        "longitude": longitude, 
                        "confidence_score": confidence_score,
                        "precision_code": precision_code  # Adding precision code to the DataFrame
                    })
                else:
                    all_results.append({
                        "address": address, 
                        "latitude": "No match", 
                        "longitude": "No match", 
                        "confidence_score": "No match",
                        "precision_code": "No match"
                    })
            else:
                all_results.append({
                    "address": address, 
                    "latitude": "Failed",
                    "longitude": f"Status Code: {result.status_code}",
                    "confidence_score": "Failed",
                    "precision_code": "Failed"
                })
                print(f"Failed to geocode address '{address}', Status Code: {result.status_code}")

    return pd.DataFrame(all_results)

### 3. Create variable(s) that will be used to authenticate your calls.
Get your key from the [LightBox Developer Portal](https://developer.lightboxre.com/).

In [9]:
lightbox_api_key = '<YOUR_API_KEY>'

### 4. Reading input data.
- The user specifies the location and name of the input file of addresses.
    - Assuming the file is in the root folder, a user would input input_file_name.csv
    - This script assumes that the input csv file has data with the headers 'Address', 'City', 'State' and 'Zip Code'.
- The user specifies the location and name of the output file for csv data.
    - Assuming the file is in the root folder, a user would input output_file_name.csv

In [10]:
input_file_path = 'input.csv' # User inputs the file name
output_file_path = 'output.csv'  # User inputs the output file name

# Reading and processing addresses
print("Reading addresses from CSV file...")
addresses = read_addresses_from_csv(input_file_path)

Reading addresses from CSV file...


### 5. Batch Geocoding Process

In [11]:
print("Starting batch geocoding...")
geocoded_data = batch_geocode_addresses(lightbox_api_key, addresses)
print("Batch geocoding completed.")

Starting batch geocoding...


Batch geocoding completed.


### 6. Saving Results

In [12]:
# Saving geocoded data to output file
geocoded_data.to_csv(output_file_path, index=False)
print(f"Geocoded data saved to '{output_file_path}'.")

Geocoded data saved to 'output4.csv'.
