# Process metadata (row-col)

## Overview of the Code

## Purpose
The code matches user-provided latitude and longitude data with the closest reference latitude and longitude values from separate datasets. This is typically useful for mapping user locations to predefined grid points, such as in geographic or environmental data processing.

## Input
1. **`ref_lat_df`**: A DataFrame containing reference latitude values (`LAT_fromcloudimage.csv`).
2. **`ref_lng_df`**: A DataFrame containing reference longitude values (`LON_fromcloudimage.csv`).
3. **`customers_metadata_df`**: A DataFrame containing user latitude and longitude values (`customers_metadata.csv`).

## Output
1. **`processed_customers_metadata_df`**: A DataFrame containing the original user data with additional columns for the closest matching reference latitude and longitude indices and values.
2. The resulting DataFrame is saved to `processed_customers_metadata.csv`.

## Process
1. **Finding Closest Latitude and Longitude**:
   - For each user latitude, find the index of the closest reference latitude.
   - For each user longitude, find the index of the closest reference longitude.
2. **Appending Closest Matches**:
   - Append the closest latitude and longitude values and their indices to the user DataFrame.
3. **Saving the Result**:
   - Save the augmented user DataFrame to a CSV file.

## Functions
1. **`find_closest_lat`**: Finds the index of the closest latitude in the reference DataFrame.
2. **`find_closest_lng`**: Finds the index of the closest longitude in the reference DataFrame.
3. **`find_closest_match`**: Finds the closest matching value in a given DataFrame.
4. **`gen_plant_metadata`**: Generates a DataFrame with additional metadata for user latitudes and longitudes.


In [1]:
import numpy as np
import pandas as pd
def find_closest_lat(ref_lat_df, user_df):
        lat = ref_lat_df['lat']  
        LAT = user_df['lat'] 

        return (np.abs(LAT - lat)).argmin()

def find_closest_lng(ref_lng_df, user_df):
        lng = ref_lng_df['lng']
        LNG = user_df['lng']

        return (np.abs(LNG - lng)).argmin()

def find_closest_match(val_to_find, df_based):
        closest_val = df_based.iloc[(df_based - val_to_find).abs().argmin()]
        return closest_val


def gen_plant_metadata(user_df, ref_lat_df, ref_lng_df): 
        user_out_df = user_df.copy()
        user_out_df['row_idx_TIFF'] = user_out_df.apply(lambda x: find_closest_lat(ref_lat_df, x), axis=1)
        user_out_df['col_idx_TIFF'] = user_out_df.apply(lambda x: find_closest_lng(ref_lng_df, x), axis=1)

        user_out_df['lat_TIFF'] = user_out_df['lat'].apply(lambda x: find_closest_match(x, ref_lat_df['lat']))
        user_out_df['lng_TIFF'] = user_out_df['lng'].apply(lambda x: find_closest_match(x, ref_lng_df['lng']))
        return user_out_df

In [4]:
ref_lat_df = pd.read_csv("configs/LAT_fromcloudimage.csv")
ref_lng_df = pd.read_csv("configs/LON_fromcloudimage.csv")

In [8]:
customers_metadata_df = pd.read_csv("configs/customers_metadata.csv")
customers_metadata_df

Unnamed: 0,customer,lat,lng
0,ee_building,13.736774,100.532122
1,FN_huahin,12.66683,99.952905


In [9]:
processed_customers_metadata_df = gen_plant_metadata(customers_metadata_df, ref_lat_df, ref_lng_df)
processed_customers_metadata_df

Unnamed: 0,customer,lat,lng,row_idx_TIFF,col_idx_TIFF,lat_TIFF,lng_TIFF
0,ee_building,13.736774,100.532122,847,864,13.736502,100.5299
1,FN_huahin,12.66683,99.952905,908,832,12.662679,99.95509


In [10]:
processed_customers_metadata_df.to_csv("configs/processed_customers_metadata.csv", index=False)