<a href="https://colab.research.google.com/github/kelsdoerksen/giga-connectivity/blob/main/GeoCLIP_Extraction_For_Connectivity.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## C02 - Use GeoCLIP embeddings

Simple example of how to obtain pretrained GeoCLIP embeddings. Read the paper here:[https://arxiv.org/abs/2309.16020](https://arxiv.org/abs/2309.16020). First install the geoclip package (see [https://github.com/VicenteVivan/geo-clip](https://github.com/VicenteVivan/geo-clip))

In [None]:
!pip install geoclip

In [None]:
from geoclip import LocationEncoder
import torch
import torch.nn as nn
import pandas as pd
import ast

Load the pretrained model directly.

In [None]:
model = LocationEncoder()

Obtain GeoCLIP location embeddings.

In [None]:
# Get [lon, lat] of schools as float.64 tensor to extract embeddings for

def get_coords(df):
  """
  Function to return coords of school locations
  as 2D tensor to extract GeoCLIP embeddings for
  in order lon, lat
  """

  total_coords = []
  for i in range(len(df)):
    coord = torch.tensor((df.loc[i]['lon'], df.loc[i]['lat']))
    total_coords.append(coord)

  locations = torch.stack(total_coords)

  return locations

In [None]:
# Processing data for locations for the embeddings to be extracted from
aoi = 'BWA'
split = 'Testing'
aoi_df = pd.read_csv('{}Data_uncorrelated_fixed.csv'.format(split))

In [None]:
# Get coordinates for aoi of interest
coords = get_coords(aoi_df)

In [None]:
model.eval()
with torch.no_grad():
  emb = model(coords.flip(1).float()).detach().cpu()

In [None]:
emb.shape

In [None]:
identifying_info_df = aoi_df[['giga_id_school', 'connectivity', 'lat', 'lon']]
emb_df = pd.DataFrame(emb.numpy())
emb_df_labelled = pd.concat([identifying_info_df, emb_df], axis=1)
emb_df_labelled['data_split'] = split

In [None]:
# Export to dataframe
emb_df_labelled.to_csv('{}_GeoCLIP_embeddings_{}.csv'.format(aoi, split))