## Examples of querying different distance metrics.
For each distance modality (geographic, genetic, typological), we define separate classes for querying distances between languages.
The input languages are specified using their Glottocode.

#### Querying speaker distribution (geographic) distance

In [1]:
from src.querying import SpeakerGeographicDistanceTool
# We supply a path to the per-country speaker distribution data
geographic_distance = SpeakerGeographicDistanceTool('data/country_speaker_subset.csv')
lang1 = 'stan1293' # English
lang2 = 'stan1290' # French
print(f'The geographic distance between {lang1} and {lang2} is: {geographic_distance.query_distance(lang1, lang2)}')

The genetic distance between stan1293 and stan1290 is: 0.2342381764445797


#### Querying hyperbolic (genetic) distance

In [2]:
from src.querying import HyperbolicGeneticDistanceTool
# We supply a path to the pre-computed distance matrix
genetic_distance = HyperbolicGeneticDistanceTool('data/genetic_distance_matrix.csv')
lang1 = 'stan1293' # English
lang2 = 'stan1290' # French
print(f'The genetic distance between {lang1} and {lang2} is: {genetic_distance.query_distance(lang1, lang2)}')

The genetic distance between stan1293 and stan1290 is: 0.6123247923358647


#### Querying latent islands (typological) distance

In [5]:
from src.querying import IslandsTypologicalDistanceTool
# We supply a path to the URIEL+ data and a path to the pre-computed islands
typological_distance = IslandsTypologicalDistanceTool('data/URIELPlus_Union_SoftImpute.csv', 'data/islands.pkl')
lang1 = 'stan1293' # English
lang2 = 'stan1290' # French
print(f'The typological distance between {lang1} and {lang2} is: {typological_distance.query_distance(lang1, lang2)}')

The typological distance between stan1293 and stan1290 is: 0.16578026419506764
