# Fetching data using `SqlFetcher`
Translating using a SQL database. This notebook assumes that the ***Prepare for `SqlFetcher` demo***-step from the [PickleFetcher](../pickle-translation/PickleFetcher.ipynb) demo notebook has been completed.

In [1]:
import sys

import rics

import id_translation

# Print relevant versions
print(f"{id_translation.__version__=}")
print(f"{sys.version=}")
rics.configure_stuff(id_translation_level="DEBUG")

id_translation.__version__='1.0.1.dev1'
sys.version='3.14.0 (main, Oct  7 2025, 16:05:28) [GCC 13.3.0]'
ðŸ‘» Configured some stuff just the way I like it!


## Create translator from config
Click [here](config.toml) to see the file.

In [2]:
from id_translation import Translator

translator = Translator.from_config("config.toml")
translator

Translator(online=True: fetcher=SqlFetcher('sqlite:///demo.db'))

In [3]:
ENGINE = translator.fetcher.engine

# Load database
Using the `SqlFetcher` engine.

In [4]:
from data import load_imdb

for source in ["name.basics", "title.basics"]:
    df = load_imdb(source)[0]
    df.to_sql(source.replace(".", "_"), ENGINE, if_exists="replace")

2025-12-03T23:27:41.196 [rics.utility.misc.get_local_or_remote:INFO] Local processed file path: '/home/dev/.id-translation/notebooks/cache/clean_and_fix_ids/name.basics.tsv.pkl'.
2025-12-03T23:27:42.476 [rics.utility.misc.get_local_or_remote:INFO] Local processed file path: '/home/dev/.id-translation/notebooks/cache/clean_and_fix_ids/title.basics.tsv.pkl'.


## Make some data to translate

In [5]:
import pandas as pd


def first_title(seed=5, n=1000):
    df = pd.read_sql("SELECT * FROM name_basics;", ENGINE).sample(n, random_state=seed)
    df["firstTitle"] = df["knownForTitles"].str.split(",").str[0]
    return df[["nconst", "firstTitle"]]

In [6]:
translator.go_offline()

2025-12-03T23:27:42.939 [id_translation.fetching:DEBUG] Metadata for 'sqlite:///demo.db' created in 3 ms.
2025-12-03T23:27:42.940 [id_translation.fetching:INFO] Finished initialization of 'SqlFetcher' in 4 ms: SqlFetcher('sqlite:///demo.db', sources=['name_basics', 'title_basics'])
2025-12-03T23:27:42.941 [id_translation.Translator:DEBUG] Begin going offline with 2 sources provided by: SqlFetcher('sqlite:///demo.db', sources=['name_basics', 'title_basics'])
2025-12-03T23:27:42.942 [id_translation.fetching:DEBUG] Begin fetching all IDs for placeholders=('id', 'name', 'original_name', 'from', 'to') for 2/2: ['name_basics', 'title_basics'].
2025-12-03T23:27:42.943 [id_translation.fetching:DEBUG] Begin mapping of wanted placeholders={'name', 'id', 'from', 'to'} to actual placeholders={'knownForTitles', 'int_id_nconst', 'nconst', 'primaryProfession', 'birthYear', 'primaryName', 'deathYear', 'index'} for source='name_basics'.
2025-12-03T23:27:42.946 [id_translation.fetching.map:DEBUG] Comput

Translator(online=False: cache=TranslationMap('name_basics': 202609 IDs, 'title_basics': 67334 IDs))

## Get the name and the "first" appearance for actors
In the IMDb list anyway. I have no idea how they're ordered in "knownForTitles".

In [7]:
df = first_title()
df.head()

Unnamed: 0,nconst,firstTitle
119545,nm0852487,tt0020182
162954,nm1707684,tt1369730
15831,nm0102912,tt0076843
99772,nm0706681,tt0072992
29547,nm0201371,tt0025323


## Translate

In [8]:
translator.translate(df).head(5)

2025-12-03T23:27:43.892 [id_translation.dio:DEBUG] Using rank-0 (priority=1999) implementation 'id_translation.dio.integration.pandas.PandasIO' for translatable of type='pandas.DataFrame'.
2025-12-03T23:27:43.893 [id_translation.Translator:DEBUG] Begin translation of 'DataFrame'-type data. Names to translate: Derive based on type.
2025-12-03T23:27:43.894 [id_translation.Translator:DEBUG] Name extraction complete. Found names=['nconst', 'firstTitle'] for 'DataFrame'-type data.
2025-12-03T23:27:43.895 [id_translation.Translator.map:DEBUG] Begin name-to-source mapping of names=['nconst', 'firstTitle'] in 'DataFrame' against sources=['name_basics', 'title_basics'].
2025-12-03T23:27:43.896 [id_translation.Translator.map:DEBUG] Computed 2x2 match scores in context=None in 28 Î¼s:
candidates  name_basics  title_basics
values                               
nconst              inf          -inf
firstTitle         -inf           inf
2025-12-03T23:27:43.897 [id_translation.Translator.map:INFO] Fi

Unnamed: 0,nconst,firstTitle
119545,nm0852487:Jack Taylor *1896â€ 1932,tt0020182 not translated; default name=Title unknown
162954,nm1707684:Blaze Aleksoski *1933â€ 2015,tt1369730 not translated; default name=Title unknown
15831,nm0102912:Saax Bradbury *1943â€ 1976,tt0076843 not translated; default name=Title unknown
99772,nm0706681:Poul Rahbek *1911â€ 1987,tt0072992 not translated; default name=Title unknown
29547,nm0201371:Jean Darling *1922â€ 2015,tt0025323 not translated; default name=Title unknown
