# Places

Analysis of places in the individual biographies.

This notebook collects all the places mentioned in the individual biographies and uses wikidata to extract associated information about them, namely current names, coordinates, and adminstrative information (present day).

Relevant to this notebook is the list of attribute types that receive place names as values.

The list is the following:

- 'nascimento'
- 'jesuita-entrada'
- 'partida'
- 'chegada'
- 'estadia'
- 'estadia-x'
- 'jesuita-votos-local'
- 'jesuita-ordenacao-padre'
- 'morte'




In [18]:
from timelink.notebooks import TimelinkNotebook

tlnb = TimelinkNotebook()
tlnb.print_info()



Timelink version: 1.1.26
Project name: dehergne
Project home: /Users/jrc/mhk-home/sources/dehergne
Database type: sqlite
Database name: dehergne
Kleio image: timelinkserver/kleio-server
Kleio server token: ACbGE...
Kleio server URL: http://127.0.0.1:8089
Kleio server home: /Users/jrc/mhk-home/sources/dehergne
Kleio server container: competent_chebyshev
Kleio version requested: latest
Kleio server version: 12.8.593 (2025-03-16 21:55:53)
SQLite directory: /Users/jrc/mhk-home/sources/dehergne/database/sqlite
Database version: 6ccf1ef385a6
Call print_info(show_token=True) to show the Kleio Server token
Call print_info(show_password=True) to show the Postgres password
TimelinkNotebook(project_name=dehergne, project_home=/Users/jrc/mhk-home/sources/dehergne, db_type=sqlite, db_name=dehergne, kleio_image=timelinkserver/kleio-server, kleio_version=latest, postgres_image=postgres, postgres_version=latest)


Define attributes types that have locations as values.

In [19]:
attributes_with_place_names = ['nascimento','jesuita-entrada','partida','chegada','estadia','estadia-x',
                'jesuita-votos-local','jesuita-ordenacao-padre','morte']

attributes_with_wikidata = [f"{attr}@wikidata" for attr in attributes_with_place_names]
attributes_with_wikidata

['nascimento@wikidata',
 'jesuita-entrada@wikidata',
 'partida@wikidata',
 'chegada@wikidata',
 'estadia@wikidata',
 'estadia-x@wikidata',
 'jesuita-votos-local@wikidata',
 'jesuita-ordenacao-padre@wikidata',
 'morte@wikidata']

Collect all the individuals with attributes related to locations. Make note of the dates of the attributes and infer dates when missing, based on last known date for a previous location attribute. Extract wikidata ids for the locations if available.



In [20]:
import pandas as pd
from timelink.pandas import entities_with_attribute
from timelink.kleio.utilities import format_timelink_date
from dehergne_util import get_linked_entity_id

# show 500 rows
pd.set_option('display.max_rows', 1550)

places_of_stay = entities_with_attribute(
    entity_type='person',
    show_elements=['name','groupname','the_line','the_order','extra_info'],
    the_type=attributes_with_place_names,
    column_name='place',
    db=tlnb.db,
)

# fillna with "?"
places_of_stay['place'] = places_of_stay['place'].fillna('?')

# remove rows where place == '?'
places_of_stay = places_of_stay[places_of_stay['place'] != '?']

# this sequence replaces missing dates with the value of the previous date
# followed by ">". This allows to use the information that although the date
# is unkown it has happened after a certain date
# we need the id in a column to group by it (it is currently in the index)
# TODO: maybe an option in entities_with_attribute and compute intervals using
#       post quem and ante quem dates. See https://github.com/time-link/timelink-kleio/issues/1

places_of_stay['id_col'] = places_of_stay.index

# We try to infer the date of stay from the previous date known
# create a copy of the date column to replace the '0' values with NaN
places_of_stay['place.date_inferred'] = places_of_stay['place.date'].replace('0', pd.NA)

# order by id and line to have the missing date values filled with the previous date
places_of_stay = places_of_stay.sort_values(by=['id_col', 'place.line'])
# use ffill to fill the missing values with the previous value
places_of_stay['place.date_inferred'] = places_of_stay.groupby(['id_col'])['place.date_inferred'].fillna(method='ffill')
# create a column to flag the inferred dates
places_of_stay['place.date_is_inferred'] = places_of_stay['place.date_inferred'] != places_of_stay['place.date']
# reset index
places_of_stay.reset_index(inplace=True)
# if data_is_inferred set date_inferred to date_inferrred+">"
places_of_stay.loc[places_of_stay['place.date_is_inferred'], 'place.date_inferred'] = places_of_stay['place.date_inferred'] + '>'
# set missing value to ''
places_of_stay['place.date_inferred'].fillna('', inplace=True)
# restore index
places_of_stay.index = places_of_stay['id_col']

# create a new column named "wikidata_id" with the wikidata id extracted from place.comment
places_of_stay['wikidata_id'] = places_of_stay['place.comment'].apply(get_linked_entity_id, linked_data_provider='wikidata', if_missing='no wikidata')
# create a new column with the formatted date
places_of_stay['place.date_inferred.formatted'] = places_of_stay['place.date_inferred'].apply(format_timelink_date)

# places_of_stay.info()
# show results
show_only=10
cols=['place','place.attr_id','wikidata_id','place.date_inferred','place.date_inferred.formatted','groupname','name','place.type','place.comment','place.original','place.date_is_inferred','place.line','place.extra_info']
places_of_stay[places_of_stay.groupname=='n'][cols].sort_values(by=['place','place.date_inferred']).head(show_only)



Unnamed: 0_level_0,place,place.attr_id,wikidata_id,place.date_inferred,place.date_inferred.formatted,groupname,name,place.type,place.comment,place.original,place.date_is_inferred,place.line,place.extra_info
id_col,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
deh-johannes-ciermans,"'s-Hertogenbosch, Holanda",deh-johannes-ciermans-att1059-277,Q2766547,16020407,1602-04-07,n,Johannes Ciermans,nascimento,@wikidata:Q2766547,Bois-le-duc,False,1507,"{'the_type': {'kleio_element_name': 'tipo', 'k..."
deh-charles-francois-xavier-de-brevedent,Abissínia,deh-charles-francois-xavier-de-brevedent-att10...,Q207521,16980610,1698-06-10,n,Charles-François Xavier de Brévedent,partida,@wikidata:Q207521,,False,1358,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-estevao-coelho,"Abrantes, diocese da Guarda",deh-estevao-coelho-att1198-277,Q331191,1586,1586,n,Estêvão Coelho,nascimento,@wikidata:Q331191,Abrantès de Guarda?,False,1678,"{'the_type': {'kleio_element_name': 'tipo', 'k..."
deh-antonio-sedeno,"Acapulco, México",deh-antonio-sedeno-att429-192,Q81398,15810329,1581-03-29,n,Antonio Sedeño,estadia,@wikidata:Q81398,,False,687,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-antoine-rene-de-brasle,"Acapulco, México",deh-antoine-rene-de-brasle-att990-232,Q81398,<1707,<1707,n,Antoine-René de Brasle,estadia,a caminho de Manila @wikidata:Q81398,,False,1297,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-antonio-de-magalhaes,Afeganistão,deh-antonio-de-magalhaes-att93-238,Q889,17000000,1700,n,António de Magalhães,estadia,@wikidata:Q889,,False,145,"{'the_date': {'kleio_element_name': 'date', 'k..."
deh-pierre-albier,Agen,deh-pierre-albier-att214-251,Q6625,16540712,1654-07-12,n,Pierre Albier,jesuita-votos-local,@wikidata:Q6625,,False,294,"{'class': {'kleio_element_name': 'class', 'kle..."
deh-philippe-avril,Agen,deh-philippe-avril-att1017-251,Q6625,16910000,1691,n,Philippe Avril,estadia,@wikidata:Q6625,,False,1420,"{'class': {'kleio_element_name': 'class', 'kle..."
deh-bento-de-gois,Agra,deh-bento-de-gois-att464-180,Q42941,16021029,1602-10-29,n,Bento de Góis,estadia,@wikidata:Q42941,,False,614,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-antonio-de-andrade,Agra,deh-antonio-de-andrade-att729-251,Q42941,16240330,1624-03-30,n,António de Andrade,estadia,@wikidata:Q42941,,False,1041,"{'class': {'kleio_element_name': 'class', 'kle..."


### Save to Excel

In [21]:
places_of_stay[cols].sort_values(by=['place','place.date_inferred']).to_excel(f"../inferences/locations_names.xlsx", sheet_name='wikidata', index=True)
# another version without wikidata_id for unbiased LLM reasoning
places_of_stay[['place','place.date_inferred.formatted', 'name','place.type','place.original']].sort_values(by=['place','place.date_inferred.formatted']).to_excel(f"../inferences/locations_names_places_dates_only.xlsx", sheet_name='wikidata', index=True)

### Locations with no wikidata id

In [15]:
places_of_stay[
    (places_of_stay['wikidata_id'] == 'no wikidata') &
    (~places_of_stay['place'].str.startswith('['))
][cols].sort_values(by=['place', 'place.date_inferred'])

Unnamed: 0_level_0,place,wikidata_id,place.date_inferred,place.date_inferred.formatted,groupname,name,place.type,place.comment,place.original,place.date_is_inferred,place.line,place.extra_info
id_col,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
deh-jose-de-lima-ref2,Aguiar,no wikidata,17391115,1739-11-15,referido,José de Lima,nascimento,"Existem vários, Aguiar da Beira (Guarda), em B...",,False,1153,"{'the_date': {'kleio_element_name': 'date', 'k..."
deh-giovanni-francesco-de-ferrariis,Anking (Ngan-k'ing fou),no wikidata,>16710908,>1671-09-08,n,Giovanni Francesco De Ferrariis,morte,,,False,61,"{'the_date': {'kleio_element_name': 'date', 'k..."
deh-miguel-vieira,Aquilon,no wikidata,,,n,Miguel Vieira,estadia,ILOC,,True,1301,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-simao-da-silveira,Aquilon,no wikidata,,,n,Simão da Silveira,estadia,"? na Índia, era cura e vigário? ILOC",,True,1406,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-manuel-jose-ref2,Ascitano,no wikidata,17650000,1765,referido,Emmanuel Josephus,morte,"? ILOC (Asciano, Siena?)",,False,225,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-claude-francois-loppin,Bellegarde,no wikidata,17070913,1707-09-13,n,Claude-François Loppin,nascimento,existem vários,,False,1549,"{'the_date': {'kleio_element_name': 'date', 'k..."
deh-sebastiao-fernandes,"Besteiros, diocese de Viseu",no wikidata,1573,1573,n,Sebastião Fernandes,nascimento,,,False,430,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-victor-agricola-poisson,"Billom, diocese de Clermont @wikidata:Q246257",no wikidata,17300302,1730-03-02,n,Victor-Agricola Poisson,nascimento,,,False,1645,"{'the_value': {'kleio_element_name': 'valor', ..."
deh-joaquim-lobo,"Cadaval, diocese de Coimbra",no wikidata,17320224,1732-02-24,n,Joaquim Lobo,nascimento,ILOC,,False,1240,"{'the_date': {'kleio_element_name': 'date', 'k..."
deh-miguel-do-amaral,"Carapito ou Mangualde, diocese de Viseu",no wikidata,16571213,1657-12-13,n,Miguel do Amaral,nascimento,@ILOC,,False,836,"{'the_value': {'kleio_element_name': 'valor', ..."


#### Export to Excel all unlocated places

In [6]:
places_of_stay[places_of_stay['wikidata_id']=='no wikidata'][cols].sort_values(by=['place','place.date_inferred']).to_excel(f"../inferences/locations_names_no_wikidata.xlsx", sheet_name='no_wikidata', index=True)

Export the list of wikidata ids for later processing by wikidata-linked-data notebook.

In [7]:
# save to csv wikidata column, except when value is "No wikidata"
places_of_stay[places_of_stay['wikidata_id'] != 'no wikidata']['wikidata_id'].drop_duplicates().to_csv(f"../inferences/wikidata-references/locations_names_wikidata.csv", index=False)

#### Only specific place of entry, unkown places

In [8]:
from timelink.pandas import entities_with_attribute

# Choose the place of entry
place_of_entry = 'Coimbra'

# get the ids of the places of entry
missionaries_from_place = entities_with_attribute(
    entity_type='person',
    show_elements=['name','groupname','the_line','the_order','extra_info'],
    the_type=['jesuita-entrada'],
    the_value=place_of_entry,
    db=tlnb.db,
)
print(f"Number of missionaries from {place_of_entry}: {len(missionaries_from_place)}")
place_of_entry_ids = missionaries_from_place.index
places_of_stay_specific = places_of_stay.loc[place_of_entry_ids]
places_of_stay_specific[places_of_stay_specific['wikidata_id'] == 'No wikidata'][cols].sort_values(by=['place','place.date_inferred'])

Number of missionaries from Coimbra: 62


Unnamed: 0_level_0,place,wikidata_id,place.date_inferred,place.date_inferred.formatted,groupname,name,place.type,place.comment,place.original,place.date_is_inferred,place.line,place.extra_info
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1


In [9]:
places_of_stay_specific[places_of_stay_specific['wikidata_id'] == 'No wikidata'][cols].sort_values(by=['place','place.date_inferred']).to_excel(f"../inferences/jesuita_entrada_{place_of_entry}_places_of_stay_no_wikidata.xlsx", sheet_name='no_wikidata', index=True)

### Frequency of places / Frequência dos lugares

In [10]:
places_of_stay.info()

<class 'pandas.core.frame.DataFrame'>
Index: 7177 entries, deh-abraham-le-royer to simao-rodrigues-ref1
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   id                      7177 non-null   object
 1   name                    7177 non-null   object
 2   groupname               7177 non-null   object
 3   the_line                7177 non-null   int64 
 4   the_order               7177 non-null   int64 
 5   extra_info              7177 non-null   object
 6   place.attr_id           7177 non-null   object
 7   place.type              7177 non-null   object
 8   place                   7177 non-null   object
 9   place.date              7177 non-null   object
 10  place.line              7177 non-null   int64 
 11  place.level             7177 non-null   int64 
 12  place.obs               7177 non-null   object
 13  place.extra_info        7177 non-null   object
 14  place.comment           70

In [10]:
col = 'place' # subtotal by this column

# Use pandas groupby and specify unique value count for id
df_totals = places_of_stay.groupby(col).agg({'id':'nunique',
                                                'place.date_inferred':'min',
                                                'place.date':'max'})
df_totals.info()
df_totals.sort_values('id',ascending= False).head(130)

<class 'pandas.core.frame.DataFrame'>
Index: 1569 entries, 's-Hertogenbosch, Holanda to Žilina, Eslováquia
Data columns (total 3 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   id                   1569 non-null   int64 
 1   place.date_inferred  1569 non-null   object
 2   place.date           1569 non-null   object
dtypes: int64(1), object(2)
memory usage: 49.0+ KB


Unnamed: 0_level_0,id,place.date_inferred,place.date
place,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Macau,495,,>1652
Pequim,255,,>17410115
Goa,209,,>1713
Cantão,200,,>16590123
China,186,,18050114
Lisboa,173,,17810300
Roma,110,,>16851112
Coimbra,89,,<16230324
Nanquim,68,,1780
Paris,66,,17780000


In [12]:
# export totals to excel
df_totals.sort_values('id',ascending= False).to_excel(f"../inferences/locations_totals.xlsx", sheet_name='totals', index=True)

### Who was at a specific place


List those who stayed at a specific place



In [13]:
place="Coimbra"

In [14]:
cols = ['place','name','place.type','place.date_inferred','place.obs']
show_only=100
places_of_stay[places_of_stay.place==place][cols].sort_values(by=['place.date_inferred']).head(show_only)


Unnamed: 0_level_0,place,name,place.type,place.date_inferred,place.obs
id_col,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
deh-nicolas-trigault,Coimbra,Nicolas Trigault,estadia-x,,
deh-bartolomeo-tedeschi,Coimbra,Bartolomeo Tedeschi,estadia-x,,
deh-belchior-miguel-carneiro-leitao,Coimbra,Belchior Miguel Carneiro Leitão,nascimento,1519,
deh-pedro-martins,Coimbra,Pedro Martins,nascimento,15420000,
deh-pedro-de-alcacova,Coimbra,Pedro de Alcáçova,jesuita-entrada,15420000,saiu a primeira vez
deh-belchior-nunes-barreto,Coimbra,Belchior Nunes Barreto,jesuita-entrada,15430311,
deh-belchior-miguel-carneiro-leitao,Coimbra,Belchior Miguel Carneiro Leitão,jesuita-entrada,15430425,
deh-francisco-perez,Coimbra,Francisco Pérez,jesuita-entrada,15440125,
deh-tiburcio-de-quadros,Coimbra,Tibúrcio de Quadros,jesuita-entrada,15440418,
deh-goncalo-alvares,Coimbra,Gonçalo Álvares,jesuita-entrada,15490101,


#### Export to markdown

In [15]:
places_of_stay.itertuples(index=True)

<map at 0x160266da0>

In [16]:
import os
from pandas import DataFrame
from timelink.pandas import group_attributes as person_attributes

# create if it does not exist directory ../inferences/{place_of_entry}/markdown"
directory = f"../inferences/{place_of_entry}/markdown"
if not os.path.exists(directory):
    os.makedirs(directory)


for id, name in places_of_stay[places_of_stay.place==place][['name']].sort_values('name').itertuples(index=True):
    with tlnb.db.session() as session:
        p = tlnb.db.get_person(id, session=session)
        session.add(p)
        pk:str = p.to_kleio()
        pk = f"""## {name}
^[In Edit mode, select title (including the ##), right click and "Extract selection"]

---
### {name} @ timelink
```
{pk}
```

---
Select title, right click on selected title, choose extract note
"""
        fname = f"{directory}/{id}.md"
        with open(fname, 'w', encoding='utf-8') as f:
            f.write(pk)

