> First time use: follow instructions in the README.md file in this directory.



# Receitas

Várias receitas para usar Timelink com notebooks

[Sumário]

---

# Receipts

Receipts for using Timelink with notebooks

[summary]


---
## Setup

### Basic setup

Create a `TimelinkNotebook` instance.

This will provide an interface to the 
various `Timelink` functions.



In [1]:
from timelink.notebooks import TimelinkNotebook

tlnb = TimelinkNotebook()




### Changing the default values

When called with no parameters `TimelinkNotebook` 
assumes a certain directory layout containing the notebook
and from that layout autoconfigures various parameters.

The default layout:

* **project-directory**
    * **database** (database related filed)
    * **notebooks**  _(directory with the current notebook)_ 
    * **kleio**  (source files in kleio format)
    * **identifications** (files with identification information)

Based on this layout `TimelinkNotebook` assumes a few default values:

* **database name**
    * the name of the *project directory* is used as the **database name**, sanitized to produce a valid database name
      (hifens and spaces are replaced with _ ).
* **database type**
  * the type of the database is assumed to be **sqlite**  and a **sqlite** database is created in 
      *database/sqlite*
    * the `db_type` parameter can be used to set the database to `postgres`. 
* **Kleio server home**, working directory for the _Kleio server_ is assumed to be the **project directory**
  * The _Kleio server_ is an application
    that processes transcriptions of historical _Kleio_ notation and generates data in a format
    that can be imported to the database.

These default values can be changed with parameters when creating the `TimelinkNotebook`  instance:

```python
tlnb = TimelinkNotebook(
    project_name="my-project",
    project_home="~/projects/m-project",
    db_type="postgres",
    db_name="my_project_db",
    kleio_image="timelinkserver/kleio-server",
    kleio_version="12.0.0",
    postgres_image="postgres",
    postgres_version="latest",
    sqlite_dir="~/databases",
    sql_echo=True
)
```

### Examining the current configuration

In [2]:

tlnb.print_info()


Timelink version: 1.1.25
Project name: dehergne
Project home: /Users/jrc/mhk-home/sources/dehergne
Database type: sqlite
Database name: dehergne
Kleio image: timelinkserver/kleio-server
Kleio server token: Ww6W5...
Kleio server URL: http://127.0.0.1:8090
Kleio server home: /Users/jrc/mhk-home/sources/dehergne
Kleio server container: mystifying_beaver
Kleio version requested: latest
Kleio server version: 12.8.593 (2025-03-16 21:55:53)
SQLite directory: /Users/jrc/mhk-home/sources/dehergne/database/sqlite
Database version: 6ccf1ef385a6
Call print_info(show_token=True) to show the Kleio Server token
Call print_info(show_password=True) to show the Postgres password
TimelinkNotebook(project_name=dehergne, project_home=/Users/jrc/mhk-home/sources/dehergne, db_type=sqlite, db_name=dehergne, kleio_image=timelinkserver/kleio-server, kleio_version=latest, postgres_image=postgres, postgres_version=latest)


### Listing available databases

In [3]:
pd = tlnb.get_postgres_databases()
print(f"Postgres databases: {pd}")
sd = tlnb.get_sqlite_databases()
print(f"Sqlite databases: {sd}")


Postgres databases: ['tests_users', 'test_project']
Sqlite databases: ['../database/sqlite/dehergne.sqlite']


### Database status (table row count)

In [4]:
tlnb.table_row_count_df()

Unnamed: 0,table,count
0,acts,29
1,alembic_version,1
2,aregisters,1
3,attributes,26742
4,blinks,200
5,class_attributes,70
6,classes,14
7,entities,32950
8,geoentities,359
9,goods,0


## Dealing with kleio files

### List Kleio files available

In [5]:
kleio_files = tlnb.get_kleio_files()
kleio_files.head()

Unnamed: 0,path,name,modified,status,translated,errors,warnings,import_status,import_errors,import_warnings,import_error_rpt,import_warning_rpt,imported,rpt_url,xml_url
0,sources/dehergne-0-abrev.cli,dehergne-0-abrev.cli,2025-05-25 09:43:47.199569+00:00,V,2025-04-14 05:11:00+00:00,0,0,I,0,0,No errors,No warnings,2025-05-10 03:50:53.702537,/rest/reports/sources/dehergne-0-abrev.rpt,/rest/exports/sources/dehergne-0-abrev.xml
1,sources/dehergne-a.cli,dehergne-a.cli,2025-05-25 09:43:47.200956+00:00,V,2025-04-14 05:11:00+00:00,0,0,I,0,0,No errors,No warnings,2025-05-10 03:51:01.352077,/rest/reports/sources/dehergne-a.rpt,/rest/exports/sources/dehergne-a.xml
2,sources/dehergne-b.cli,dehergne-b.cli,2025-05-25 09:43:47.206771+00:00,V,2025-04-14 08:09:00+00:00,0,0,I,0,0,No errors,No warnings,2025-05-10 03:51:10.480262,/rest/reports/sources/dehergne-b.rpt,/rest/exports/sources/dehergne-b.xml
3,sources/dehergne-c.cli,dehergne-c.cli,2025-05-25 09:43:47.210742+00:00,V,2025-04-14 08:03:00+00:00,0,0,I,0,0,No errors,No warnings,2025-05-10 03:51:23.126101,/rest/reports/sources/dehergne-c.rpt,/rest/exports/sources/dehergne-c.xml
4,sources/dehergne-d.cli,dehergne-d.cli,2025-05-25 09:43:47.216062+00:00,V,2025-04-14 05:11:00+00:00,0,0,I,0,0,No errors,No warnings,2025-05-10 03:51:28.685167,/rest/reports/sources/dehergne-d.rpt,/rest/exports/sources/dehergne-d.xml


Show only translation and import status

In [6]:
kleio_files[["name","status","import_status"]]

Unnamed: 0,name,status,import_status
0,dehergne-0-abrev.cli,V,I
1,dehergne-a.cli,V,I
2,dehergne-b.cli,V,I
3,dehergne-c.cli,V,I
4,dehergne-d.cli,V,I
5,dehergne-e.cli,V,I
6,dehergne-f.cli,V,U
7,dehergne-g.cli,V,I
8,dehergne-h.cli,V,I
9,dehergne-i.cli,V,I


Show translation and import errors

In [7]:
kleio_files[["name","errors","warnings","import_errors","import_warnings"]]

Unnamed: 0,name,errors,warnings,import_errors,import_warnings
0,dehergne-0-abrev.cli,0,0,0,0
1,dehergne-a.cli,0,0,0,0
2,dehergne-b.cli,0,0,0,0
3,dehergne-c.cli,0,0,0,0
4,dehergne-d.cli,0,0,0,0
5,dehergne-e.cli,0,0,0,0
6,dehergne-f.cli,0,0,0,0
7,dehergne-g.cli,0,0,0,0
8,dehergne-h.cli,0,0,0,0
9,dehergne-i.cli,0,0,0,0


### Translation and Import reports

Translation reports.

Pass a dataframe with Kleio files and a row number to get the translation report

In [8]:
rpt=tlnb.get_translation_report(kleio_files,rows=1)
print(rpt)

KleioTranslator - server version 12.7 - build 579 2025-01-29 17:45:15
14-4-2025 5-11

Processing data file dehergne-a.cli
-------------------------------------------
Generic Act translation module with geoentities (XML).
     Joaquim Ramos de Carvalho (joaquim@uc.pt) 
** New document: kleio
kleio translation started
Structure: gacto2.str
Prefix: 
Autorel: 
Translation count: 242
Obs: 
** Processing source fonte$dehergne-a
21: lista$dehergne-notices-a
*** End of File

Line 602 "SAME AS" TO EXTERNAL REFERENCE EXPORTED (deh-belchior-miguel-carneiro-leitao) CHECK IF IT EXISTS BEFORE IMPORTING THIS FILE.
Line 802 "SAME AS" TO EXTERNAL REFERENCE EXPORTED (deh-andre-palmeiro) CHECK IF IT EXISTS BEFORE IMPORTING THIS FILE.
Line 807 "SAME AS" TO EXTERNAL REFERENCE EXPORTED (deh-antonio-francisco-cardim) CHECK IF IT EXISTS BEFORE IMPORTING THIS FILE.
Line 966 "SAME AS" TO EXTERNAL REFERENCE EXPORTED (deh-jean-regis-lieou) CHECK IF IT EXISTS BEFORE IMPORTING THIS FILE.
Line 1028 "SAME AS" TO EXTE

Or use the file name

In [9]:
file = kleio_files.iloc[0].rpt_url
print(file)
rpt=tlnb.get_translation_report(file)
print(rpt)

/rest/reports/sources/dehergne-0-abrev.rpt
KleioTranslator - server version 12.7 - build 579 2025-01-29 17:45:15
14-4-2025 5-11

Processing data file dehergne-0-abrev.cli
-------------------------------------------
Generic Act translation module with geoentities (XML).
     Joaquim Ramos de Carvalho (joaquim@uc.pt) 
** New document: kleio
kleio translation started
Structure: 
Prefix: 
Autorel: 
Translation count: 12
Obs: 
** Processing source fonte$dehergne-0-abrev
*** End of File


Structure file: /kleio-home/structures/sources.str
Structure processing report: /kleio-home/structures/sources.srpt
Structure in JSON: /kleio-home/structures/sources.str.json

Kleio file: /kleio-home/sources/dehergne-0-abrev.cli
Original file: /kleio-home/sources/dehergne-0-abrev.org
Previous version: /kleio-home/sources/dehergne-0-abrev.old
Temp file with ids: /kleio-home/sources/dehergne-0-abrev.ids
** - /kleio-home/sources/dehergne-0-abrev.cli-renamed to- /kleio-home/sources/dehergne-0-abrev.old
0  error

Import report

In [10]:
rpt = tlnb.get_import_rpt(kleio_files,rows=1)
print(rpt)

dehergne-a.cli
No errors





# Atualizar base de dados

Atualiza traduções de fontes e importa quando tradução não tem erros.

---

# Update database

Updates source translations and imports into database sources with no errors.

In [11]:
import logging
logging.basicConfig(level=logging.INFO)

tlnb.update_from_sources()

  element: KElement = group.get_element_by_name_or_class(cattr.colclass)


Show imported files status

In [12]:
imported_files_df = tlnb.get_import_status()
imported_files_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 29 entries, 0 to 28
Data columns (total 25 columns):
 #   Column              Non-Null Count  Dtype              
---  ------              --------------  -----              
 0   path                29 non-null     object             
 1   name                29 non-null     object             
 2   size                29 non-null     int64              
 3   directory           29 non-null     object             
 4   modified            29 non-null     datetime64[ns, UTC]
 5   modified_iso        29 non-null     datetime64[ns, UTC]
 6   modified_string     29 non-null     object             
 7   qtime               29 non-null     datetime64[ns, UTC]
 8   qtime_string        29 non-null     object             
 9   source_url          29 non-null     object             
 10  status              29 non-null     object             
 11  translated          29 non-null     datetime64[ns, UTC]
 12  translated_string   29 non-null     ob

Check the import status of the translated files
```python
    I = "I" # imported
    E = "E" # imported with error
    W = "W" # imported with warnings no errors
    N = "N" # not imported
    U = "U" # translation updated need to reimport

``` 

In [13]:
imported_files_df[["import_status","import_errors","import_warnings","name","imported","path"]].sort_values("name")

Unnamed: 0,import_status,import_errors,import_warnings,name,imported,path
0,I,0,0,dehergne-0-abrev.cli,2025-05-10 03:50:53.702537,sources/dehergne-0-abrev.cli
1,I,0,0,dehergne-a.cli,2025-05-10 03:51:01.352077,sources/dehergne-a.cli
2,I,0,0,dehergne-b.cli,2025-05-10 03:51:10.480262,sources/dehergne-b.cli
3,I,0,0,dehergne-c.cli,2025-05-10 03:51:23.126101,sources/dehergne-c.cli
4,I,0,0,dehergne-d.cli,2025-05-10 03:51:28.685167,sources/dehergne-d.cli
5,I,0,0,dehergne-e.cli,2025-05-10 03:51:30.219256,sources/dehergne-e.cli
6,I,0,0,dehergne-f.cli,2025-05-26 08:13:26.720670,sources/dehergne-f.cli
7,I,0,0,dehergne-g.cli,2025-05-10 03:51:46.453827,sources/dehergne-g.cli
8,I,0,0,dehergne-h.cli,2025-05-10 03:51:49.364195,sources/dehergne-h.cli
9,I,0,0,dehergne-i.cli,2025-05-10 03:51:51.033646,sources/dehergne-i.cli


## Todo

This as data frame in a single function
* TimelinkNotebook.translate([files_df,rows=List[int],status="T"])
* TimelinkNotebook.import(files_df, rows=List[int])



# Obter dados
---

# Getting data

### Procurar pessoa

---
### Search for people

#### Search persons by name

In [14]:
from timelink.api.models import Person
from sqlalchemy import select

show_only=10

with tlnb.db.session() as session:
    stmt = select(Person).where(Person.name.like('% Valignano%'))
    print(stmt)
    persons = session.execute(stmt).scalars().all()
    print()
    for person in persons[:show_only]:
        print(person.id, person.name,person.sex)


SELECT persons.id, entities.id AS id_1, entities.class, entities.inside, entities.the_source, entities.the_order, entities.the_level, entities.the_line, entities.groupname, entities.extra_info, entities.updated, entities.indexed, persons.name, persons.sex, persons.obs 
FROM entities JOIN persons ON entities.id = persons.id 
WHERE persons.name LIKE :name_1

deh-luis-cerqueira-ref1 Alessandro Valignano m
deh-pedro-martins-ref2 Alessandro Valignano m
deh-gil-martinez-de-la-mata-ref1 Alessandro Valignano m
deh-lourenco-mexia-ref1 Alessandro Valignano m
deh-alessandro-valignano Alessandro Valignano m
deh-francesco-pasio-ref1 Alessandro Valignano m


### Search other Entities

#### get the Entity classes in the database

In [15]:
from sqlalchemy import select, func
from timelink.api.models import Entity

models = tlnb.db.get_models_ids()
models

['entity',
 'attribute',
 'relation',
 'act',
 'source',
 'aregister',
 'person',
 'good',
 'object',
 'geoentity',
 'rgeoentity',
 'robject',
 'rperson',
 'rentity',
 'class']

#### Get columns of an entity type

In [16]:
table = tlnb.db.get_table("entity")
print(table.name)
list(table.columns)

entities


[Column('id', String(), table=<entities>, primary_key=True, nullable=False),
 Column('class', String(), table=<entities>, nullable=False),
 Column('inside', String(), ForeignKey('entities.id'), table=<entities>),
 Column('the_source', String(), table=<entities>),
 Column('the_order', Integer(), table=<entities>),
 Column('the_level', Integer(), table=<entities>),
 Column('the_line', Integer(), table=<entities>),
 Column('groupname', String(), table=<entities>),
 Column('extra_info', JSON(), table=<entities>),
 Column('updated', DateTime(), table=<entities>, default=CallableColumnDefault(<function datetime.utcnow at 0x1070afac0>)),
 Column('indexed', DateTime(), table=<entities>)]

#### Search any entity type

## IMPROVE

In [17]:
from timelink.api.models import Entity
from sqlalchemy import select, func

Geoentity = tlnb.db.get_model("geoentity")
stmt = select(Geoentity).where(Geoentity.name.like('H%'))
print(stmt)
with tlnb.db.session() as session:

    result = session.execute(stmt).scalars().all()
    for row in result[:4]:
        print()
        print(row.the_type,row.name,row.id,row.obs)
        print(row.to_kleio())

SELECT geoentities.id, entities.id AS id_1, entities.class, entities.inside, entities.the_source, entities.the_order, entities.the_level, entities.the_line, entities.groupname, entities.extra_info, entities.updated, entities.indexed, geoentities.name, geoentities.the_type, geoentities.obs 
FROM entities JOIN geoentities ON entities.id = geoentities.id 
WHERE geoentities.name LIKE :name_1

geo2 Hangchou deh-r1644-hangchou 
geo2$Hangchou#Hang-tcheou, hoje: Hangzhou, 杭州, @wikidata:Q4970/geo2
  atr$activa/sim/1611
  atr$residencia-missao/Jesuíta/1611
  atr$geoentity:name@wikidata/"https://www.wikidata.org/wiki/Q4970"#Hang-tcheou, hoje: Hangzhou, 杭州, @wikidata:Q4970%Q4970/1644

geo2 Huchow deh-r1644-huchow 
geo2$Huchow#Hou-tcheou, hoje: Huzhou, 湖州, @wikidata:Q42664/geo2
  atr$geoentity:name@wikidata/"https://www.wikidata.org/wiki/Q42664"#Hou-tcheou, hoje: Huzhou, 湖州, @wikidata:Q42664%Q42664/1644

geo3 Hungtang deh-r1644-hungtang 
geo3$Hungtang/geo3
  atr$activa/sim/1636
  atr$geoentity:type

#### Get person by id

Show a single person or entity in Kleio notation

In [18]:
from timelink.api.models.person import Person

id = 'deh-duarte-de-sande'
with tlnb.db.session() as session:
    # get Person with id
    p = session.get(Person, id)
    print(p)

n$Duarte de Sande/m/id=deh-duarte-de-sande
  ls$nacionalidade/Portugal/
  ls$jesuita-estatuto/Padre/
  ls$nome/Edoardo de Sande/
  ls$nome-chines/Mong San-Tö Ning-Houan/
  ls$estadia@wikidata/"https://www.wikidata.org/wiki/Q1949022"%Baçaim/
  ls$estadia/Baçaim#@wikidata:Q1949022/
  ls$jesuita-cargo/Reitor em Baçaim%recteur/
  ls$tarefa/Retoca o latim do «De missione legatorum japonensium» de Valignano/
  ls$dehergne@archive/"https://archive.org/details/bhsi37/page/276/mode/1up"%741/
  ls$dehergne/741#@archive:276//obs="""
            Sande, Duarte (Edoardo) de (port.) P. 741
            Mong San-Tô Nlng-Houan (Pf.).
            N. 1547 (et non 1531) avant le 25 oct.« Vimarani », à Guimarães près de Braga (D'Elia I, 222)
            -E. juin 1562, Lisbonne à 15 ans et demi {Lus. 43, 181).
            Emb. comme supérieur des jésuites le 24 mars 1578 sur le S. Luis pour les Indes (W 196).,
            Baçaim, où recteur., quitte Goa le 1er mai 1585 et arr.
            Macao 31 juillet 15

### Show other type of entities by id in Kleio

In [19]:
from timelink.api.models import Entity

with tlnb.db.session() as session:
    # get Entity with id
    ent = tlnb.db.get_entity(id="deh-r1644-chekiang", session=session)
    print(ent.to_kleio())

geo1$Chekiang#Tche-kiang, hoje:Zhejiang, 浙江, @wikidata:Q16967 @dehergne:396/geo1
  atr$geoentity:name@wikidata/"https://www.wikidata.org/wiki/Q16967"#Tche-kiang, hoje:Zhejiang, 浙江, @wikidata:Q16967 @dehergne:396%Q16967/1644
  atr$geoentity:name@dehergne/"https://archive.org/details/bhsi37/page/n396/mode/1up"#Tche-kiang, hoje:Zhejiang, 浙江, @wikidata:Q16967 @dehergne:396%396/1644




###  Obter um dataframe a partir de atributos

---


###  Get a Dataframe from attributes


#### Exemplo: Faculdade, data de entrada e data de saída e grau dos naturais de Coimbra

In [33]:
from timelink.pandas import entities_with_attribute


# Get list of people with with a certain value in a specific attribute
df = entities_with_attribute(
                    entity_type='person',
                    the_type='estadia',
                    the_value='Coimbra',
                    # name_like='%Aboim%',
                    more_attributes=['uc.entrada','uc.saida'],
                    db=tlnb.db,
                    sql_echo=False)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, deh-afonso-aires-ref1 to deh-jose-montanha-ii
Data columns (total 12 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   estadia.attr_id       5 non-null      object
 1   estadia.type          5 non-null      object
 2   estadia               5 non-null      object
 3   estadia.date          5 non-null      object
 4   estadia.line          5 non-null      int64 
 5   estadia.level         5 non-null      int64 
 6   estadia.obs           5 non-null      object
 7   estadia.extra_info    5 non-null      object
 8   estadia.comment       5 non-null      object
 9   estadia.date.comment  1 non-null      object
 10  uc.entrada            0 non-null      object
 11  uc.saida              0 non-null      object
dtypes: int64(2), object(10)
memory usage: 520.0+ bytes


In [None]:
df.head(5)

Unnamed: 0_level_0,name,sex,naturalidade,naturalidade.date,naturalidade.obs,faculdade,faculdade.date,faculdade.obs,uc.entrada,uc.entrada.date,uc.entrada.obs,uc.saida,uc.saida.date,uc.saida.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
140349,António de Aboim,m,Coimbra,1566-12-20,,Cânones,1566-12-20,,1566-12-20,1566-12-20,,1574-07-24,1574-07-24,
140367,Manuel de Vargas de Aboim,m,Coimbra,20200211,,Cânones,20200211,,0000-00-00,20200211,,0000-00-00,20200211,


#### obter attributos de outras entidades

In [None]:
from timelink.pandas import entities_with_attribute


# Get list of people with with a certain value in a specific attribute
df = entities_with_attribute(
                    entity_type='geoentity',
                    more_info=['name'],
                    the_type='activa',
                    the_value='sim',
                    more_cols=['residencia-missao'],
                    db=tlnb.db,
                    sql_echo=False)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 165 entries, deh-r1644-anhai to deh-r1644-yunnan
Data columns (total 7 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   name                    165 non-null    object
 1   activa                  165 non-null    object
 2   activa.date             165 non-null    object
 3   activa.obs              4 non-null      object
 4   residencia-missao       38 non-null     object
 5   residencia-missao.date  38 non-null     object
 6   residencia-missao.obs   4 non-null      object
dtypes: object(7)
memory usage: 10.3+ KB


In [None]:
df.head(30)

Unnamed: 0_level_0,name,activa,activa.date,activa.obs,residencia-missao,residencia-missao.date,residencia-missao.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
deh-r1644-anhai,Anhai,sim,1634,,,,
deh-r1644-bankao,Bankao,sim,1636,,,,
deh-r1644-cantao,Cantão,sim,1555,,Jesuíta,1580.0,
deh-r1644-cantao,Cantão,sim,1555,,Franciscanos,0.0,
deh-r1644-cantao,Cantão,sim,1555,,Dominicanos,0.0,
deh-r1644-chala,Chala,sim,1610,,,,
deh-r1644-changchow-fou,Changchow,sim,1643,,,,
deh-r1644-changshu,Changshu,sim,1623,,Jesuíta,1635.0,"R 1635 «Cham Xo», cf AHSI 28 (1951) 311-312"
deh-r1644-chengting,Chengting,sim,1621,,,,
deh-r1644-chinkiang,Chinkiang,sim,1611,,,,




###  Remover colunas sem valores

---



###  Remove empty columns

In [None]:
df.dropna(how='all', axis=1, inplace=True)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 165 entries, deh-r1644-anhai to deh-r1644-yunnan
Data columns (total 7 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   name                    165 non-null    object
 1   activa                  165 non-null    object
 2   activa.date             165 non-null    object
 3   activa.obs              4 non-null      object
 4   residencia-missao       38 non-null     object
 5   residencia-missao.date  38 non-null     object
 6   residencia-missao.obs   4 non-null      object
dtypes: object(7)
memory usage: 10.3+ KB


In [None]:
df.head(5)

Unnamed: 0_level_0,name,activa,activa.date,activa.obs,residencia-missao,residencia-missao.date,residencia-missao.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
deh-r1644-anhai,Anhai,sim,1634,,,,
deh-r1644-bankao,Bankao,sim,1636,,,,
deh-r1644-cantao,Cantão,sim,1555,,Jesuíta,1580.0,
deh-r1644-cantao,Cantão,sim,1555,,Franciscanos,0.0,
deh-r1644-cantao,Cantão,sim,1555,,Dominicanos,0.0,


In [None]:
ids = df.index.unique()
for id in ids[:5]:
    ent = tlnb.db.get_entity(id)
    print(ent.to_kleio())
    print()

geo3$deh-r1644-anhai/type=geoentity
  rel$function-in-act/geo3/deh-chre-1644/16440000
  atr$activa/sim/1634

geo3$deh-r1644-bankao/type=geoentity
  rel$function-in-act/geo3/deh-chre-1644/16440000
  atr$activa/sim/1636

geo2$deh-r1644-cantao/type=geoentity
  rel$function-in-act/geo2/deh-chre-1644/16440000
  atr$activa/sim/1555
  atr$residencia-missao/Jesuíta/1580
  atr$residencia-missao/Franciscanos/0000
  atr$residencia-missao/Dominicanos/0000
  geo3$deh-r1644-tungkun/type=geoentity
    rel$function-in-act/geo3/deh-chre-1644/16440000
  geo3$deh-r1644-quon-yao/type=geoentity
    rel$function-in-act/geo3/deh-chre-1644/16440000
    atr$activa/sim/1621
  geo3$deh-r1644-lampacao/type=geoentity
    rel$function-in-act/geo3/deh-chre-1644/16440000
    atr$activa/sim/1535

geo3$deh-r1644-chala/type=geoentity
  rel$function-in-act/geo3/deh-chre-1644/16440000
  atr$activa/sim/1610

geo2$deh-r1644-changchow-fou/type=geoentity
  rel$function-in-act/geo2/deh-chre-1644/16440000
  atr$activa/sim/1643



## Contagens

---

## Counting


### Tipos de attributos

---

### Attribute types

In [39]:
import pandas as pd

from sqlalchemy import func
from sqlalchemy import select


pd.set_option('display.max_rows', 500)

attr_table = tlnb.db.get_table('attributes')
tlnb.db.describe('attributes', show=True)
print()
stmt = select(
    attr_table.c.the_type,
    func.count().label('count'),
    func.count(func.distinct(attr_table.c.the_value)).label('distinct_value')
    ).group_by('the_type')
print(stmt)
print()

with tlnb.db.session() as session:
    # nml2 = session.query(Attribute.the_type,func.count().label('tot')).group_by(Attribute.the_type).all()
    nml = session.execute(stmt)
    attribute_df = pd.DataFrame(nml)

attribute_df

attributes (model_table)
id                   entities             VARCHAR    
class                entities             VARCHAR    
inside               entities             VARCHAR    {ForeignKey('entities.id')}
the_source           entities             VARCHAR    
the_order            entities             INTEGER    
the_level            entities             INTEGER    
the_line             entities             INTEGER    
groupname            entities             VARCHAR    
extra_info           entities             JSON       
updated              entities             DATETIME   
indexed              entities             DATETIME   
id                   attributes           VARCHAR    {ForeignKey('entities.id')}
entity               attributes           VARCHAR    {ForeignKey('entities.id')}
the_type             attributes           VARCHAR    
the_value            attributes           VARCHAR    
the_date             attributes           VARCHAR    
obs                  attribute

Unnamed: 0,the_type,count,distinct_value
0,activa,286,3
1,alternative-name,4,4
2,alternative-name@wikidata,3,3
3,baptizado,28,28
4,baptizado@wikidata,26,26
5,bibliografia,9,9
6,cargo,356,225
7,chegada,487,71
8,chegada@wikidata,420,61
9,dehergne,1464,1211



###  Contagem de atributos a partir de uma tabela em memória

---

###  Count attributes from an existing dataframe



In [35]:
# create a column with the index values which are the id numbers
df['id'] = df.index.values

col = 'jesuita-entrada' # subotal by this column

# Use pandas groupby and specify unique value count for id
df_totals = df.groupby(col).agg({'id':'nunique',
                                                  'embarque':'min',
                                                  'embarque':'max'})

df_totals.sort_values('id',ascending= False).head(30)

KeyError: 'jesuita-entrada'


### Contagens na base de dados

Quando o atributo tem muitos valores e não é necessário
ter todas as pessoas em memória: contagem feita na base de dados

---

### Counting directly in the database
When there are many values and it is not
necessary to have all the people in memory:
count directly in the database.




In [None]:
from timelink.pandas import attribute_values

df_totals = attribute_values('grau',db=tlnb.db)


In [None]:
df_totals.head(10)


Unnamed: 0_level_0,count,date_min,date_max
value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Bacharel,170,1554-07-19,1912-08-03
Formatura,153,1574-07-24,1905-06-19
Licenciado,33,1574-06-03,1886-02-27
Bacharel em Artes,25,1574-03-14,1766-07-19
Doutor,11,1560-12-22,1887-11-27
Licenciado em Artes,5,1574-05-15,1738-06-17
Mestre,2,1710-10-05,1768-10-23


#### Filtrar por datas

Para evitar remissivas com data zero

---

#### Filter by dates

Avoid cross-references with zero date

##ERROR

In [None]:
df_totals = attribute_values('grau',dates_between=('1535','1919'),db=tlnb.db)

In [None]:
df_totals.head(10)

Unnamed: 0_level_0,count,date_min,date_max
value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Bacharel,170,1554-07-19,1912-08-03
Formatura,153,1574-07-24,1905-06-19
Licenciado,33,1574-06-03,1886-02-27
Bacharel em Artes,25,1574-03-14,1766-07-19
Doutor,11,1560-12-22,1887-11-27
Licenciado em Artes,5,1574-05-15,1738-06-17
Mestre,2,1710-10-05,1768-10-23


## Visualizar registos

---

## View records





### Visualizar uma pessoa

---

### View a person


#### Atributos de uma pessoa numa tabela, uma linha por attributo

---

#### Person attributes in a dataframe, one line per attribute

In [3]:
import pandas as pd
from timelink.pandas import group_attributes as person_attributes

pd.set_option('display.max_rows',1000)

id = 'deh-duarte-de-sande'
pdf = person_attributes([id],db=tlnb.db)  # note id in a list
pdf.info()
pdf[['the_date','the_type','the_value','attr_obs']].sort_values(['the_date','the_type'])

<class 'pandas.core.frame.DataFrame'>
Index: 44 entries, deh-duarte-de-sande to deh-duarte-de-sande
Data columns (total 4 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   the_type   44 non-null     object
 1   the_value  44 non-null     object
 2   the_date   44 non-null     object
 3   attr_obs   44 non-null     object
dtypes: object(4)
memory usage: 1.7+ KB


Unnamed: 0_level_0,the_date,the_type,the_value,attr_obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
deh-duarte-de-sande,0,dehergne,741,"""""""\n Sande, Duarte (Edoardo) de (p..."
deh-duarte-de-sande,0,dehergne@archive,https://archive.org/details/bhsi37/page/276/mo...,"extra_info: {""value"": {""original"": ""741""}}"
deh-duarte-de-sande,0,estadia,Baçaim,"extra_info: {""value"": {""comment"": ""@wikidata:Q..."
deh-duarte-de-sande,0,estadia@wikidata,https://www.wikidata.org/wiki/Q1949022,"extra_info: {""value"": {""original"": ""Ba\u00e7ai..."
deh-duarte-de-sande,0,jesuita-cargo,Reitor em Baçaim,"extra_info: {""value"": {""original"": ""recteur""}}"
deh-duarte-de-sande,0,jesuita-estatuto,Padre,
deh-duarte-de-sande,0,nacionalidade,Portugal,
deh-duarte-de-sande,0,nome,Edoardo de Sande,
deh-duarte-de-sande,0,nome-chines,Mong San-Tö Ning-Houan,
deh-duarte-de-sande,0,tarefa,Retoca o latim do «De missione legatorum japon...,


#### Atributos de uma pessoa numa tabela, attributos em colunas

---

#### Person attributes in a dataframe, attributes in columns

In [30]:
# Get list of people with with a certain value in a specific attribute

df = entities_with_attribute(
                    entity_type='person',
                    the_type='uc.entrada',  # we need a base attribute
                    more_info=['name','sex'],
                    more_cols=['instituta','matricula.faculdade','matricula.ano'],
                    filter_by=[140349],
                    db=tlnb.db,
                    sql_echo=False)
view_cols = ['name','matricula.ano.date','matricula.ano','matricula.ano.obs']
df[view_cols].sort_values('matricula.ano.date')

Unnamed: 0_level_0,name,matricula.ano.date,matricula.ano,matricula.ano.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
140349,António de Aboim,1571-07-20,Cânones.1571,20.07.1571
140349,António de Aboim,1573-10-07,Cânones.1573,07.10.1573
140349,António de Aboim,1574-07-24,Cânones.1574,24.07.1574


#### Examinar potenciais duplicados

---

#### Examine potentital duplicates

In [15]:
from timelink.pandas import display_group_attributes
pd.set_option('display.max_rows',250)

no_show=['código-de-referência','data-do-registo','url','faculdade.ano','naturalidade.ano',
         'matricula-faculdade.ano','nome-apelido','nome-primeiro','nome-geografico.ano',
         'grau.ano','matricula-outra.ano','nome-geografico','instituta.ano']

dup_ids = ['234295','234710',]  # Alexandre Metelo de

display_group_attributes(dup_ids,
                             header_cols=['uc-entrada','naturalidade','faculdade','nome-pai'],
                             exclude_attributes=no_show,
                             sort_attributes=['date','type','value'],
                             cmap_name='Pastel1')

Unnamed: 0,id,uc-entrada,naturalidade,faculdade,nome-pai
0,234295,1704-11-07,Marialva,Cânones,
1,234710,1705-10-24,Marialva,Matemática,Manuel Cardoso Metelo


Unnamed: 0,date,id,type,value,attr_obs
0,1704-11-07,234295,faculdade,Cânones,Cânones
1,1704-11-07,234295,instituta,1704-11-07,07.11.1704 1704-11-07
2,1704-11-07,234295,naturalidade,Marialva,
3,1704-11-07,234295,nome,Alexandre Metelo de Sousa,
4,1704-11-07,234295,uc-entrada,1704-11-07,
5,1704-11-07,234295,uc-entrada.ano,1704,
6,1705-10-24,234710,faculdade,Matemática,Matemática
7,1705-10-24,234295,matricula-faculdade,Cânones,24.10.1705
8,1705-10-24,234710,matricula-faculdade,Matemática,24.10.1705
9,1705-10-24,234710,naturalidade,Marialva,


#### Notação Kleio

Ver [Kleio notation](README_kleio.md) [EN]

---

#### Kleio notation

See [Kleio notation](README_kleio.md)

#### Notação Kleio directamente da base de dados

Ver [Kleio notation](README_kleio.md) [EN]

---

#### Kleio notation directly from database

See [Kleio notation](README_kleio.md)

In [33]:
from timelink.mhk.models.person import Person

with tlnb.db.session() as session:

    p: Person = session.query(Person).order_by(Person.id).first()
    k = p.to_kleio()
    print(p.to_kleio())


n$António Pinto Abadeço/m/id=140337/obs="""
      """

                  Id: 140337
                  Código de referência: PT/AUC/ELU/UC-AUC/B/001-001/A/000001

                  Nome        : António Pinto Abadeço
                  Data inicial: 1705-10-01
                  Data final  : 1710-10-01
                  Filiação: António Pinto

                  Naturalidade: Abrantes
                  Faculdade: Cânones

                  Matrícula(s): 01.10.1705
                  01.10.1706
                  01.01.1707
                  01.10.1708
                  01.10.1709
                  01.10.1710
                  Instituta:
                  Bacharel: 28.06.1709
                  Formatura:
                  Licenciado:
                  Doutor:

                  Outras informações:
              """
  """
  rel$function-in-act/n/António Pinto Abadeço/auc-alumni-A-140337-140771/20200211
  atr$código-de-referência/""PT/AUC/ELU/UC-AUC/B/001-001/A/000001""/2021-05-17
  atr$data-