> First time use: follow instructions in the README.md file in this directory.



# Receitas

Várias receitas para usar Timelink com notebooks

[Sumário]

---

# Receipts

Receipts for using Timelink with notebooks

[summary]


---
## Setup

### Basic setup

Create a `TimelinkNotebook` instance.

This will provide an interface to the 
various `Timelink` functions.



In [4]:
from timelink.notebooks import TimelinkNotebook

tlnb = TimelinkNotebook(db_type='postgres', db_name="tests")

### Changing the default values

When called with no parameters `TimelinkNotebook` 
assumes a certain directory layout containing the notebook
and from that layout autoconfigures various parameters.

The default layout:

* **project-directory**
    * **database** (database related filed)
    * **notebooks**  _(directory with the current notebook)_ 
    * **kleio**  (source files in kleio format)
    * **identifications** (files with identification information)

Based on this layout `TimelinkNotebook` assumes a few default values:

* **database name**
    * the name of the *project directory* is used as the **database name**, sanitized to produce a valid database name
      (hifens and spaces are replaced with _ ).
* **database type**
  * the type of the database is assumed to be **sqlite**  and a **sqlite** database is created in 
      *database/sqlite*
    * the `db_type` parameter can be used to set the database to `postgres`. 
* **Kleio server home**, working directory for the _Kleio server_ is assumed to be the **project directory**
  * The _Kleio server_ is an application
    that processes transcriptions of historical _Kleio_ notation and generates data in a format
    that can be imported to the database.

These default values can be changed with parameters when creating the `TimelinkNotebook`  instance:

```python
tlnb = TimelinkNotebook(
    project_name="my-project",
    project_home="~/projects/m-project",
    db_type="postgres",
    db_name="my_project_db",
    kleio_image="timelinkserver/kleio-server",
    kleio_version="12.0.0",
    postgres_image="postgres",
    postgres_version="latest",
    sqlite_dir="~/databases",
    sql_echo=True
)
```

#### Examining the current configuration

In [16]:

tlnb.print_info()


Project name: tutorial
Project home: /Users/jrc/develop/timelink-py/tests/timelink-home/projects/tutorial
Database type: postgres
Database name: tests
Kleio image: timelinkserver/kleio-server
Kleio version: latest
Kleio server token: t1jpCKgvoQYRk0mCvLmgm3L24ZP9yFrG
Kleio server URL: http://127.0.0.1:8088
Kleio server home: /Users/jrc/develop/timelink-py/tests/timelink-home/projects/tutorial
Postgres image: postgres
Postgres version: latest
Postgres user: postgres
Postgres password: IxqhakeloR
TimelinkNotebook(project_name=tutorial, project_home=/Users/jrc/develop/timelink-py/tests/timelink-home/projects/tutorial, db_type=postgres, db_name=tests, kleio_image=timelinkserver/kleio-server, kleio_version=latest, postgres_image=postgres, postgres_version=latest)


#### Listing available databases

In [17]:
pd = tlnb.get_postgres_databases()
print(f"Postgres databases: {pd}")
sd = tlnb.get_sqlite_databases()
print(f"Sqlite databases: {sd}")


Postgres databases: ['tutorial', 'tests']
Sqlite databases: ['../database/sqlite/tutorial.sqlite']


### Database status (table row count)

In [18]:
tlnb.table_row_count_df()

Unnamed: 0,table,count
0,entities,19884
1,syspar,0
2,syslog,0
3,kleiofiles,9
4,attributes,15632
5,relations,2296
6,acts,68
7,sources,7
8,persons,1298
9,objects,204


## Dealing with kleio files

### List Kleio files available

In [19]:
kleio_files = tlnb.get_kleio_files()
kleio_files.head()

Unnamed: 0,path,name,modified,status,translated,errors,warnings,import_status,import_errors,import_warnings,import_error_rpt,import_warning_rpt,imported,rpt_url,xml_url
0,kleio/auc-alunos.cli,auc-alunos.cli,2024-02-11 08:59:51.335971+00:00,V,2024-02-11 08:59:00+00:00,0,0,N,0,0,0,0,0,/rest/reports/kleio/auc-alunos.rpt,/rest/exports/kleio/auc-alunos.xml
1,kleio/b1685.cli,b1685.cli,2024-02-11 08:58:43.856369+00:00,W,2024-02-11 08:58:00+00:00,0,1,U,0,0,0,0,0,/rest/reports/kleio/b1685.rpt,/rest/exports/kleio/b1685.xml
2,kleio/dehergne-a.cli,dehergne-a.cli,2024-02-11 08:58:46.638628+00:00,V,2024-02-11 08:58:00+00:00,0,0,U,0,0,0,0,0,/rest/reports/kleio/dehergne-a.rpt,/rest/exports/kleio/dehergne-a.xml
3,kleio/dehergne-locations-1644.cli,dehergne-locations-1644.cli,2024-02-11 08:58:46.691770+00:00,W,2024-02-11 08:58:00+00:00,0,1,U,0,0,0,0,0,/rest/reports/kleio/dehergne-locations-1644.rpt,/rest/exports/kleio/dehergne-locations-1644.xml


Show only translation and import status

In [20]:
kleio_files[["name","status","import_status"]]

Unnamed: 0,name,status,import_status
0,auc-alunos.cli,V,N
1,b1685.cli,W,U
2,dehergne-a.cli,V,U
3,dehergne-locations-1644.cli,W,U


Show translation and import errors

In [21]:
kleio_files[["name","errors","warnings","import_errors","import_warnings"]]

Unnamed: 0,name,errors,warnings,import_errors,import_warnings
0,auc-alunos.cli,0,0,0,0
1,b1685.cli,0,1,0,0
2,dehergne-a.cli,0,0,0,0
3,dehergne-locations-1644.cli,0,1,0,0


### Translation and Import reports

Translation reports.

Pass a dataframe with Kleio files and a row number to get the translation report

In [22]:
rpt=tlnb.get_translation_report(kleio_files,rows=1)
print(rpt)

KleioTranslator - server version 12.4 - build 567 2024-02-07 13:02:03
11-2-2024 8-58

Processing data file b1685.cli
-------------------------------------------
Generic Act translation module with geoentities (XML).
     Joaquim Ramos de Carvalho (joaquim@uc.pt) 
** New document: kleio
kleio translation started
Structure: gacto2.str
Prefix: 
Autorel: 
Translation count: 38
Obs: 
** Processing source fonte$baptismos 1685

Near lines: 4       bap$b1685.1/8/7/1685/?/manuel cordeiro

6: bap$b1685.1
22: bap$b1685.2
40: bap$b1685.3
59: bap$b1685.4
70: bap$b1685.5
79: bap$b1685.6
98: bap$b1685.7
113: bap$b1685.8
129: bap$b1685.9
144: bap$b1685.10
156: bap$b1685.11
171: bap$b1685.12
183: bap$b1685.13
201: bap$b1685.14
219: bap$b1685.15
235: bap$b1685.16
250: bap$b1685.17
266: bap$b1685.18
284: bap$b1685.19
301: bap$b1685.20
318: bap$b1685.21
334: bap$b1685.22
351: bap$b1685.23
368: bap$b1685.24
381: bap$b1685.25
396: bap$b1685.26
416: bap$b1685.27
431: bap$b1685.27b
452: bap$b1685.27c
466: bap

Or use the file name

In [23]:
kleio_files.iloc[0]

path                                kleio/auc-alunos.cli
name                                      auc-alunos.cli
modified                2024-02-11 08:59:51.335971+00:00
status                                                 V
translated                     2024-02-11 08:59:00+00:00
errors                                                 0
import_status                                          N
import_errors                                          0
import_error_rpt                                       0
imported                                               0
rpt_url               /rest/reports/kleio/auc-alunos.rpt
xml_url               /rest/exports/kleio/auc-alunos.xml
Name: 0, dtype: object

In [24]:
file = kleio_files.iloc[0].rpt_url

print(file)
rpt=tlnb.get_translation_report(file)
print(rpt)

/rest/reports/kleio/auc-alunos.rpt
KleioTranslator - server version 12.4 - build 567 2024-02-07 13:02:03
11-2-2024 8-58

Processing data file auc-alunos.cli
-------------------------------------------
Generic Act translation module with geoentities (XML).
     Joaquim Ramos de Carvalho (joaquim@uc.pt) 
** New document: kleio
kleio translation started
Structure: gacto2.str
Prefix: 
Autorel: 
Translation count: 7
Obs: 
** Processing source fonte$auc-alunos
30: lista$auc-alumni-A-140337-140771
*** End of File


Structure file: /usr/local/timelink/clio/src/gacto2.str
Structure processing report: /usr/local/timelink/clio/src/gacto2.srpt
Structure in JSON: /usr/local/timelink/clio/src/gacto2.str.json

Kleio file: /kleio-home/kleio/auc-alunos.cli
Original file: /kleio-home/kleio/auc-alunos.org
Previous version: /kleio-home/kleio/auc-alunos.old
Temp file with ids: /kleio-home/kleio/auc-alunos.ids
** - /kleio-home/kleio/auc-alunos.cli-renamed to- /kleio-home/kleio/auc-alunos.old
0  errors. 
Tra

Import report

In [30]:
rpt = tlnb.get_import_rpt(kleio_files,rows=2)
print(rpt)

ERROR: dehergne-a.cli 396 storing rel$class=relation/date=0/groupname=rel/id=deh-michel-alfonso-chen-ref4-rel8-175/destination=deh-guillaume-van-der-beken-irmao/level=5/line=396/undef=Jean/order=477/type=parentesco/value=Irmão
 (psycopg2.errors.ForeignKeyViolation) insert or update on table "relations" violates foreign key constraint "relations_destination_fkey"
DETAIL:  Key (destination)=(deh-guillaume-van-der-beken-irmao) is not present in table "entities".

[SQL: INSERT INTO relations (id, origin, destination, the_type, the_value, the_date, obs) VALUES (%(id)s, %(origin)s, %(destination)s, %(the_type)s, %(the_value)s, %(the_date)s, %(obs)s)]
[parameters: {'id': 'deh-michel-alfonso-chen-ref4-rel8-175', 'origin': 'deh-michel-alfonso-chen-ref4', 'destination': 'deh-guillaume-van-der-beken-irmao', 'the_type': 'parentesco', 'the_value': 'Irmão', 'the_date': '0', 'obs': None}]
(Background on this error at: https://sqlalche.me/e/20/gkpj)

ERROR: dehergne-a.cli 541 storing relation$class=re


# Atualizar base de dados

Atualiza traduções de fontes e importa quando tradução não tem erros.

---

# Update database

Updates source translations and imports into database sources with no errors.

In [26]:
import logging
logging.basicConfig(level=logging.INFO)

tlnb.update_from_sources()

INFO:root:Importing kleio/auc-alunos.cli
INFO:root:Importing kleio/dehergne-a.cli


Storing 11 postponed relations


INFO:root:Importing kleio/b1685.cli


Storing 2 postponed relations


INFO:root:Importing kleio/dehergne-locations-1644.cli


Show imported files status

In [28]:
imported_files_df = tlnb.get_import_status()
imported_files_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 25 columns):
 #   Column              Non-Null Count  Dtype              
---  ------              --------------  -----              
 0   path                4 non-null      object             
 1   name                4 non-null      object             
 2   size                4 non-null      int64              
 3   directory           4 non-null      object             
 4   modified            4 non-null      datetime64[ns, UTC]
 5   modified_iso        4 non-null      datetime64[ns, UTC]
 6   modified_string     4 non-null      object             
 7   qtime               4 non-null      datetime64[ns, UTC]
 8   qtime_string        4 non-null      object             
 9   source_url          4 non-null      object             
 10  status              4 non-null      object             
 11  translated          4 non-null      datetime64[ns, UTC]
 12  translated_string   4 non-null      obje

Check the import status of the translated files
```python
    I = "I" # imported
    E = "E" # imported with error
    W = "W" # imported with warnings no errors
    N = "N" # not imported
    U = "U" # translation updated need to reimport

``` 

In [35]:
imported_files_df[["import_status","import_errors","import_warnings","name","imported","path"]].sort_values("name")

Unnamed: 0,import_status,import_errors,import_warnings,name,imported,path
0,I,0,0,auc-alunos.cli,2024-02-12 08:26:20.556994,kleio/auc-alunos.cli
1,I,0,0,b1685.cli,2024-02-12 08:27:27.521256,kleio/b1685.cli
2,E,4,4,dehergne-a.cli,2024-02-12 08:27:05.313744,kleio/dehergne-a.cli
3,I,0,0,dehergne-locations-1644.cli,2024-02-12 08:27:48.311174,kleio/dehergne-locations-1644.cli


## Todo

This as data frame in a single function
* TimelinkNotebook.translate([files_df,rows=List[int],status="T"])
* TimelinkNotebook.import(files_df, rows=List[int])



# Obter dados
---

# Getting data

### Procurar pessoa

---
### Search for people

#### Search persons by name

In [3]:
from timelink.api.models import Person
from sqlalchemy import select

show_only=30

with tlnb.db.session() as session:
    stmt = select(Person).where(Person.name.like('%vasconcelos')).order_by('name')
    print(stmt)
    persons = session.execute(stmt).scalars().all()
    print()
    for person in persons[:show_only]:
        print(person.id, person.name,person.sex)
        print(person.to_kleio())

SELECT persons.id, entities.id AS id_1, entities.class, entities.inside, entities.the_order, entities.the_level, entities.the_line, entities.groupname, entities.updated, entities.indexed, persons.name, persons.sex, persons.obs 
FROM entities JOIN persons ON entities.id = persons.id 
WHERE persons.name LIKE :name_1 ORDER BY persons.name

b1685.6-per4 joao da costa de vasconcelos m
pad$joao da costa de vasconcelos/m/id=b1685.6-per4
  rel$function-in-act/pad/b1685.6/16850827
  ls$residencia/soure/16850827
b1685.29-per4 joao da costa de vasconcelos m
pad$joao da costa de vasconcelos/m/id=b1685.29-per4
  rel$function-in-act/pad/b1685.29/16851213
b1685.27b-per3 joao da costa de vasconcelos m
pad$joao da costa de vasconcelos/m/id=b1685.27b-per3
  rel$function-in-act/pad/b1685.27b/16851125


In [45]:
[person.id for person in persons]

['b1685.6-per4', 'b1685.29-per4', 'b1685.27b-per3']

### Search other Entities

#### get the Entity classes in the database

In [37]:
from sqlalchemy import select, func
from timelink.api.models import Entity

models = tlnb.db.get_models_ids()
models

['attribute',
 'relation',
 'act',
 'source',
 'person',
 'good',
 'acusacoes',
 'caso',
 'object',
 'class',
 'geoentity',
 'rgeoentity',
 'robject',
 'rperson',
 'rentity',
 'aregister',
 'group_element',
 'entity']

#### Get columns of an entity type

In [38]:
table = tlnb.db.get_table("entity")
print(table.name)
list(table.columns)

entities


[Column('id', String(), table=<entities>, primary_key=True, nullable=False),
 Column('class', String(), table=<entities>, nullable=False),
 Column('inside', String(), ForeignKey('entities.id'), table=<entities>),
 Column('the_order', Integer(), table=<entities>),
 Column('the_level', Integer(), table=<entities>),
 Column('the_line', Integer(), table=<entities>),
 Column('groupname', String(), table=<entities>),
 Column('updated', DateTime(), table=<entities>, default=CallableColumnDefault(<function datetime.utcnow at 0x137c672e0>)),
 Column('indexed', DateTime(), table=<entities>)]

#### Search any entity type

## IMPROVE

In [39]:
from timelink.api.models import Entity
from sqlalchemy import select, func

Geoentity = tlnb.db.get_model("geoentity")
stmt = select(Geoentity).where(Geoentity.name.like('Hang%'))
print(stmt)
with tlnb.db.session() as session:

    result = session.execute(stmt).scalars().all()
    for row in result:
        print()
        print(row.the_type,row.name,row.id,row.obs)
        print(row.to_kleio())

SELECT geoentities.id, entities.id AS id_1, entities.class, entities.inside, entities.the_order, entities.the_level, entities.the_line, entities.groupname, entities.updated, entities.indexed, geoentities.name, geoentities.obs, geoentities.the_type 
FROM entities JOIN geoentities ON entities.id = geoentities.id 
WHERE geoentities.name LIKE :name_1

geo2 Hangchou deh-r1644-hangchou None
geo2$deh-r1644-hangchou/type=geoentity
  rel$function-in-act/geo2/deh-chre-1644/16440000
  atr$activa/sim/16110000
  geo3$deh-r1644-fuyang/type=geoentity
    atr$geoentity:name@wikidata/"https://www.wikidata.org/wiki/Q1011103"/16440000
    rel$function-in-act/geo3/deh-chre-1644/16440000
    atr$activa/sim/16420000
  geo3$deh-r1644-jenho/type=geoentity
    atr$activa/sim/16080000
    rel$function-in-act/geo3/deh-chre-1644/16440000
  atr$geoentity:name@wikidata/"https://www.wikidata.org/wiki/Q4970"/16440000
  atr$residencia-missao/Jesuíta/16110000


### Search Acts by date

## BUG relations do not appear inside the correct group. 

This is a problem with to_kleio() in Entity. First the attributes
and relations, then the contained groups which are not attributes
not relations. contained groups should be ordered by Order

In [40]:
from timelink.api.models import Act
from sqlalchemy import select, func

stmt = select(Act).where(Act.the_date > '16850800',
                         Act.the_date < '16850900')

print(stmt)
with tlnb.db.session() as session:

    result = session.execute(stmt).scalars().all()
    for row in result:
        print()
        print(row.groupname,row.the_date,row.loc,row.obs)
        print(row.to_kleio())

SELECT acts.id, entities.id AS id_1, entities.class, entities.inside, entities.the_order, entities.the_level, entities.the_line, entities.groupname, entities.updated, entities.indexed, acts.the_type, acts.the_date, acts.loc, acts.ref, acts.obs 
FROM entities JOIN acts ON entities.id = acts.id 
WHERE acts.the_date > :the_date_1 AND acts.the_date < :the_date_2

bap 16850802 igreja de s. tiago None
bap$b1685.3/16850802/type=bap/ref=?/loc=igreja de s. tiago
  rel$parentesco/pai/ana/b1685.3-per1/16850802
  rel$parentesco/marido/maria de oliveira/b1685.3-per1-per3/16850802
  n$ana/f/id=b1685.3-per1
    rel$function-in-act/n/b1685.3/16850802
    mae$maria de oliveira/f/id=b1685.3-per1-per3
      ls$ec/c/16850802
      rel$function-in-act/mae/b1685.3/16850802
    pad$domingos simoes/m/id=b1685.3-per4
      ls$residencia/sobral/16850802
      rel$function-in-act/pad/b1685.3/16850802
      ls$profissao/padre/16850802
    mad$ana velho/f/id=b1685.3-per5
      mrmad$nomes desconhecido/m/id=b1685.3

In [14]:
from timelink.api.models import Relation
from sqlalchemy import select

stmt = select(Relation).where(Relation.destination=='b1685.8-per1')
print(stmt)
with tlnb.db.session() as session:

    result = session.execute(stmt).scalars().all()
    for row in result:
        print()
        print(row.origin,row.the_type, row.the_value,row.destination)
        print(row.to_kleio())

SELECT relations.id, entities.id AS id_1, entities.class, entities.inside, entities.the_order, entities.the_level, entities.the_line, entities.groupname, entities.updated, entities.indexed, relations.origin, relations.destination, relations.the_type, relations.the_value, relations.the_date, relations.obs 
FROM entities JOIN relations ON relations.id = entities.id 
WHERE relations.destination = :destination_1


#### Get person by id

Show a single person or entity in Kleio notation

In [35]:
p = tlnb.db.get_person(person.id)
print(p.to_kleio())

n$Gonçalo de Aboim/m/id=140357/obs="""
      """

                  Id: 140357
                  Código de referência: PT/AUC/ELU/UC-AUC/B/001-001/A/000024

                  Nome        : Gonçalo de Aboim, vide Brito
                  Data inicial:
                  Data final  :
                  Filiação:

                  Naturalidade: Santarém
                  Faculdade:

                  Matrícula(s):
                  Instituta:
                  Bacharel:
                  Formatura:
                  Licenciado:
                  Doutor:

                  Outras informações:
              """
  """
  atr$código-de-referência/""PT/AUC/ELU/UC-AUC/B/001-001/A/000024""/2021-05-17
  atr$url/""https://pesquisa.auc.uc.pt/details?id=140357""/2021-05-17
  ls$uc.saida/0000-00-00/20200211
  ls$nome/Gonçalo de Aboim/20200211
  ls$nome.apelido/Aboim/20200211
  ls$naturalidade.ano/Santarém.0000/20200211
  rel$function-in-act/n/auc-alumni-A-140337-140771/20200211
  atr$data-do-registo/20

### Show other type of entities by id in Kleio

In [36]:
from timelink.api.models import Entity

ent = tlnb.db.get_entity("deh-r1644-chekiang")
print(ent.to_kleio())

geo1$deh-r1644-chekiang/type=geoentity
  rel$function-in-act/geo1/deh-chre-1644/16440000
  geo2$deh-r1644-hangchou/type=geoentity
    atr$activa/sim/16110000
    geo3$deh-r1644-fuyang/type=geoentity
      atr$activa/sim/16420000
      rel$function-in-act/geo3/deh-chre-1644/16440000
    rel$function-in-act/geo2/deh-chre-1644/16440000
    atr$residencia-missao/Jesuíta/16110000
    geo3$deh-r1644-jenho/type=geoentity
      rel$function-in-act/geo3/deh-chre-1644/16440000
      atr$activa/sim/16080000
  geo2$deh-r1644-chuchow/type=geoentity
    atr$activa/sim/16130000
    rel$function-in-act/geo2/deh-chre-1644/16440000
  geo2$deh-r1644-kashing/type=geoentity
    geo3$deh-r1644-kaosham/type=geoentity
      atr$activa/sim/1640
      rel$function-in-act/geo3/deh-chre-1644/16440000
    geo3$deh-r1644-tsungteh/type=geoentity
      atr$activa/1629/16440000
      rel$function-in-act/geo3/deh-chre-1644/16440000
      atr$alternative-name/Shimen/16440000
    geo3$deh-r1644-tungsiang/type=geoentity
 



###  Obter um dataframe a partir de atributos

---


###  Get a Dataframe from attributes


#### Exemplo: Faculdade, data de entrada e data de saída e grau dos naturais de Coimbra

In [37]:
from timelink.pandas import entities_with_attribute


# Get list of people with with a certain value in a specific attribute
df = entities_with_attribute(
                    entity_type='person',
                    the_type='naturalidade',
                    the_value='Coimbra',
                    more_info=['name','sex'],
                    name_like='% Aboim',
                    more_cols=['faculdade','uc.entrada','uc.saida'],
                    db=tlnb.db,
                    sql_echo=False)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2 entries, 140349 to 140367
Data columns (total 14 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   name               2 non-null      object
 1   sex                2 non-null      object
 2   naturalidade       2 non-null      object
 3   naturalidade.date  2 non-null      object
 4   naturalidade.obs   0 non-null      object
 5   faculdade          2 non-null      object
 6   faculdade.date     2 non-null      object
 7   faculdade.obs      0 non-null      object
 8   uc.entrada         2 non-null      object
 9   uc.entrada.date    2 non-null      object
 10  uc.entrada.obs     0 non-null      object
 11  uc.saida           2 non-null      object
 12  uc.saida.date      2 non-null      object
 13  uc.saida.obs       0 non-null      object
dtypes: object(14)
memory usage: 348.0+ bytes


In [38]:
df.head(5)

Unnamed: 0_level_0,name,sex,naturalidade,naturalidade.date,naturalidade.obs,faculdade,faculdade.date,faculdade.obs,uc.entrada,uc.entrada.date,uc.entrada.obs,uc.saida,uc.saida.date,uc.saida.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
140349,António de Aboim,m,Coimbra,1566-12-20,,Cânones,1566-12-20,,1566-12-20,1566-12-20,,1574-07-24,1574-07-24,
140367,Manuel de Vargas de Aboim,m,Coimbra,20200211,,Cânones,20200211,,0000-00-00,20200211,,0000-00-00,20200211,


#### obter attributos de outras entidades

In [39]:
from timelink.pandas import entities_with_attribute


# Get list of people with with a certain value in a specific attribute
df = entities_with_attribute(
                    entity_type='geoentity',
                    more_info=['name'],
                    the_type='activa',
                    the_value='sim',
                    more_cols=['residencia-missao'],
                    db=tlnb.db,
                    sql_echo=False)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 165 entries, deh-r1644-anhai to deh-r1644-yunnan
Data columns (total 7 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   name                    165 non-null    object
 1   activa                  165 non-null    object
 2   activa.date             165 non-null    object
 3   activa.obs              4 non-null      object
 4   residencia-missao       38 non-null     object
 5   residencia-missao.date  38 non-null     object
 6   residencia-missao.obs   4 non-null      object
dtypes: object(7)
memory usage: 10.3+ KB


In [23]:
df.head(30)

Unnamed: 0_level_0,name,activa,activa.date,activa.obs,residencia-missao,residencia-missao.date,residencia-missao.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
deh-r1644-anhai,Anhai,sim,1634,,,,
deh-r1644-bankao,Bankao,sim,1636,,,,
deh-r1644-cantao,Cantão,sim,1555,,Jesuíta,1580.0,
deh-r1644-cantao,Cantão,sim,1555,,Franciscanos,0.0,
deh-r1644-cantao,Cantão,sim,1555,,Dominicanos,0.0,
deh-r1644-chala,Chala,sim,1610,,,,
deh-r1644-changchow-fou,Changchow,sim,1643,,,,
deh-r1644-changshu,Changshu,sim,1623,,Jesuíta,1635.0,"R 1635 «Cham Xo», cf AHSI 28 (1951) 311-312"
deh-r1644-chengting,Chengting,sim,1621,,,,
deh-r1644-chinkiang,Chinkiang,sim,1611,,,,




###  Remover colunas sem valores

---



###  Remove empty columns

In [24]:
df.dropna(how='all', axis=1, inplace=True)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 165 entries, deh-r1644-anhai to deh-r1644-yunnan
Data columns (total 7 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   name                    165 non-null    object
 1   activa                  165 non-null    object
 2   activa.date             165 non-null    object
 3   activa.obs              4 non-null      object
 4   residencia-missao       38 non-null     object
 5   residencia-missao.date  38 non-null     object
 6   residencia-missao.obs   4 non-null      object
dtypes: object(7)
memory usage: 10.3+ KB


In [25]:
df.head(5)

Unnamed: 0_level_0,name,activa,activa.date,activa.obs,residencia-missao,residencia-missao.date,residencia-missao.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
deh-r1644-anhai,Anhai,sim,1634,,,,
deh-r1644-bankao,Bankao,sim,1636,,,,
deh-r1644-cantao,Cantão,sim,1555,,Jesuíta,1580.0,
deh-r1644-cantao,Cantão,sim,1555,,Franciscanos,0.0,
deh-r1644-cantao,Cantão,sim,1555,,Dominicanos,0.0,



## Contagens

---

## Counting



###  Contagem de atributos a partir de uma tabela em memória

---

###  Count attributes from an existing dataframe



In [28]:
# create a column with the index values which are the id numbers
# Get list of people with with a certain value in a specific attribute
from timelink.pandas import entities_with_attribute
import pandas as pd

df = entities_with_attribute(
                    entity_type='person',
                    the_type='naturalidade',
                    the_value='Coimbra',
                    more_info=['name','sex'],
                    more_cols=['faculdade','uc.entrada','uc.saida'],
                    db=tlnb.db,
                    sql_echo=False)
df.info()
df['id'] = df.index.values

col = 'faculdade' # subotal by this column

# Use pandas groupby and specify unique value count for id
df_totals = df.groupby(col).agg({'id':'nunique',
                                                  'uc.entrada':'min',
                                                  'uc.saida':'max'})

df_totals.sort_values('id',ascending= False).head(30)

<class 'pandas.core.frame.DataFrame'>
Index: 14 entries, 140349 to 140367
Data columns (total 14 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   name               14 non-null     object
 1   sex                14 non-null     object
 2   naturalidade       14 non-null     object
 3   naturalidade.date  14 non-null     object
 4   naturalidade.obs   0 non-null      object
 5   faculdade          14 non-null     object
 6   faculdade.date     14 non-null     object
 7   faculdade.obs      1 non-null      object
 8   uc.entrada         14 non-null     object
 9   uc.entrada.date    14 non-null     object
 10  uc.entrada.obs     0 non-null      object
 11  uc.saida           14 non-null     object
 12  uc.saida.date      14 non-null     object
 13  uc.saida.obs       0 non-null      object
dtypes: object(14)
memory usage: 2.2+ KB


Unnamed: 0_level_0,id,uc.entrada,uc.saida
faculdade,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Cânones,7,0000-00-00,1767-07-27
Medicina,5,1670-10-01,1823-10-20
Direito,1,1748-10-19,1748-10-19
Matemática,1,1868-10-02,1872-10-03



### Contagens na base de dados

Quando o atributo tem muitos valores e não é necessário
ter todas as pessoas em memória: contagem feita na base de dados

---

### Counting directly in the database
When there are many values and it is not
necessary to have all the people in memory:
count directly in the database.




In [5]:
from timelink.pandas import attribute_values

df_totals = attribute_values('grau',db=tlnb.db)


In [6]:
df_totals.head(10)


Unnamed: 0_level_0,count,date_min,date_max
value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Bacharel,171,1554-07-19,1912-08-03
Formatura,153,1574-07-24,1905-06-19
Licenciado,33,1574-06-03,1886-02-27
Bacharel em Artes,25,1574-03-14,1766-07-19
Doutor,11,1560-12-22,1887-11-27
Licenciado em Artes,5,1574-05-15,1738-06-17
Mestre,2,1710-10-05,1768-10-23


#### Filtrar por datas

Para evitar remissivas com data zero

---

#### Filter by dates

Avoid cross-references with zero date

##ERROR

In [33]:
df_totals = attribute_values('grau',dates_between=('1772','1919'),db=tlnb.db)

In [34]:
df_totals.head(10)

Unnamed: 0_level_0,count,date_min,date_max
value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Bacharel,53,1775-06-10,1912-08-03
Formatura,42,1776-05-11,1905-06-19
Licenciado,4,1800-07-26,1886-02-27
Doutor,3,1800-07-31,1887-11-27


## Visualizar entidades

---

## View entities





### Atributos de uma entidade numa tabela, uma linha por attributo

---

### Entitiy attributes in a dataframe, one line per attribute

#### A specifc entity

Use `filter_by` to filter a list of entites

In [12]:
# Get list of people with with a certain value in a specific attribute
from timelink.pandas import entities_with_attribute

df = entities_with_attribute(
                    entity_type='person',
                    the_type='uc.entrada',  # we need a base attribute
                    more_info=['name','sex'],
                    more_cols=['instituta','matricula.ano'],
                    filter_by=['140349'],
                    db=tlnb.db,
                    sql_echo=False)
df.sort_values('matricula.ano.date')

Unnamed: 0_level_0,name,sex,uc.entrada,uc.entrada.date,uc.entrada.obs,instituta,matricula.ano,matricula.ano.date,matricula.ano.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
140349,António de Aboim,m,1566-12-20,1566-12-20,,,Cânones.1571,1571-07-20,20.07.1571
140349,António de Aboim,m,1566-12-20,1566-12-20,,,Cânones.1573,1573-10-07,07.10.1573
140349,António de Aboim,m,1566-12-20,1566-12-20,,,Cânones.1574,1574-07-24,24.07.1574


In [13]:
from timelink.pandas import entities_with_attribute

neighbors = entities_with_attribute(
                                entity_type="person",
                                more_info=["groupname","name","sex"],
                                the_type='residencia',
                                column_name="residencia",
                                the_value="soure",
                                more_cols=['profissao'],
                                db=tlnb.db)
neighbors.head(10)

Unnamed: 0_level_0,groupname,name,sex,residencia,residencia.date,residencia.obs,profissao,profissao.date,profissao.obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
b1685.27d-per1-per2,pai,antonio francisco,m,soure,16850430,,,,
b1685.27d-r1,pad,antonio da rocha,m,soure,16850430,,,,
b1685.3-per6,mrmad,nomes desconhecido,m,soure,16850802,,boticario,16850802.0,
b1685.6-per4,pad,joao da costa de vasconcelos,m,soure,16850827,,,,
b1685.10-per1-per2,pai,matias de carvalho,m,soure,16850912,,,,
b1685.14-per4,pad,manuel duarte de morais,m,soure,16850923,,padre,16850923.0,
b1685.14-per5,mad,maria rosado de carvalho,f,soure,16850923,,,,
b1685.21-per6,mrmad,manuel de tavora,m,soure,16851104,,,,
b1685.27c-per1-per2,pai,manuel rodrigues,m,soure,16851125,,,,
b1685.29-per6,pmad,paulo ribeiro cabral,m,soure,16851213,,,,


### Visualizar um grupo de entidades

---

#### Show the attributes of a group of entites

In [7]:
from timelink.api.models import Person
from sqlalchemy import select

select(Person).selected_columns.keys()

['id',
 'id_1',
 'class',
 'inside',
 'the_order',
 'the_level',
 'the_line',
 'groupname',
 'updated',
 'indexed',
 'name',
 'sex',
 'obs']

In [8]:
from timelink.pandas.entities_with_attribute import entities_with_attribute
from timelink.pandas.group_attributes import group_attributes

# Get list of people with with a certain value in a specific attribute
colleagues_df = entities_with_attribute(entity_type='person',
                        the_type='uc.entrada.ano',
                        the_value='1750',db=tlnb.db)

colleagues = colleagues_df.index.values
print(colleagues)

# show their attributes
colleagues_cv = group_attributes(colleagues,
    entity_type='person',
    more_info=['name'],
    include_attributes=['instituta','matricula','faculdade','grau','exame'],
    exclude_attributes=['data-do-registo'],
    sql_echo=True,
    db=tlnb.db)

colleagues_cv.sort_values(['the_date','the_type','the_value'])

['140387' '140449' '140568' '140652' '140442' '140488' '140461' '140477'
 '140589']
SELECT persons.id, persons.name, attributes.the_type, attributes.the_value, attributes.the_date, attributes.obs AS attr_obs 
FROM entities JOIN persons ON entities.id = persons.id LEFT OUTER JOIN attributes ON attributes.entity = persons.id 
WHERE persons.id IN (__[POSTCOMPILE_id_1]) AND attributes.the_type IN (__[POSTCOMPILE_the_type_1]) AND (attributes.the_type NOT IN (__[POSTCOMPILE_the_type_2]))


Unnamed: 0_level_0,name,the_type,the_value,the_date,attr_obs
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
140449,Luís António de Abranches,faculdade,Cânones,1750-10-01,
140488,José Ribeiro de Abrantes,faculdade,Cânones,1750-10-01,
140568,António da Costa e Abreu,faculdade,Cânones,1750-10-01,
140387,António Lobo da Costa Borges e Abranches,faculdade,Cânones,1750-10-01,
140652,Caetano de Abreu,faculdade,Leis,1750-10-01,
140442,José Nunes Abranches,faculdade,Medicina,1750-10-01,
140449,Luís António de Abranches,instituta,1750-10-01,1750-10-01,01.10.1750 1750-10-01
140488,José Ribeiro de Abrantes,instituta,1750-10-01,1750-10-01,01.10.1750 1750-10-01
140568,António da Costa e Abreu,instituta,1750-10-01,1750-10-01,01.10.1750 1750-10-01
140387,António Lobo da Costa Borges e Abranches,instituta,1750-10-01,1750-10-01,01.10.1750 1750-10-01


In [1]:
from timelink.notebooks import TimelinkNotebook

tlnb = TimelinkNotebook(db_type='postgres', db_name="tests")

In [9]:
from timelink.pandas.group_attributes import display_group_attributes
import pandas as pd

display_group_attributes(
    colleagues,
    entity_type='person',
    header_cols=['name','uc.entrada'],
    more_info=['name'],
    include_attributes=['instituta','matricula','faculdade','grau','exame'],
    sort_attributes=['date','type','value'],
    cmap_name='Pastel1',
    db=tlnb.db)

InvalidRequestError: Label name name is being renamed to an anonymous label due to disambiguation which is not supported right now.  Please use unique names for explicit labels.

In [None]:
from timelink.pandas.group_attributes import display_group_attributes
import pandas as pd

pd.set_option('display.max_rows',250)

no_show=['código-de-referência','data-do-registo','url','faculdade.ano','naturalidade.ano',
         'matricula-faculdade.ano','nome-apelido','nome-primeiro','nome-geografico.ano',
         'grau.ano','matricula-outra.ano','nome-geografico','instituta.ano']

dup_ids = ['b1685.6-per4', 'b1685.29-per4', 'b1685.27b-per3']

display_group_attributes(dup_ids,
                             header_cols=['uc-entrada','naturalidade','faculdade','nome-pai'],
                             exclude_attributes=no_show,
                             sort_attributes=['date','type','value'],
                             cmap_name='Pastel1')

#### Notação Kleio

Ver [Kleio notation](README_kleio.md) [EN]

---

#### Kleio notation

See [Kleio notation](README_kleio.md)

#### Notação Kleio directamente da base de dados

Ver [Kleio notation](README_kleio.md) [EN]

---

#### Kleio notation directly from database

See [Kleio notation](README_kleio.md)

In [33]:
from timelink.mhk.models.person import Person

with tlnb.db.session() as session:

    p: Person = session.query(Person).order_by(Person.id).first()
    k = p.to_kleio()
    print(p.to_kleio())


n$António Pinto Abadeço/m/id=140337/obs="""
      """

                  Id: 140337
                  Código de referência: PT/AUC/ELU/UC-AUC/B/001-001/A/000001

                  Nome        : António Pinto Abadeço
                  Data inicial: 1705-10-01
                  Data final  : 1710-10-01
                  Filiação: António Pinto

                  Naturalidade: Abrantes
                  Faculdade: Cânones

                  Matrícula(s): 01.10.1705
                  01.10.1706
                  01.01.1707
                  01.10.1708
                  01.10.1709
                  01.10.1710
                  Instituta:
                  Bacharel: 28.06.1709
                  Formatura:
                  Licenciado:
                  Doutor:

                  Outras informações:
              """
  """
  rel$function-in-act/n/António Pinto Abadeço/auc-alumni-A-140337-140771/20200211
  atr$código-de-referência/""PT/AUC/ELU/UC-AUC/B/001-001/A/000001""/2021-05-17
  atr$data-