## Part One: Loading and querying through RDF data

#### First part of this analysis, the script loads RDF data from a file, queries and manipulates it to extract specific pieces of information (mainly the contents of a set of letters), then stores this information in a CSV file and a pandas Series. The goal behind this code is to preprocess and extract specific data from a complex RDF dataset for further analysis in part 2.

1. **Library installation and importation**
   - The script starts by installing `rdflib`, a Python library for working with RDF, a standard model for data interchange on the Web. The other imported modules and functions are used for manipulating and querying RDF data.

2. **Google Drive mount**
   - The script then mounts Google Drive to access the file stored there.

3. **Data loading and RDF graph creation**
   - It loads the data from the file `newVespasiano.nq`, which is stored on Google Drive, into an RDF graph `g`. The data is in the 'nquads' format, which is a line-based, plain text format that's used to encode an RDF dataset.

4. **Data exploration**
   - It performs some basic operations to explore and understand the RDF graph. For example, it calculates the number of triples in the graph, prints all the contexts (i.e., the named graphs within the graph), prints all quads (i.e., the triples with their associated named graph), and finds all unique objects in the quads.

5. **Retrieving specific information from the graph**
   - The script then finds and prints all properties and objects for a specific subject URI (Uniform Resource Identifier), specifically the maintext of a letter. Then, it gets the content of that specific letter.

6. **Creating a list of letter URIs**
   - The script creates a list of URIs for different letters and removes one specific URI from the list.

7. **Extracting letter contents**
   - It defines a function `func_uri` that, given an index and a list, retrieves the content of the letter corresponding to the URI at that index in `letters_id` and appends it to the list. Then it calls this function for each index in `letters_id`, storing the results in the list `letter_texts`.

8. **Data validation**
   - It checks whether the number of elements in `letters_id` and `letter_texts` are equal.

9. **Dataframe creation**
   - It creates a pandas DataFrame with two columns: 'id' and 'testo', and populates it with the URIs from `letters_id` and the letter contents from `letter_texts`.

10. **Data saving**
   - It saves the DataFrame to a CSV file named 'letters.csv'.

11. **Further data extraction and loading**
   - It extracts the descriptions of another specific letter and appends them to the list `doc`, then converts `doc` to a pandas Series and prints its first few elements.

In [1]:
!pip install rdflib

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
import pprint
import rdflib
from rdflib import URIRef, Literal, Namespace
from rdflib.namespace import XSD, RDFS, DCTERMS
from rdflib import Literal

In [3]:
# loading data from google drive
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
# create an empty Graph
data_path = "/content/drive/MyDrive/Colab Notebooks/dati/newVespasiano.nq"
#DateClassifier-TextClassification-BERT/newVespasiano.nq
g = rdflib.ConjunctiveGraph()
#result = g.parse("E:\\DHDK\\Tiziana\\vdbRdf\\vespasiano.nq", format='nquads')
result = g.parse(data_path, format='nquads')


In [5]:
## the number of triples (quadruples)
print(len(g))

30291


In [6]:
for n_graph in g.contexts():
    pprint.pprint(n_graph)

<Graph identifier=file:///content/drive/MyDrive/Colab%20Notebooks/dati/newVespasiano.nq (<class 'rdflib.graph.Graph'>)>


In [7]:
for quad in g.quads():
    pprint.pprint(quad)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
(rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/letter-35-analog'),
 rdflib.term.URIRef('http://purl.org/vocab/frbr/core#exemplar'),
 rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/t-498-a-f'),
 <Graph identifier=file:///content/drive/MyDrive/Colab%20Notebooks/dati/newVespasiano.nq (<class 'rdflib.graph.Graph'>)>)
(rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/letter-8-analog'),
 rdflib.term.URIRef('http://purl.org/vocab/frbr/core#exemplar'),
 rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/plut-90-sup-30-ff-14-15v'),
 <Graph identifier=file:///content/drive/MyDrive/Colab%20Notebooks/dati/newVespasiano.nq (<class 'rdflib.graph.Graph'>)>)
(rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-37-note-2'),
 rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'),
 rdflib.term.URIRef('http://purl.org/spar/doco/Footnote')

In [8]:
# iterate over all terms of the quads
for subj, pred, obj, context in g.quads():
    pprint.pprint(obj)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/letter-24-expr')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/letters/2')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/letters/33')
rdflib.term.Literal('Termine di lessico specifico "Scriptura velox"', lang='it')
rdflib.term.Literal('Publisher Neri Pozza', lang='en')
rdflib.term.URIRef('http://purl.org/emmedi/hico/InterpretationAct')
rdflib.term.Literal('Philological note', lang='en')
rdflib.term.URIRef('http://purl.org/vocab/frbr/core#Work')
rdflib.term.Literal('Gray, William (d. 1478) ', lang='en')
rdflib.term.Literal('M. Vattasso', lang='en')
rdflib.term.Literal("Sarei venuto chostì, se non mi fussi suto decto non vi contentate bene vi si venga, per rispecto del morbo. L'Etica vostra <è> finita di tutto: volendo ve la mandi, ve la manderò. <Manc>hami alchuni exempli per finire e libri della <Badia di Fiesole>; e sono s

In [9]:
unique_objs = set()
for subj, pred, obj, context in g.quads():
    unique_objs.add(obj)

for obj in unique_objs:
    pprint.pprint(obj)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-37-note-4')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-note-4')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/letter-30-sender-attribution')
rdflib.term.Literal('137-139')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/note-3ref-to-lanconelli-ed')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/michelini-fdm-ed-bulzoni-publisher-1986')
rdflib.term.Literal('"Maestà del Re vostro padre" ... "Sua Maestà"')
rdflib.term.Literal('Mediceo avanti il Principato, filza XXVIII, n. 701')
rdflib.term.Literal('Lettera 2')
rdflib.term.Literal('Basil Blackwell', lang='it')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/p26')
rdflib.term.URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-11-note-8')
rdflib.term.URIRef('http://vespasianodabisti

In [10]:
from rdflib import URIRef, Literal
# all the properties and objects having 'maintext' as subject
for s, p, o, c in g.quads():
    if s == URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-maintext'): # the URI representing maintext of letter
        print(p, o)

http://www.essepuntato.it/2008/12/pattern#contains http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-maintext-p7-attested-name-d250e110
http://purl.org/vocab/frbr/core#embodiment http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-maintext-html
http://www.essepuntato.it/2008/12/pattern#contains http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-maintext-philippo-monsigniore-attested-name-d250e119
http://www.essepuntato.it/2008/12/pattern#contains http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-maintext-p8-attested-name-d250e88
http://www.essepuntato.it/2008/12/pattern#contains http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-l30-pn2
http://purl.org/spar/c4o/hasContent Adì 9 scripsi alla vostra illustrissima Signoria d'uno degno facto d'arme facto in Valachia: aretela <ha>vuta. Di nuove qui sono poche chose degne. Aspectasi la speditione di questa pace da Roma, che chome havete inteso l'ha prolunghata infino adì 15, e il simile farà

In [11]:
# testing for getting content(text) of a specific letter
for s, p, o, c in g.quads():
    if s == URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-30-maintext') \
    and p == URIRef('http://purl.org/spar/c4o/hasContent'): # the URI representing content of letter
        print(o)

Adì 9 scripsi alla vostra illustrissima Signoria d'uno degno facto d'arme facto in Valachia: aretela <ha>vuta. Di nuove qui sono poche chose degne. Aspectasi la speditione di questa pace da Roma, che chome havete inteso l'ha prolunghata infino adì 15, e il simile farà per torre tempo alla lega e avere meso a pratichare di nuovo cho' Cardinali. Pare abbi pocha voglia abbi conclusione, e se potrà etiam none farà nulla. Solo resta a che via vanno i vinitiani: che se hanno retifichato, chome ce ne sono lettere da Roma, non è molto buono segnio, perché di principio quando il Papa lo publichò, i principali di Vinegia ne feciono infra loro grande allegreza; di poi, chom'eglino vidono la lega non retifichare, cominciorono in publicho a biasimare e dire che il Papa haveva facto male a publicharla in questa forma. Sendosi mossi hora a retificare, sarebbe l'opposito a quello hanno detto, e non pare molto buono segnio di volere la pacie. Dubito che i manchamenti d'altri non gli faccino più gagliar

In [12]:
# testing with a shorter line of code:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-2-maintext'), URIRef('http://purl.org/spar/c4o/hasContent'), None )):
    print(o)

Vehementer me oblectant littere tue, Philippe dulcissime, non eo solum quod tanta facilitate sunt scripte, ut a prisco epistolarum modo nequaquam abhorreant, sed etiam quia cum eas video semper recursat in animo dulcissima recordatio amicitie nostre. Superioribus vero meis certiorem te reddidi Florentie neminem esse qui ad fragmenta scribat. Reperirentur vero scriptores ad volumina eo pacto quo exoptas, hoc est ut unumquodque latus quinquaginta lineas, versus vero singuli elementa septuaginta continerent. Pretium unius voluminis essent grossi sex. Littere scriptoris potiores tuis sunt, quarum formam misissem certe cum his litteris, nisi prohibuissent nonnulle occupationes mee. Decerne igitur quicquid tibi videtur, deinde de proposito tuo fac aliquid sentiam. Praeterea, ut intelligas me sedulo invigilare rebus tuis, certiorem te reddo habuisse me equidem summo labore Lactantium super Statium, optimum sane et emendatum. Forma scripture fusa est et velox. Scribe igitur quid sit agendum. D

In [13]:
# creating list of letters URIs
letters_id = []
for i in range(42):
    letters_id.append("http://vespasianodabisticciletters.unibo.it/tomasi-letter-" + str(i+1)+ "-maintext")

In [14]:
letters_id[41]

'http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext'

In [15]:
# removing letter 16 (Giannozzo Manetti a Vespasiano. Napoli, 28 luglio 1457);
# because it's text is not available.

letters_id.remove('http://vespasianodabisticciletters.unibo.it/tomasi-letter-16-maintext')

In [16]:
# here it is created a function to extract text of each letter, and then it will be added to a list ('letter_texts')
def func_uri(j, text_lst):
    for s, p, o in g.triples((URIRef(letters_id[j]), URIRef('http://purl.org/spar/c4o/hasContent'), None )):
        text_lst.append(str(o))


letter_texts=[]
for i in range(len(letters_id)):
    func_uri(i,letter_texts)

In [17]:
# test for having the same length for both list of letters id's an letters texts:
len(letter_texts) == len(letters_id)

True

In [18]:
import pandas as pd

#Create empty DataFrame with specific column names & types
df_letters = pd.DataFrame({'id': pd.Series(dtype='str'),
                   'testo': pd.Series(dtype='str')})

In [19]:
# populating each column with its content from lists of 'letter_texts' and 'letters_id'
df_letters['testo'] = letter_texts
df_letters['id'] = letters_id

In [20]:
# take a look to two rows:
df_letters.head(2)

Unnamed: 0,id,testo
0,http://vespasianodabisticciletters.unibo.it/to...,"Vespasiano mio dolcissimo, le lettere le quali..."
1,http://vespasianodabisticciletters.unibo.it/to...,"Vehementer me oblectant littere tue, Philippe ..."


In [21]:
# creating a csv file for saving our dataframe
df_letters.to_csv('letters.csv', index=False)

In [22]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-17'), URIRef('http://purl.org/dc/terms/description'), None )):
    print(o)

Letter "17 - Vespasiano a Piero de' Medici. Firenze, 19 aprile 1458", published by Francesca Tomasi
Lettera "17 - Vespasiano a Piero de' Medici. Firenze, 19 aprile 1458", edita da Francesca Tomasi


In [23]:
# Loading data into pandas dataframe
doc = []
# with open('dates.txt') as file:
#     for line in file:
#         doc.append(line)
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-17'), URIRef('http://purl.org/dc/terms/description'), None )):
        doc.append(str(o))

df = pd.Series(doc)

df.head()

0    Letter "17 - Vespasiano a Piero de' Medici. Fi...
1    Lettera "17 - Vespasiano a Piero de' Medici. F...
dtype: object

## Part 2: Extracting "dates" info from RDF dataset using Regular Expressions

### The intention of this script is looking at a collection of letters, figuring out when each one was written, and making a table that includes the letter, its contents, and the date it was written. Then it saves this table for later.

1. **Making a List of Letters**
- The script makes a list of all the letters it needs to look at. There's one letter it's having trouble with (letter 16), so it takes that one out of the list.
2. **Collecting the Dates**
- Next, it makes a blank list where it'll put all the dates. For every letter in its list, it looks up that letter in the dataset, finds the part of the record that says when the letter was written, and adds this date to the list.
- After collecting all the dates, it puts them into a table (DataFrame) called df_dates_test.
3. **Sorting through the Dates**
- Some of the dates are written in a certain way (like 'yyyy-yyyy'), and some are written in a different way (like 'yyyy'). It uses rules (regular expressions) to find the dates and store them separately.
- There's one date that it fixes manually, then it puts all the dates it found into a single list (dates_list).
4. **Updating the Dates**
- It then goes back to its table of dates (df_dates_test) and replaces the original, unsorted dates with the ones it just figured out.
5. **Putting it All Together**
- Now that it has a list of when each letter was written, it can add this information to the table it made earlier with all the letters and their contents. It sticks the two tables together side by side.
6. **Saving the Results**
- Finally, it takes this combined table of letters, contents, and dates, and saves it as a new file called 'letters-with-dates.csv'.

In [24]:
letters_date_id = []
for i in range(42):
    letters_date_id.append("http://vespasianodabisticciletters.unibo.it/tomasi-letter-" + str(i+1))

In [25]:
# again removing the problamatic letter 16
letters_date_id.remove('http://vespasianodabisticciletters.unibo.it/tomasi-letter-16')


In [26]:
# print an example of a letter id that will be used to extract the date of letter from URI 'http://purl.org/dc/terms/description'
letters_date_id[5]

'http://vespasianodabisticciletters.unibo.it/tomasi-letter-6'

In [27]:
df_dates_test = pd.DataFrame(dtype=str)
letters_dates_test = []
for i in range(41):
    for s, p, o in g.triples((URIRef(letters_date_id[i]), URIRef('http://purl.org/dc/terms/description'), None)):
        letters_dates_test.append(str(o))
    #print(letters_dates[i])
    df = pd.Series(letters_dates_test.pop())
    df_dates_test = df_dates_test.append(df, ignore_index=True)
df_dates_test.rename(columns={0: 'years'}, inplace=True)
df_dates_test['years'] = df_dates_test['years'].astype('string')

In [28]:
df_dates_test['years'][0]

'Lettera "1 - Donato Acciaiuoli a Vespasiano. [Montegufoni], 28 settembre 1446", edita da Francesca Tomasi'

In [29]:
import re

extract_test1 = dict()
for ind,vals in dict(df_dates_test['years'].apply(lambda x:re.search(r'\d{4}[-]\d{4}',x,re.M|re.I))).items():
    if vals:
        extract_test1[ind]=vals.group()

extract_test2 = dict()
for ind,vals in dict(df_dates_test['years'].apply(lambda x:re.search(r'\d{4}',x,re.M|re.I))).items():
    if vals and (ind not in list(extract_test1.keys())):
        extract_test2[ind]=vals.group()

extract_test1

{39: '1444-1448', 40: '1480-1485'}

In [30]:
extract_test2;


In [31]:
extract_test2[24]='1463-1467'

In [32]:
letters_date_id[24]

'http://vespasianodabisticciletters.unibo.it/tomasi-letter-26'

In [33]:
extract_test2.update(extract_test1)
dates_list=list(extract_test2.values())
#print(dates_list)


In [34]:
df_dates_test['years'] = dates_list
df_dates_test

Unnamed: 0,years
0,1446
1,1448
2,1448
3,1448
4,1448
5,1449
6,1450
7,1451
8,1453
9,1454


In [35]:
# here we concat two dataframes to have id uri and texts and each year for each letter:
final_df_edited = pd.concat([df_letters, df_dates_test], axis=1, join='inner')
display(final_df_edited)

Unnamed: 0,id,testo,years
0,http://vespasianodabisticciletters.unibo.it/to...,"Vespasiano mio dolcissimo, le lettere le quali...",1446
1,http://vespasianodabisticciletters.unibo.it/to...,"Vehementer me oblectant littere tue, Philippe ...",1448
2,http://vespasianodabisticciletters.unibo.it/to...,"Paucis ante diebus respondidi litteris tuis, P...",1448
3,http://vespasianodabisticciletters.unibo.it/to...,"Suscepi nuper litteras tuas, quibus nescio qui...",1448
4,http://vespasianodabisticciletters.unibo.it/to...,Reverende in Christo pater et domine mi singul...,1448
5,http://vespasianodabisticciletters.unibo.it/to...,"Egli è più dì ch'io ricevetti una tua, alla qu...",1449
6,http://vespasianodabisticciletters.unibo.it/to...,"Dici non potest, Vespasiane suavissime, quantu...",1450
7,http://vespasianodabisticciletters.unibo.it/to...,Honorevole come fratello et caetera. Ne' dì pa...,1451
8,http://vespasianodabisticciletters.unibo.it/to...,Egregie tanquam frater carissime. Perché dite ...,1453
9,http://vespasianodabisticciletters.unibo.it/to...,"Egregie tanquam frater, ho ricevuta vostra let...",1454


In [36]:
# creating csv file
final_df_edited.to_csv('letters-with-dates.csv', index=False)

In [37]:
# http://vespasianodabisticciletters.unibo.it/letters/41
# http://vespasianodabisticciletters.unibo.it/attested-name
# http://purl.org/spar/c4o/hasContent


## Part 3: Enriching the DataFrame extended_fianl with additional information drawn from the RDF graph. This part is focused on refining the uri_from column and adding a few new columns to the DataFrame.

Here is a summary of the objectives of this part:

1. **Refinement of Place Information:** The uri_from column, which stores the place information, is further refined. It was initially filled based on the uri_maintext column, but in this part, another approach is also used to update the uri_from values. This second approach uses the 'p-sender-letter' URIs instead of the main text URIs to find and update the corresponding place information.

2. **Addition of New Columns:** The DataFrame is enriched with additional columns that represent new information:
- uri_expr: The URI of the expressions associated with each letter.
- sender-att: The URI of the sender attributions associated with each expression.
- p-sender-letter: The URI of the entity generated by each sender attribution.

3. **Data Export:** The final DataFrame, now with enriched information, is exported to a CSV file named 'extended-data.csv'.

In [38]:
from pandas import *
data_json = read_json("/content/drive/MyDrive/Colab Notebooks/dati/vdbRdf/vdbRdf.json")
df_data_json = pd.read_json("/content/drive/MyDrive/Colab Notebooks/dati/vdbRdf/vdbRdf.json")

In [39]:
df_data_json[df_data_json['http://purl.org/vocab/bio/0.1/place'].notna()]['http://purl.org/vocab/bio/0.1/place']
df_2 = df_data_json[df_data_json['http://purl.org/vocab/bio/0.1/place'].notna()]['http://purl.org/vocab/bio/0.1/place']


In [40]:
df_3_dropped = df_2.drop_duplicates()
df_3_dropped.reset_index(drop=True, inplace=True)


In [41]:
place_lst=[]
for i in range(34):
    place_lst.append(df_3_dropped[i][0]['@id'])

In [42]:
place_lst[0]

'http://vespasianodabisticciletters.unibo.it/antella'

In [43]:
lst_attested=[]
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/antella'), URIRef('http://purl.org/dc/terms/isReferencedBy'), None )):
    lst_attested.append(str(o))

lst_attested

['http://vespasianodabisticciletters.unibo.it/tomasi-letter-33-datatio-antella-attested-name-d253e134',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-35-tergo-antella-attested-name-d255e136',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-datatio-antella-attested-name-d263e257']

In [44]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-datatio-antella-attested-name-d263e257'), URIRef('http://purl.org/dc/terms/source'), None )):
    print(o)

http://vespasianodabisticciletters.unibo.it/letters/42


In [45]:
# In [18]: %timeit df.set_value('C', 'x', 10)
# 100000 loops, best of 3: 2.9 µs per loop
#
# In [20]: %timeit df['x']['C'] = 10
# 100000 loops, best of 3: 6.31 µs per loop
#
# In [81]: %timeit df.at['C', 'x'] = 10
# 100000 loops, best of 3: 9.2 µs per loop

In [46]:
extended_fianl=final_df_edited.copy()

In [47]:
extended_fianl.insert(3,"from", " ")


In [48]:
extended_fianl

Unnamed: 0,id,testo,years,from
0,http://vespasianodabisticciletters.unibo.it/to...,"Vespasiano mio dolcissimo, le lettere le quali...",1446,
1,http://vespasianodabisticciletters.unibo.it/to...,"Vehementer me oblectant littere tue, Philippe ...",1448,
2,http://vespasianodabisticciletters.unibo.it/to...,"Paucis ante diebus respondidi litteris tuis, P...",1448,
3,http://vespasianodabisticciletters.unibo.it/to...,"Suscepi nuper litteras tuas, quibus nescio qui...",1448,
4,http://vespasianodabisticciletters.unibo.it/to...,Reverende in Christo pater et domine mi singul...,1448,
5,http://vespasianodabisticciletters.unibo.it/to...,"Egli è più dì ch'io ricevetti una tua, alla qu...",1449,
6,http://vespasianodabisticciletters.unibo.it/to...,"Dici non potest, Vespasiane suavissime, quantu...",1450,
7,http://vespasianodabisticciletters.unibo.it/to...,Honorevole come fratello et caetera. Ne' dì pa...,1451,
8,http://vespasianodabisticciletters.unibo.it/to...,Egregie tanquam frater carissime. Perché dite ...,1453,
9,http://vespasianodabisticciletters.unibo.it/to...,"Egregie tanquam frater, ho ricevuta vostra let...",1454,


In [49]:
extended_fianl.index[extended_fianl['id']=='http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext'].tolist()

[40]

In [50]:
extended_fianl['id'][40]

'http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext'

In [51]:
from rdflib import URIRef, Literal
# all the properties and objects having 'maintext' as subject
for s, p, o, c in g.quads():
    if s == URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext'): # the URI representing maintext of letter
        print(p, o)

http://purl.org/dc/terms/type http://vespasianodabisticciletters.unibo.it/maintext
http://www.essepuntato.it/2008/12/pattern#contains http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-l42-pn14
http://purl.org/spar/c4o/hasContent Pierfilippo, io ti scrissi d'agosto per altra mia, e quelo significharò ora a tte: ch'e pechati <che> fai non sono per ignoranza, ma per l'oposito. Ti ricordo che 'l vivere è dato da Dio per gratia, e la morte per pena di pechato. In tuti gl'uomini, come tu sai, questo naturale apetito di vivere, e per chonseguitare questo efetto è necesario farne ogni chosa. Io non so donde in te si nascha tuto l'oposito in tute l'operationi tua, che questo desiderio pare che in te non solo sia istinto, ma morto, e di questo ce ne sono pruove infinite. In prima, del continovo ocupato i<n> mile fantasie, nele quali il dì e la notte mai non pe<n>ssi a altro, e pieno di mile farnetichi di roba, di stati, in forma tale che non solo hai un'ora di tempo vacua, ma tu non h

In [52]:
for s, p, o, c in g.quads():
    if s == URIRef('http://vespasianodabisticciletters.unibo.it/letters/42'): # the URI representing the letter number
        print(p, o)

http://www.w3.org/2000/01/rdf-schema#label 
http://www.w3.org/2000/01/rdf-schema#label 42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]
http://purl.org/dc/terms/source http://vespasianodabisticciletters.unibo.it/asf-strozziane-serie-i-vol-137-ff-288-289v
http://www.w3.org/2000/01/rdf-schema#label 42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/vocab/frbr/core#Item
http://purl.org/dc/terms/identifier 42
http://purl.org/dc/terms/description Edizione digitale online della lettera "42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]", edita da Francesca Tomasi
http://purl.org/dc/terms/description Online digital edition of the letter "42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]", published by Francesca Tomasi
http://purl.org/vocab/frbr/core#partOf http://vespasianodabisticciletters.unibo.it/


In [53]:
for s, p, o, c in g.quads():
    if o == URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext'): # the URI representing maintext
        print(s)

http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-expr


In [54]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/letters/42'), URIRef('http://purl.org/dc/terms/identifier'), None )):
    print(o)

42


In [55]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/letters/42'), URIRef('http://purl.org/dc/terms/source'), None )):
    print(o)

http://vespasianodabisticciletters.unibo.it/asf-strozziane-serie-i-vol-137-ff-288-289v


In [56]:
letter42=[]
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-expr'), URIRef('http://purl.org/vocab/frbr/core#part'), None )):
    letter42.append(str(o))

letter42[1]

'http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext'

In [57]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/letters/42'), None, None )):
    print(p, o)

http://purl.org/dc/terms/description Edizione digitale online della lettera "42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]", edita da Francesca Tomasi
http://purl.org/dc/terms/description Online digital edition of the letter "42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]", published by Francesca Tomasi
http://purl.org/dc/terms/identifier 42
http://purl.org/dc/terms/source http://vespasianodabisticciletters.unibo.it/asf-strozziane-serie-i-vol-137-ff-288-289v
http://purl.org/vocab/frbr/core#partOf http://vespasianodabisticciletters.unibo.it/
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/vocab/frbr/core#Item
http://www.w3.org/2000/01/rdf-schema#label 
http://www.w3.org/2000/01/rdf-schema#label 42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]
http://www.w3.org/2000/01/rdf-schema#label 42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]


In [58]:
for s, p, o, c in g.quads():
    if s == URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-expr'): # the URI representing letter expr
        print(p,o)

http://www.w3.org/2000/01/rdf-schema#label Testo della lettera "42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]"
http://purl.org/vocab/frbr/core#complement http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-note-1
http://purl.org/vocab/frbr/core#part http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext
http://purl.org/vocab/frbr/core#partOf http://dx.doi.org/10.6092/unibo/vespasianodabisticciletters
http://purl.org/vocab/frbr/core#part http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-tergo
http://purl.org/spar/cito/citesAsAuthority http://vespasianodabisticciletters.unibo.it/frizzi-ed-letter-42-print
http://purl.org/dc/terms/description Content of the letter "42 - Vespasiano a Pierfilippo Pandolfini. Antella, [1480-1485]", published by Francesca Tomasi
http://purl.org/vocab/frbr/core#complement http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-note-3
http://purl.org/dc/terms/description Testo della lettera "42 - Vespasian

In [59]:
letters_uri=[]
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/'), URIRef('http://purl.org/vocab/frbr/core#part'), None)):
    letters_uri.append(str(o))

letters_uri[0]

'http://vespasianodabisticciletters.unibo.it/letters/1.html'

In [60]:
len(letters_uri)

44

In [61]:
letters_uri_id = []
for i in range(42):
    letters_uri_id.append("http://vespasianodabisticciletters.unibo.it/letters/" + str(i+1))

letters_uri_id.remove('http://vespasianodabisticciletters.unibo.it/letters/16')

In [62]:
letters_identifier=[]
for i in range(41):
    for s, p, o in g.triples((URIRef(letters_uri_id[i]), URIRef('http://purl.org/dc/terms/identifier'), None )):
        letters_identifier.append(str(o))

In [63]:
extended_fianl.insert(1,"identifier",letters_identifier)

ValueError: ignored

In [64]:
len(letters_identifier)

40

In [65]:
len(letters_uri_id)


41

we can see that the length of those list is not equal. it seems some information in one of the uriref's identifier is missing:

In [66]:
letters_identifier

['1',
 '2',
 '3',
 '4',
 '5',
 '6',
 '7',
 '8',
 '9',
 '10',
 '11',
 '12',
 '13',
 '14',
 '15',
 '17',
 '18',
 '19',
 '20',
 '21',
 '22',
 '23',
 '24',
 '25',
 '26',
 '27',
 '28',
 '29',
 '30',
 '31',
 '32',
 '33',
 '34',
 '35',
 '36',
 '37',
 '39',
 '40',
 '41',
 '42']

we see that identifier 38 is missing! so we insert it manually in the index of '39' so it will shift to next position:

In [67]:
letters_identifier.insert(letters_identifier.index('39'), '38')

In [68]:
extended_fianl.insert(1,"identifier",letters_identifier)

In [69]:
extended_fianl

Unnamed: 0,id,identifier,testo,years,from
0,http://vespasianodabisticciletters.unibo.it/to...,1,"Vespasiano mio dolcissimo, le lettere le quali...",1446,
1,http://vespasianodabisticciletters.unibo.it/to...,2,"Vehementer me oblectant littere tue, Philippe ...",1448,
2,http://vespasianodabisticciletters.unibo.it/to...,3,"Paucis ante diebus respondidi litteris tuis, P...",1448,
3,http://vespasianodabisticciletters.unibo.it/to...,4,"Suscepi nuper litteras tuas, quibus nescio qui...",1448,
4,http://vespasianodabisticciletters.unibo.it/to...,5,Reverende in Christo pater et domine mi singul...,1448,
5,http://vespasianodabisticciletters.unibo.it/to...,6,"Egli è più dì ch'io ricevetti una tua, alla qu...",1449,
6,http://vespasianodabisticciletters.unibo.it/to...,7,"Dici non potest, Vespasiane suavissime, quantu...",1450,
7,http://vespasianodabisticciletters.unibo.it/to...,8,Honorevole come fratello et caetera. Ne' dì pa...,1451,
8,http://vespasianodabisticciletters.unibo.it/to...,9,Egregie tanquam frater carissime. Perché dite ...,1453,
9,http://vespasianodabisticciletters.unibo.it/to...,10,"Egregie tanquam frater, ho ricevuta vostra let...",1454,


In [70]:
extended_fianl.rename(columns = {'id':'uri_maintext', 'identifier':'letter_number', 'from':'uri_from'}, inplace = True)

In [71]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/antella'), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
        print(o)

Antella


In [72]:
letters_uri_id

['http://vespasianodabisticciletters.unibo.it/letters/1',
 'http://vespasianodabisticciletters.unibo.it/letters/2',
 'http://vespasianodabisticciletters.unibo.it/letters/3',
 'http://vespasianodabisticciletters.unibo.it/letters/4',
 'http://vespasianodabisticciletters.unibo.it/letters/5',
 'http://vespasianodabisticciletters.unibo.it/letters/6',
 'http://vespasianodabisticciletters.unibo.it/letters/7',
 'http://vespasianodabisticciletters.unibo.it/letters/8',
 'http://vespasianodabisticciletters.unibo.it/letters/9',
 'http://vespasianodabisticciletters.unibo.it/letters/10',
 'http://vespasianodabisticciletters.unibo.it/letters/11',
 'http://vespasianodabisticciletters.unibo.it/letters/12',
 'http://vespasianodabisticciletters.unibo.it/letters/13',
 'http://vespasianodabisticciletters.unibo.it/letters/14',
 'http://vespasianodabisticciletters.unibo.it/letters/15',
 'http://vespasianodabisticciletters.unibo.it/letters/17',
 'http://vespasianodabisticciletters.unibo.it/letters/18',
 'http

In [73]:
# <http://vespasianodabisticciletters.unibo.it/antella> <http://www.w3.org/2000/01/rdf-schema#label> "Antella"@it .

#

In [74]:
len(place_lst)

34

In [75]:

for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/antella'),
                          URIRef('http://purl.org/dc/terms/isReferencedBy'), None)):
    print(o)

for s, p, o in g.triples((
                         URIRef('http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-datatio-antella-attested-name-d263e257'),
                         URIRef('http://purl.org/dc/terms/source'), None)):
    print(o)

http://vespasianodabisticciletters.unibo.it/tomasi-letter-33-datatio-antella-attested-name-d253e134
http://vespasianodabisticciletters.unibo.it/tomasi-letter-35-tergo-antella-attested-name-d255e136
http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-datatio-antella-attested-name-d263e257
http://vespasianodabisticciletters.unibo.it/letters/42


In [76]:
extended_fianl.insert(2, "uri_id", letters_uri_id)

In [77]:
extended_fianl.head(1)

Unnamed: 0,uri_maintext,letter_number,uri_id,testo,years,uri_from
0,http://vespasianodabisticciletters.unibo.it/to...,1,http://vespasianodabisticciletters.unibo.it/le...,"Vespasiano mio dolcissimo, le lettere le quali...",1446,


In [78]:
# here i keep exploring and testing... 

# temp_lst=[]
for i in range(len(place_lst)):
    for s, p, o in g.triples((URIRef(place_lst[i]),
                              URIRef('http://purl.org/dc/terms/isReferencedBy'), None)):
        # temp_lst.append(str(o))
        s2 = str(o)
        for s, p, o in g.triples((
                                 URIRef(s2),
                                 URIRef('http://purl.org/dc/terms/source'), None)):
            # if str(o) == 'http://vespasianodabisticciletters.unibo.it/letters/42':
            if extended_fianl['uri_id'].where(extended_fianl['uri_id']==str(o)).any():
                print(str(o))
                # which_index=extended_fianl.index[extended_fianl['uri_id']=='http://vespasianodabisticciletters.unibo.it/letters/42'].tolist()
                which_index=extended_fianl.index[extended_fianl['uri_id']==str(o)].tolist()
                for s, p, o in g.triples((URIRef(place_lst[i]), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
                    print(o)
                    # extended_fianl.insert(which_index[0],'uri_from', str(o))
                    # for i in range(len(extended_fianl)):
                    #     extended_fianl['uri_from'][i] = str(o)
                    extended_fianl['uri_from'][which_index]=str(o)
                print("TRUE")
                # print("which index: ", which_index)

http://vespasianodabisticciletters.unibo.it/letters/33
Antella
TRUE
http://vespasianodabisticciletters.unibo.it/letters/35
Antella
TRUE
http://vespasianodabisticciletters.unibo.it/letters/42
Antella
TRUE
http://vespasianodabisticciletters.unibo.it/letters/17
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/18
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/19
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/2
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/20
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/25
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/3
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/30
Firenze
Firenze
Florentiae
TRUE
http://vespasianodabisticciletters.unibo.it/letters/31
Firenze
Firenze
Florentiae
TRUE
http://vespasia

In [79]:
extended_fianl['uri_id']=='http://vespasianodabisticciletters.unibo.it/letters/42'

0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
11    False
12    False
13    False
14    False
15    False
16    False
17    False
18    False
19    False
20    False
21    False
22    False
23    False
24    False
25    False
26    False
27    False
28    False
29    False
30    False
31    False
32    False
33    False
34    False
35    False
36    False
37    False
38    False
39    False
40     True
Name: uri_id, dtype: bool

In [80]:
extended_fianl['uri_id'][1]

'http://vespasianodabisticciletters.unibo.it/letters/2'

In [81]:
test_df=extended_fianl.copy()

In [82]:
# for i in range(test_df):
test_df['uri_from'][1] = 'hello'

In [83]:
len(test_df)

41

In [84]:
extended_fianl

Unnamed: 0,uri_maintext,letter_number,uri_id,testo,years,uri_from
0,http://vespasianodabisticciletters.unibo.it/to...,1,http://vespasianodabisticciletters.unibo.it/le...,"Vespasiano mio dolcissimo, le lettere le quali...",1446,
1,http://vespasianodabisticciletters.unibo.it/to...,2,http://vespasianodabisticciletters.unibo.it/le...,"Vehementer me oblectant littere tue, Philippe ...",1448,Florentiae
2,http://vespasianodabisticciletters.unibo.it/to...,3,http://vespasianodabisticciletters.unibo.it/le...,"Paucis ante diebus respondidi litteris tuis, P...",1448,Florentiae
3,http://vespasianodabisticciletters.unibo.it/to...,4,http://vespasianodabisticciletters.unibo.it/le...,"Suscepi nuper litteras tuas, quibus nescio qui...",1448,Florentiae
4,http://vespasianodabisticciletters.unibo.it/to...,5,http://vespasianodabisticciletters.unibo.it/le...,Reverende in Christo pater et domine mi singul...,1448,Florentiae
5,http://vespasianodabisticciletters.unibo.it/to...,6,http://vespasianodabisticciletters.unibo.it/le...,"Egli è più dì ch'io ricevetti una tua, alla qu...",1449,
6,http://vespasianodabisticciletters.unibo.it/to...,7,http://vespasianodabisticciletters.unibo.it/le...,"Dici non potest, Vespasiane suavissime, quantu...",1450,Vacciano
7,http://vespasianodabisticciletters.unibo.it/to...,8,http://vespasianodabisticciletters.unibo.it/le...,Honorevole come fratello et caetera. Ne' dì pa...,1451,Napoli
8,http://vespasianodabisticciletters.unibo.it/to...,9,http://vespasianodabisticciletters.unibo.it/le...,Egregie tanquam frater carissime. Perché dite ...,1453,
9,http://vespasianodabisticciletters.unibo.it/to...,10,http://vespasianodabisticciletters.unibo.it/le...,"Egregie tanquam frater, ho ricevuta vostra let...",1454,Bologna


In [85]:
extended_fianl['uri_id'][0]

'http://vespasianodabisticciletters.unibo.it/letters/1'

In [86]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
                    print(o)

Montegufoni


In [87]:
place_lst

['http://vespasianodabisticciletters.unibo.it/antella',
 'http://vespasianodabisticciletters.unibo.it/firenze',
 'http://vespasianodabisticciletters.unibo.it/roma',
 'http://vespasianodabisticciletters.unibo.it/citta-del-vaticano',
 'http://vespasianodabisticciletters.unibo.it/spoleto',
 'http://vespasianodabisticciletters.unibo.it/napoli',
 'http://vespasianodabisticciletters.unibo.it/torino',
 'http://vespasianodabisticciletters.unibo.it/milano',
 'http://vespasianodabisticciletters.unibo.it/poppi',
 'http://vespasianodabisticciletters.unibo.it/los-altos-hills',
 'http://vespasianodabisticciletters.unibo.it/venezia',
 'http://vespasianodabisticciletters.unibo.it/montegufoni',
 'http://vespasianodabisticciletters.unibo.it/bari',
 'http://vespasianodabisticciletters.unibo.it/padova',
 'http://vespasianodabisticciletters.unibo.it/forli',
 'http://vespasianodabisticciletters.unibo.it/verona',
 'http://vespasianodabisticciletters.unibo.it/londra',
 'http://vespasianodabisticciletters.unib

In [88]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'),
                          URIRef('http://purl.org/dc/terms/isReferencedBy'), None)):
    # temp_lst.append(str(o))
    s2 = str(o)
    print(s2)
    print("a")
    for s, p, o in g.triples((
                             URIRef(s2),
                             URIRef('http://purl.org/dc/terms/source'), None)):
        # if str(o) == 'http://vespasianodabisticciletters.unibo.it/letters/42':
        print("b")
        if extended_fianl['uri_id'].where(extended_fianl['uri_id']==str(o)).all():
            print(str(o))
            # which_index=extended_fianl.index[extended_fianl['uri_id']=='http://vespasianodabisticciletters.unibo.it/letters/42'].tolist()
            which_index=extended_fianl.index[extended_fianl['uri_id']==str(o)].tolist()
            for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
                print(o)
                # extended_fianl.insert(which_index[0],'uri_from', str(o))
                # for i in range(len(extended_fianl)):
                #     extended_fianl['uri_from'][i] = str(o)
                # extended_fianl['uri_from'][which_index]=str(o)
            which_index
            print("TRUE")
        # else:
        #     print("ffffff")
            # print("which index: ", which_index)

http://vespasianodabisticciletters.unibo.it/tomasi-letter-41-datatio-montegufoni-attested-name-d262e166
a
b
http://vespasianodabisticciletters.unibo.it/letters/41
Montegufoni
TRUE


In [89]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
                print(o)

Montegufoni


In [90]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'),
                          URIRef('http://purl.org/dc/terms/isReferencedBy'), None)):
    print(o)

http://vespasianodabisticciletters.unibo.it/tomasi-letter-41-datatio-montegufoni-attested-name-d262e166


In [91]:
# <http://vespasianodabisticciletters.unibo.it/letter-42-sender-attribution> <http://purl.org/emmedi/hico/isExtractedFrom> <http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-expr>

In [92]:
# <http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-expr> <http://purl.org/vocab/frbr/core#part> <http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext>

In [93]:
# <http://vespasianodabisticciletters.unibo.it/p0-sender-letter-42-1480-1485> <http://www.w3.org/ns/prov#wasGeneratedBy> <http://vespasianodabisticciletters.unibo.it/letter-42-sender-attribution>

In [94]:
# <http://vespasianodabisticciletters.unibo.it/p0-sender-letter-42-1480-1485> <http://purl.org/vocab/bio/0.1/place> <http://vespasianodabisticciletters.unibo.it/antella>

In [95]:
for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'),
                          URIRef('http://purl.org/vocab/bio/0.1/place'), None)):
    s2 = str(o)
    print(s2)
    print("a")
    for s, p, o in g.triples((
                             URIRef(s2),
                             URIRef('http://purl.org/dc/terms/source'), None)):
        print("b")
        if extended_fianl['uri_id'].where(extended_fianl['uri_id']==str(o)).all():
            print(str(o))
            which_index=extended_fianl.index[extended_fianl['uri_id']==str(o)].tolist()
            for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
                print(o)
                # extended_fianl.insert(which_index[0],'uri_from', str(o))
                # for i in range(len(extended_fianl)):
                #     extended_fianl['uri_from'][i] = str(o)
                # extended_fianl['uri_from'][which_index]=str(o)
            which_index
            print("TRUE")

In [96]:
for s, p, o in g.triples((None,
                          URIRef('http://purl.org/vocab/bio/0.1/place'),
                          URIRef('http://vespasianodabisticciletters.unibo.it/antella')
                          )):
    print(str(s))


http://vespasianodabisticciletters.unibo.it/casprini-ed-edizioni-crc-publisher-1993
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-33-1479
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-34-1479
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-38-1489
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-40-1497
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-42-1480-1485
http://vespasianodabisticciletters.unibo.it/p76-sender-letter-35-1480


In [97]:
temp_lst=[]
for i in range(len(place_lst)):
    for s, p, o in g.triples((None,
                              URIRef('http://purl.org/vocab/bio/0.1/place'),
                              URIRef(place_lst[i])
                              )):
        temp_lst.append(str(s))
        for t in range(len(temp_lst)):

            for s, p, o in g.triples((temp_lst[t],
                              URIRef('http://www.w3.org/ns/prov#wasGeneratedBy'),
                              None
                              )):
                new_s = str(o)
                for s, p, o in g.triples((new_s,
                              URIRef('http://purl.org/emmedi/hico/isExtractedFrom'),
                              None
                              )):
                    new_s2 = str(o)
                    for s, p, o in g.triples((new_s2,
                              URIRef('http://purl.org/vocab/frbr/core#part'),
                              None
                              )):
                        if extended_fianl['uri_maintext'].where(extended_fianl['uri_maintext']==str(o)).any():
                            print(str(o))
                            which_index=extended_fianl.index[extended_fianl['uri_maintext']==str(o)].tolist()
                            for s, p, o in g.triples((URIRef(place_lst[i]), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
                                print(o)
                                # extended_fianl.insert(which_index[0],'uri_from', str(o))
                                # for i in range(len(extended_fianl)):
                                #     extended_fianl['uri_from'][i] = str(o)
                                extended_fianl['uri_from'][which_index]=str(o)
                            # which_index
                            print("TRUE")


In [98]:
extended_fianl

Unnamed: 0,uri_maintext,letter_number,uri_id,testo,years,uri_from
0,http://vespasianodabisticciletters.unibo.it/to...,1,http://vespasianodabisticciletters.unibo.it/le...,"Vespasiano mio dolcissimo, le lettere le quali...",1446,
1,http://vespasianodabisticciletters.unibo.it/to...,2,http://vespasianodabisticciletters.unibo.it/le...,"Vehementer me oblectant littere tue, Philippe ...",1448,Florentiae
2,http://vespasianodabisticciletters.unibo.it/to...,3,http://vespasianodabisticciletters.unibo.it/le...,"Paucis ante diebus respondidi litteris tuis, P...",1448,Florentiae
3,http://vespasianodabisticciletters.unibo.it/to...,4,http://vespasianodabisticciletters.unibo.it/le...,"Suscepi nuper litteras tuas, quibus nescio qui...",1448,Florentiae
4,http://vespasianodabisticciletters.unibo.it/to...,5,http://vespasianodabisticciletters.unibo.it/le...,Reverende in Christo pater et domine mi singul...,1448,Florentiae
5,http://vespasianodabisticciletters.unibo.it/to...,6,http://vespasianodabisticciletters.unibo.it/le...,"Egli è più dì ch'io ricevetti una tua, alla qu...",1449,
6,http://vespasianodabisticciletters.unibo.it/to...,7,http://vespasianodabisticciletters.unibo.it/le...,"Dici non potest, Vespasiane suavissime, quantu...",1450,Vacciano
7,http://vespasianodabisticciletters.unibo.it/to...,8,http://vespasianodabisticciletters.unibo.it/le...,Honorevole come fratello et caetera. Ne' dì pa...,1451,Napoli
8,http://vespasianodabisticciletters.unibo.it/to...,9,http://vespasianodabisticciletters.unibo.it/le...,Egregie tanquam frater carissime. Perché dite ...,1453,
9,http://vespasianodabisticciletters.unibo.it/to...,10,http://vespasianodabisticciletters.unibo.it/le...,"Egregie tanquam frater, ho ricevuta vostra let...",1454,Bologna


create sender att column...

In [99]:
# <http://vespasianodabisticciletters.unibo.it/letter-42-sender-attribution> <http://purl.org/emmedi/hico/isExtractedFrom> <http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-expr>
# <http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-expr> <http://purl.org/vocab/frbr/core#part> <http://vespasianodabisticciletters.unibo.it/tomasi-letter-42-maintext>
# <http://vespasianodabisticciletters.unibo.it/p0-sender-letter-42-1480-1485> <http://www.w3.org/ns/prov#wasGeneratedBy> <http://vespasianodabisticciletters.unibo.it/letter-42-sender-attribution>
# <http://vespasianodabisticciletters.unibo.it/p0-sender-letter-42-1480-1485> <http://purl.org/vocab/bio/0.1/place> <http://vespasianodabisticciletters.unibo.it/antella>

In [100]:
expr_lst=[]
for i in range(len(letters_id)):
    for s, p, o in g.triples((None,
                              URIRef('http://purl.org/vocab/frbr/core#part'),
                              URIRef(letters_id[i]))):
        # print(s)
        s2 = str(s)
        expr_lst.append(s2)
        # print(s2)
        # print("a")
        # for s, p, o in g.triples((
        #                          URIRef(s2),
        #                          URIRef('http://purl.org/dc/terms/source'), None)):
        #     print("b")
        #     if extended_fianl['uri_id'].where(extended_fianl['uri_id']==str(o)).all():
        #         print(str(o))
        #         which_index=extended_fianl.index[extended_fianl['uri_id']==str(o)].tolist()
        #         for s, p, o in g.triples((URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
        #             print(o)
        #             # extended_fianl.insert(which_index[0],'uri_from', str(o))
        #             # for i in range(len(extended_fianl)):
        #             #     extended_fianl['uri_from'][i] = str(o)
        #             # extended_fianl['uri_from'][which_index]=str(o)
        #         which_index
        #         print("TRUE")
expr_lst

['http://vespasianodabisticciletters.unibo.it/tomasi-letter-1-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-2-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-3-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-4-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-5-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-6-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-7-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-8-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-9-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-10-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-11-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-12-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-13-expr',
 'http://vespasianodabisticciletters.unibo.it/tomasi-letter-14-expr',
 'http://vespasianodabisticci

In [101]:
extended_fianl.insert(1,"uri_expr",expr_lst)

In [102]:
extended_fianl

Unnamed: 0,uri_maintext,uri_expr,letter_number,uri_id,testo,years,uri_from
0,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,1,http://vespasianodabisticciletters.unibo.it/le...,"Vespasiano mio dolcissimo, le lettere le quali...",1446,
1,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,2,http://vespasianodabisticciletters.unibo.it/le...,"Vehementer me oblectant littere tue, Philippe ...",1448,Florentiae
2,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,3,http://vespasianodabisticciletters.unibo.it/le...,"Paucis ante diebus respondidi litteris tuis, P...",1448,Florentiae
3,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,4,http://vespasianodabisticciletters.unibo.it/le...,"Suscepi nuper litteras tuas, quibus nescio qui...",1448,Florentiae
4,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,5,http://vespasianodabisticciletters.unibo.it/le...,Reverende in Christo pater et domine mi singul...,1448,Florentiae
5,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,6,http://vespasianodabisticciletters.unibo.it/le...,"Egli è più dì ch'io ricevetti una tua, alla qu...",1449,
6,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,7,http://vespasianodabisticciletters.unibo.it/le...,"Dici non potest, Vespasiane suavissime, quantu...",1450,Vacciano
7,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,8,http://vespasianodabisticciletters.unibo.it/le...,Honorevole come fratello et caetera. Ne' dì pa...,1451,Napoli
8,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,9,http://vespasianodabisticciletters.unibo.it/le...,Egregie tanquam frater carissime. Perché dite ...,1453,
9,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/to...,10,http://vespasianodabisticciletters.unibo.it/le...,"Egregie tanquam frater, ho ricevuta vostra let...",1454,Bologna


In [103]:
import re
sender_att_lst =[]
for i in range(len(expr_lst)):
    for s, p, o in g.triples((None,
                              URIRef('http://purl.org/emmedi/hico/isExtractedFrom'),
                              URIRef(expr_lst[i]))):
        # print(s)
        s3 = str(s)
        attRegex = re.compile(r'(-sender-attribution)')
        mo = attRegex.search(s3)
        if mo:
            sender_att_lst.append(s3)

sender_att_lst

['http://vespasianodabisticciletters.unibo.it/letter-1-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-2-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-3-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-4-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-5-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-6-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-7-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-8-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-9-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-10-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-11-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-12-sender-attribution',
 'http://vespasianodabisticciletters.unibo.it/letter-13-sender-attribution',
 'http:/

In [104]:
extended_fianl.insert(1,"sender-att",sender_att_lst)

In [105]:
p_sender_letter=[]
for i in range(len(sender_att_lst)):
    for s, p, o in g.triples((None,
                              URIRef('http://www.w3.org/ns/prov#wasGeneratedBy'),
                              URIRef(sender_att_lst[i]))):
        # print(s)
        s4 = str(s)
        p_sender_letter.append(s4)

p_sender_letter

['http://vespasianodabisticciletters.unibo.it/p2-sender-letter-1-1446',
 'http://vespasianodabisticciletters.unibo.it/p0-sender-letter-2-1448',
 'http://vespasianodabisticciletters.unibo.it/p0-sender-letter-3-1448',
 'http://vespasianodabisticciletters.unibo.it/p0-sender-letter-4-1448',
 'http://vespasianodabisticciletters.unibo.it/p0-sender-letter-5-1448',
 'http://vespasianodabisticciletters.unibo.it/p70-sender-letter-6-1449',
 'http://vespasianodabisticciletters.unibo.it/p70-sender-letter-7-1450',
 'http://vespasianodabisticciletters.unibo.it/p70-sender-letter-8-1451',
 'http://vespasianodabisticciletters.unibo.it/p101-sender-letter-9-1453',
 'http://vespasianodabisticciletters.unibo.it/p101-sender-letter-10-1454',
 'http://vespasianodabisticciletters.unibo.it/p70-sender-letter-11-1454',
 'http://vespasianodabisticciletters.unibo.it/p70-sender-letter-12-1456',
 'http://vespasianodabisticciletters.unibo.it/p70-sender-letter-13-1456',
 'http://vespasianodabisticciletters.unibo.it/p70-

In [106]:
extended_fianl.insert(1,"p-sender-letter",p_sender_letter)

In [107]:
for i in range(len(place_lst)):
    for s, p, o in g.triples((None,
                              URIRef('http://purl.org/vocab/bio/0.1/place'),
                              URIRef(place_lst[i])
                              )):
        s5 = str(s)
        print(s5)
        print("a")

        if extended_fianl['p-sender-letter'].where(extended_fianl['p-sender-letter']==s5).all():
            print("b")
            which_index=extended_fianl.index[extended_fianl['p-sender-letter']==s5].tolist()
            for s, p, o in g.triples((URIRef(place_lst[i]), URIRef('http://www.w3.org/2000/01/rdf-schema#label'), None )):
                print("c")
                # extended_fianl.insert(which_index[0],'uri_from', str(o))
                # for i in range(len(extended_fianl)):
                #     extended_fianl['uri_from'][i] = str(o)
                extended_fianl['uri_from'][which_index]=str(o)
            print("TRUE")

http://vespasianodabisticciletters.unibo.it/casprini-ed-edizioni-crc-publisher-1993
a
b
c
TRUE
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-33-1479
a
b
c
TRUE
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-34-1479
a
b
c
TRUE
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-38-1489
a
b
c
TRUE
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-40-1497
a
b
c
TRUE
http://vespasianodabisticciletters.unibo.it/p0-sender-letter-42-1480-1485
a
b
c
TRUE
http://vespasianodabisticciletters.unibo.it/p76-sender-letter-35-1480
a
b
c
TRUE
http://vespasianodabisticciletters.unibo.it/archivio-di-stato-firenze-repository-letter-16
a
b
c
c
c
TRUE
http://vespasianodabisticciletters.unibo.it/archivio-di-stato-firenze-repository-letter-17
a
b
c
c
c
TRUE
http://vespasianodabisticciletters.unibo.it/archivio-di-stato-firenze-repository-letter-18
a
b
c
c
c
TRUE
http://vespasianodabisticciletters.unibo.it/archivio-di-stato-firenze-repository-letter-26
a
b
c
c
c


In [108]:
for s, p, o in g.triples((None,
                         URIRef('http://purl.org/vocab/bio/0.1/place'),
                         URIRef('http://vespasianodabisticciletters.unibo.it/montegufoni'))):
    s5 = str(s)
    print(s5)

http://vespasianodabisticciletters.unibo.it/p11-sender-letter-41-1444-1448
http://vespasianodabisticciletters.unibo.it/p2-sender-letter-1-1446


In [109]:
extended_fianl

Unnamed: 0,uri_maintext,p-sender-letter,sender-att,uri_expr,letter_number,uri_id,testo,years,uri_from
0,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p2...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,1,http://vespasianodabisticciletters.unibo.it/le...,"Vespasiano mio dolcissimo, le lettere le quali...",1446,Montegufoni
1,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p0...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,2,http://vespasianodabisticciletters.unibo.it/le...,"Vehementer me oblectant littere tue, Philippe ...",1448,Florentiae
2,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p0...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,3,http://vespasianodabisticciletters.unibo.it/le...,"Paucis ante diebus respondidi litteris tuis, P...",1448,Florentiae
3,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p0...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,4,http://vespasianodabisticciletters.unibo.it/le...,"Suscepi nuper litteras tuas, quibus nescio qui...",1448,Florentiae
4,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p0...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,5,http://vespasianodabisticciletters.unibo.it/le...,Reverende in Christo pater et domine mi singul...,1448,Florentiae
5,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p7...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,6,http://vespasianodabisticciletters.unibo.it/le...,"Egli è più dì ch'io ricevetti una tua, alla qu...",1449,Florentiae
6,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p7...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,7,http://vespasianodabisticciletters.unibo.it/le...,"Dici non potest, Vespasiane suavissime, quantu...",1450,Vacciano
7,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p7...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,8,http://vespasianodabisticciletters.unibo.it/le...,Honorevole come fratello et caetera. Ne' dì pa...,1451,Napoli
8,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p1...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,9,http://vespasianodabisticciletters.unibo.it/le...,Egregie tanquam frater carissime. Perché dite ...,1453,Bologna
9,http://vespasianodabisticciletters.unibo.it/to...,http://vespasianodabisticciletters.unibo.it/p1...,http://vespasianodabisticciletters.unibo.it/le...,http://vespasianodabisticciletters.unibo.it/to...,10,http://vespasianodabisticciletters.unibo.it/le...,"Egregie tanquam frater, ho ricevuta vostra let...",1454,Bologna


In [110]:
extended_fianl.to_csv('extended-data.csv', index=False)