# RDF-   NLS -  Encyclopaedia Britannica

This notebook is going to create the RDF triples to generate our RDLIB GRAPH

For each postprocess edition dataframe that we got from **Merging_EB_Terms.ipynb** (e.g. results_eb_1_edition_dataframe, results_eb_2_edition_dataframe, etc) we are going to add the information from the dataframe that we got from **Metadata_EB.ipynb** (metadata_eb_dataframe). 

The idea is to have per edition dataframe (and also supplement dataframe), all the information (which currently is splitted across several dataframes) in one. 


This notebook will store the final dataframes in results_NLS directory, and their name schema will be **final_eb_< NUM_EDITION >_dataframe**.

Per entry in these new dataframes we will have the following columns (see an example of one entry of the first edition):

- MMSID:                                              
- editionTitle:                          First edition, 1771, Volume 1, A-B
- editor:                                                  Smellie, William
- editor_date:                                                   1740-1795
- genre:                                                       encyclopedia
- language:                                                             eng
- termsOfAddress:                                                       NaN
- numberOfPages:                                                        832
- physicalDescription:               3 v., 160 plates : ill. ; 26 cm. (4to)
- place:                                                         Edinburgh
- publisher:              Printed for A. Bell and C. Macfarquhar; and so...
- referencedBy:           [Alston, R.C.  Engl. language III, 560, ESTC T...
- shelfLocator:                                                        EB.1
- editionSubTitle:        Illustrated with one hundred and sixty copperp...
- volumeTitle:            Encyclopaedia Britannica; or, A dictionary of ...
- year:                                                                1771
- volumeId:                                                       144133901
- metsXML:                                               144133901-mets.xml
- permanentURL:                            https://digital.nls.uk/144133901
- publisherPersons:                     [C. Macfarquhar, Colin Macfarquhar]
- volumeNum:                                                              1
- letters:                                                              A-B
- part:                                                                   0
- editionNum:                                                             1
- supplementTitle:                                                         
- supplementSubTitle:                                                      
- supplementsTo:                                                         []
- numberOfVolumes:                                                        6
- term:                                                                  OR
- definition:             A NEW A D I C T I A A, the name of several riv...
- relatedTerms:                                                          []
- header:                                           EncyclopaediaBritannica
- startsAt:                                                              15
- endsAt:                                                                15
- numberOfTerms:                                                         22
- numberOfWords:                                                         54
- positionPage:                                                           0
- typeTerm:                                                         Article
- altoXML:                                  144133901/alto/188082904.34.xml

### Loading the necessary libraries

In [1]:
import rdflib
from rdflib.extras.external_graph_libs import rdflib_to_networkx_multidigraph
import networkx as nx
import matplotlib.pyplot as pl
from rdflib import Graph, Namespace, Literal
from rdflib.plugins.sparql import prepareQuery

### Functions

### 1. Loading the graph

In [2]:
g = Graph()
g.parse("../../results_NLS/edition1st.ttl", format="ttl") 

<Graph identifier=Nd38de0568c984480a30237db2815f124 (<class 'rdflib.graph.Graph'>)>

List all the resources with the property eb:editor

In [3]:
eb = Namespace("https://w3id.org/eb#")

q1 = prepareQuery('''
  SELECT ?Edition WHERE { 
    ?Edition eb:editor ?FullName. 
  }
  ''',
  initNs = { "eb": eb}
)


for r in g.query(q1):
      print(r.Edition)

https://w3id.org/eb/i/Edition/992277653804341
https://w3id.org/eb/i/Edition/9929192893804340


Same query but asking more information about the resources obtained. 

In [4]:
q2 = prepareQuery('''
  SELECT ?Subject ?FullName WHERE { 
    ?Subject eb:editor ?FullName.
  } 
  ''',
  initNs = { "eb": eb}
)

for r in g.query(q2):
  print(r.Subject, r.FullName)

https://w3id.org/eb/i/Edition/992277653804341 https://w3id.org/eb/i/Person/Smellie,William
https://w3id.org/eb/i/Edition/9929192893804340 https://w3id.org/eb/i/Person/Smellie,William


Same query asking for the first 10 resources with the properity eb.name

In [5]:
q2 = prepareQuery('''
  SELECT ?Subject ?FullName WHERE { 
    ?Subject eb:name ?FullName.
  } 
  LIMIT 10
  ''',
  initNs = { "eb": eb}
)

for r in g.query(q2):
    print(r.Subject, r.FullName)

https://w3id.org/eb/i/Article/992277653804341_144133903_TRAGIA_0 TRAGIA
https://w3id.org/eb/i/Article/9929192893804340_144850368_PASTIL_0 PASTIL
https://w3id.org/eb/i/Article/992277653804341_144133901_BURNISHER_0 BURNISHER
https://w3id.org/eb/i/Article/9929192893804340_144850366_ATRICES_0 ATRICES
https://w3id.org/eb/i/Article/992277653804341_144133903_PARTI_0 PARTI
https://w3id.org/eb/i/Article/9929192893804340_144850367_CHARA_0 CHARA
https://w3id.org/eb/i/Article/992277653804341_144133903_TAVISTOCK_0 TAVISTOCK
https://w3id.org/eb/i/Article/9929192893804340_144850366_BASARUCO_0 BASARUCO
https://w3id.org/eb/i/Article/9929192893804340_144850366_BAKAL_0 BAKAL
https://w3id.org/eb/i/Article/9929192893804340_144850367_EDDISH_0 EDDISH


Asking for resources which name is "Smellie, Willian"

In [6]:
from rdflib import XSD
q3 = prepareQuery('''
  SELECT ?Subject WHERE { 
    ?Subject eb:name ?Family.
  } 
  ''',
    initNs = { "eb": eb}
)

for r in g.query(q3, initBindings = {'?Family' : Literal('Smellie, William', datatype=XSD.string)}):
  print(r.Subject)

https://w3id.org/eb/i/Person/Smellie,William


Asking for resources with name is ABACUS

In [7]:
from rdflib import XSD
q3 = prepareQuery('''
  SELECT ?Subject WHERE { 
    ?Subject eb:name ?Term.
  } 
  ''',
    initNs = { "eb": eb}
)

for r in g.query(q3, initBindings = {'?Term' : Literal('ABACUS', datatype=XSD.string)}):
  print(r.Subject)

https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_3
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_2
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_0
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_0
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_3
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_1
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_1
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_2


In [8]:
from rdflib import XSD
q4 = prepareQuery('''
  SELECT ?Subject ?Term WHERE { 
    ?Subject eb:name ?Term.
  } 
  ''',
    initNs = { "eb": eb}
)

for r in g.query(q4, initBindings = {'?Term' : Literal('ABACUS', datatype=XSD.string)}):
  print(r.Subject, r.Term)

https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_3 ABACUS
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_2 ABACUS
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_0 ABACUS
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_0 ABACUS
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_3 ABACUS
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_1 ABACUS
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_1 ABACUS
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_2 ABACUS


In [9]:
q5 = prepareQuery('''
    SELECT ?edition
    WHERE {
       ?edition rdf:type eb:Edition .
    }
    ''',
  initNs = { "eb": eb}
)

for r in g.query(q5):
  print(r.edition)

https://w3id.org/eb/i/Edition/992277653804341
https://w3id.org/eb/i/Edition/9929192893804340


In [10]:
q5 = prepareQuery('''
    SELECT ?topic
    WHERE {
       ?topic rdf:type eb:Topic .
    }
    LIMIT 10
    ''',
  initNs = { "eb": eb}
)

for r in g.query(q5):
  print(r.topic)

https://w3id.org/eb/i/Topic/9929192893804340_144850368_TRIGONOMETRY_0
https://w3id.org/eb/i/Topic/992277653804341_144133902_ELECTRICITY_0
https://w3id.org/eb/i/Topic/9929192893804340_144850368_NATDNATHISTORY_0
https://w3id.org/eb/i/Topic/992277653804341_144133901_ALLIGATION_0
https://w3id.org/eb/i/Topic/9929192893804340_144850367_CONICSECTIONS_0
https://w3id.org/eb/i/Topic/992277653804341_144133903_SURGEY_0
https://w3id.org/eb/i/Topic/992277653804341_144133903_MUTMRMUTILATION_0
https://w3id.org/eb/i/Topic/9929192893804340_144850367_OJITCHBTTEPIXXX_0
https://w3id.org/eb/i/Topic/992277653804341_144133903_SHORTHANDWRITING_0
https://w3id.org/eb/i/Topic/9929192893804340_144850366_BLEACKING_0


In [11]:
q6 = prepareQuery('''
    SELECT *
    WHERE {
       ?topic a eb:Topic .
       ?topic eb:relatedTerms ?relatedTerms . 
       ?topic eb:name ?name .
       ?topic eb:startsAt ?startsAt.
       ?topic eb:endsAt ?endsAt.
    }
    LIMIT 10
    ''',
  initNs = { "eb": eb}
)

for r in g.query(q6):
  print(r.topic, r.relatedTerms, r.name, r.startsAt, r.endsAt)

https://w3id.org/eb/i/Topic/992277653804341_144133902_ELECTRICITY_0 https://w3id.org/eb/i/Article/992277653804341_144133901_AMBER_1 ELECTRICITY https://w3id.org/eb/i/Page/992277653804341_144133902_413 https://w3id.org/eb/i/Page/992277653804341_144133902_430
https://w3id.org/eb/i/Topic/992277653804341_144133902_ELECTRICITY_0 https://w3id.org/eb/i/Article/992277653804341_144133901_AMBER_0 ELECTRICITY https://w3id.org/eb/i/Page/992277653804341_144133902_413 https://w3id.org/eb/i/Page/992277653804341_144133902_430
https://w3id.org/eb/i/Topic/9929192893804340_144850368_NATDNATHISTORY_0 https://w3id.org/eb/i/Article/9929192893804340_144850366_AMPHIBIA_0 NATDNATHISTORY https://w3id.org/eb/i/Page/9929192893804340_144850368_412 https://w3id.org/eb/i/Page/9929192893804340_144850368_415
https://w3id.org/eb/i/Topic/9929192893804340_144850368_NATDNATHISTORY_0 https://w3id.org/eb/i/Article/9929192893804340_144850367_HOMO_0 NATDNATHISTORY https://w3id.org/eb/i/Page/9929192893804340_144850368_412 http

In [12]:
q6 = prepareQuery('''
    SELECT *
    WHERE {
       ?article a eb:Article .
       ?article eb:relatedTerms ?relatedTerms . 
       ?article eb:name ?name .
       ?article eb:definition ?definition.
    }
    LIMIT 5
    ''',
  initNs = { "eb": eb}
)

for r in g.query(q6):
  print(r.article, r.relatedTerms, r.name, r.definition)

https://w3id.org/eb/i/Article/9929192893804340_144850368_VICE_0 https://w3id.org/eb/i/Article/9929192893804340_144850367_KING_0 VICE in ethics, is ordinarily defined an eledlive habit, deviating, either in excess, or defedt, from the just medium wherein virtue is placed. Vice, in fmithery, and other arts employed in metals, is a machine, or instrument, serving to hold fast any thing they are at work upon, whether it is to be filed, bent, Vice is also used, iri the composition of divers words, to denote the relation of something that comes instead, or in the place, of another; as vice-admiral, vice-chancellor, vice chamberlain, vice-president, are officers who take place in the absence of admirals, circ. Vice-rov, a governor o: a kingdom, who commands in the name and stead of a king, v/ith full and iovereign authority. See King.
https://w3id.org/eb/i/Article/992277653804341_144133901_BOBBIO_0 https://w3id.org/eb/i/Article/992277653804341_144133903_SPARUS_0 BOBBIO a town of the Milanese,

In [13]:
q6 = prepareQuery('''
    SELECT *
    WHERE {
       ?article a eb:Article . 
       ?article eb:name ?name .
       ?article eb:definition ?definition.
    }
    LIMIT 5
    ''',
  initNs = { "eb": eb}
)

for r in g.query(q6):
  print(r.article, r.name, r.definition)

https://w3id.org/eb/i/Article/992277653804341_144133902_KINGSTON_0 KINGSTON a market-town of Surry, (ituated on the river Thames, twelve miles west of London. .Kingston, a port-town of Jamaica, situated on the north side of the bay of Port royal: W. long. 77 0 , N. lat. 17 0 32' .
https://w3id.org/eb/i/Article/992277653804341_144133901_BORIQUE_0 BORIQUE one of the Caribbee islands, lying southeast o T Porto Rico, in 64° 30' W. long, and 18° N.
https://w3id.org/eb/i/Article/9929192893804340_144850367_LEVER_0 LEVER or Leaver, in mechanics., See Mecha-
https://w3id.org/eb/i/Article/9929192893804340_144850368_VICE_0 VICE in ethics, is ordinarily defined an eledlive habit, deviating, either in excess, or defedt, from the just medium wherein virtue is placed. Vice, in fmithery, and other arts employed in metals, is a machine, or instrument, serving to hold fast any thing they are at work upon, whether it is to be filed, bent, Vice is also used, iri the composition of divers words, to denote 

In [22]:
q7 = prepareQuery('''
  SELECT  ?Subject ?Definition ?HasPart ?Title WHERE {
    ?Subject eb:name ?Term.
    ?Subject eb:definition ?Definition.
    ?HasPart eb:hasPart ?Subject.
    ?HasPart eb:title ?Title
  }  
  ''',
  initNs = { "eb": eb}
)

for r in g.query(q7, initBindings = {'?Term' : Literal('ABACUS', datatype=XSD.string)}):
  print(r.Subject, r.Definition, r.HasPart, r.Title)
  print("----")



https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_3 logijlicus, a right-angled triangle, whose sides forming the right angle contain the numbers from 1 to 60, and its area the fafta of every two of the numbers perpendicularly opposite. This is also called a canon Jk^&cus Pythagvricus, the multiplication-table, or any table of numbers that facilitates operations in arith- https://w3id.org/eb/i/Volume/992277653804341_144133901 Encyclopaedia Britannica; or, A dictionary of arts and sciences, compiled upon a new plan
----
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_2 is also the name of an ancient inllrument for facilitating operations in arithmetic. It is variously contrived. That chiefly used in Europe is made by drawing any number of parallel lines at the difiance of two diameters of one of the counters used in the calculation. A counter placed on the lowest line, fignifics i; on the 2d, 10; on the 3d, 100; on the 4th, 1000, &c. In the intermediate space

In [54]:
q7 = prepareQuery('''
  SELECT  ?Subject ?RelatedTerms WHERE {
    ?Subject eb:name ?Term.
    ?RelatedTerms eb:relatedTerms ?Subject.

    
  }  
  ''',
  initNs = { "eb": eb}
)

for r in g.query(q7, initBindings = {'?Term' : Literal('ABACUS', datatype=XSD.string)}):
  print(r.Subject, r.RelatedTerms)
  print("----")


https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_3 https://w3id.org/eb/i/Article/992277653804341_144133901_ABACISCUS_0
----
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_2 https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACISCUS_0
----
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_0 https://w3id.org/eb/i/Article/992277653804341_144133901_ABACISCUS_0
----
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_0 https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACISCUS_0
----
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_3 https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACISCUS_0
----
https://w3id.org/eb/i/Article/992277653804341_144133901_ABACUS_1 https://w3id.org/eb/i/Article/992277653804341_144133901_ABACISCUS_0
----
https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACUS_1 https://w3id.org/eb/i/Article/9929192893804340_144850366_ABACISCUS_0
----
https://w3id.org/eb/i/Arti

In [14]:
#G = rdflib_to_networkx_multidigraph(result)

# Plot Networkx instance of RDF Graph
#pos = nx.spring_layout(G, scale=2)
#edge_labels = nx.get_edge_attributes(G, 'r')
#nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels)
#nx.draw(G, with_labels=True)

#if not in interactive mode for 
#plt.show()
