### ***What is ORKG?***

We build the next generation digital libraries for semantic scientific knowledge communicated in scholarly literature. We focus on the communicated content rather than the context e.g., people and institutions in which scientific knowledge is communicated, and the content is semantic i.e., machine interpretable.

Scientific knowledge continues to be confined to the document, seemingly inseparable from the medium as hieroglyphs carved in stone. The global scientific knowledge base is little more than a collection of documents. It is written by humans for humans, and we have done so for a long time. This makes perfect sense, after all it is people that make up the audience, and researchers in particular.

Yet, with the monumental progress in information technologies over the more recent decades, one may wonder why it is that the scientific knowledge communicated in scholarly literature remains largely inaccessible to machines. Surely it would be useful if some of that knowledge is more available to automated processing.

The Open Research Knowledge Graph project is working on answers and solutions. The recently initiated TIB coordinated project is open to the community and actively engages research infrastructures and research communities in the development of technologies and use cases for open graphs about research knowledge.



### ***Who can use ORKG package?***

Well, the short answer is anybody. The package is designed to be minimalistic, simple knowledge of python and JSON suffices for you to know and understand how the package works.

You can use the ORKG package to add/edit/list data from any instance of the open research knowledge graph. It can be easily integrated into data science workflows, it can be used to fetch data for analysis and visualizations.

The sky is your limit :)

### ***ORKG python package documentation***

https://orkg.readthedocs.io/en/latest/introduction.html

# ***Installation***

ORKG python package is a simple API wrapper for the ORKG’s API.

Start using it by typing these few words:

In [2]:
!pip install orkg

Collecting orkg
  Downloading orkg-0.18.0-py3-none-any.whl (46 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/46.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.6/46.6 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting Deprecated<2.0.0,>=1.2.14 (from orkg)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting Faker<20.0.0,>=19.1.0 (from orkg)
  Downloading Faker-19.6.2-py3-none-any.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m32.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting Inflector<4.0.0,>=3.1.0 (from orkg)
  Downloading Inflector-3.1.0-py3-none-any.whl (12 kB)
Collecting cardinality<0.2.0,>=0.1.1 (from orkg)
  Downloading cardinality-0.1.1.tar.gz (2.3 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting hammock<0.3.0,>=0.2.4 (from orkg)
  Downloading hammock-0.2.4.tar.gz (4.8 kB)
  Preparing metada

# ***Usage***

In order to use the package in your python code, you just need to import it and instantiate an instance of the base class to use it.

In [4]:
from orkg import ORKG # import base class from package

orkg = ORKG(host="<host-address-is-here>", creds=('email-address', 'password')) # create the connector to the ORKG

The package can be used to connect to any instance of the ORKG local or remote. The host parameter may be specified with an address or environment name (see below).
Optionally you can provide the credentials to authenticate your requests. If host is not specified, it defaults to sandbox.

 # ***Host***

Three different host environments are available:


1.   ***Production***: The most stable version of the ORKG. Use this host to access the current version of the graph and add persistent data.

In [None]:
# from orkg import ORKG, Hosts # import base class from package

# connector = ORKG(host=Hosts.PRODUCTION) # create the connector to the ORKG



2. ***Sandbox:*** The ORKG playground! It has the same features that exist on production at any given time. Use this host to experiment with the ORKG or Python package features without adding data to the main graph. Data added to sandbox is periodically deleted, and this version of the ORKG may not contain all of the data found in the production graph.



In [5]:
from orkg import ORKG, Hosts # import base class from package

connector = ORKG(host=Hosts.SANDBOX) # create the connector to the ORKG


3. ***Incubating:*** The ORKG version with the newest features being tested. Since features are still under development, this version may break or not function as expected. Data added to incubating is periodically deleted, and this version of the ORKG may not contain all of the data found in the production graph.



In [None]:
# from orkg import ORKG, Hosts # import base class from package

# connector = ORKG(host=Hosts.INCUBATING) # create the connector to the ORKG

Three main classes should be known when using the ORKG python package.



1. ORKG class (main class to connect to an ORKG instance).
2. OrkgResponse (output encapsulation for the ORKG API).
3. OrkgUnpaginatedResponse (output encapsulation for pageable endpoints of the ORKG API)


In [6]:
connector.resources # entry point to manipulate ORKG resources
connector.predicates # entry point to manipulate ORKG predicates
connector.classes # entry point to manipulate ORKG classes
connector.literals # entry point to manipulate ORKG literals
connector.stats # entry point to get ORKG statistics
connector.statements # entry point to manipulate ORKG statements
connector.papers # entry point to manipulate ORKG papers
connector.comparisons # entry point to manipulate ORKG comparisons
connector.objects # entry point to manipulate ORKG objects
connector.templates # entry point to manipulate ORKG templates (Alpha)
connector.harvesters # entry point to run harvesting functionalities on top of ORKG data (Alpha)

<orkg.client.harvesters.harvesters.HarvestersClient at 0x7bcbc0c9a620>

For the other main component which represent the output of all requests in the package.

In [7]:
connector.ping() # this call will check the availability of the ORKG and returns True if the request is successful; otherwise, returns False.

True

In [8]:
connector.resources.get()  # this call will fetch a collection of resources from the ORKG

(Success) [{'id': 'R0', 'label': "Gruber's design of ontologies", 'classes': [], 'shared': 1, 'featured': False, 'unlisted': False, 'verified': False, 'extraction_method': 'UNKNOWN', '_class': 'resource', 'created_at': '2019-01-06T15:04:07.692Z', 'created_by': '00000000-0000-0000-0000-000000000000', 'observatory_id': '00000000-0000-0000-0000-000000000000', 'organization_id': '00000000-0000-0000-0000-000000000000', 'formatted_label': None}, {'id': 'R172', 'label': 'Oceanography', 'classes': ['ResearchField'], 'shared': 152, 'featured': False, 'unlisted': False, 'verified': False, 'extraction_method': 'UNKNOWN', '_class': 'resource', 'created_at': '2019-01-06T15:04:07.692Z', 'created_by': '00000000-0000-0000-0000-000000000000', 'observatory_id': '00000000-0000-0000-0000-000000000000', 'organization_id': '00000000-0000-0000-0000-000000000000', 'formatted_label': None}, {'id': 'R173', 'label': 'Physics', 'classes': ['ResearchField'], 'shared': 8, 'featured': False, 'unlisted': False, 'veri

### ***Streamlit***

Streamlit documentation: https://github.com/streamlit/streamlit

Welcome to Streamlit 👋
A faster way to build and share data apps.



In [9]:
!pip install streamlit

Collecting streamlit
  Downloading streamlit-1.27.2-py2.py3-none-any.whl (7.6 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/7.6 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/7.6 MB[0m [31m42.1 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━[0m [32m6.7/7.6 MB[0m [31m97.6 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m74.6 MB/s[0m eta [36m0:00:00[0m
Collecting validators<1,>=0.2 (from streamlit)
  Downloading validators-0.22.0-py3-none-any.whl (26 kB)
Collecting gitpython!=3.1.19,<4,>=3.0.7 (from streamlit)
  Downloading GitPython-3.1.37-py3-none-any.whl (190 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m190.0/190.0 kB[0m [31m19.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck

# Example code for Hackathon

You can use these codes from here for your prototype (if you want)

In [11]:
def get_paper_via_doi(doi: str):
  dois = connector.literals.get_all(q=doi, exact=True).content
  if len(dois) > 0:
    # Found the id of DOI predicate from here https://orkg.org/resource/R36109?noRedirect
    statements = connector.statements.get_by_object_and_predicate(predicate_id="P26", object_id=dois[0]["id"]).content
    return statements[0]["subject"]
  else:
    return None

get_paper_via_doi("10.1101/2020.03.03.20029983")

{'id': 'R36109',
 'label': 'Transmission interval estimates suggest pre-symptomatic spread of COVID-19',
 'classes': ['Paper', 'FeaturedPaper'],
 'shared': 0,
 'featured': True,
 'unlisted': False,
 'verified': True,
 'extraction_method': 'UNKNOWN',
 '_class': 'resource',
 'created_at': '2020-04-08T15:56:42.986436+02:00',
 'created_by': '3a55ad95-645b-4c1f-b614-d2293718ee0b',
 'observatory_id': '06144227-7efc-478c-8555-25147020f02f',
 'organization_id': 'edc18168-c4ee-4cb8-a98a-136f748e912e',
 'formatted_label': None}

In [12]:
def get_contributions_of_paper(paper_id: str):
  if connector.resources.exists(paper_id):
    # P31 is the predicate that connects papers to contributions
    # It can be found when exploring any paper resource in the ORKG
    statements = connector.statements.get_by_subject_and_predicate(subject_id=paper_id, predicate_id="P31").content
    return [st["object"] for st in statements]
  else:
    raise ValueError("Paper doesn't exist in the system!")

get_contributions_of_paper("R36109")

[{'id': 'R306172',
  'label': 'Contribution 36',
  'classes': ['Contribution'],
  'shared': 1,
  'featured': False,
  'unlisted': False,
  'verified': False,
  'extraction_method': 'UNKNOWN',
  '_class': 'resource',
  'created_at': '2023-06-21T12:04:00.092248+02:00',
  'created_by': '00000000-0000-0000-0000-000000000000',
  'observatory_id': '00000000-0000-0000-0000-000000000000',
  'organization_id': '00000000-0000-0000-0000-000000000000',
  'formatted_label': None},
 {'id': 'R308579',
  'label': 'Contribution 110',
  'classes': ['Contribution'],
  'shared': 1,
  'featured': False,
  'unlisted': False,
  'verified': False,
  'extraction_method': 'UNKNOWN',
  '_class': 'resource',
  'created_at': '2023-08-09T12:04:27.355753+02:00',
  'created_by': '00000000-0000-0000-0000-000000000000',
  'observatory_id': '00000000-0000-0000-0000-000000000000',
  'organization_id': '00000000-0000-0000-0000-000000000000',
  'formatted_label': None},
 {'id': 'R308578',
  'label': 'Contribution 109',
  '

In [13]:
# PREFIX orkgr: <http://orkg.org/orkg/resource/>
# PREFIX orkgp: <http://orkg.org/orkg/predicate/>

# SELECT DISTINCT ?prop1 ?label1 ?prop2 ?label2
# WHERE{
#   ?prop1 rdfs:label ?label1 .
#   ?prop2 rdfs:label ?label2 .
#   ?sub ?prop1 ?obj1 .
#   ?sub ?prop2 ?obj2 .
#   ?paper orkgp:P31 ?sub.
#   FILTER(?prop1 != ?prop2 && ?prop1 != rdf:type && ?prop2 != rdf:type && ?prop1 != rdfs:label && ?prop2 != rdfs:label)
# }

In [14]:
# PREFIX orkgr: <http://orkg.org/orkg/resource/>
# PREFIX orkgp: <http://orkg.org/orkg/predicate/>

# SELECT ?research_problem ?approach (COUNT(?paper) as ?c)
# WHERE{
#   ?sub orkgp:P32 ?research_problem .
#   ?sub orkgp:HAS_APPROACH ?approach .
#   ?paper orkgp:P31 ?sub .
# }
# ORDER BY DESC(?c)

In [15]:
# PREFIX orkgr: <http://orkg.org/orkg/resource/>
# PREFIX orkgp: <http://orkg.org/orkg/predicate/>

# SELECT ?p ?pl ?o ?ol (COUNT(?p) as ?c)
# WHERE{
#   ?s ?p ?o .
#   ?p rdfs:label ?pl .
#   ?o rdfs:label ?ol .
#   ?paper orkgp:P31 ?s .
# }
# ORDER BY DESC(?c)

In [20]:
from orkg import ORKG, Hosts
from collections import Counter

filters = []
print('Starting to add filters')
while True:

  # Inputing a property
  prop_label = input('Enter a property label or "ok" if you have enough filters: ')
  if prop_label == 'ok':
    break
  response = orkg.predicates.get(q = prop_label, exact = False, size = 10)
  if response.succeeded and response.content:
    [print(prop) for prop in response.content]
  else:
    print('No properties found, enter a filter again')
    continue
  prop_id = input('Enter ID of a selected property: ')
  response = orkg.predicates.by_id(id = prop_id)
  if not response.succeeded or not response.content:
    print('Wrong ID, enter a filter again')
    continue

  # Inputing a property value
  obj_label = input('Enter a resource label: ')
  response = orkg.resources.get(q = obj_label, exact = False, size = 10)
  if response.succeeded and response.content:
    [print(obj) for obj in response.content]
  else:
    print('No resources found, enter a filter again')
    continue
  obj_id = input('Enter ID of a selected resource: ')
  if not orkg.resources.exists(obj_id):
    print('Wrong ID, enter a filter again')
    continue

  filters.append((prop_id, obj_id))

print('Your filters:')
for filter in filters:
  print(f'property: {filter[0]}, resource: {filter[1]}')

# Quering contributions
contrs = []
for filter in filters:
  prop = filter[0]
  obj = filter[1]
  response = orkg.statements.get_by_object_and_predicate(object_id = obj, predicate_id = prop)
  if response.succeeded:
    for contr in response.content:
      contrs.append(contr['subject']['id'])

print('Found related contributions:')
for contr_id, count in Counter(contrs).items():
  print(f'contribution id: {contr_id}, number of mathes: {count}')

# Quering papers
print('Getting papers by related contributions')
while True:
  contr_id = input('Enter ID of a selected contribution, or "q" to quit: ')
  if contr_id == 'q':
    break
  response = connector.statements.get_by_object_and_predicate(object_id = contr_id, predicate_id = 'P31', size = 10)
  if response.succeeded:
    paper = response.content[0]
    paper_id = paper['subject']['id']
    print(f'ID of a paper with the selected contribution: {paper_id}')
    response = connector.statements.get_by_subject_and_predicate(subject_id = paper_id, predicate_id = 'P26')
    if response.succeeded and response.content:
      doi = response.content[0]['object']['label']
      print(f'DOI of the paper: {doi}')
  else:
    print('Wrong ID, enter ID again')
    continue

In [None]:
# research problem, P32:
#  entity extraction, R567246
#  question answering, R567252
#  KG Completion, R573930
# approach, HAS_APPROACH:
#  Knowledge Graph Embedding, R69603
#  Deep Learning, R38156
# method, METHOD:
#  Bidirectional Encoder Representations from Transformers (BERT), R212628
#  Random Forest (RF), R212792
#  Support Vector Machine (SVM), R191529
# keywords, P3000:
#  Knowledge Graph, R286303
#  Natural language processing, R278000
#  Knowledge Representation, R109071
#evaluation, HAS_EVALUATION
#  Precision, R286363
#  F1-Score, R318640

# ***To Do:*** Develop a prototype (group idea)

***Contribution:***


1.   Gollam Rabby
2.   Ildar Baimuratov
3.   Yaser Jaradeh

