---
title: "Working with JSON-LD"
author: "Charles F. Vardeman II"
date: "2023-09-09"
format:
    html:
        code-fold: true
        grid:
          margin-width: 350px
reference-location: margin
citation-location: margin
bibliography: kg.bib
---

## Introduction to JSON-LD for Knowledge Graph Construction
Welcome to this tutorial on JSON-LD for constructing knowledge graphs. In an era where data has become the new oil, the ability to organize, link, and query information efficiently is critical. Knowledge graphs serve as the backbone for a variety of applications, from search engines and recommendation systems to advanced research in machine learning. JSON-LD, or JavaScript Object Notation for Linked Data, offers a lightweight, flexible, and web-friendly way to represent structured data. This tutorial aims to provide a comprehensive guide to leveraging JSON-LD for building robust, scalable knowledge graphs.

## Background
Before diving into the tutorial, it's essential to understand the foundational elements: JSON-LD and knowledge graphs. We'll also touch on some of the ethical considerations inherent to these technologies.

### JSON-LD: A Bridge Between JSON and the Semantic Web
JSON-LD extends the popular JSON format to include semantic vocabulary and context, thereby enabling easier data interlinking and sharing. It was developed within the W3C Linked Data Platform Working Group with the goal to create a unified data representation across different platforms and languages. This standardization makes it especially useful for web developers looking to tap into the power of linked data without a steep learning curve.

### Knowledge Graphs: Structuring the World's Information
A knowledge graph is a specialized graph-structured database that represents entities and the relationships between them. Knowledge graphs allow for complex queries and provide a robust framework for machine learning algorithms to gain better contextual understanding. These graphs serve not just as data storage but as a way to model real-world scenarios, making them indispensable in fields like AI, healthcare, finance, and beyond.

### Ethical Considerations
While JSON-LD and knowledge graphs offer powerful tools for data manipulation and representation, they come with ethical responsibilities. These include ensuring data privacy, mitigating biases in how knowledge is represented, and complying with open data standards.

## Why JSON-LD Was Created and Its Relationship to Web Data
JSON-LD emerged as a response to the need for a lightweight, easy-to-use format for representing linked data on the web. Prior to its introduction, existing formats like RDF/XML were robust but often too complex for everyday web development. JSON-LD bridges this gap, offering the ease-of-use of JSON while incorporating the semantic linking capabilities needed for a more interconnected web landscape.

### Practical Use in Schema.org
JSON-LD plays a pivotal role in Schema.org, a collaborative initiative to create, maintain, and promote schemas for structured data on the Internet. By using JSON-LD in tandem with Schema.org vocabularies, webmasters can significantly enhance the structure and visibility of their data. This has practical implications, such as improved SEO rankings and better interoperability with other web services.

### Transforming Standard JSON to a Knowledge Graph
The beauty of JSON-LD lies in its ability to transform standard JSON data into a knowledge graph simply and efficiently. By adding a '@context' to a JSON document, developers can define how the data should be interpreted semantically. This turns a flat data structure into an interconnected web of information, opening up powerful querying and linking possibilities.

### JSON-LD 1.1 Features
The release of JSON-LD 1.1 introduced a range of new features aimed at enhancing its functionality and ease of use. Key updates include improved context management, support for graph containers, and the ability to nest node objects. These features make it easier to construct intricate knowledge graphs and offer more control over how data is contextualized.

## JSON-LD and Web APIs
In the modern web ecosystem, APIs serve as the bridges between different services and applications. JSON-LD's compatibility with web APIs makes it a top choice for developers needing to consume or provide structured, linked data. Its seamless integration with RESTful services ensures that you can work within a familiar environment while benefiting from enhanced data semantics.

Moreover, JSON-LD's ability to express linked data allows for more advanced operations such as data aggregation, filtering, and transformation directly via API calls. This creates opportunities for developing richer, more interactive applications that can adapt in real-time to changes in underlying data. For example, by utilizing JSON-LD in a RESTful API for a content management system, you could dynamically link related articles, authors, and tags, thereby providing a more enriched user experience.

Additionally, JSON-LD's interoperability means it can be easily coupled with other web standards like OAuth for secure authentication or CORS for cross-origin resource sharing. This makes it not just a data format, but a comprehensive solution for building robust and scalable API ecosystems.

Finally, JSON-LD also plays a significant role in the realm of Web APIs for semantic search engines and linked data platforms. These APIs can consume JSON-LD to understand the contextual relationships between different pieces of information, thereby enabling more intelligent and nuanced search queries.



## Example: Exposing JSON-LD Context via HTTP Link Header
In many real-world applications, the JSON-LD context can be exposed via an HTTP link header, similar to how Schema.org does it. This enables clients to discover the context automatically and understand how to interpret the linked data.

Suppose you have a RESTful API for a blog platform, and you want to expose a JSON-LD context for articles. The HTTP response could include a link header pointing to the JSON-LD context:

```http
HTTP/1.1 200 OK
Content-Type: application/json
Link: <https://yourapi.com/docs/jsonldcontext.json>; rel="http://www.w3.org/ns/json-ld#context"; type="application/ld+json"
```

With this setup, clients consuming the API can follow the link to fetch the context and understand the semantics of the data. Here is a simplified example of what the `jsonldcontext.json` might look like:

```json
{
  "@context": {
    "title": "http://schema.org/headline",
    "author": "http://schema.org/author",
    "datePublished": "http://schema.org/datePublished",
    "content": "http://schema.org/text"
  }
}
```

By using the link header to expose the JSON-LD context, you're making it easier for clients to consume and understand your API's data. This aligns well with JSON-LD 1.1 conventions and allows for greater interoperability and semantic richness.


## Example: Using JSON-LD to construct a Knowledge Graph from Wikipedia tables

In [1]:
import requests
from bs4 import BeautifulSoup

url = "https://en.wikipedia.org/wiki/List_of_current_ships_of_the_United_States_Navy"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
table = soup.find_all("table")[0]


In [3]:
import pandas as pd

df = pd.read_html(str(table))[0]
df

  df = pd.read_html(str(table))[0]


Unnamed: 0,Ship name,Hull number,Class,Type,Commission date,Homeport[2],Note
0,USS Abraham Lincoln,CVN-72,Nimitz,Aircraft carrier,11 November 1989,"San Diego, CA",[3]
1,USS Alabama,SSBN-731,Ohio,Ballistic missile submarine,25 May 1985,"Bangor, WA",[4]
2,USS Alaska,SSBN-732,Ohio,Ballistic missile submarine,25 January 1986,"Kings Bay, GA",[5]
3,USS Albany,SSN-753,Los Angeles,Attack submarine,7 April 1990,"Norfolk, VA",[6]
4,USS Alexandria,SSN-757,Los Angeles,Attack submarine,29 June 1991,"San Diego, CA",[7] Scheduled to be decommissioned 2026[8]
...,...,...,...,...,...,...,...
234,USS William P. Lawrence,DDG-110,Arleigh Burke,Destroyer,19 May 2011,"San Diego, CA",[242]
235,USS Winston S. Churchill,DDG-81,Arleigh Burke,Destroyer,10 March 2001,"Norfolk, VA",[243]
236,USS Wichita,LCS-13,Freedom,Littoral combat ship,12 January 2019,"Mayport, FL",[244] Proposed to be decommissioned 2023[17]
237,USS Wyoming,SSBN-742,Ohio,Ballistic missile submarine,13 July 1996,"Kings Bay, GA",[245]


This is a good start at getting the ship names. Let's get the columns in a list.

In [5]:
column_names = df.columns.tolist()
column_names

['Ship name',
 'Hull number',
 'Class',
 'Type',
 'Commission date',
 'Homeport[2]',
 'Note']

So, if we want to use the column names as the basis for eventually constructing a URI, we unfortunately need to make it web safe and remove spaces and other issues.

In [7]:
# Normalize column names in DataFrame
normalized_columns = {col: col.replace(" ", "_").replace("[", "").replace("]", "") for col in column_names}
df.rename(columns=normalized_columns, inplace=True)

# Let's change the date to be consistent with xsd:date
df['Commission_date'] = pd.to_datetime(df['Commission_date']).dt.strftime('%Y-%m-%d')

# Convert to JSON
df_json = df.to_json(orient="records")
df_json

'[{"Ship_name":"USS\\u00a0Abraham Lincoln","Hull_number":"CVN-72","Class":"Nimitz","Type":"Aircraft carrier","Commission_date":"1989-11-11","Homeport2":"San Diego, CA","Note":"[3]"},{"Ship_name":"USS\\u00a0Alabama","Hull_number":"SSBN-731","Class":"Ohio","Type":"Ballistic missile submarine","Commission_date":"1985-05-25","Homeport2":"Bangor, WA","Note":"[4]"},{"Ship_name":"USS\\u00a0Alaska","Hull_number":"SSBN-732","Class":"Ohio","Type":"Ballistic missile submarine","Commission_date":"1986-01-25","Homeport2":"Kings Bay, GA","Note":"[5]"},{"Ship_name":"USS\\u00a0Albany","Hull_number":"SSN-753","Class":"Los Angeles","Type":"Attack submarine","Commission_date":"1990-04-07","Homeport2":"Norfolk, VA","Note":"[6]"},{"Ship_name":"USS\\u00a0Alexandria","Hull_number":"SSN-757","Class":"Los Angeles","Type":"Attack submarine","Commission_date":"1991-06-29","Homeport2":"San Diego, CA","Note":"[7] Scheduled to be decommissioned 2026[8]"},{"Ship_name":"USS\\u00a0America","Hull_number":"LHA-6","Class