# What might we do differently?

## Purpose:
To briefly discuss the availability of subjective and objective data relationship analytics. 

## Background:
*LLM's* and *vector* search are great for the subjective assessment of data relationshsips - they identify collections that are symantically **"alike"**.  This enables LLM's to be *creative* which is also why they get things wrong.
Graph databases, specifically *knowledge graphs*, are wholly objective in their inferences through the data relationships.  Wikidata is a graph database that sits behind a large proprtion of wikipedia. 

## Method:
Use *Langchain* to orchestrate GPT-4 to write some code to construct an API call (and query) into Wikidata, to answer a specific questions.  Langchain will the manage the execution of the code, to return the response back to GPT-4 to format the layout.   

---

### What is a graph database?

<figure>
     <a href="https://link.springer.com/article/10.1007/s10639-023-11664-1/figures/3">
          <img src="https://media.springernature.com/full/springer-static/image/art%3A10.1007%2Fs10639-023-11664-1/MediaObjects/10639_2023_11664_Fig3_HTML.png"
               alt="image of a simple graph"
               height=600px>
     </a>
     <figcaption>A simple graph, used by the Metropoltan Museum of Art, to learn of the link between Madame X and the dress worn by Rita Hayworth in Gilda
</figure>




<figure>
     <a href="https://www.wikidata.org/wiki/Wikidata:Tools/Visualize_data#/media/File:Wikidata_Tree_Builder.png">
          <img src="https://upload.wikimedia.org/wikipedia/commons/1/1b/Wikidata_Tree_Builder.png"
               alt="image of a detailed graph"
               height=600px>
     </a>
     <figcaption>A dynamically generated graph, from  data in wikidata, showing the subclasses to two levels (and interconnects) of occupations using *artist* as the root
</figure>


Demonstrate the use of wikidata to get to some data, but do so using langchain to have GPT-4 construct Python and query and pass to PythonREPL to execute and then format neatly



### Example

Demonstrate the use of wikidata to source data, but do so using langchain to have GPT-4 construct Python and query and pass to PythonREPL to execute and then format neatly


In [3]:
import requests
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")

from langchain.agents.agent_toolkits import create_python_agent
from langchain.tools.python.tool import PythonREPLTool

from langchain.llms.openai import OpenAIChat
agent_executor = create_python_agent(
    llm=OpenAIChat(temperature=0, model="gpt-4-0613"), 
    tool=PythonREPLTool(),
    verbose=True
)
agent_executor.run("Using wikidata, fetch and printout the dates and locations of the birth and death of William Shakespeare.  The requests library is already installed.")



[1m> Entering new  chain...[0m
[32;1m[1;3mTo fetch data from Wikidata, I can use the requests library to send a GET request to the Wikidata API. The API endpoint for Wikidata is "https://www.wikidata.org/w/api.php". I need to pass the action, format, and the entity ID as parameters. The entity ID for William Shakespeare is "Q692". I will then parse the response to extract the birth and death dates and locations.
Action: Python REPL
Action Input: 
```python
import requests

def get_entity_data(entity_id):
    url = "https://www.wikidata.org/w/api.php"
    params = {
        "action": "wbgetentities",
        "ids": entity_id,
        "format": "json"
    }
    response = requests.get(url, params=params)
    return response.json()

def get_birth_death_info(entity_data):
    claims = entity_data['entities']['Q692']['claims']
    birth_date = claims['P569'][0]['mainsnak']['datavalue']['value']['time']
    death_date = claims['P570'][0]['mainsnak']['datavalue']['value']['time']
    bi

'William Shakespeare was born on April 23, 1564 and died on April 23, 1616. Both his birth and death took place in Stratford-upon-Avon.'