<a href="https://colab.research.google.com/github/martin-fabbri/colab-notebooks/blob/master/tigergraph/tigergraph_inventor_schema_creation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Install pyTigerGraph

In [10]:
from IPython.display import clear_output
!pip install -qq watermark

################################
# Packages to install
################################
!pip install -U -qq pyTigerGraph

################################

clear_output()
%reload_ext watermark
%watermark -v -p numpy,pyTigerGraph

Python implementation: CPython
Python version       : 3.7.12
IPython version      : 5.5.0

numpy       : 1.21.5
pyTigerGraph: 0.0.9.9.2



# Global Schema

This is an overarching schema that can be used by as many or as few **Graphs** as you desire. The purpose of the **Global Schema** is to give you a common schema that you CAN pull from when setting up Graphs. This is helpful if you have common elements between your different Graphs and want those common elements to have the same schema. For example, my **Global Schema** might contain the entirety of information about my supply chain. Everything from suppliers, warehouses, transportation lines, user orders, product information, user shipping information. However, one of our Graphs might want to only contain information about the suppliers, warehouses, and transportation as that side of the business doesn't need access to customer information. I could use just those elements and their edges from the Global Schema to populate this Graph without having to include all fo the information about the ordering user.

# Graph Schema
As we talked about above, the Graph Schema can contain as much or none of the Global Schema as you desire. In addition, the Graph Schema can contain elements that are not in the Global Schema. From the example above, I want to keep track of the manufacturing side of my supply chain. I can include the suppliers, warehouses, and transportation from the Global Schema, but I can also add in something like weather forecasts and supply forecasts that could be used to predict disruptions to my manufacturing chain. This data can exist in the Graph Schema for the Manufacturing Graph, but not in the Global Schema

## GSQL Schema Design

For times when you don't want to click through the interface in order to create your schema, you can do it via TigerGraph's language **GSQL**. **GSQL** can be executed on your TigerGraph server via the **GSQL** terminal or remotely by one of our many TigerGraph connectors. For this example we'll be using [pyTigerGraph](https://github.com/pyTigerGraph/pyTigerGraph) to interface through this Python notebook. The GSQL will remain the same regardless of which connector method you use.

### Creating a Vertex

This is what the GSQL required to create the **Application** vertex looks like:

`CREATE VERTEX Application(PRIMARY_ID id STRING, filingDate DATETIME, confirmationNumber STRING, docketNumber STRING, title STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"`

To simplify things, this is the pattern your most basic vertex deceleration will follow:

`CREATE VERTEX <VertexType>(PRIMARY_ID id <DataType>, <attributeName1> <DataType1>) WITH PRIMARY_ID_AS_ATTRIBUTE="true"`

Additional attributes are separated by commas and placed after the first.

### Creating an Edge

The GSQL to create an edge is extremely similar to that to create a vertex. Here's the **is_continuation_type** edge:

`CREATE UNDIRECTED EDGE is_continuation_type(FROM Application, TO ContinuationIype)`

Let's look at a **Directed** edge for comparison:

`CREATE DIRECTED EDGE has_child(FROM Application, TO Application, date DATETIME) WITH REVERSE_EDGE="reverse_has_child"`

And lastly, the pattern ( [  ] = **Optional** ):

`CREATE DIRECTED|UNDIRECTED EDGE <edge_name>(FROM <VertexType> TO <VertexType>, <attributeName1> <DataType1>) [WITH REVERSE_EDGE=<reverse_edge_name>]`