# Creating Schema and Loading Data
## Notebook 0

This notebook will create the schema and load the data we will be using. If you have already completed the pre-requisites below, scroll to the bottom of the notebook for the schema creation and data loading process.

### Pre-requisites
You must have a TigerGraph database instance provisioned. If you are using TigerGraph Cloud, you need to create a graph called `KDD_2022_NFT`, and generate a secret. The steps are outlined below:

#### 1. Provision a TigerGraph Cloud Instance

First, we will have to provision a TigerGraph Cloud instance. Once you follow the invite link in the email you recieve from the Google Form, you will see a page like the below:

<img src="../img/tgCreateSolution.png" alt="drawing" width="800"/>

Click **Create Solution** in the upper right hand corner. You will then see:

<img src="../img/blankInstance.png" alt="drawing" width="800"/>

Select **Blank v3.6.1** and scroll to the bottom of the page to continue. This will then bring you to the instance configuration page.

<img src="../img/instanceConfig1.png" alt="drawing" width="800"/>

Select **AWS** as the platform, **N. Virginia** as the region, a **Public** endpoint, and the **TG.C8.M32** Instance Type. Leave the defaults below, scroll to the bottom of the page, and click **Next**

<img src="../img/instanceConfig2.png" alt="drawing" width="800"/>

We will then name and tag the solution we are provisioning. **The names, tags, and subdomains must be unique, so choose something that is identifiable to you**.

<img src="../img/instanceName.png" alt="drawing" width="800"/>

After clicking next, you should see a confirmation page where you can check the details and then hit **Submit**.

<img src="../img/instanceConfirm.png" alt="drawing" width="800"/>

This will then take you back to the solutions page, where you can find your solution after a few minutes:

<img src="../img/mySolutionsPage.png" alt="drawing" width="800"/>

#### 2. Connect to GraphStudio

Once your solution is provisioned, we want to connect to the GraphStudio UI. To do this, click on the **Applications** icon and select GraphStudio.

<img src="../img/mySolutionsGS.png" alt="drawing" width="800"/>


#### 3. Create Graph

This will take you to the GraphStudio landing page, where we can create our graph. Click on the **Global View** button and select **Create a graph**:

<img src="../img/createGraphGS.png" alt="drawing" width="800"/>

Clicking this will bring up the following pop-up. Fill in the graph name as `KDD_2022_NFT`, and then select **Create**:

<img src="../img/createGraph.png" alt="drawing" width="800"/>

After this, you should see the following:

<img src="../img/designSchema.png" alt="drawing" width="800"/>

#### 4. Create Secret for Graph

Once your graph is created, we can create a secret we will use for authentication with the database. From the previous screen, click on **Admin Portal** in the upper right corner. It will bring you to this screen:

<img src="../img/apUsers.png" alt="drawing" width="800"/>

Under the **Management** tab on the left hand menu bar, select **Users**. Here, you can create a secret. Define an alias and then click the **+** button:

<img src="../img/createSecret.png" alt="drawing" width="800"/>

Make sure to copy the secret that is generated, as you will never see it again.

<img src="../img/copySecret.png" alt="drawing" width="800"/>

Finally, we can paste the secret into the notebook cell below:

<img src="../img/pasteSecret.png" alt="drawing" width="800"/>

## Connect to TigerGraph Database

The code block below connects to a TigerGraph database. Make sure to change the authentication details in order for you to connect to the instance successfully.

In [None]:
from pyTigerGraph import TigerGraphConnection

conn=TigerGraphConnection(
    host="YOUR_HOSTNAME_HERE",
    graphname="KDD_2022_NFT",
    gsqlSecret="YOUR_SECRET_HERE"
)
conn.getToken("YOUR_SECRET_HERE")

## Define and Create Schema
Here, we use a GSQL file that contains the schema definition. This step may take a few minutes.

In [None]:
print(conn.gsql(open("../src/gsql/schema/schema.gsql", "r").read()))

## Define Loading Job
We define a loading job to map our raw data into our schema.

In [None]:
print(conn.gsql(open("../src/gsql/loading/load_data.gsql", "r").read()))

## Run Loading Job
Upload our data to the TigerGraph database and verify that it is loaded. This step may take a few minutes.

In [None]:
conn.runLoadingJobWithFile("../data/Data_API_sampled.csv", "file1", "load_nft_data", sep=";")

In [None]:
conn.getVertexCount("NFT_User")