### Imports and installs

In [None]:
!pip3 install locknessie[microsoft] pyiceberg pyarrow s3fs pandas requests

In [None]:
from locknessie.main import LockNessie
from pyiceberg.catalog import load_catalog
from pyiceberg.exceptions import SignError, CommitStateUnknownException
import pyarrow as pa
import requests

### Setting Up
The first thing we want to do is query some data from the existing `demo.sailboats` table. 
Since our user has `main.Read` permissions in the IDP, this should work. 

To query the data we'll need a token that is authed as us. Let's do that now.

In [None]:
# authing a token
NESSIE_URL = "http://nessie:19120"
lnessie = LockNessie()
token = lnessie.get_token()
# If this is your first time running the script, this will print out a URL that 
# you need to copy and paste into a new browser tab. 
# Follow the auth instructions in that tab. 
# The script will block until you have completed this auth in your browser the first time. 

### Business Logic 
There are a few operations we need to perform repeatedly in this demo, so to keep things tidy they are namespaced in the `AuthDemo` class below.

In [None]:
class AuthDemo:
    nessie_url: str
    catalog: "pyiceberg.catalog.rest.RestCatalog"

    def __init__(self, nessie_url:str):
        self.nessie_url = nessie_url
        self.catalog = None 
        
    def checkout_catalog(self, branch:str) -> None:
        """sets the working catalog"""
        del self.catalog
        self.catalog = load_catalog(
        "nessie",
        **{
            "uri": f"{self.nessie_url}/iceberg/{branch}/",
            "token": token,
            "header.X-Iceberg-Access-Delegation": "remote-signing",
            "s3.session-token": "placeholder",
        }
        )
    def get_table_from_lakehouse(self, tablename:str) -> "pyiceberg.iceberg.Table": 
        """extracts records from the lakehouse on the current branch table"""
        return self.catalog.load_table("demo.sailboats")

    @classmethod
    def as_pandas(cls, table:"pyiceberg.iceberg.Table") -> "pandas.DataFrame":
        """pandas is a bit better known as an API"""
        return table.scan().to_arrow().to_pandas()

    def add_a_row(self, table: "pyiceberg.iceberg.Table", row:dict) -> None:
        """writes a row to the datalake table (schema is hardcoded for the demo"""
        schema = pa.schema([
            pa.field('id', pa.int32(), nullable=False),
            pa.field('name', pa.string()),
            pa.field('sailplan', pa.string()),
            pa.field('draft', pa.float64()),
        ])
        new_boat = pa.Table.from_pydict(row, schema=schema)
        try:
            table.append(new_boat)
        except (SignError, CommitStateUnknownException):
            RED = '\033[31m'
            RESET = '\033[0m'
            print(f"{RED}YOU DO NOT HAVE AUTHORIZATION!{RESET}")

    def create_new_branch(self, branch_name:str, from_branch:str)-> bool:
        """creates a new branch from head of from_branch 
        Returns: 
            creation success
        """
        headers={"Authorization": f"Bearer {token}",
                  "Content-Type": "application/json"
                 }
        btype={"type":"BRANCH"}
        current_branch_hash = requests.get(f"{self.nessie_url}/api/v2/trees/{from_branch}",
                                           headers=headers).json()["reference"]["hash"]
        new_branch_request = requests.post(f"{self.nessie_url}/api/v2/trees",
                                           headers=headers,
                                           params={"name":branch_name, **btype},
                                           json={"name":from_branch,
                                                 "hash":current_branch_hash,
                                                **btype}
                                          )
        return new_branch_request.json()["reference"]

We will start on the `main` branch, which is typically the canonical/production/read-only data source for your organization.
An easy way to check your connection is to list the available namespaces;

In [None]:
authdemo = AuthDemo(NESSIE_URL)
authdemo.checkout_catalog("main")

# this should output a single namespace "demo" that was created by el_script.py at startup
authdemo.catalog.list_namespaces()

<details>
    <summary style="color:red;">Did you not see a single namespace "demo"? </summary>
     If not, you will want to print out your token and start debugging. A good place to start is [jwt.io](https://jwt.io), where you can decode the token and make sure the token has: 
    
- the correct audience (should be your client ID)
- the correct roles/groups (depending on how your IDP exposes them)
- a valid signature
</details>

### Querying The Data Lakehouse
Assuming you did see the correct namespace, you can now grab the demo 'sailboats' table and load all the data into a dataframe (just to make it easy to manipulate, since most people are more familiar with the pandas api).

_**Note**: there is an annoying "missing Cython implementation" you can ignore, this is just a demo we don't need all that speed)._

In [None]:
authdemo = AuthDemo(NESSIE_URL)
authdemo.checkout_catalog("main")
table = authdemo.get_table_from_lakehouse("demo.sailboats")
authdemo.as_pandas(table).head()

### Doing Things You Shouldn't
Now let's try to write a new boat to the `main` branch. Since this is our production branch and is protected, you should get an error. 

In [None]:
new_sailboat = {"id":[4], "name":["1979 Tayana 44"], "sailplan": ["Staysail Ketch"], "draft": [5.6]}
authdemo.add_a_row(table, new_sailboat)

### Switching Branches
You still want to add this new boat to the data lakehouse, and you can do that in your own branch of the data (not prod). This new branch will be named `new-feature`.

In [None]:
authdemo = AuthDemo(NESSIE_URL)
authdemo.create_new_branch("new-feature", "main")
authdemo.checkout_catalog("new-feature")

Great! Now that we are off the production `main` branch and on `new-feature`, we can go ahead and manipulate the data. 

In [None]:
# confirm that the new branch matches main, initially
table = authdemo.get_table_from_lakehouse("demo.sailboats")
authdemo.as_pandas(table).head()

### Writing on the New Branch
Now we can add the row, and it will work (because we are on a feature branch and not `main`)

In [None]:
# add the row to the sailboats table on new-feature
authdemo.add_a_row(table, new_sailboat)

In [None]:
# confirm that the new row was added to the table
table = authdemo.get_table_from_lakehouse("demo.sailboats")
authdemo.as_pandas(table)

### Confirm Main is Unchanged
Now we can switch back to `main` to be sure that the changes we made are isolated to the `new-feature` branch

In [None]:
authdemo.checkout_catalog("main")
main_table = authdemo.get_table_from_lakehouse("demo.sailboats")
authdemo.as_pandas(main_table)

### That's it! 
You are able to authenticate and correctly access only the data you _should_ be allowed to access. Yay!