# Iceberg with Polaris REST Catalog

This notebook provides a basic code sample to initiate a connection to a REST catalog, create a table and insert some dummy data. 

## Prerequisites

The catalog 'polariscatalog' needs to be created prior to executing this notebook. If not created PyIceberg will not be able to connect to the REST Catalog. 
Credentials also need to be updated.

In [1]:
from pyiceberg.catalog import load_catalog
from pyiceberg.schema import Schema
from pyiceberg.types import (
    TimestampType,
    FloatType,
    DoubleType,
    StringType,
    IntegerType,
    NestedField,
    StructType,
    BooleanType)

import pandas as pd
import pyarrow.parquet as pq
import pyarrow as pa

In [2]:
database = "haagenbergpolaris"
table_name = f"{database}.test"

schema = Schema(
    NestedField(field_id=1, name="id", field_type=IntegerType(), required=True),
    NestedField(field_id=2, name="name", field_type=StringType(), required=False),
)


In [3]:
catalog = load_catalog(
    "rest_catalog",
    **{
        "uri": "http://localhost:8181/api/catalog", 
        "credential": "f4563fdc06cb3670:b2b29869e9106f7498ffc5d6055c295b",
        "scope": "PRINCIPAL_ROLE:ALL",
        "warehouse": "polariscatalog"
    }
)

In [7]:
if 'haagenbergpolaris' not in str(catalog.list_namespaces()):
    catalog.create_namespace("haagenbergpolaris")

if not catalog.table_exists(table_name):
    table = catalog.create_table(table_name, schema=schema, location=f"gs://tmp-erube-iceberg/polaris/{database}/test")

table = catalog.load_table(table_name)

In [5]:
catalog.properties

{'default-base-location': 'gs://tmp-erube-iceberg/polaris/',
 'uri': 'http://localhost:8181/api/catalog',
 'credential': 'f4563fdc06cb3670:b2b29869e9106f7498ffc5d6055c295b',
 'scope': 'PRINCIPAL_ROLE:ALL',
 'warehouse': 'polariscatalog',
 'token': 'principal:polarisuser;password:b2b29869e9106f7498ffc5d6055c295b;realm:default-realm;role:ALL',
 'prefix': 'polariscatalog'}

In [8]:
df = pd.DataFrame([{"id": 1, "name":"tug"}, {"id": 3, "name":"tug"}])
data = pa.Table.from_pandas(df, schema=pa.schema([
    pa.field('id', pa.int32(), nullable=False),
    ('name', pa.string())
]))
table.append(data)

  Expected `TableIdentifier` but got `dict` - serialized value may not be as expected
  return self.__pydantic_serializer__.to_json(


In [9]:
table

test(
  1: id: required int,
  2: name: optional string
),
partition by: [],
sort order: [],
snapshot: Operation.APPEND: id=2331249421951049425, parent_id=2666045365096731415, schema_id=0