## Create Object store continairs (S3 buckets) 
we will use S3-compatible buckets to store our:

  - `UC-NODE-data`: for datasets
  - `UC-NODE-mlflow-metrics`: for metrics, logs
  - `UC-NODE-mlflow-artifacts`: for model checkpoints, images, etc.


The notebook cells uses [swiftclient](https://docs.openstack.org/python-swiftclient/latest/) + [python-chi](https://python-chi.readthedocs.io/en/latest/index.html) 
to automate object store bucket creation on Chameleon

###  Prerequisites
- This notebook assumes that You must be logged into Chameleon JupyterHub and you run the following cells there .

In [None]:
#Import required packages
from chi import context
import swiftclient
import chi

In [None]:
project_name = "UC-NODE" # Define project name (this will prefix the bucket names)
context.choose_project() 
context.choose_site(default="CHI@UC")

## Authentication for Object Store

This step is used to **authenticate JupyterHub with the object store**.  
Since our JupyterHub session already has the necessary Chameleon credentials,
we don’t type in usernames or passwords here.  

The code below simply asks JupyterHub for a valid **token** and the **object store
endpoint**, and then uses them to connect with our object store services to create the buckets

In [None]:
# Authenticate - getting object store endpoint
os_conn = chi.clients.connection()
token = os_conn.authorize()
storage_url = os_conn.object_store.get_endpoint()
# Connect to Swift (S3-compatible) object store
swift_conn = swiftclient.Connection(preauthurl=storage_url,
                                    preauthtoken=token,
                                    retries=5)

## Creat buckets for our project

Here we create our **three buckets** in the object store:  

Later in the workflow, MLflow will be configured to **write metrics to the metrics bucket** and **upload artifacts (models, plots, etc.) to the artifacts bucket**, while you can upload or download datasets from the data bucket.  

In [None]:
# List of buckets to create
buckets = [
    f"{project_name}-data",
    f"{project_name}-mlflow-metrics",
    f"{project_name}-mlflow-artifacts"
]
# Creating buckets
for bucket in buckets:
    print(f"Creating bucket: {bucket}")
    swift_conn.put_container(bucket)

### You can check your buckets in Horizon GUI

Now that your S3 buckets are created, you can verify them in the Chameleon Horizon interface.

1. Log in to [Chameleon Horizon](https://chi.uc.chameleoncloud.org)
2. In the left-hand menu, go to **Project → Object Store → Containers**.
3. You should see the buckets you just created here.

### Create application credentials
if we want our compute instance or script (MLflow, rclone, or a data pipeline) to have write access to the Chameleon object store, it needs to authenticate.
instead of using your personal username + password (which is unsafe and interactive), we create an application credential:

the following script will generate that for us or you can head to create it using the GUI: 

#### USING GUI 
go over the the chameleon cloud [Horizon GUI](https://chi.tacc.chameleoncloud.org/project/)
- enter to the horizon GUI 
- in the menu sidebar on the left side of the Horizon GUI, click “Identity”
- select "Application Credentials" 
- click “Create Application Credential”.

#### if using GUI 
You will not be able to view the secret again from the Horizon GUI. Then, click “Download openrc file” to have another copy of the secret.

#### note 
Copy the “ID” and “Secret” displayed in the dialog, and save them in a safe place, in this notebook you want to make sure to clear your cell output before pushing it pubicly 

In [None]:
user = os_conn.current_user_id

# create application credential
app_cred = os_conn.identity.create_application_credential(
    user=user,
    name="rclone_swift_access",
    description="Created via notebook using python-chi",
    unrestricted=False
)

print("Application Credential ID:", app_cred.id)
print("Application Credential Secret:", app_cred.secret)
print("User ID:", user)   