### Setup of the Hopsworks Feature Store with Databricks

This first notebook is meant to setup the Databricks environment to be able to interact with the Hopsworks Feature Store. This notebook assumes that you are running Hopsworks Enterprise Edition and you have already installed the Hops Python libarary. 
The library is available here: https://github.com/logicalclocks/hops-util-py - make sure the version of the Hops Python library and the Hopsworks version match. You can follow the Databricks instructions on how to install Python libraries. https://docs.databricks.com/libraries.html

This notebook complements the Hopsworks Documentation you can find here: https://hopsworks.readthedocs.io/en/1.2/featurestore/featurestore.html#connecting-from-databricks-notebooks - In particular, before running the notebook you should make sure you have generated a new API Key (https://hopsworks.readthedocs.io/en/latest/user_guide/hopsworks/apiKeys.html) and you have setup correctly the Parameter Store or the Secrets Manager.
If your Hopsworks instance is running on Azure or you don't have the privileges to modify the AWS Parameter Store or Secrets Manager, you can write the API key on a file on DBFS pass the path to the `setup_databricks` method using the `api_key_file` parameter.

You should also make sure that the Databricks cluster can communicate with the Hopsworks Feature Store. You can achieve that by either deploying Hopsworks in the same VPC as your Databricks cluster, or by configuring VPC peering (https://docs.databricks.com/administration-guide/cloud-configurations/aws/vpc-peering.html#vpc-peering)

In [2]:
from hops import featurestore

# Create a folder on DBFS to store the Hopsworks certificates. The Hops python library will use them to authenticate with the Feature Store.
dbutils.fs.mkdirs("dbfs:/certs/")

# Call the library to correctly configure the cluster with the certificates
# The host is the Hopsworks URL. 
# Project Name is the Feature Store project you want to access from Databricks
# Region Name for the Prameter Store/Secrets Manager
# The complete method documentation is available here: http://hops-py.logicalclocks.com/hops.html#hops.featurestore.setup_databricks
featurestore.setup_databricks(host="host.aws.hopsworks.ai", project_name="demo_featurestore_admin000", region_name="us-west-2", secrets_store="secretsmanager")

If the setup is successful, the output should be a set of configurations to apply to your Databricks cluster. In particular you should configure `dbfs:/hops/scripts/initScript.sh` as Init Script for your cluster. You can follow the Databricks documentation here for instructions on how to do it: https://docs.databricks.com/clusters/init-scripts.html#configure-a-cluster-scoped-init-script-using-the-ui

You should also add the above configuration properties to your Spark settings.