# Fraud Classification - Test Connection to SAP HANA
<BR><BR>The purpose of this notebook is to test the connection to HANA Cloud. And check the location of the data that will be used in the project.

### Steps in this notebook
-  Connect to HANA Cloud using hana_ml python package
-  Check the location of the data
-  Confirm that APL is installed

### Documentation
-  SAP HANA Python Client API for Machine Learning Algorithms:   
   https://help.sap.com/doc/0172e3957b5946da85d3fde85ee8f33d/latest/en-US/html/hana_ml.html
-  SAP HANA Predictive Analysis Library (PAL):  
   https://help.sap.com/viewer/319d36de4fd64ac3afbf91b1fb3ce8de/cloud/en-US
-  SAP HANA Automated Predictive Library (APL):
   https://help.sap.com/viewer/product/apl/2018/en-US

### Connect to HANA Cloud
Begin by checking if hana_ml package is installed.

If not installed, use '!pip install hana_ml' to install the latest version of the package.

In [1]:
import hana_ml
print(hana_ml.__version__)

2.5.20062609


Instantiate a connecton object to SAP HANA.
-  For simplicity, to help you get started, the values can be hardcoded here. 
-  We recommend keeping these credentials in the Secure User Store of the SAP HANA Client. Retrieving the credentials from the Secure User Store prevents having to specify these credentials in clear text. 

To connect to a HANA Cloud instance, we need
-  URL
-  Port (usually 443)
-  user and password OR hdbuserstore 'key'
-  encrypt = 'true' to ensure that we use an encrypted connection

In [2]:
import hana_ml.dataframe as dataframe

# Instantiate connection object
# conn = dataframe.ConnectionContext(address = 'd1c353c3-ea74-4e9a-bda5-897d70fd7376.hana.prod-us10.hanacloud.ondemand.com',
#                                    port = 443, 
#                                    user = 'BIUSER', 
#                                    password = 'xxxx', 
#                                    encrypt = 'true'
#                                   )

# Stored in hdbuserstore on a Mac using the below statement (in /Applications/sap/hdbclient)
# hdbuserstore -i SET PRESALESHC "d1c353c3-ea74-4e9a-bda5-897d70fd7376.hana.prod-us10.hanacloud.ondemand.com:443" BIUSER
conn = dataframe.ConnectionContext(key = 'NA_POC_HANA_CLOUD', encrypt = 'true',sslValidateCertificate = 'false')

In [3]:
# Send basic SELECT statement and display the result
sql = 'SELECT 12345 FROM DUMMY'
df_remote = conn.sql(sql)
print(df_remote.collect())

   12345
0  12345


### Check the location of the data
For this project, all the data is stored in SOURCEDATA schema.

In [4]:
conn.sql("Select * from TABLES WHERE SCHEMA_NAME='SOURCEDATA'").collect()

Unnamed: 0,SCHEMA_NAME,TABLE_NAME,TABLE_OID,COMMENTS,FIXED_PART_SIZE,IS_LOGGED,IS_SYSTEM_TABLE,IS_COLUMN_TABLE,TABLE_TYPE,IS_INSERT_ONLY,...,ROW_ORDER_TYPE,CREATE_TIME,TEMPORAL_TYPE,HAS_MASKED_COLUMNS,PERSISTENT_MEMORY,HAS_RECORD_COMMIT_TIMESTAMP,IS_REPLICATION_LOG_ENABLED,NUMA_NODE_INDEXES,IS_MOVABLE,LOAD_UNIT
0,SOURCEDATA,CARD,165754,Cards,48.0,True,False,True,COLUMN,False,...,,2020-08-15 23:33:28.660,,False,,False,False,,True,COLUMN
1,SOURCEDATA,CARD_TRANSACTIONS,165767,Card Transactions (full datsaet in-memory),56.0,True,False,True,COLUMN,False,...,,2020-08-15 23:33:28.681,,True,,False,False,,True,COLUMN
2,SOURCEDATA,CARD_TRANSACTIONS_HOTWARM,165781,Card Transactions hot & warm data (partition),56.0,True,False,True,COLUMN,False,...,,2020-08-15 23:33:28.700,,True,,False,False,,True,DEFAULT
3,SOURCEDATA,CUSTOMER,165796,Customer,72.0,True,False,True,COLUMN,False,...,,2020-08-15 23:33:28.720,,False,,False,False,,True,COLUMN
4,SOURCEDATA,MERCHANT,165813,Merchant,56.0,True,False,True,COLUMN,False,...,,2020-08-15 23:33:28.737,,False,,False,False,,True,COLUMN
5,SOURCEDATA,MERCHANT_CA,171144,,88.0,True,False,True,COLUMN,False,...,,2020-08-26 13:34:46.053,,False,,False,False,,True,DEFAULT
6,SOURCEDATA,CARD_TRANSACTIONS_COLD,165825,,,True,False,False,VIRTUAL,True,...,,2020-08-15 23:35:36.808,,False,,False,False,,False,DEFAULT
7,SOURCEDATA,EXCEL_Tax_20000,166282,,,True,False,False,VIRTUAL,True,...,,2020-08-17 20:17:36.429,,False,,False,False,,False,DEFAULT
8,SOURCEDATA,SQLSERVER_Tag,166293,,,True,False,False,VIRTUAL,True,...,,2020-08-18 20:28:43.593,,False,,False,False,,False,DEFAULT
9,SOURCEDATA,VT_CARD_TRANSACTIONS,171504,,,True,False,False,VIRTUAL,True,...,,2020-08-28 15:58:23.429,,False,,False,False,,False,DEFAULT


### Confirm that APL is installed

When APL (Automated Predictive Library) has been installed on SAP HANA Cloud, there are several functions created under AREA_NAME (APL_AREA); and PACKAGE_NAME (APL) under SYS.AFL_FUNCTIONS.

In [5]:
conn.sql("select * from SYS.AFL_FUNCTIONS where AREA_NAME='APL_AREA'").collect()

Unnamed: 0,FUNCTION_OID,SCHEMA_NAME,AREA_NAME,PACKAGE_NAME,FUNCTION_NAME,CREATE_TIMESTAMP,INPUT_PARAMETER_COUNT,RETURN_VALUE_COUNT,FUNCTION_TYPE,TECHNICAL_CATEGORY
0,156970,_SYS_AFL,APL_AREA,APL,CREATE_MODEL,2020-07-19 04:27:28.559,3,2,LFunc,var_columns
1,156971,_SYS_AFL,APL_AREA,APL,CREATE_MODEL__OVERLOAD_3_1,2020-07-19 04:27:28.563,3,1,LFunc,var_columns
2,156972,_SYS_AFL,APL_AREA,APL,IMPORT_VARIABLEDESCRIPTIONS,2020-07-19 04:27:28.567,3,1,LFunc,var_columns
3,156973,_SYS_AFL,APL_AREA,APL,EXPORT_VARIABLEDESCRIPTIONS,2020-07-19 04:27:28.570,2,1,LFunc,var_columns
4,156974,_SYS_AFL,APL_AREA,APL,CREATE_MODEL_AND_TRAIN,2020-07-19 04:27:28.573,5,4,LFunc,var_columns
5,156975,_SYS_AFL,APL_AREA,APL,CREATE_MODEL_AND_TRAIN__OVERLOAD_5_1,2020-07-19 04:27:28.578,5,1,LFunc,var_columns
6,156976,_SYS_AFL,APL_AREA,APL,CREATE_MODEL_AND_TRAIN__OVERLOAD_7_1,2020-07-19 04:27:28.582,7,1,LFunc,var_columns
7,156977,_SYS_AFL,APL_AREA,APL,CREATE_MODEL_AND_TRAIN__OVERLOAD_7_4,2020-07-19 04:27:28.586,7,4,LFunc,var_columns
8,156978,_SYS_AFL,APL_AREA,APL,TRAIN_MODEL,2020-07-19 04:27:28.590,5,4,LFunc,var_columns
9,156979,_SYS_AFL,APL_AREA,APL,TRAIN_MODEL__OVERLOAD_5_1,2020-07-19 04:27:28.594,5,1,LFunc,var_columns


Once preparation steps are done, we can close the connection to HANA Cloud. Continue with the next notebook, "05 Introduction".

In [6]:
conn.close()