# Machine Learning with SAP Datasphere, Hands-On Workshop
## Load the workshop data into SAP Datasphere

Load the data from CSV File into a pandas DataFrame

In [1]:
import pandas as pd
df_data = pd.read_csv("LucerneElectricity.csv", sep = ';')

Look at a few rows of data

In [2]:
df_data.head(5)

Unnamed: 0,TIMESTAMP,CONSUMPTION
0,2022-01-01T00:00:00.000Z,25112.140855
1,2022-01-01T00:15:00.000Z,24611.179355
2,2022-01-01T00:30:00.000Z,21784.375855
3,2022-01-01T00:45:00.000Z,20987.941355
4,2022-01-01T01:00:00.000Z,19466.977355


Ensure the TIMESTAMP column is treated as a DateTime

In [3]:
df_data.TIMESTAMP = pd.to_datetime(df_data.TIMESTAMP) 

Retrieve the credentials to connect to SAP Datasphere

In [4]:
import json
file = open('credentials.json', 'r')
credentials = json.load(file)
file.close()

Establish a connection with SAP Datasphere with these credentials

In [5]:
import hana_ml.dataframe as dataframe
conn = dataframe.ConnectionContext(address  = credentials['hana_address'],
                                   port     = credentials['hana_port'], 
                                   user     = credentials['hana_user'], 
                                   password = credentials['hana_password'], 
                                  )
conn.connection.isconnected()

True

Load the pandas DataFrame as table into SAP Datasphere

In [6]:
df_remote = dataframe.create_dataframe_from_pandas(
    connection_context=conn,
    pandas_df=df_data,
    table_name='LUCERNEELECTRICITY',
    force=True,
    drop_exist_tab=True,
    replace=False)

100%|██████████| 2/2 [00:00<00:00,  3.25it/s]


Count how many rows were uploaded

In [7]:
df_remote.count()

52404

Retrieve and display a few rows of data from SAP Datasphere

In [8]:
df_remote.head(5).collect()

Unnamed: 0,TIMESTAMP,CONSUMPTION
0,2022-01-01 00:00:00,25112.140855
1,2022-01-01 00:15:00,24611.179355
2,2022-01-01 00:30:00,21784.375855
3,2022-01-01 00:45:00,20987.941355
4,2022-01-01 01:00:00,19466.977355
