# Load data to SAP HANA Cloud

Establish connection to SAP HANA Cloud, here through the Secure User Store, ie<BR>
C:\Program Files\SAP\hdbclient>hdbuserstore -i SET MYHC "YOURENDPOINT:PORT" ML
The Secure User Store is installed as part of the SAP HANA Client, for details see https://blogs.sap.com/2020/07/27/hands-on-tutorial-automated-predictive-apl-in-sap-hana-cloud/#create_user

In [1]:
import hana_ml.dataframe as dataframe
conn = dataframe.ConnectionContext(userkey='MYHC')
conn.connection.isconnected()

True

Load the CSV file into a Pandas DataFrame

In [2]:
import pandas as pd
df_data = pd.read_csv('USEDCARS.csv', sep=',')
df_data.head(5)

Unnamed: 0,CAR_ID,BRAND,MODEL,VEHICLETYPE,YEAROFREGISTRATION,HP,FUELTYPE,GEARBOX,KILOMETER,PRICE
0,3,volkswagen,golf,kleinwagen,2001,75,benzin,manuell,150000,1500
1,4,skoda,fabia,kleinwagen,2008,69,diesel,manuell,90000,3600
2,6,peugeot,2_reihe,cabrio,2004,109,benzin,manuell,150000,2200
3,7,volkswagen,andere,limousine,1980,50,benzin,manuell,40000,0
4,10,mazda,3_reihe,limousine,2004,105,benzin,manuell,150000,2000


Upload the data as table into SAP HANA Cloud

In [3]:
df_remote = dataframe.create_dataframe_from_pandas(connection_context=conn, 
                            pandas_df=df_data, 
                            table_name='USEDCARS',
                            force=True,
                            drop_exist_tab=False,
                            replace=False)

100%|████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:18<00:00,  3.15s/it]


Download and display just a few records to verify successful upload

In [4]:
df_remote = conn.table('USEDCARS')
df_remote.head(5).collect()

Unnamed: 0,CAR_ID,BRAND,MODEL,VEHICLETYPE,YEAROFREGISTRATION,HP,FUELTYPE,GEARBOX,KILOMETER,PRICE
0,3,volkswagen,golf,kleinwagen,2001,75,benzin,manuell,150000,1500
1,4,skoda,fabia,kleinwagen,2008,69,diesel,manuell,90000,3600
2,6,peugeot,2_reihe,cabrio,2004,109,benzin,manuell,150000,2200
3,7,volkswagen,andere,limousine,1980,50,benzin,manuell,40000,0
4,10,mazda,3_reihe,limousine,2004,105,benzin,manuell,150000,2000


Create a filtered view on top of the data. This view will be used as source for Machine Learning

In [5]:
df_remote = df_remote.filter('PRICE >= 1000 AND PRICE <= 20000 and HP >= 50 AND HP <= 300 and YEAROFREGISTRATION >= 1995 AND YEAROFREGISTRATION <= 2015')
df_remote.count()

182385

In [6]:
df_remote.save('V_USEDCARS', table_type='VIEW', force=True)

<hana_ml.dataframe.DataFrame at 0x17f64f8bbe0>

In [7]:
df_remote.head(5).collect()

Unnamed: 0,CAR_ID,BRAND,MODEL,VEHICLETYPE,YEAROFREGISTRATION,HP,FUELTYPE,GEARBOX,KILOMETER,PRICE
0,717,volkswagen,golf,limousine,2004,75,benzin,manuell,150000,3499
1,860,volkswagen,golf,limousine,1999,75,benzin,manuell,150000,1670
2,1557,volkswagen,golf,limousine,1999,75,benzin,manuell,150000,1300
3,2326,volkswagen,golf,limousine,2000,75,benzin,manuell,150000,1350
4,2580,volkswagen,golf,limousine,1999,75,benzin,manuell,150000,1299
