<a href="https://colab.research.google.com/github/bytehub-ai/code-examples/blob/main/tutorials/03_bytehub_cloud_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ByteHub Cloud introduction

Start by installing ByteHub using pip. Use `bytehub[cloud]` to get extra dependencies.

In [1]:
!pip install -q bytehub[cloud]

In [2]:
import pandas as pd
import numpy as np
import os
import shutil
import bytehub as bh
print(f'ByteHub version {bh.__version__}')

ByteHub version 0.3.4


Connect to the cloud-hosted feature store and login. [Contact us](https://www.bytehub.ai/feature-store/request-access) to register for an account.

In [3]:
fs = bh.CloudFeatureStore()

Please go to https://auth.bytehub.ai/login?response_type=code&client_id=67lckd9mdmb8h3obkr8jtndamt&redirect_uri=https%3A%2F%2Fwww.bytehub.ai%2Fauthenticated&state=CJYkToWP1sQXIYRMpl0BARilNgpESW and login. Copy the response code and paste below.
Response: ··········


In [4]:
fs.list_features()

Unnamed: 0,namespace,name,version,description,meta,partition,serialized,transform
0,toby,rawdata.carbon,1,,{},year,True,False
1,toby,rawdata.generation,1,,{},year,True,False
2,toby,feature.carbon-forecast,1,,{},date,False,True
3,toby,feature.carbon-actual,1,,{},date,False,True
4,toby,feature.low-carbon-fuels,1,,{},date,False,True


All features will be stored inside a namespace corresponding to your username. For example, to create a new feature:

In [5]:
# Edit this to match your own username
fs.create_feature('toby/numbers', description='Timeseries of numbers', partition='year')

Now we can generate a Pandas dataframe with time and value columns to store.

In [6]:
dts = pd.date_range('2020-01-01', '2021-02-09')
df = pd.DataFrame({'time': dts, 'value': list(range(len(dts)))})

df.head()

Unnamed: 0,time,value
0,2020-01-01,0
1,2020-01-02,1
2,2020-01-03,2
3,2020-01-04,3
4,2020-01-05,4


In [7]:
fs.save_dataframe(df, 'toby/numbers')

This data is now stored securely on the cloud. Let's read it back and try some resampling operations.

In [8]:
fs.load_dataframe('toby/numbers', from_date='2020-10-01', to_date='2020-10-31') # Query date range

Unnamed: 0_level_0,toby/numbers
time,Unnamed: 1_level_1
2020-10-01,274
2020-10-02,275
2020-10-03,276
2020-10-04,277
2020-10-05,278
2020-10-06,279
2020-10-07,280
2020-10-08,281
2020-10-09,282
2020-10-10,283


In [9]:
fs.load_dataframe('toby/numbers', freq='1M') # Monthly sampling

Unnamed: 0,toby/numbers
2020-01-31,30
2020-02-29,59
2020-03-31,90
2020-04-30,120
2020-05-31,151
2020-06-30,181
2020-07-31,212
2020-08-31,243
2020-09-30,273
2020-10-31,304


In [10]:
fs.last('toby/numbers') # Last value

{'toby/numbers': 405}

Take a look at the [ByteHub documentation](https://docs.bytehub.ai) for more examples and documentation on the feature store.