# Fog-RTX Cloud Data Collection Demo

FogRTX supports a wide range of cloud service providers. In this workbook, we show the support of AWS and google cloud.

### AWS

In [None]:
! git clone https://github.com/KeplerC/fog_x.git
! cd fog_x && git checkout cloud-demo-dev && pip install .

Install required AWS dependency and configure with your aws credential

In [None]:
! curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
! unzip -q awscliv2.zip
! sudo ./aws/install
! aws configure

In [3]:
# create an AWS bucket named fog-rtx-test-east-1 (anything you want)
!aws s3api create-bucket --bucket fog-rtx-test-east-2 --region us-east-1

{
    "Location": "/fog-rtx-test-east-2"
}


In [None]:
!pip3 install boto3

### Creating or Loading from existing datasets


Fog-RTX can load from the existing bucket, and add more to it!

In [5]:
import fog_x

dataset = fog_x.dataset.Dataset(
    name="demo_ds",
    path='s3://fog-rtx-test-east-2',
)

INFO:fog_x.database.polars_connector:Prepare to load table demo_ds loaded from s3://fog-rtx-test-east-2/demo_ds.parquet.
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:fog_x.database.polars_connector:Table demo_ds loaded from s3://fog-rtx-test-east-2/demo_ds.parquet.


read from exisitng AWS stored dataset

In [6]:
dataset.get_episode_info()

episode_id,Finished,feature_arm_camera_view_type,feature_arm_camera_view_shape,arm_camera_view_count,feature_gripper_acton_type,feature_gripper_acton_shape,gripper_acton_count
i64,bool,str,str,f64,str,str,f64
0,True,"""float64""","""(480, 640, 3)""",0.0,"""float64""","""(7,)""",0.0


### Adding new data to the dataset

In [7]:
import numpy as np

# create a new trajectory
episode = dataset.new_episode()
# collect step data for the episode
episode.add(feature = "arm_camera_view", value = np.random.rand(480, 640, 3))
episode.add(feature = "gripper_acton", value = np.random.rand(7))
# Automatically time-aligns and saves the trajectory
episode.close()

INFO:fog_x.database.db_manager:Closing the episode with metadata {}


In [8]:
dataset.get_episode_info()

episode_id,Finished,feature_arm_camera_view_type,feature_arm_camera_view_shape,arm_camera_view_count,feature_gripper_acton_type,feature_gripper_acton_shape,gripper_acton_count
i64,bool,str,str,f64,str,str,f64
0,True,"""float64""","""(480, 640, 3)""",0.0,"""float64""","""(7,)""",0.0
1,True,"""float64""","""(480, 640, 3)""",0.0,"""float64""","""(7,)""",0.0


### Load Cloud Dataset at different place!
The data is automatically uploaded to the cloud!
We can create a different reader (you can run this on a different machine).
The data is automatically loaded and read!

In [9]:
dataset2 = fog_x.dataset.Dataset(
    name="demo_ds",
    path='s3://fog-rtx-test-east-2',
)

INFO:fog_x.database.polars_connector:Prepare to load table demo_ds loaded from s3://fog-rtx-test-east-2/demo_ds.parquet.
INFO:fog_x.database.polars_connector:Table demo_ds loaded from s3://fog-rtx-test-east-2/demo_ds.parquet.


In [10]:
# metadata
trajectory_metadata = dataset2.get_episode_info()
trajectory_metadata

episode_id,Finished,feature_arm_camera_view_type,feature_arm_camera_view_shape,arm_camera_view_count,feature_gripper_acton_type,feature_gripper_acton_shape,gripper_acton_count
i64,bool,str,str,f64,str,str,f64
0,True,"""float64""","""(480, 640, 3)""",0.0,"""float64""","""(7,)""",0.0
1,True,"""float64""","""(480, 640, 3)""",0.0,"""float64""","""(7,)""",0.0


# Google Cloud Platform

This can also be done on GCP!

Register google cloud credentials

Alternative in non-colab environment, run following command instead:
```
gcloud auth application-default login   --quiet --no-launch-browser
```


In [11]:
from google.colab import auth
PROJECT_ID = "canvas-rampart-342500"
auth.authenticate_user(project_id=PROJECT_ID)

INFO:google.colab.auth:Failure refreshing credentials: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7e1292cdaad0>)
INFO:google.colab.auth:Failure refreshing credentials: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7e1292dbbf70>)


In [12]:
! gcloud storage buckets create gs://fog_rtx_test  --location=us-east1

Creating gs://fog_rtx_test/...
[1;31mERROR:[0m (gcloud.storage.buckets.create) HTTPError 409: Your previous request to create the named bucket succeeded and you already own it.


In [13]:
dataset = fog_x.dataset.Dataset(
    name="demo_ds",
    path='gs://fog_rtx_test/',
)

INFO:fog_x.database.polars_connector:Prepare to load table demo_ds loaded from gs://fog_rtx_test/demo_ds.parquet.
ERROR:fog_x.database.polars_connector:Table demo_ds does not exist, available tables are dict_keys([]).


In [14]:
import numpy as np

# create a new trajectory
episode = dataset.new_episode()
# collect step data for the episode
episode.add(feature = "arm_camera_view", value = np.random.rand(480, 640, 3))
episode.add(feature = "gripper_acton", value = np.random.rand(7))
# Automatically time-aligns and saves the trajectory
episode.close()

INFO:fog_x.database.db_manager:Closing the episode with metadata {'Finished': True, 'arm_camera_view_count': 0, 'gripper_acton_count': 0}


In [17]:
dataset2 = fog_x.dataset.Dataset(
    name="demo_ds",
    path='gs://fog_rtx_test/',
)

INFO:fog_x.database.polars_connector:Prepare to load table demo_ds loaded from gs://fog_rtx_test/demo_ds.parquet.
INFO:fog_x.database.polars_connector:Table demo_ds loaded from gs://fog_rtx_test/demo_ds.parquet.


In [18]:
dataset2.get_episode_info()

episode_id,Finished,feature_arm_camera_view_type,feature_arm_camera_view_shape,arm_camera_view_count,feature_gripper_acton_type,feature_gripper_acton_shape,gripper_acton_count
i64,bool,str,str,f64,str,str,f64
0,True,"""float64""","""(480, 640, 3)""",0.0,"""float64""","""(7,)""",0.0


### Known issues

1. `export` as rlds format to the cloud directly does not work yet for S3 (known issue for tensorflow Gfile)
2. (will fix) automatically check the existence