# Example 1: Grab metadata from a dataset

In this example, we will learn how to get metadata of each file in dataset.  
The metadata contains deterministic information (e.g. recording date, duration, etc.)
as well as heuristic information such as tags.

## Grab a DataFrame of metadata from a database

Firstly, let's import a metadata handler from the toolkit and initialize it.

In [1]:
from pydtk.db import V3DBHandler as DBHandler

db_handler = DBHandler(
    db_class='meta',
    db_host='./example.db',
    base_dir_path='../test',
    read_on_init=True
)

If you set `read_on_init` to `True`, the entire contents in the database will be loaded
and stored into the local memory as a Pandas DataFrame.  
You can access to the contents as follows.

In [2]:
db_handler.df

Unnamed: 0_level_0,description,database_id,record_id,sub_record_id,data_type,path,start_timestamp,end_timestamp,content_type,contents,msg_type,msg_md5sum,count,frequency,tags,creation_time_in_df
uuid_in_df,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
48b887f575f4ee7bb4cb99cb97a05bf6,Driving Database,Driving Behavior Database,B05_17000000010000000829,,raw_data,records/B05_17000000010000000829/data/records.bag,1517463000.0,1517463000.0,application/rosbag,/vehicle/gnss,sensor_msgs/NavSatFix,2d3a8cd499b9b4a0249fb98fd05cfa48,1.0,,vehicle;gnss,1601023000.0
5738d5ce2396719f8547cdba8a11fd87,Driving Database,Driving Behavior Database,B05_17000000010000000829,,raw_data,records/B05_17000000010000000829/data/records.bag,1517463000.0,1517463000.0,application/rosbag,/vehicle/analog/turn_signal,std_msgs/UInt8,7c8164229e7d2c17eb95e9231617fdee,20.0,20.000019,vehicle;analog;turn;signal,1601023000.0
62467411122b2cadd9fc84d5971e26da,Driving Database,Driving Behavior Database,B05_17000000010000000829,,raw_data,records/B05_17000000010000000829/data/records.bag,1517463000.0,1517463000.0,application/rosbag,/vehicle/analog/speed_pulse,std_msgs/UInt8,7c8164229e7d2c17eb95e9231617fdee,20.0,20.000019,vehicle;analog;speed;pulse,1601023000.0
64a509adebdb5e302cb0d3abac7c274e,Driving Database,Driving Behavior Database,B05_17000000010000000829,,raw_data,records/B05_17000000010000000829/data/records.bag,1517463000.0,1517463000.0,application/rosbag,/vehicle/acceleration,geometry_msgs/AccelStamped,d8a98a5d81351b6eb0578c78557e7659,10.0,10.00001,vehicle;acceleration,1601023000.0
755a42842ae9a45379b8a0b436e39560,Driving Database,Driving Behavior Database,016_00000000030000000240,,raw_data,records/016_00000000030000000240/data/camera_0...,1489728000.0,1489729000.0,text/csv,camera/front-center,,,,,camera;front;center;timestamps,1601023000.0
aef4120ef2309a1ddb79f92f6087f559,Driving Database,Driving Behavior Database,B05_17000000010000000829,,raw_data,records/B05_17000000010000000829/data/records.bag,1517463000.0,1517463000.0,application/rosbag,/vehicle/analog/brake_signal,std_msgs/Bool,8b94c1b53db61fb6aed406028ad6332a,20.0,20.000019,vehicle;analog;brake;signal,1601023000.0
b2fa32697cd93b0de12bc64d28421753,Driving Database,Driving Behavior Database,B05_17000000010000000829,,raw_data,records/B05_17000000010000000829/data/records.bag,1517463000.0,1517463000.0,application/rosbag,/vehicle/analog/back_signal,std_msgs/Bool,8b94c1b53db61fb6aed406028ad6332a,20.0,20.000019,vehicle;analog;back;signal,1601023000.0
dc5b2c8e704cc50d5ef0ec8218921ef9,Driving Database,Driving Behavior Database,sample,,raw_data,records/sample/data/records.bag,1550126000.0,1550126000.0,application/rosbag,/points_concat_downsampled,sensor_msgs/PointCloud2,1158d486dd51d683ce2f1be655c3c181,4.0,10.0,lidar;downsampled,1601023000.0
dced43c9a965c75b0204f2fa5a79e088,Urban driving situation description.,METI2019,20191001_094731_000_car3,20191001_094731_car3_101732,raw_data,records/meti2019/ssd7.bag,1569893000.0,1569893000.0,application/rosbag,/vehicle/can_raw,autoware_can_msgs/CANPacket,8315bda71683b8ece50e17e529eea4c1,698217.0,4173.436816,vehicle;can;raw,1601023000.0


When you want to handle a very large dataset, the metadata contains huge amount of information and as a result,
it takes a long time to load all of it.  
However, if you want to grab only a limited scope (e.g. metadata of files tagged 'camera' and 'front'),
it is costful to load all the dataset and search items on the loaded dataframe.  
Therefore, the toolkit provides a method to execute a sql query before loading the database
and limit the items to load.  

To execute a sql query before loading metadata, you should set `read_on_init` option to `False` as follows.

In [3]:
db_handler = DBHandler(
    db_class='meta',
    db_host='./example.db',
    base_dir_path='../test',
    read_on_init=False
)
db_handler.read(where='start_timestamp > 1500000000')
print('# of metadata: {}'.format(len(db_handler.df)))
db_handler.read(where='tags like "%camera%" and tags like "%front%"')
print('# of metadata: {}'.format(len(db_handler.df)))
db_handler.read(where='tags like "%can%" and tags like "%steering%"')
print('# of metadata: {}'.format(len(db_handler.df)))
db_handler.read(where='tags like "%can%" or tags like "%camera%"')
print('# of metadata: {}'.format(len(db_handler.df)))

# of metadata: 8
# of metadata: 1
# of metadata: 0
# of metadata: 2


## Get list for record_id corresponding to metadata

Each row of the dataframe acquired above corresponds to a file in the dataset.  
If you wan to know which record-id the file belongs to, you can get a dataframe of records as follows.

In [4]:
db_handler.record_id_df

Unnamed: 0,record_id,start_timestamp,end_timestamp,tags,duration
0,016_00000000030000000240,1489728000.0,1489729000.0,"[timestamps, camera, front, center]",79.957
1,20191001_094731_000_car3,1569893000.0,1569893000.0,"[vehicle, can, raw]",0.95


You can get list of contents as well.

In [5]:
db_handler.content_df



Unnamed: 0_level_0,record_id,path,content,msg_type,tag
uuid_in_df,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
755a42842ae9a45379b8a0b436e39560,016_00000000030000000240,/opt/pydtk/test/records/016_000000000300000002...,camera/front-center,,"[timestamps, camera, front, center]"
dced43c9a965c75b0204f2fa5a79e088,20191001_094731_000_car3,/opt/pydtk/test/records/meti2019/ssd7.bag,/vehicle/can_raw,autoware_can_msgs/CANPacket,"[vehicle, can, raw]"
