# Example 2: Grab data based on metadata

In this example, we will learn how to grab the actual data based on metadata.

## Filter metadata

Just like how we did in the previous example, let's search metadata of Analog signals as an example.

In [1]:
import os
from pydtk.db import V3DBHandler as DBHandler

db_handler = DBHandler(
    db_class='meta',
    db_host='./example.db',
    base_dir_path='../test',
    read_on_init=False
)
db_handler.read(where='tags like "%analog%"')
db_handler.content_df

Unnamed: 0_level_0,record_id,path,content,msg_type,tag
uuid_in_df,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
5738d5ce2396719f8547cdba8a11fd87,B05_17000000010000000829,/opt/pydtk/test/records/B05_170000000100000008...,/vehicle/analog/turn_signal,std_msgs/UInt8,"[signal, vehicle, turn, analog]"
62467411122b2cadd9fc84d5971e26da,B05_17000000010000000829,/opt/pydtk/test/records/B05_170000000100000008...,/vehicle/analog/speed_pulse,std_msgs/UInt8,"[vehicle, pulse, analog, speed]"
aef4120ef2309a1ddb79f92f6087f559,B05_17000000010000000829,/opt/pydtk/test/records/B05_170000000100000008...,/vehicle/analog/brake_signal,std_msgs/Bool,"[brake, vehicle, analog, signal]"
b2fa32697cd93b0de12bc64d28421753,B05_17000000010000000829,/opt/pydtk/test/records/B05_170000000100000008...,/vehicle/analog/back_signal,std_msgs/Bool,"[signal, vehicle, analog, back]"


You can also use Pandas' `filter` function to filter metadata.  
For example, if you want to search for speed signals... 

In [2]:
db_handler.df.query('tags.str.contains("speed")', inplace=True)
db_handler.content_df

Unnamed: 0_level_0,record_id,path,content,msg_type,tag
uuid_in_df,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
62467411122b2cadd9fc84d5971e26da,B05_17000000010000000829,/opt/pydtk/test/records/B05_170000000100000008...,/vehicle/analog/speed_pulse,std_msgs/UInt8,"[vehicle, pulse, analog, speed]"


## Iterate metadata

You can get metadata one-by-one as `DBHandler` works as a iterator.  
To get a sample, just use `next()` method.
A metadata will be returned as a dict.

In [3]:
sample = next(db_handler)
sample

{'description': 'Driving Database',
 'database_id': 'Driving Behavior Database',
 'record_id': 'B05_17000000010000000829',
 'sub_record_id': None,
 'data_type': 'raw_data',
 'path': '/opt/pydtk/test/records/B05_17000000010000000829/data/records.bag',
 'start_timestamp': 1517463303.0,
 'end_timestamp': 1517463303.95,
 'content_type': 'application/rosbag',
 'contents': {'/vehicle/analog/speed_pulse': {'msg_type': 'std_msgs/UInt8',
   'msg_md5sum': '7c8164229e7d2c17eb95e9231617fdee',
   'count': 20,
   'frequency': 20.000019073504518,
   'tags': ['vehicle', 'analog', 'speed', 'pulse']}}}

## Grab data

Based on the metadata, we can grab the actual data as a numpy array from the corresponding file.  
`BaseFileReader` automatically chooses an appropriate model to load the file based on the given metadata.  
Thus, you can simple call `read` function to grab data as follows.

In [4]:
from pydtk.io import BaseFileReader, NoModelMatchedError

reader = BaseFileReader()

try:
    timestamps, data, columns = reader.read(sample)
    print('# of frames: {}'.format(len(timestamps)))
except NoModelMatchedError as e:
    print(str(e))

Failed to load Python extension for LZ4 support. LZ4 compression will not be available.


# of frames: 20


Let's check the ndarray.

In [5]:
timestamps?

In [6]:
data?

In [7]:
columns?