BigTable: Async client for high throughput mutate_rows

Ubuntu 16.04
Python version and virtual environment information python --version - python 2.7.12

With a batch size of 300, total 3 nodes in an instance, the write throughput is not good for BigTable by using `mutate_rows` api call.
  
```python
BULK_WRITE_BATCH_SIZE = 300
with open(file_path) as sensor_data_input_file:
    list_direct_row_obj = []
    for line in sensor_data_input_file:
        if not line.strip():
            continue

        sensor_json_data = json.loads(line)
        row_key = create_sensor_data_id(sensor_json_data)
        value = line

        direct_row_obj = bigtable.row.DirectRow(row_key, table)

        column_id = 'column_id_data'.encode('utf-8')
        direct_row_obj.set_cell(column_family_id, column_id, value.encode('utf-8'))


        list_direct_row_obj.append(direct_row_obj)
        direct_row_obj = None

        if len(list_direct_row_obj) == BULK_WRITE_BATCH_SIZE:
            table.mutate_rows(list_direct_row_obj)
            list_direct_row_obj[:] = []

    if list_direct_row_obj:
        table.mutate_rows(list_direct_row_obj)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigTable: Async client for high throughput mutate_rows #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BigTable: Async client for high throughput mutate_rows #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions