Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Clone in Desktop Download ZIP

Loading…

New Data Dictionary and atomic dictionary operations #40

Closed
yoshinorim opened this Issue · 1 comment

1 participant

@yoshinorim
Owner

Since we need to manage more dictionary information, it is time to define better dictionary data model. This has to be done before rolling out widely.

Candidate dictionary model after discussion:

  • Table Name => internal index ids
    key: RDBSE_KEYDEF::DDL_ENTRY_INDEX_START_NUMBER(0x1) + dbname.tablename
    value: version, {index_id}*n_indexes_of_the_table

  • Internal index id => CF id
    key: RDBSE_KEYDEF::INDEX_CF_MAPPING(0x2) + index_id
    value: version, cf_id

  • CF id => cf flags
    key: RDBSE_KEYDEF::CF_DEFINITION(0x3) + cf_id
    value: version, {is_reverse_cf, is_auto_cf}

  • Ongoing drop index entry
    key: RDBSE_KEYDEF::DDL_DROP_INDEX_ONGOING(0x4) + index_id
    value: version

  • Binlog entry (updated at commit)
    key: RDBSE_KEYDEF::BINLOG_INFO_INDEX_NUMBER (0x5)
    value: version, {binlog_name,binlog_pos,binlog_gtid}

  • table_stats (same as innodb)
    key: RDBSE_KEYDEF::TABLE_STATISTICS(0x6) + db_name.tablename
    value: version, {n_rows, clustered_index_size, sum_of_other_index_sizes, last_update}

  • index_stats (same as innodb)
    key: RDBSE_KEYDEF::INDEX_STATISTICS(0x7) + index_id
    value: version, {stat_value, sample_size, last_update, stat_description}

side notes:

  • We agreed to have a dedicated column family for data dictionary.
  • New data dictionary is not compatible with current dictionary. This is fine as long as we're in alpha stage.
  • Adding version number (2 bytes possibly) to make it easier for further format changes.
  • About implementation, I think extending Table_ddl_manager would be fine.
  • DDL operations should be atomic. For example, when adding a new index with new column family, it is necessary to call Put() three times. They have to be atomic -- using WriteBatch() to include all Puts.
  • Currently dictionary is cached at ddl_hash. I think table_name->index_id->cf_id may be managed via mysql hash as well, in unnormalized format, for performance reasons.
@yoshinorim yoshinorim self-assigned this
@yoshinorim yoshinorim closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.