Skip to content

Lift the storage limit for tag and attribute management#12447

Merged
JackieTien97 merged 11 commits intoapache:masterfrom
linxt20:Lift_the_storage_limit_for_tag_and_attribute_management
May 6, 2024
Merged

Lift the storage limit for tag and attribute management#12447
JackieTien97 merged 11 commits intoapache:masterfrom
linxt20:Lift_the_storage_limit_for_tag_and_attribute_management

Conversation

@linxt20
Copy link
Copy Markdown
Contributor

@linxt20 linxt20 commented Apr 29, 2024

The goal of this implementation is that for a tagmap that stores more than tagAttributeTotalSize, the program can support continuing to allocate a new space of the length of tagAttributeTotalSize and continue to store it.

  • Storage design:

    • How to record the newly allocated space with a length of tagAttributeTotalSize: During tagmap serialization, identification is distinguished between tags that exceed tagAttributeTotalSize and those that do not exceed tagAttributeTotalSize. Currently, the identifier is considered to be the first read int value.
      • If this tagmap does not exceed tagAttributeTotalSize, it is based on the original storage format (mapsize key-value key-value). The first value is MapSize and is recorded as a positive number, 0, -1. (When map is null, mapsize is -1, map is not null, but when empty is true, mapsize is 0)
      • If this tagmap exceeds tagAttributeTotalSize, record the number of storage blocks it occupies offsetListNum, recorded as -offsetListNum. The format is -offsetlistnum offset1 ... offsetn. Note that offset1 records the offset of the second storage block. The offset of the starting block has been recorded in measurenode, so the number of offsets is 1 less than offsetlistnum. The follow-up is to record the MapSzie key-value key-value pairs according to the normal storage.
    • In this way, the storage format of the old data is not modified, the storage of new data can be supported, and the original deserialization interface can be reused.
  • Implementation details:

    • The read and readtag interfaces of taglogfile have been modified and simplified to only require offset parameters. On the one hand, the read length is no longer a fixed length, and the caller does not know how much space is needed. On the other hand, this taglogfile provides an external interface and requires relatively complete encapsulation. The specific length is something that the caller does not need to care about.
    • Calculate the number of required offsets through the inequality Num * MAX_LENGTH < TotalMapSize + 4 + Long.BYTES * Num <= MAX_LENGTH * (Num + 1). There are at most two solutions within the range of the result. This one takes the smaller solution to save space. There are two solutions that are easy to understand, that is, just adding the offset will exceed the current block upper limit.
  • Current reading process:

    • The current reading processes of read and readtag are very similar, except that read will read one more attribute, so the reading process is abstracted into a function parsebytebuffer.
    • In this function, a blocksize data will be read first. According to the first int of blocksize, if the first int is greater than or equal to -1, then it is a block, otherwise it is multiple blocks. If it is a block, you can return after restoring the offset. If there are multiple blocks, you can create a very large bytebuffer that can accommodate all data based on offsetlistsize and put the data of the first block. Then after the first int, the 8-byte offset is continuously read, and after reading the block corresponding to the offset, it is put into the super large bytebuffer. After loop reading, the entire bytebuffer can be read out, and the offset of the bytebuffer is set before the tagmapsize to return.
  • Current writing process:

    • Serialize the content: First calculate the actual length of the serialized content. If it does not exceed 1 blocksize, then store it according to the storage format of a block (also the old storage format). If the actual length exceeds 1 blocksize, then you need to reserve space for offsetlist and serialize the actual content behind it.
    • Write content to file:
      • During the writing process, if the writing offset is a negative number, it means writing from the end.
      • First, you need to read the original data of the offset that needs to be written and obtain the space block position it occupies. The function parseoffsetlist function is used here to implement this process. Compared with parsebytebuffer, it has been pruned to a certain extent. It only needs to read the offset sequence of the sequence header, reducing the complexity of processing.
      • This is then divided into three situations for processing:
        • Situation 1: It turns out that there is only one block in this place, and now there is only one block to write, so just write it directly.
        • Situation 2: It turns out that the block occupied by this place is more than the amount of data to be written currently, so the original block needs to be reused at this time. Create a bytebuffer with the original blocksize, reserve the original offset space, and then put the currently written data into it. If there are multiple blocks now, remember to adjust the offset to skip the space reserved during serialization. Then you only need to write the corresponding blocks to the corresponding offset positions in sequence according to the original offset list.
        • Case 3: The space written now is greater than or equal to the original space. At this time, the space reserved during serialization is sufficient. However, since some offsets need to be written to obtain, the serialization results without offsets need to be written first according to the offset records of the original block. If the block is not enough, then continue writing at the end of the filechannel, and then record the offset position into the offset list. Finally, the data of the offset list is also combined into a buffer, and then written sequentially from the starting position according to the offset list. In this way, the entire writing process is completed.
  • The difficulty in this part is that the offset of the file will change with reading and writing. You need to always pay attention to the changes in the position and limit of the bytebuffer and make adjustments.

  • extra work

    • Optimization of SRStatementGenerator: Found and fixed the non-standard implementation of reading tagmap in SRStatementGenerator. By setting the parsebytebuffer function to static, the SRStatementGenerator is allowed to be called directly, reducing modifications to the source code and avoiding inconsistency issues.
    • Increase the number and size limits of Tags and Attributes: Restrictive parameters for the number of Tags and Attributes and their individual sizes are introduced. Currently, the number of supported tags and attributes has the same upper limit, rather than the upper limit of both together, which is relatively reasonable. Currently, this quantity and size restrictive parameter is implemented in the serialization of the writing process. During serialization, the serialization size of tagmap and attributemap needs to be calculated. Checks for mapsize and entrysize are added here to ensure that the checks are completed before writing and that the length will not be processed repeatedly.
    • Add tagindex to the memory control of metadata: Detailed memory increase and decrease calculations are implemented for the addindex and removeindex functions in tagmanager. At the same time, the assignment and recycling of the storage structure are completed, and the basic memory calculation interface for the increase or decrease of index is completed. In the case of memory overflow, new tags cannot be created. Here, the corresponding tags are added in renameTagOrAttributeKey, setTagsOrAttributesValue, addTags, and upsertAliasAndTagsAndAttributes in schemaregionmemoryimpl and schemaregionpbtreeimpl when the corresponding tag is not empty. If memory overflows, execution will be refused and an error will be thrown.
    • Fix memory control omissions: The original alisa is involved in memory control, but for the addition and modification of alisa, there is no check for memory overflow. That is to say, when the memory overflows, alisa can still be added normally, which will cause further memory overflow. The current modification is upsertAlias in schemaregionmemoryimpl and schemaregionpbtreeimpl, which checks whether the memory overflows when alisa is not null.

Copy link
Copy Markdown
Contributor

@JackieTien97 JackieTien97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to add some it for this.

Comment on lines +254 to +255
private int tagAttributeEachMaxNum = 20;
private int tagAttributeEachMaxSize = 100;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add all of these two config into iotdb-common.properties in iotdb-core/node-commons/src/assembly/resources/conf/iotdb-common.properties and load these two in CommonDescriptor.loadCommonProps.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also remember to change the comments about tag_attribute_total_size in iotdb-core/node-commons/src/assembly/resources/conf/iotdb-common.properties

Comment on lines +228 to +229
if (blockOffset.size()
> blockNumReal) { // if the original space is larger than the new space, the original
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>= or >?

blockOffset.add(position);
for (int i = 1; i < blockNum; i++) {
blockOffset.add(ReadWriteIOUtils.readLong(byteBuffers));
Long nextPosition = ReadWriteIOUtils.readLong(byteBuffers);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Long nextPosition = ReadWriteIOUtils.readLong(byteBuffers);
long nextPosition = ReadWriteIOUtils.readLong(byteBuffers);

@@ -181,18 +180,16 @@ public void addIndex(String tagKey, String tagValue, IMeasurementMNode<?> measur

int memorySize = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it as long

@@ -212,20 +209,18 @@ public void removeIndex(String tagKey, String tagValue, IMeasurementMNode<?> mea
// init memory size
int memorySize = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it as long

@JackieTien97 JackieTien97 merged commit 7df7e5c into apache:master May 6, 2024
SzyWilliam pushed a commit to SzyWilliam/iotdb that referenced this pull request Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants