Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReferencedTabletFile has getTableId method that may be misleading. #4094

Open
EdColeman opened this issue Dec 20, 2023 · 2 comments
Open

ReferencedTabletFile has getTableId method that may be misleading. #4094

EdColeman opened this issue Dec 20, 2023 · 2 comments
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects

Comments

@EdColeman
Copy link
Contributor

In ReferencedTabletFile and inherited by StoredTabletFile there is a method

    public TableId getTableId() ...

The id is being derived by parsing the file path. If a table had been cloned, the id in the path may not be the table id. At a minimum, the method could be renamed, and possibly return an id that is not a TableId so that any id derived from the path cannot be directly used as a TableId.

The issue would be if someone assumed that the TableId returned from a ReferencedTabletFile was the same a the "real" table id and then made metadata changes based on incorrect information. Just using the method signature, it seems easy to make an incorrect assumption on what that id returned represents.

@EdColeman EdColeman added the enhancement This issue describes a new feature, improvement, or optimization. label Dec 20, 2023
@EdColeman EdColeman added this to To do in 3.1.0 via automation Dec 20, 2023
@rsingh433
Copy link
Contributor

I will look at this.

@EdColeman
Copy link
Contributor Author

To illustrate the issue I create a clone and then list the metadata files. The original table is tableA (id=1), the clone is cloneA (id=5). The resulting files for the clone look like

5;0020 file:hdfs://localhost:8020/accumulo/tables/1/t-000004d/A0000009.rf

If the table id in pulled from the file path, it would show id=1, the file really belongs to id=5. Both tableA and cloneA are "sharing" the file. If cloneA (id=5) is compacted, new files will be created under the id=5 directory.

Additional listings

Output of tables -ls:

> tables -ls
accumulo.metadata    =>        !0
accumulo.root        =>        +r
accumulo.replication =>      +rep
tableA               =>         1
tableB               =>         2
tableC               =>         3
tableD               =>         4
cloneA               =>         5

Listing the metadata files:

> scan -np -t accumulo.metadata -c file
1;0020 file:hdfs://localhost:8020/accumulo/tables/1/t-000004d/A0000009.rf []	299,21
1;0040 file:hdfs://localhost:8020/accumulo/tables/1/t-000004e/A0000008.rf []	296,20
1;0060 file:hdfs://localhost:8020/accumulo/tables/1/t-000004f/A000000f.rf []	298,20
1;0080 file:hdfs://localhost:8020/accumulo/tables/1/t-000004g/A000000h.rf []	303,20
1< file:hdfs://localhost:8020/accumulo/tables/1/default_tablet/A000000e.rf []	288,19
2;0020 file:hdfs://localhost:8020/accumulo/tables/2/t-000004h/A000000i.rf []	291,21
...
4< file:hdfs://localhost:8020/accumulo/tables/4/default_tablet/A0000016.rf []	292,19
5;0020 file:hdfs://localhost:8020/accumulo/tables/1/t-000004d/A0000009.rf []	299,21
5;0040 file:hdfs://localhost:8020/accumulo/tables/1/t-000004e/A0000008.rf []	296,20
5;0060 file:hdfs://localhost:8020/accumulo/tables/1/t-000004f/A000000f.rf []	298,20
5;0080 file:hdfs://localhost:8020/accumulo/tables/1/t-000004g/A000000h.rf []	303,20
5< file:hdfs://localhost:8020/accumulo/tables/1/default_tablet/A000000e.rf []	288,19

@EdColeman EdColeman assigned EdColeman and unassigned rsingh433 and EdColeman Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects
3.1.0
To do
Development

No branches or pull requests

2 participants