Skip to content

Conversation

@JingsongLi
Copy link
Contributor

@JingsongLi JingsongLi commented Apr 24, 2022

We can store the schema on the path of the table store, which includes type, options, etc.

This schema should be in a format that supports evolution, which means that the fields contain id information.

@JingsongLi JingsongLi changed the title [FLINK-27366] Record metadata on filesystem path [FLINK-27366] Record schema on filesystem path Apr 25, 2022
@JingsongLi JingsongLi force-pushed the schema branch 3 times, most recently from 2e1c59d to c7c9095 Compare May 16, 2022 08:32
/** Json serializer for jackson. */
public interface JsonSerializer<T> {

void serializer(T t, JsonGenerator generator) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serialize


void serializer(T t, JsonGenerator generator) throws IOException;

T deserializer(JsonNode node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deserialize

return fieldIds.stream().max(Integer::compareTo).orElse(-1);
}

private static void collectFieldIds(Set<Integer> fieldIds, DataType type) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataType type, Set<Integer> fieldIds. Input arguments should be in front of output arguments.

Comment on lines +60 to +63
return listVersionedFiles(schemaDirectory(), SCHEMA_PREFIX)
.reduce(Math::max)
.map(this::schema);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add hint files just like SnapshotFinder? Maybe extract common classes for both snapshot and schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary for schema because there won't be too many versions of schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schema changes are a very low frequency thing (Compared to snapshot generation)

return schema;
} else {
// retry
FileUtils.deleteOrWarn(temp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this method fails with exception temp file will not be cleaned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we dont have other solutions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have. Wrap the code with try... finally....

import static org.junit.jupiter.api.Assertions.assertThrows;

/** Test for {@link SchemaManager}. */
public class SchemaManagerTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lacks concurrent tests and cleanup tests for commitNewVersion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add after each check for cleanup tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add one testConcurrentCommit test.

Comment on lines 124 to 126
FileUtils.writeFileUtf8(temp, schema.toString());

Boolean success = lock.runWithLock(() -> temp.getFileSystem().rename(temp, finalFile));
if (success) {
return schema;
} else {
// retry
FileUtils.deleteOrWarn(temp);
boolean success = false;
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try should also cover file write. File write may be partial

@JingsongLi JingsongLi merged commit 1aac9a1 into apache:master May 18, 2022
@JingsongLi JingsongLi deleted the schema branch January 3, 2024 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants