HIVE-17580 : Remove dependency of get_fields_with_environment_context API to serde #310
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This version of patch moves TypeInfo and its sub-classes to standalone-metastore. The motivation of doing this is that metastore needs the TypeInfo like classes to store the metadata about types. This is implemented by TypeInfos in Hive. Metastore needs this information because table like avro can define schema externally using url to a file containing schema or a string value of the schema added as a table property. In such cases metastore need to parse this information and convert them into FieldSchema. Before this patch this String->FieldSchema conversion was done using SerDes using the ObjectInspectors and the typeInfos from them. This patch bypasses a lot of that to remove the dependency to the SerDes such that it converts the String -> TypeInfo -> FieldSchema.
In order to achieve this and also for reducing duplicate code and a cleaner design, this patch moves TypeInfo and its subclasses (ListTypeInfo, MapTypeInfo, StructTypeInfo, UnionTypeInfo), TypeInfoParser to standalone metastore. In case of PrimitiveTypeInfo, Hive code has added lot more than just type metadata in PrimitiveTypeInfo. Specifically, PrimitiveTypeEntry, PrimitiveCategory is type implementation detail which cannot be moved to standalone-metastore. Not to mention bring in PrimitiveTypeEntry bring in a whole lot of dependent code with it. To workaround this issue, a new class called MetastorePrimitiveTypeInfo is introduced in standalone-metastore. This class contains only the information which is needed by metastore from PrimitiveTypeInfo and PrimitiveTypeInfo extends MetastorePrimitiveTypeInfo. This way we reduce the scope of changes greatly. PrimitiveTypeInfo now contains implementation details of Hive's primitive types. Moving TypeInfo to standalone-metastore also needs the Category enum which unfortunately was defined in ObjectInspector. In order to get around this ObjectInspector is moved to storage-api so that standalone-metastore can access the Category enum from TypeInfo.
Moving TypeInfoFactory was also very disruptive and hence an interface called ITypeInfoFactory is created in metastore and both metastore and hive implement this interface. The Avro storage schema reader now can use the TypeInfoToSchema and SchemaToTypeInfo util classes (also moved to metastore) using the ITypeInfoFactory interface.