New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAWQ-1228. Use profile based on file format in HCatalog integration(HiveRC, HiveText profiles) for homogeneous tables. #1076
Conversation
Each property in the profile was meant to be a particular class which allowed any user to customize/extend it. In the same theme, instead of definiging outputFormat in the pxf-profile, lets go with and get the classname. |
|
||
@Override | ||
void parseDelimiterChar(InputData input) { | ||
delimiter = 44; //, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a temporary workaround ?
Why not use a static variable to store the delimiter value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's temporary, work is in progress to actually use appropriate parameters from Hive's SerDe.
initPartitionFields(toks[3]); | ||
filterInFragmenter = new Boolean(toks[4]); | ||
HiveUserData hiveUserData = HiveUtilities.parseHiveUserData(input); | ||
initPartitionFields(hiveUserData.getPartitionKeys()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we always have value for partitionKeys ? If not, initParitionFields will throw NPE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, even if a table has no partitions, we have "!HNPT!" value.
@@ -57,11 +59,11 @@ | |||
*/ | |||
public class HiveColumnarSerdeResolver extends HiveResolver { | |||
private static final Log LOG = LogFactory.getLog(HiveColumnarSerdeResolver.class); | |||
private ColumnarSerDeBase deserializer; | |||
//private ColumnarSerDeBase deserializer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove line
initPartitionFields(toks[HiveInputFormatFragmenter.TOK_KEYS]); | ||
filterInFragmenter = new Boolean(toks[HiveInputFormatFragmenter.TOK_FILTER_DONE]); | ||
HiveUserData hiveUserData = HiveUtilities.parseHiveUserData(input, PXF_HIVE_SERDES.LAZY_SIMPLE_SERDE); | ||
initPartitionFields(hiveUserData.getPartitionKeys()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible NPE in this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partitionKeys always have not-null value.
…pes in HiveText profile.
@@ -49,6 +49,7 @@ under the License. | |||
<accessor>org.apache.hawq.pxf.plugins.hive.HiveAccessor</accessor> | |||
<resolver>org.apache.hawq.pxf.plugins.hive.HiveResolver</resolver> | |||
<metadata>org.apache.hawq.pxf.plugins.hive.HiveMetadataFetcher</metadata> | |||
<outputFormat>BINARY</outputFormat> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaultOutputFormat
void initSerde(InputData input) { | ||
/* nothing to do here */ | ||
void initSerde(InputData input) throws Exception { | ||
if (((ProtocolData) inputData).outputFormat() == OutputFormat.TEXT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invert check.
Please make sure you resolve the conflicts when you merge with master |
@@ -68,7 +68,6 @@ void free_token_resources(PxfInputData *inputData); | |||
Datum gpbridge_import(PG_FUNCTION_ARGS) | |||
{ | |||
gpbridge_check_inside_extproto(fcinfo, "gpbridge_import"); | |||
// ExternalSelectDesc desc = EXTPROTOCOL_GET_SELECTDESC(fcinfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove instead of commenting out
|
||
/* location - should be an array of text with one element: | ||
* pxf://<ip:port/namaservice>/<db>.<path>?Profile=profileName&delimiter=delimiterCode */ | ||
static Datum GetLocationForFormat(char *profile, List *outputFormats, char *pxf_service_address, char *path, char *name, int delimiter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename path to db
and name to path
, the comment and the actual variable names are confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
this(name, type, sourceType); | ||
this.modifiers = modifiers; | ||
} | ||
|
||
public Field(String name, EnumHawqType type, boolean isComplexType, String sourceType, String[] modifiers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add isComplexType as the final argument to keep the method overloading structure clean
@@ -31,6 +31,8 @@ | |||
*/ | |||
public class InputData { | |||
|
|||
public static final String DELIMITER_KEY = "DELIMITER"; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra line
|
||
HiveUserData hiveUserData = new HiveUserData(toks[0], toks[1], toks[2], toks[3], Boolean.valueOf(toks[4]), toks[5]); | ||
|
||
if (supportedSerdes.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
Work still in progress, but the main part is done. I want to have earlier feedback on it, it's quite a big diff.