New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support nested fields in Elasticsearch Connector #1001
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we know the type of the column, it might be better to drive the "un-flattening" based on the fields that are actually needed. I'll think about how we might be able to do this and comment later.
if (length1 == length2) { | ||
return key1.compareTo(key2); | ||
} | ||
return length2 - length1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Integer.compare()
. Comparing using subtraction can result in subtle issues due to overflow (unlikely to be an issue here, but let's use the more robust way).
processFields(fieldsMap); | ||
} | ||
|
||
private void processFields(TreeMap<String, Object> fieldsMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just call the argument fields
fieldsMap.put(prefix, value); | ||
processFields(fieldsMap); | ||
} | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can only happen if there are no fields with .
left, but it's not immediately obvious. I have the feeling we should be able to structure it in such a way that it's easier to read, but at a minimum, it deserves a comment.
@zhenxiao, here's my suggestion for an alternate implementation of the unflatten logic: private void extractFromSource(SearchHit hit)
{
List<Field> fields = new ArrayList<>();
for (Map.Entry<String, Object> entry : hit.getSourceAsMap().entrySet()) {
fields.add(new Field(entry.getKey(), entry.getValue()));
}
Collections.sort(fields, Comparator.comparing(Field::getName));
for (Map.Entry<String, Object> entry : unflatten(fields).entrySet()) {
setFieldIfExists(entry.getKey(), entry.getValue());
}
}
private static Map<String, Object> unflatten(List<Field> fields)
{
return unflatten(fields, 0, 0, fields.size());
}
private static Map<String, Object> unflatten(List<Field> fields, int level, int start, int length)
{
checkArgument(length > 0, "length must be > 0");
int limit = start + length;
Map<String, Object> result = new HashMap<>();
int anchor = start;
int current = start;
do {
Field field = fields.get(anchor);
String name = field.getPathElement(level);
current++;
if (current == limit || !name.equals(fields.get(current).getPathElement(level))) {
// We assume that fields can't be both leaves and intermediate nodes
Object value;
if (level < field.getDepth() - 1) {
value = unflatten(fields, level + 1, anchor, current - anchor);
}
else {
value = field.getValue();
}
result.put(name, value);
anchor = current;
}
}
while (current < limit);
return result;
}
private final static class Field
{
private final String name;
private final List<String> path;
private final Object value;
public Field(String name, Object value)
{
this.name = name;
this.path = Arrays.asList(name.split("\\."));
this.value = value;
}
public String getName()
{
return name;
}
public int getDepth()
{
return path.size();
}
public String getPathElement(int level)
{
return path.get(level);
}
public Object getValue()
{
return value;
}
} |
thank you, @martint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I'll merge it once travis passes
No description provided.