Skip to content

Conversation

@orhankislal
Copy link
Contributor

This commits add the a partial kd-tree implementation to be used for knn
operations. This function is designed to work independently in case some
future modules require its functionality.

This commits add the a partial kd-tree implementation to be used for knn
operations. This function is designed to work independently in case some
future modules require its functionality.
@asfgit
Copy link

asfgit commented Jan 7, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/madlib-pr-build/723/

@asfgit
Copy link

asfgit commented Jan 8, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/madlib-pr-build/725/

Fixes import and keyword related bugs as well.
@asfgit
Copy link

asfgit commented Jan 11, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/madlib-pr-build/731/



def kd_tree(schema_madlib, source_table, output_table, point_column_name, depth, **kwargs):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a docstring with info on function objective and arguments.

with MinWarning("error"):

validate_kd_tree(source_table, output_table, point_column_name, depth)
num_features = plpy.execute("SELECT array_upper({point_column_name},1) AS num FROM {source_table}".format(**locals()))[0]['num']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace with num_features() from utilities.py_in

# Traverse the clauses list in reverse order because we only want the leaf conditions
for i in reversed(range(n_leaves)):
# Remove the first 14 characters to get rid of the "WHERE 1=1 AND " segment
cond = clauses.pop()[14:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the [14:] completely? It looks like the WHERE is the problem since 1=1 is a valid condition and would be TRUE for each row. The WHERE can be moved to within the query itself (line 182) instead of part of all the clauses.



def kd_tree(schema_madlib, source_table, output_table, point_column_name, depth, **kwargs):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please add docstrings for this and the knn_tree function describing the purpose and meaning of each argument?

format(output_table_tree,
" ,".join(map(str, cutoffs))))

case_when_clause = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we remove the WHERE out of the clauses, this can be built in a list comprehension:

case_when_clause = ["WHEN {0} THEN {1}::INTEGER".format(cond, i) 
                                     for i, cond in enumerate(clauses[-n_leaves:])]

@orhankislal orhankislal deleted the feature/kd-tree branch April 11, 2019 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants