Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support name filter at get workflows #1525

Merged

Conversation

serihiro
Copy link
Contributor

@serihiro serihiro commented Jan 22, 2021

What does this PR change?

This PR introduces name parameter into GET /api/workflows endpoint so that users can filter workflows by name partial match.

I confirmed that this updates does not change the SQL statement execution plan at PostgreSQL by explain command.

Comparison of SQL statement execution plan

Currently workflow_definitions_on_revision_id_and_name index on workflow_definitions table is already used.

   ->  Nested Loop  (cost=688.77..797.66 rows=28 width=162)
         ->  Nested Loop  (cost=688.35..784.73 rows=28 width=104)
               ->  Limit  (cost=687.92..687.99 rows=28 width=31)
                     InitPlan 1 (returns $1)
                       ->  GroupAggregate  (cost=0.71..539.00 rows=294 width=8)
                             Group Key: r_1.project_id
                             ->  Nested Loop  (cost=0.71..534.59 rows=294 width=8)
                                   ->  Index Only Scan using projects_on_site_id_and_id on projects p_1  (cost=0.28..71.50 rows=213 width=4)
                                         Index Cond: (site_id = 1)
                                   ->  Index Only Scan using revisions_on_project_id_and_id on revisions r_1  (cost=0.43..2.08 rows=9 width=8)
                                         Index Cond: (project_id = p_1.id)
                     ->  Sort  (cost=148.92..148.99 rows=28 width=31)
                           Sort Key: wf.id
                           ->  Nested Loop  (cost=1.28..148.24 rows=28 width=31)
                                 ->  Nested Loop  (cost=0.86..135.61 rows=28 width=35)
                                       ->  Index Scan using workflow_definitions_on_revision_id_and_name on workflow_definitions wf  (cost=0.43..39.15 rows=28 width=31)
                                             Index Cond: (revision_id = ANY ($1))
                                             Filter: (id > 0)
                                       ->  Index Scan using revisions_pkey on revisions rev  (cost=0.43..3.45 rows=1 width=8)
                                             Index Cond: (id = wf.revision_id)
                                 ->  Index Only Scan using projects_pkey on projects proj  (cost=0.43..0.45 rows=1 width=4)
                                       Index Cond: (id = rev.project_id)
               ->  Index Scan using revisions_pkey on revisions r  (cost=0.43..3.45 rows=1 width=77)
                     Index Cond: (id = wf.revision_id)
         ->  Index Scan using projects_pkey on projects p  (cost=0.43..0.46 rows=1 width=62)
               Index Cond: (id = r.project_id)
   ->  Index Scan using workflow_configs_pkey on workflow_configs wc  (cost=0.43..3.45 rows=1 width=1323)
         Index Cond: (id = wf.config_id)

After modification of this PR, same index is used.

 Nested Loop  (cost=583.41..589.50 rows=1 width=1481)
   ->  Nested Loop  (cost=582.98..586.05 rows=1 width=162)
         ->  Nested Loop  (cost=582.56..585.59 rows=1 width=104)
               ->  Limit  (cost=582.13..582.13 rows=1 width=31)
                     InitPlan 1 (returns $1)
                       ->  GroupAggregate  (cost=0.71..539.00 rows=294 width=8)
                             Group Key: r_1.project_id
                             ->  Nested Loop  (cost=0.71..534.59 rows=294 width=8)
                                   ->  Index Only Scan using projects_on_site_id_and_id on projects p_1  (cost=0.28..71.50 rows=213 width=4)
                                         Index Cond: (site_id = 1)
                                   ->  Index Only Scan using revisions_on_project_id_and_id on revisions r_1  (cost=0.43..2.08 rows=9 width=8)
                                         Index Cond: (project_id = p_1.id)
                     ->  Sort  (cost=43.13..43.13 rows=1 width=31)
                           Sort Key: wf.id
                           ->  Nested Loop  (cost=1.28..43.12 rows=1 width=31)
                                 ->  Nested Loop  (cost=0.86..42.66 rows=1 width=35)
                                       ->  Index Scan using workflow_definitions_on_revision_id_and_name on workflow_definitions wf  (cost=0.43..39.22 rows=1 width=31)
                                             Index Cond: (revision_id = ANY ($1))
                                             Filter: ((id > 0) AND (name ~~ '%test%'::text))
                                       ->  Index Scan using revisions_pkey on revisions rev  (cost=0.43..3.45 rows=1 width=8)
                                             Index Cond: (id = wf.revision_id)
                                 ->  Index Only Scan using projects_pkey on projects proj  (cost=0.43..0.45 rows=1 width=4)
                                       Index Cond: (id = rev.project_id)
               ->  Index Scan using revisions_pkey on revisions r  (cost=0.43..3.45 rows=1 width=77)
                     Index Cond: (id = wf.revision_id)
         ->  Index Scan using projects_pkey on projects p  (cost=0.43..0.46 rows=1 width=62)
               Index Cond: (id = r.project_id)
   ->  Index Scan using workflow_configs_pkey on workflow_configs wc  (cost=0.43..3.45 rows=1 width=1323)
         Index Cond: (id = wf.config_id)

Kazuhiro Serizawa added 3 commits January 21, 2021 16:24
siteId,
pageSize,
lastId.or(0L),
name.isPresent() && !name.get().isEmpty() ? "%" + escapeLikeParameter(name.get()) + "%" : "%",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If name parameter is absent, this method just sets % so that the name filter does not affect.

siteId,
pageSize,
lastId.or(0L),
name.isPresent() && !name.get().isEmpty() ? "%" + escapeLikeParameter(name.get()) + "%" : "%",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt it was difficult to understand what this line is trying. How about adding a comment on this line or extracting this line to a private method like createWorkflowNamePattern() or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment to explain the intention the index structure on workflow_definition, and the performance insight on the SQL statement of dao. getLatestActiveWorkflowDefinitions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.

  • I refactored here at af05bb1
  • I left a comment about the performance and the index used in this SQL statement at
    983b31a

@komamitsu
Copy link
Contributor

Can you put the SQL query you used to get the query plan? I'm interested in whether current index works with suffix/partial match.

" and <acFilter>" +
" order by wd.id" +
" limit :limit")
List<StoredWorkflowDefinitionWithProject> getLatestActiveWorkflowDefinitions(
@Bind("siteId") int siteId,
@Bind("limit") int limit,
@Bind("lastId") long lastId,
@Bind("name") String name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namePattern or something is more correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed at 601788e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I can't see the change on this code review....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Now I fixed at 102cb3a

@@ -625,6 +639,7 @@ public void deleteSchedules(int projId)
@Bind("siteId") int siteId,
@Bind("limit") int limit,
@Bind("lastId") long lastId,
@Bind("name") String name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed at 601788e

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Now I fixed at 102cb3a

@QueryParam("count") Integer count)
@QueryParam("count") Integer count,
@ApiParam(value="name pattern to be partially matched", required=false)
@QueryParam("name") String name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed at 601788e

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Now I fixed at 102cb3a

Kazuhiro Serizawa added 3 commits January 22, 2021 16:27
a generation process of partial match pattern
to describe the performance of like search
@serihiro
Copy link
Contributor Author

@komamitsu
Thank you for your review. I modified this PR according to your comments. Could you take a look again?

@@ -355,6 +360,19 @@ public TimeZoneMap getWorkflowTimeZonesByIdList(List<Long> defIdList)
Map<Long, ZoneId> map = IdTimeZone.listToMap(list);
return new TimeZoneMap(map);
}

private String generatePartialMatchPattern(Optional<String> pattern)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

{
// If provided pattern is absent or empty string, just set '%'
// so that the pattern does not affect to a where clause.
return pattern.isPresent() && !pattern.get().isEmpty() ? "%" + escapeLikePattern(pattern.get()) + "%" : "%";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pattern.or("").isEmpty() is simpler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at 3ea375a

@yoyama yoyama added this to the v0.11.0 milestone Jan 25, 2021
Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@serihiro
Copy link
Contributor Author

serihiro commented Feb 4, 2021

Thanks!

@serihiro serihiro merged commit aea4b4c into treasure-data:v0_11 Feb 4, 2021
@serihiro serihiro deleted the support-name-filter-at-get-workflows branch February 4, 2021 08:03
yoyama pushed a commit that referenced this pull request Jan 24, 2022
…kflows

Support name filter at get workflows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants