Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a query param to /tables for getting sorted table names based on time metadata #7142

Merged
merged 8 commits into from
Jul 20, 2021

Conversation

yashrsharma44
Copy link
Contributor

@yashrsharma44 yashrsharma44 commented Jul 8, 2021

Signed-off-by: Yash Sharma yashrsharma44@gmail.com

Description

Adds a query parameter -

  • sortType - for sorting based on name or creation time of table.
  • sortAsc - sort ascending/descending order - use boolean values

Fixes #6865

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

@yashrsharma44
Copy link
Contributor Author

yashrsharma44 commented Jul 8, 2021

I have added some boilerplate code for the sorting feature required, wanted to make sure if the tests are not failing.
Will add the sorting param after tests pass.

Made the PR ready for Review :)

@yashrsharma44
Copy link
Contributor Author

I have a java specific question 😅

  • I am trying to import import org.apache.pinot.common.utils.config.TableConfigUtils;, and this package with the same name - import org.apache.pinot.segment.local.utils.TableConfigUtils;.
  • My import contains this function - toZNRecord(...) - but the older import does not contain this function. Now since Java doesn't provide imports with same name, any ideas how should I move ahead?

@yashrsharma44 yashrsharma44 marked this pull request as ready for review July 8, 2021 21:15
@Jackie-Jiang
Copy link
Contributor

I have a java specific question 😅

  • I am trying to import import org.apache.pinot.common.utils.config.TableConfigUtils;, and this package with the same name - import org.apache.pinot.segment.local.utils.TableConfigUtils;.

  • My import contains this function - toZNRecord(...) - but the older import does not contain this function. Now since Java doesn't provide imports with same name, any ideas how should I move ahead?

You can only import one of them, and use the full class path for the other one when accessing it

@codecov-commenter
Copy link

codecov-commenter commented Jul 9, 2021

Codecov Report

Merging #7142 (1db76ce) into master (738a584) will increase coverage by 0.02%.
The diff coverage is 8.82%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7142      +/-   ##
============================================
+ Coverage     73.53%   73.55%   +0.02%     
  Complexity       92       92              
============================================
  Files          1506     1506              
  Lines         73801    73832      +31     
  Branches      10644    10655      +11     
============================================
+ Hits          54266    54308      +42     
+ Misses        15997    15990       -7     
+ Partials       3538     3534       -4     
Flag Coverage Δ
integration 41.93% <8.82%> (+0.08%) ⬆️
unittests 65.24% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ntroller/helix/core/PinotHelixResourceManager.java 62.44% <0.00%> (-0.05%) ⬇️
...oller/api/resources/PinotTableRestletResource.java 50.17% <9.67%> (-5.43%) ⬇️
...data/manager/realtime/SegmentCommitterFactory.java 64.70% <0.00%> (-35.30%) ⬇️
...ot/segment/local/customobject/MinMaxRangePair.java 75.86% <0.00%> (-24.14%) ⬇️
...er/api/resources/LLCSegmentCompletionHandlers.java 52.40% <0.00%> (-8.66%) ⬇️
...altime/ServerSegmentCompletionProtocolHandler.java 49.52% <0.00%> (-8.58%) ⬇️
...e/data/manager/realtime/SplitSegmentCommitter.java 53.84% <0.00%> (-7.70%) ⬇️
.../pinot/core/query/scheduler/PriorityScheduler.java 80.82% <0.00%> (-2.74%) ⬇️
...ation/function/MinMaxRangeAggregationFunction.java 87.50% <0.00%> (-1.25%) ⬇️
...ot/common/protocols/SegmentCompletionProtocol.java 94.78% <0.00%> (-0.95%) ⬇️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 738a584...1db76ce. Read the comment docs.

@mcvsubbu
Copy link
Contributor

mcvsubbu commented Jul 11, 2021

Why should sorting be the function of the API? There are external tools available to sort the results and parse them in various ways. If we need to sort by retention and then by table name, would you then extend the API? I am not in favor of adding more complexity to the API when there is more flexibility doing such operations outside of Pinot.

@Jackie-Jiang
Copy link
Contributor

@mcvsubbu What external tool are you referring to? IMO listing tables in a cluster sorted with creation time/last modified time is quite common request, and currently there is no easy way to get this info. This is similar to list files within a directory sorted with creation time/last modified time.

@mcvsubbu
Copy link
Contributor

@mcvsubbu What external tool are you referring to? IMO listing tables in a cluster sorted with creation time/last modified time is quite common request, and currently there is no easy way to get this info. This is similar to list files within a directory sorted with creation time/last modified time.

A combination of jq and sort, for instance. You can sort/massage/filter data to your heart's content

@mqliang
Copy link
Contributor

mqliang commented Jul 13, 2021

A combination of jq and sort, for instance. You can sort/massage/filter data to your heart's content

I agree:

  • For sorting by name, client can use the @Path("/tables") API and sort the result.
  • For sorting by creation/modification time, client can use the @Path("/tables/{tableName}/stats") API, which will return the table name as well as the creation/modification time info, client can then parse/sort the response

@Jackie-Jiang
Copy link
Contributor

@mcvsubbu @mqliang Are you suggesting calling /tables first, then loop over all the tables and get stats for each individual table, then sort on the client side? This could translate to hundreds of rest calls

@Jackie-Jiang
Copy link
Contributor

I still don't see why adding sort type a bad thing to the list table api. If not provided, it will just work as is. Adding it can save lots of effort on the user side

@mqliang
Copy link
Contributor

mqliang commented Jul 13, 2021

Are you suggesting calling /tables first, then loop over all the tables and get stats for each individual table, then sort on the client side? This could translate to hundreds of rest calls

@Jackie-Jiang Sorry for the confusing, I was assuming there is a @Path("/tables/stats") API which will return all data table names as well as their stats info (creation time, modification time), but it turns out it's a @Path("/tables/{tableName}/stats") API which will return stats for a single table. So my suggestion is:

  • If client want to sort by name, use the @path("/tables") API and sort the result at client side.
  • Add a new @Path("/tables/stats") API, which will return all data table names as well as their stats info, if client want to sort by creation/modification time, use this API and parse/sort at client side.

Does it sounds good?

@Jackie-Jiang
Copy link
Contributor

@mqliang I think it is good to add the list stats API, and I agree it can solve the sorting problem, but I still suggest adding an option in list tables API to sort on common fields because that will be much easier to use.
Lots of APIs we created are for ease of use. I don't think it is handy to force the user to further process the response from the swagger UI (maybe okay to professional administrator but they are not the only user).

@yashrsharma44
Copy link
Contributor Author

Hi @Jackie-Jiang!
Any suggestions for this PR, or are we good to merge this? Just curious 😄

yashrsharma44 and others added 8 commits July 19, 2021 16:26
…n time metadata

Signed-off-by: Yash Sharma <yashrsharma44@gmail.com>
Signed-off-by: Yash Sharma <yashrsharma44@gmail.com>
Signed-off-by: Yash Sharma <yashrsharma44@gmail.com>
Signed-off-by: Yash Sharma <yashrsharma44@gmail.com>
Signed-off-by: Yash Sharma <yashrsharma44@gmail.com>
Signed-off-by: Yash Sharma <yashrsharma44@gmail.com>
Signed-off-by: Yash Sharma <yashrsharma44@gmail.com>
@Jackie-Jiang Jackie-Jiang merged commit fe83e95 into apache:master Jul 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add controller rest API to list recently created/modified tables
5 participants