Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API to expose the contract/rules imposed by pinot on tableConfig #10655

Merged

Conversation

shounakmk219
Copy link
Contributor

@shounakmk219 shounakmk219 commented Apr 20, 2023

This PR tries to address the items mention in issue #10647

Changes :

  1. Moved the switch case logic from Schema.validate() to Schema.validate(FieldType fieldType, DataType dataType) so that the code can be reused to just validate if a DataType is allowed by a FieldType.
  2. Created FieldSpec.FIELD_SPEC_METADATA to hold the spec metadata.
  3. Created a wrapper to hold the spec metadata. Currently is just has the valid (fieldType, dataType) pairs along with the respective default null values.
{
  "fieldTypes": {
    "METRIC": {
      "allowedDataTypes": {
        "INT": {
            "nullDefault": 0
        },
        "STRING": {
          "nullDefault": "null"
        },
        .
        .
        .
      ]
    },
    "DIMENSION": {
      "allowedDataTypes": [
        "INT": {
          "nullDefault": -2147483648
        },
        "STRING": {
          "nullDefault": "null"
        },
        .
        .
        .
      }
    },
    .
    .
    .
  },
  "dataTypes": {
    "INT": {
      "storedType": "INT",
      "size": 4,
      "sortable": true,
      "numeric": true
    },
    "UNKNOWN": {
      "storedType": "UNKNOWN",
      "size": -1,
      "sortable": true,
      "numeric": false
    },
    .
    .
    .
}
  1. Expose the metadata from a new API endpoint at /schemas/fieldSpec

@codecov-commenter
Copy link

codecov-commenter commented Apr 20, 2023

Codecov Report

Merging #10655 (c649b12) into master (c7e05a7) will increase coverage by 42.57%.
The diff coverage is 90.56%.

@@              Coverage Diff              @@
##             master   #10655       +/-   ##
=============================================
+ Coverage     27.66%   70.23%   +42.57%     
- Complexity       58     6431     +6373     
=============================================
  Files          2090     2107       +17     
  Lines        112578   113835     +1257     
  Branches      16972    17201      +229     
=============================================
+ Hits          31145    79955    +48810     
+ Misses        78333    28286    -50047     
- Partials       3100     5594     +2494     
Flag Coverage Δ
integration1 24.37% <0.00%> (-0.09%) ⬇️
integration2 23.95% <0.00%> (-0.16%) ⬇️
unittests1 67.81% <96.00%> (?)
unittests2 13.85% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ller/api/resources/PinotSchemaRestletResource.java 67.07% <0.00%> (+44.09%) ⬆️
...rc/main/java/org/apache/pinot/spi/data/Schema.java 75.21% <86.66%> (+75.21%) ⬆️
...main/java/org/apache/pinot/spi/data/FieldSpec.java 85.48% <100.00%> (+85.48%) ⬆️

... and 1507 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@saurabhd336 saurabhd336 requested a review from npawar April 21, 2023 10:28
@ApiOperation(value = "Get TableConfig metadata", notes = "Get TableConfig metadata")
public String getFieldSpecMetadata() {
try {
return JsonUtils.objectToString(FieldSpec.FIELD_SPEC_METADATA);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return just the object. It'll automatically be jsonified.

}

public static class DataTypeMetadata {
public FieldSpec.DataType _name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataType has a bunch of other useful properties that can be part of the response.


    private final DataType _storedType;
    private final int _size;
    private final boolean _sortable;
    private final boolean _numeric;


Think we should include all that in the API result. 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Maybe I need to introduce a separate dataTypes property at fieldTypes level so as to not have the dataType specific info under allowedDataTypes and that too with redundant replication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the wrapper structure to accommodate the data type properties.

@mcvsubbu
Copy link
Contributor

I understand the re-org of code to have metadata on fieldSpec, but why expose this via API? How will a user of Pinot make use of this API?

Should we call the API /schemata/fieldSpec? Is this endpoint always going to be some general stuff (i.e. tableName, segmentName, etc. are not an argument)?

@shounakmk219
Copy link
Contributor Author

Hey @mcvsubbu as mentioned in the issue (link in description) the API consumer will be mostly another application built on top of pinot rather than a pinot user. Motivation is to enable client side user data input validations.

The endpoint should return in general what are the constraints imposed by pinot on the table config or schema. It's not specific to any particular table or segment. Using this info client application should be able to impose data validations dynamically.

I have no strong reason behind the endpoint path, I can change it to /schemas/fieldSpec

@shounakmk219
Copy link
Contributor Author

Hey @mcvsubbu I have changed the endpoint path as suggested, let me know if the above comment addressed your concerns.

@saurabhd336 saurabhd336 merged commit f5d7586 into apache:master May 2, 2023
14 checks passed
@shounakmk219 shounakmk219 deleted the tableconfig-defaults-metadata branch May 2, 2023 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants