Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The queryables resource use invalid JSON Schema types #1091

Closed
jerstlouis opened this issue Jan 10, 2023 · 9 comments
Closed

The queryables resource use invalid JSON Schema types #1091

jerstlouis opened this issue Jan 10, 2023 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@jerstlouis
Copy link

Description
When importing data types from PostgreSQL database, unrecognized types are simply copied, resulting in invalid JSON Schema types for queryables.

Steps to Reproduce
See https://aviationapi.skymantics.com/faa/collections/faa_flight_plans/queryables

Expected behavior
The queryables response should always conform to the JSON Schema. Invalid types should be converted to valid types (e.g., strings, objects).

Additional context
This issue was discovered in Testbed 18 - Advanced SWIM Filtering with Skymantics Aviation service.

The problem may be in openapi.py, about line 601:

                    if type_ == 'date':
                        schema = {
                            'type': 'string',
                            'format': 'date'
                        }
                    elif type_ == 'float':
                        schema = {
                            'type': 'number',
                            'format': 'float'
                        }
                    elif type_ == 'long':
                        schema = {
                            'type': 'integer',
                            'format': 'int64'
                        }
                    else:
                        schema = type_
@jerstlouis jerstlouis added the bug Something isn't working label Jan 10, 2023
@tomkralidis tomkralidis self-assigned this Jan 13, 2023
@tomkralidis
Copy link
Member

@jerstlouis thanks for filing the issue. This depends on the backend provider in use.

If I assume/guess a PostgreSQL provider, then this would mean that its get_fields function (which is supposed to provide the JSON Schema types) would need to be updated.

Can someone confirm PostgreSQL is indeed the backend being used here?

cc @skyNacho

@skyNacho
Copy link

Yes, that is correct. The PostgreSQL provider was used in this case. @tomkralidis do you want me to give it a try?

@jerstlouis
Copy link
Author

@tomkralidis @skyNacho Thank you for looking into this.
My suggestion would be for validating and ensuring providers cannot possibly result in pygeoapi outputting JSON schemas that do not conform to the JSON Schema spec.

e.g., get_fields() possibly returning some pygeoapi schema construct, but pygeoapi itself either converting it into a JSON schema and/or validating it.

@tomkralidis
Copy link
Member

Yes, that is correct. The PostgreSQL provider was used in this case. @tomkralidis do you want me to give it a try?

@skyNacho sure, that would be great. cc @KoalaGeo

@KoalaGeo
Copy link
Contributor

@ximenesuk & @volcan01010 adding to the thread for info...

@volcan01010
Copy link
Contributor

The get_fields() method returns the types of the columns as reported by SQLAlchemy. The following example comes from the unit tests.

    expected_fields = {
        'blockage': {'type': 'VARCHAR(80)'},
        'covered': {'type': 'VARCHAR(80)'},
        'depth': {'type': 'VARCHAR(80)'},
        'layer': {'type': 'VARCHAR(80)'},
        'name': {'type': 'VARCHAR(80)'},
        'natural': {'type': 'VARCHAR(80)'},
        'osm_id': {'type': 'INTEGER'},
        'tunnel': {'type': 'VARCHAR(80)'},
        'water': {'type': 'VARCHAR(80)'},
        'waterway': {'type': 'VARCHAR(80)'},
        'width': {'type': 'VARCHAR(80)'},
        'z_index': {'type': 'VARCHAR(80)'}
    }

I wasn't aware of the JSON Schema requirement. The best fix is probably to map between SQLAlchemy types and JSON Schema types within the get_fields function itself.

def get_fields(self):

@jerstlouis
Copy link
Author

@volcan01010 @tomkralidis

It seems that at minimum the get_fields() prototype needs to be better documented so that it is clear to providers implementors what should be returned. Some kind of validation would be best. I don't know if this could be addressed with types?

@tomkralidis
Copy link
Member

@skyNacho if you do submit a PR/update, feel free to update to pygeoapi/provider/base.py get_fields docstring to something like:

        """ 
        Get provider field information (field names, JSON Schema data types)

        Example response: {'field1': 'string', 'field2': 'number'}}

        :returns: dict of field names and their associated JSON Schema typess
        """ 

@tomkralidis
Copy link
Member

Fixed in master (thanks @totycro!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants