Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path variables with segments may produce invalid Swagger file #407

Closed
calbach opened this issue Jun 2, 2017 · 21 comments · Fixed by #2461
Closed

Path variables with segments may produce invalid Swagger file #407

calbach opened this issue Jun 2, 2017 · 21 comments · Fixed by #2461

Comments

@calbach
Copy link

calbach commented Jun 2, 2017

For example:

service YourService {
  rpc Foo(StringMessage) returns (StringMessage) {
    option (google.api.http) = {
      post: "/v1/{name=examples/*/foos/*}"
      body: "*"
    };
  rpc Bar(StringMessage) returns (StringMessage) {
    option (google.api.http) = {
      post: "/v1/{name=examples/*/bars/*}"
      body: "*"
    };
  }
}

This produces conflicting paths in the YAML file: /v1/{name} twice; the pattern of the variable is not captured in the swagger spec.

@tmc
Copy link
Collaborator

tmc commented Jun 2, 2017

@calbach thank you for your issue report. Does openapi spec have the descriptive power to differentiate between those two?

@calbach
Copy link
Author

calbach commented Jun 8, 2017

I'm not sure. I couldn't find much in the specification about paths and templates. From searching around, my best guess is that you cannot use pattern matching within a path template - if that's the case the best approximation of this might be something like:

/v1/examples/{name1}/bars/{name2}

I don't know if such an approach can generalize to all possible http patterns, but at least it would support Google's recommended approach for defining resource names.

@calbach
Copy link
Author

calbach commented Jun 22, 2017

For future reference, looks like the relevant code is here.

@boosh
Copy link

boosh commented Jul 4, 2017

Came here to post about this issue. The issue is obviously that the extracted path name (e.g. name in the OP) is used as the key into a dict.

TBH I'm still not 100% clear on why Google recommend naming resources like they do. Anyone any ideas?

@boosh
Copy link

boosh commented Jul 4, 2017

I suppose the question is, what's the right thing to do? If I have the following grpc paths:

  • "/v1/{name1=prefix1/*}"
  • "/v1/{name1=prefix1/*/suffix1}/{name2=prefix2/*}"
  • "/v1/{name1=prefix1/*/wildcard2-prefix/*}"

What should the corresponding swagger paths look like?

Perhaps we could give wildcards sequentially generated names after extracting and prepending strings, e.g.:

  • "/v1/prefix1/{segment1}"
  • "/v1/prefix1/{segment1}/suffix1/prefix2/{segment2}"
  • "/v1/name1/{segment1}/wildcard2-prefix/{segment2}"

or, if a segment only contains a single wildcard, use the key of the segment as the placeholder, e.g.:

  • "/v1/prefix1/{name1}"
  • "/v1/prefix1/{name1}/suffix1/prefix2/{name2}"
  • "/v1/{name1=prefix1/*/wildcard2-prefix/*}" <- fallback to the above scheme with sequential names?

Since I have basically no idea of the relationship between this plugin and how the gRPC gateway works (i.e. if it uses this plugin internally) does this approach look feasible and make sense @tmc ?

It also doesn't resolve the problem of paths using the ** wildcard which can contain / characters, but that's a more complicated case we should think about later.

@achew22
Copy link
Collaborator

achew22 commented Jul 4, 2017

@boosh to answer your question (though it may be retorical)

TBH I'm still not 100% clear on why Google recommend naming resources like they do. Anyone any ideas?

Not speaking for the people who designed it, but as I understand it, idea is that you don't want your generators changing variable names on you willy nilly. Right now it is pretty easy to rename things to name2 and get a deterministic behavior. If you have a solution to this I would recommend writing it up in a PR with some tests so we can see what it does and how it modifies the existing code. That will help drive the conversation forward.

I'm looking forward to some more discussion!

@boosh
Copy link

boosh commented Jul 5, 2017

@achew22 what do you mean by "generators changing variable names on you"?

I thought the reason for Google's recommendations for declaring endpoints like get: "/v1/{name=shelves/*/books/*}" was so that resource names are unique and fully qualified (e.g. shelves/classics/books/war-and-peace).

I've found in my own APIs this does indeed lead to cleaner interfaces (e.g. because I can do things like return author as authors/russians/leo-tolstoy which clients can then lookup directly, instead of needing a specific Author response message type). It makes API payloads more generic.

That's my understanding at least.

Do you have any feedback on my suggestion for how we can map these sorts of names to swagger path templates? I'd like to get cracking on this as soon as possible because it's thrown a spanner in the works of a project I've been working on and need a resolution or workaround.

@boosh
Copy link

boosh commented Jul 6, 2017

I've created the following python code as a testbed:

def template(path):
    output = []
    buffer = []

    depth = 0

    for char in path:
        if char == '{':
            depth += 1
        elif char == '}':
            depth -= 1

            if depth == 0:
                # we've reached the end of a path template, so process the buffer
                updated_string = process_string(''.join(buffer))
                output.extend([x for x in updated_string])
                buffer = []
        else:
            if depth > 0:
                buffer.append(char)
            else:
                output.append(char)

    return ''.join(output)



def process_string(buffer):
    """
    Return the path with strings prefixed and suffixed, and correctly templated
    :param buffer:
    :return: Updated string
    """
    key, segments = buffer.split('=')

    # convert ** to *
    segments = segments.replace('**', '*')

    if segments.count('*') == 1:
        return segments.replace('*', '{%s}' % key)
    else:
        output = []
        split_segments = segments.split('*')

        for index, segment in enumerate(split_segments):
            # trim empties from the end
            if not segment and index + 1 == len(split_segments):
                continue

            output.append(segment)
            # output 1-indexed numbers since end users are humans
            output.append("{%s%d}" % (key, index + 1))

        return ''.join(output)


def main(path):
    print("Input=%s" % path)
    output = template(path)
    print("Output=%s" % output)


if __name__=='__main__':
    inputs = [
        '/v1/{name=customers/*/surname/*}',
        '/v1/{name=customers/*/surname/*}/likes/{category=sites/*}',
        '/v1/{name=path/**}',
    ]

    for input in inputs:
        main(input)

It yields:

Input=/v1/{name=customers/*/surname/*}
Output=/v1/customers/{name1}/surname/{name2}
Input=/v1/{name=customers/*/surname/*}/likes/{category=sites/*}
Output=/v1/customers/{name1}/surname/{name2}/likes/sites/{category}
Input=/v1/{name=path/**}
Output=/v1/path/{name}

I believe the above would solve this issue if translated into go.

Any thoughts?

@boosh
Copy link

boosh commented Jul 13, 2017

Sooo... any thoughts on this @calbach @tmc @achew22? I might have some time to do the work, but I don't want to waste my time if the PR won't be accepted.

@boosh
Copy link

boosh commented Jul 19, 2017

So converting "/v1/{name1=prefix1/*}" to "/v1/prefix1/{name1}" works fine, but I get an error trying to generate a swagger spec after replacing "/v1/{name1=prefix1/*/wildcard2-prefix/*}" with "/v1/prefix1/{segment1}/wildcard2-prefix/{segment2}":

fatal: [localhost]: FAILED! => {
  "changed": true, 
  "cmd": "protoc -I/usr/local/include -I. -I$GOPATH/src -I$GOPATH/src/github.com/grpc-ecosystem/grpc-gateway/third_party/googleapis --swagger_out=logtostderr=true:. api/mine/v1/my_service.proto", 
  "delta": "0:00:00.024865", 
  "end": "2017-07-19 10:54:40.209476", 
  "failed": true, 
  "rc": 1,
  "start": "2017-07-19 10:54:40.184611", 
  "stderr": "--swagger_out: no field \"segment2\" found in APIRequest", 
  "stderr_lines": ["--swagger_out: no field \"segment2\" found in APIRequest"], 
  "stdout": "", 
  "stdout_lines": []}

@Multiply
Copy link

Has there been any traction on this since 2017?

I'd like to help out where possible, as this problem is currently biting us a little hard, and I can't seem to find a way to 'override' the http path used for swagger, without changing the entire implementation.

@vtsao
Copy link

vtsao commented Feb 13, 2019

Is this a dupe of #702?

@calbach
Copy link
Author

calbach commented Feb 13, 2019

Yes - I think it's a dupe AFAICT. That said, I don't have an explanation of why @Multiply hit this if that dupe is marked "fixed", unless they're on an older version.

@johanbrandhorst
Copy link
Collaborator

I'll close this as a dupe of #702, please reply if you're still experiencing this problem @Multiply

@Multiply
Copy link

Not sure if you want me to reply here, or elsewhere, but I am indeed still hitting this.

Just to make sure I wasn't using an older version, I just removed the protoc-gen-swagger, and installed it again, and same result.

Currently I'm seeing the following results:

# Gist of a proto
option (google.api.http) = {
  get: "/v1/{id=prefix/*}"
};

# Gist of the swagger path
"/v1/{id}": {
  "get": {

I kind of hoped, this would be generating the following swagger path instead:

"/v1/prefix/{id}": {
  "get": {

Please let me know if that's not how it's meant to work.

@johanbrandhorst
Copy link
Collaborator

@Multiply thanks for replying, I think that sounds reasonable, looking at the PR that closed #702 it appears we're only doing this expansion for fields called parent or name (ref: de0b679#diff-d419038d63db0fdabc61f0d6cc6683aeR567), unless I'm misreading it. An easy fix might be to add your id field there as well?

@Multiply
Copy link

@johanbrandhorst I tried adding || field == "id" in there, and the generated swagger is the same.

I also tried using name in my proto, rather than using id and it didn't seem to help.

What is the expected path when using prefix?

@johanbrandhorst
Copy link
Collaborator

I don't know, sorry. If you'd like to experiment with it and find something that works for your use case feel free to open a PR. @achew22 might have a better understanding about why it does or doesn't work.

@Multiply
Copy link

Multiply commented Feb 14, 2019

Also limiting it to just parent, name or id, also seems kind of odd, as in some instances you might have more than a single nested resource, like /v1/{company_id=companies/*}/{team_id=teams/*}/{user_id=users/*}

I might just have the wrong expectations of the functionality, and in that case, what I'd rather want, is a way to override the path used for swagger, using an option of sorts. (Edit: if that even makes sense?)

@achew22
Copy link
Collaborator

achew22 commented Feb 14, 2019

To be honest I think this is just one of those things where the gRPC expectation and the swagger expectation differ. I don't really have a good solution. As an example of the difference it could interpret /v1/{company_id=companies/*}/{team_id=teams/*}/{user_id=users/*} as /v1/companies/{company_id}/teams/{team_id}/users/{user_id}, but that doesn't really seem to express the same thing since the consumer would be receiving the (company|team|user)_id value without their matching literal prefix.

I do not know what a great is.I would expect || field == "id" to not work, since it doesn't match team_id, but if you were to use a . to delimit it instead of an _ I think it might work.

Do we know the exact definition of a resource name? I think getting clarity on that would be very useful.

@Multiply
Copy link

The reason I use underscore, is because my gRPC implementation takes company_id in the body. (for this specific request)
In other instances I'd use company.id because the body takes a full company in the body, but takes the id from the URL.

All of the scenarios can be fixed for my use-case, if I can override the generated swagger path. I don't want to modify the actual implementation details, just the generated code the user sees in the UI.

The consumer of the documentation might not know from reading /v1/companies/{company_id} that the id is in fact companies/<id>, but I honestly don't mind, since the end result is the same. They get what they were searching for, and grouping/nesting of the paths is easier.

How feasible is it to be able to override the generated path?

johanbrandhorst pushed a commit that referenced this issue Dec 15, 2021
… multiple HTTP path/method endpoint support (#2461)

* Support standard OpenAPI paths for all valid path params for gRPC endpoints in protoc-gen-openapiv2 (#407)

* Fix issue with URL encoded path segments not being split when unescaping mode is legacy, default or all characters

* Adding docs for path template syntax and new functionality (#1700)

* Removed unused old code

* Additional tests for regular expressions

* Formatting code

* Fix lint error

* Removing duplicate test

* Adding additional test for path escaping

* Fix failing test because of path change

* Fix failing tests in code coverage

* Change determination of unescaping by supporting default mode change

* Regenerated files from changes

* Remove unused parameters in test

* Remove code warnings in test

* Implement changes from PR #2461 comments

* Correcting incorrect documentation

* Adding test case to show change in docs is warranted
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants