Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for auth token endpoint #142

Closed
wants to merge 19 commits into from

Conversation

martimors
Copy link

@martimors martimors commented Jun 28, 2021

This adds the TokenEndpointAuthenticator class, which can be used to retrieve an auth token by making a POST-request to an HTTP endpoint before each poll. It is not the most efficient approach possible, as these types of tokens have a life span which is typically far larger than the desired polling interval (some providers such as Okta charge per. token too), but at least it is a start and it can be expanded later.

These config attributes were added:

  • "http.auth.url": The full URL for the token endpoint
  • "http.auth.body": Payload for the token request (ie. some escaped json)
  • "http.auth.tokenkeypath": Where in the returned payload to retreive the token from

Here is an example of a connector which also shows the three new config parameters added:

{
    "name": "sertica-voyage-reports.http.source",
    "config": {
        "connector.class": "com.github.castorm.kafka.connect.http.HttpSourceConnector",
        "tasks.max": "1",
        "http.offset.initial": "timestamp=2020-05-08T07:55:44Z",
        "http.request.url": "{{url}}/VoyageReports/search",
        "http.request.headers": "accept: text/plain,Content-Type: application/json",
        "http.request.body": "{  \"distinct\": true,  \"includeSubObjects\": true,  \"criteria\": [    {      \"field\": \"updateDate\",      \"operator\": \"GreaterThan\",      \"values\": [        \"${offset.timestamp?datetime.iso}\"      ]    }  ],  \"sorting\": [    {      \"field\": \"updateDate\",      \"direction\": \"Asc\"    }  ]}",
        "http.request.method": "POST",
        "http.auth.type": "Token_Endpoint", # Token_Endpoint option was added in this PR
        "http.auth.url": "http://dkcph-pmstapp1.dk.dfds.root:9001/Auth", # This was added in this PR
        "http.auth.body": "{\"login\": \"mamor\", \"password\": \"{{password}}\"}", # This was added in this PR
        "http.auth.tokenkeypath": "accessToken", # This was added in this PR
        "http.response.list.pointer": "/",
        "http.response.record.offset.pointer": "key=/voyageReportNo, timestamp=/updateDate",
        "http.response.record.timestamp.parser.zpme": "UTC",
        "http.response.record.timestamp.parser": "com.github.castorm.kafka.connect.http.response.timestamp.NattyTimestampParser",
        "http.timer.interval.millis": "30000",
        "http.timer.catchup.interval.millis": "1000",
        "kafka.topic": "sertica-voyage-reports-cdc"
    }
}

I have also added a dev-dependency (mockhttpserver) to be able to test this functionality with okhttp calls mocked.

What do you think?

@salimane
Copy link

@dingobar I love the changes in this PR. Something that could be added to address the life span of the token is for example

  • add http.auth.tokenexpirypath to define the place to get the lifespan of the token
  • use some cache in the OkHttpClient to cache the accessToken

what do you think @dingobar?

@martimors
Copy link
Author

In order:

  • I think this is a bit complicated, as there is no consistent way that this is implemented. Some times, it can only be read from the JWT claims, some times the validity end timestamp is obtained and some times the number of seconds until expiry. Therefore, I don't think this should be part of this particular feature. We can discuss it separately, because whatever solution we land on most likely won't create any breaking changes.
  • This is a great idea, but it requires us to also be aware of what to do when we try to use an expired token, ie. the API returns a 401 or something. I know it's wasteful to create a new token for every single request, but I think its a good way to keep the feature general enough to cover all potential use-cases. It requires further refinement and discussion in my opinion, and since it won't break the current setup it could wait.

For now, it would be nice if I could get the current feature set reviewed, perhaps with backwards compatibility for the above ideas in mind. The features implemented in this PR covers our use-case at our company, and I have only limited resources for further commitments unfortunately.

What do you think?

@martimors
Copy link
Author

@castorm Any chance we could revive this project somehow? I think it's an awesome connector and it will have many use-cases in our organization.

@lobbin
Copy link

lobbin commented Sep 24, 2021

  • I think this is a bit complicated, as there is no consistent way that this is implemented. Some times, it can only be read from the JWT claims, some times the validity end timestamp is obtained and some times the number of seconds until expiry. Therefore, I don't think this should be part of this particular feature. We can discuss it separately, because whatever solution we land on most likely won't create any breaking changes.

What about letting the user set a fixed TTL on the token? In our case I know the token will be valid for one hour so I wouldn't have a problem using that.

@lobbin
Copy link

lobbin commented Oct 18, 2021

pr-142.diff.txt

Please see attached diff. It's an initial go at caching the access token for a fixed time. The token is saved for "http.auth.tokenexpirytime" seconds. The default value of 0 continues the current behaviour or fetching a new token on every request.

Not sure if it's the proper way of storing the data locally in private properties, but at least it works ;)

I also took the liberty ot changing how to find the token from path() to at(), given that our token result looks like this

{
  "data": {
    "tokens": {
      "accessToken": "<token data">
    }
  }
}

@martimors
Copy link
Author

@lobbin Fixed TTL for the token given in the connector config is fine. However, seeing as this project is pretty dead the past months (no master branch commits since june 16th this year), I won't spend any more time on this until I get a reply from one of the maintainers of this repo. We are not wanting to support our fork of this project and just wanted to introduce some new functionality, so if the community is dead that unfortunately means we cannot use this software for now...

@lobbin
Copy link

lobbin commented Oct 18, 2021

@dingobar Yes, I saw that as well. However, I haven't found any other suitable connector so I'm stuck with this one for a while longer.

@castorm
Copy link
Owner

castorm commented Oct 24, 2021

Hi @dingobar and @lobbin,

Thank you very much for your contributions.

I'm sorry it took me a bit to reply, I wouldn't say this project is dead, but I definitely don't have the time or incentive to spend too much time on it, so I'd be happy to accept new maintainers if you are up for the task.

Regarding this particular contribution, I'd love to merge it, as this feature is a recurrent request.

However, I'd like for a couple of things to be considered before moving forward, and I perfectly understand if you wouldn't want to invest any minute on it.

In any case, those things are:

  • There were already some integration tests starting up kafka & kafka connect servers. I believe the approach should be unified (preferably continue using testcontainers)
  • This plugin's HttpAuthenticator is integrated as an OkHttp's Authenticator. That means it's called automatically after receiving a 401 status code, and request retried after token refreshed. There might be some flaws in the current implementation preventing us from taking advantage of this, but before merging this PR, I'd like for this to be reviewed. Meaning, IMO Authenticator should only be called after a 401 to refresh the token (or as an optimization on first request as well), for the rest of the requests, last token should be reused. (Might be worth checking this out: https://sangsoonam.github.io/2019/03/06/okhttp-how-to-refresh-access-token-efficiently.html).

Again, thank you very much for your contribution, I believe even if the project is not as active as it used to be, it's still used by some people, and I'm sure this feature would be very much welcome. So I'll be happy to hear back from you.

Best regards.

@martimors
Copy link
Author

Thanks for the reply - I'll see if I can implement those changes.

About your time and the state of this repo, I agree that this is a very nice plug-in to connect and it will have immense value for us if we manage to implement it. Perhaps you could ask some of those people who you know use it and have contributed to it in the past to help you maintain it? I think the most crucial thing is being able to merge bugfixes and security patches and release them to confluent hub. Perhaps a conversation for another issue.

@roger-schlachter-waypoint

I understand this plugin has gone a bit stale, but this would definitely be a welcome feature.

@martimors
Copy link
Author

I unfortunately no longer use Kafka Connect. I welcome you to take over the feature and implement the requested changes from @castorm .

@martimors martimors closed this Mar 3, 2023
@szalapski
Copy link

szalapski commented May 8, 2023

I would like to try this or perhaps complete it if possible.

@szalapski
Copy link

@dingobar or @castorm, can you give guidance on how I might help? Above you say you want integration tests as well as adaptation to work with OkHttp's Authenticator.

I'm not sure I can get my environment set up to build or run integration tests.

Also, for what it's worth, I think fetching a token perhaps should be done preemptively, not having to get a 401 before doing so. If Token_Endpoint is the mode but we have no token, my vote is we should go get one before attempting the API call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants