Upgrade to Api v2 #20

StevenMHernandez · 2018-06-28T15:23:44Z

This directly handles issue #9.

In addition to the stated issue of API v1 being deprecated by Harvest, this pull request also syncs many new data points that v2 offers (in addition to some data points not available in the previous version).

Indirectly; as a result of this pull request, the following issues will be handled as well:

Time entries not sync'ed properly #18 because we are now correctly paginating all collection api endpoints
Paging not supported for invoices. #17 because we are now correctly paginating all collection api endpoints
404 response from expense_categories #6 because we are now pulling this data in a completely different method

As a result of moving from API v1 up to v2 for Harvest, there are a few configuration changes that must occur (for use in systems such as Stitch). These config.json changes can be seen in the updated README

account_id is required instead of account_name
user_agent is required (though it appears that it was required in the past, just not documented)

I've changed the version # for the module because it makes large changes to the underlying schemas compared to that of module v1.1.1

many new schemas were added
many new schema attributes were added (as per new data from the v2 Harvest API)
many old schema attributes were removed (no longer available in the v2 Harvest API)
people schema was renamed to users (as per new Harvest API endpoints)
schema attribute names have been changed to match the current api (ex: active vs is_active)

Please let me know any questions, concerns or recommendations.

* update authentication method required for new v2 of the api * add additional required config variables for tap usage * update access_token request to comply with v2 of api * update schemas to reflect data returned from the v2 api * add additional schemas to reflect additional data points available @see https://help.getharvest.com/api-v2/ @see singer-io#9

…a and loops through data, thus removing redundancies * allow each parent object loaded from the `sync_endpoint` to recursively sync additional endpoints (related to the parent object) through a `for_each_handler` * remove api requests from the individual `sync_$item` methods, instead prefer using the generic `sync_endpoint` method

…nc_endpoint` method)

* allow key_properties to be set to anything other than just `["id"]` for use with pivot table style data * update config.json and state.json documentation in README.md * add additional schemas and subsequent loading-code that was missed in the first pass * additional minor fixes such as * incorrect schema attribute names * missing attributes * incorrect schema attribute type/formats * minor formatting tweaks

…ompany, otherwise the user receives a 403 response error * load company metadata from harvest api * ensure specific features are either enabled or disabled before beginning endpoint syncs

* define default key_properties directly in `load_and_write_schema` method signature * update version * fix other minor inconsistencies

cmerrick · 2018-06-28T15:23:46Z

Hi @StevenMHernandez, thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes.

cmerrick · 2018-06-28T16:01:30Z

You did it @StevenMHernandez!

Thank you for signing the Singer Contribution License Agreement.

nick-mccoy · 2018-06-29T18:25:26Z

This is awesome, @StevenMHernandez!! Thanks for the well thought out, comprehensive updates. We are going to review and test these changes during our next sprint. We'll update soon!

StevenMHernandez · 2018-08-07T14:30:49Z

@nick-mccoy Any update on this? Is there anything I can do on my end to help?

nick-mccoy · 2018-08-07T14:32:45Z

Hey @StevenMHernandez -- sorry for the delay -- reviewing this is in our current sprint, so we should be getting to it by next week at the latest.

StevenMHernandez · 2018-08-07T14:37:44Z

Cool, no worries and thanks!

dmosorast · 2018-08-10T15:28:42Z

Hi @StevenMHernandez, I'm taking a look at this to see about getting it merged. Right now, I'm having a bit of trouble with the authorization, and I'm not sure what pattern the tap is using when looking at the Harvest docs on auth.

It looks to be a mix of personal access token (using account_id) and OAuth (using refresh_token), but neither have worked thus far.

Here's the details of what I've done.

Personal Access Token

I've tried creating a personal access token using the developer UI and passing it in the config like so (put it in refresh_token, assuming the name was just off, since that's a required field and there's no special handling around it other than the OAuth access token requests):

{
    "refresh_token": "<personal access token>",
    "account_id": "<id provided with access token>",
    ...
}

but this failed the Auth object's access token request with this partial stack trace.

CRITICAL invalid_grant
CRITICAL The provided access grant is invalid, expired, or revoked (e.g. invalid assertion, expired authorization token, bad end-user password credentials, or mismatching authorization code and redirection URI).

...

  File "/opt/code/tap-harvest/tap_harvest/__init__.py", line 61, in _refresh_access_token
    self._access_token = resp_json['access_token']
KeyError: 'access_token'

OAuth

I created an OAuth app and was able to retrieve a refresh token and access token, and specified them in the config. I also removed the account_id parameter requirement and usage, since their OAuth flow doesn't seem to return that? This gets further, but then I get a 401 (Unauthorized) from the company endpoint.

Have you run into these troubles locally? It would be great if the tap could be used with both of these methods.

dmosorast · 2018-08-10T17:56:21Z

It appears that the problem was that I wasn't specifying the account_id, but I'm a bit confused about how to get that without creating a personal access token. So I have a bit of research to do.

In the meantime, if you have any thoughts on the above regarding the auth method, and how this flow should work, I'm interested!

StevenMHernandez · 2018-08-10T18:01:13Z

Sorry for the issues with this.

OAuth

Yup, account_id is required, which would make sense why you are getting a 401 Unauthorized. However, I understand from what you are saying, this should be automated. Looking through example code that harvest provides here - https://github.com/harvesthq/harvest_api_samples/tree/master/v2/oauth it appears they are able to retrieve this account_id value automatically, however I am noticing that it is actually using the v1 api endpoints (which defeats the purpose of this pull request! V1 is being deprecated!).

I'm not sure if this is an oversight in the design of the harvest API or if I'm missing something.

Personal Access Token

I hadn't considered this method in this pull request. Do you think this is an additional requirement you all will need to accept this pull request or perhaps something that can be handled independently?

StevenMHernandez · 2018-08-10T18:11:16Z

Actually, it appears the https://id.getharvest.com/api/v2/accounts endpoint accepts just the access_code parameter as authentication which does itself return the account_id required. So it wouldn't be required that the user specifies this id necessarily.

dmosorast · 2018-08-10T18:14:38Z

Yes, I was just about to comment that! When I query it with Postman and my OAuth token, I get an object like this:

{
    "user": {
        "id": 123456,
        "first_name": "Name",
        "last_name": "Lastname",
        "email": "my-email@me.net"
    },
    "accounts": [
        {
            "id": 98765,
            "name": "Company Name",
            "product": "harvest",
            "google_sign_in_required": false
        }
    ]
}

Potentially, this could be requested by the tap to see which account the access token has been granted for. It would make sense to take in both the access_token and refresh_token from the OAuth flow, request the accounts endpoint, and pull out the first account from that endpoint to run. This could also give the option to extend it to allow for multiple accounts authed with the same token (though I wouldn't suggest adding that in to this PR heh).

Personal Access Token

As for the Personal Access Token method, I was just confused by the Account ID, since that auth method was the only mention of it I could find at the time. OAuth is definitely preferred.

… from user config

StevenMHernandez · 2018-08-10T18:48:00Z

Ok @dmosorast That last commit allows the tap to automatically retrieve the first account_id.

dmosorast · 2018-08-10T18:51:34Z

Great! Thanks for the quick turnaround. It's working on my end now. I'll continue with the review and figure out the plans for merging. All in all, it's looking pretty good. I should have something concrete early next week.

StevenMHernandez · 2018-08-10T19:02:10Z

Sure thing and thanks for taking a look into this!

dmosorast · 2018-08-14T15:23:58Z

@StevenMHernandez I've got everything set up to get this merged and released. If you have a Stitch account, can you PM me in the Singer slack @.dmosora and we can see about getting you set up there for testing?

StevenMHernandez added 7 commits June 27, 2018 12:28

store the id of nested objects instead of the full nested object

1893e5e

automatically paginate all "collection" results (all calls to the `sy…

41a965b

…nc_endpoint` method)

skip syncing any endpoints that are not enabled by the given user's c…

7cc1c6e

…ompany, otherwise the user receives a 403 response error * load company metadata from harvest api * ensure specific features are either enabled or disabled before beginning endpoint syncs

update module to v2.0.0, minor tweaks

8c2e535

* define default key_properties directly in `load_and_write_schema` method signature * update version * fix other minor inconsistencies

cmerrick added the cla-missing label Jun 28, 2018

cmerrick removed the cla-missing label Jun 28, 2018

nick-mccoy mentioned this pull request Jun 29, 2018

Adding projects attributes relating to fixed fee projects #19

Closed

determine account_id automatically as opposed to requiring account_id…

d9b523c

… from user config

dmosorast merged commit ca87f8b into singer-io:master Aug 14, 2018

This was referenced Sep 19, 2018

Upgrade to V2 API #9

Closed

Time entries not sync'ed properly #18

Closed

Paging not supported for invoices. #17

Closed

dmosorast mentioned this pull request Sep 19, 2018

404 response from expense_categories #6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to Api v2 #20

Upgrade to Api v2 #20

StevenMHernandez commented Jun 28, 2018

cmerrick commented Jun 28, 2018

cmerrick commented Jun 28, 2018

nick-mccoy commented Jun 29, 2018

StevenMHernandez commented Aug 7, 2018

nick-mccoy commented Aug 7, 2018

StevenMHernandez commented Aug 7, 2018

dmosorast commented Aug 10, 2018 •

edited

dmosorast commented Aug 10, 2018

StevenMHernandez commented Aug 10, 2018

StevenMHernandez commented Aug 10, 2018

dmosorast commented Aug 10, 2018

StevenMHernandez commented Aug 10, 2018

dmosorast commented Aug 10, 2018

StevenMHernandez commented Aug 10, 2018

dmosorast commented Aug 14, 2018 •

edited

Upgrade to Api v2 #20

Upgrade to Api v2 #20

Conversation

StevenMHernandez commented Jun 28, 2018

cmerrick commented Jun 28, 2018

cmerrick commented Jun 28, 2018

nick-mccoy commented Jun 29, 2018

StevenMHernandez commented Aug 7, 2018

nick-mccoy commented Aug 7, 2018

StevenMHernandez commented Aug 7, 2018

dmosorast commented Aug 10, 2018 • edited

Personal Access Token

OAuth

dmosorast commented Aug 10, 2018

StevenMHernandez commented Aug 10, 2018

StevenMHernandez commented Aug 10, 2018

dmosorast commented Aug 10, 2018

Personal Access Token

StevenMHernandez commented Aug 10, 2018

dmosorast commented Aug 10, 2018

StevenMHernandez commented Aug 10, 2018

dmosorast commented Aug 14, 2018 • edited

dmosorast commented Aug 10, 2018 •

edited

dmosorast commented Aug 14, 2018 •

edited