Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic structure for API client class #163

Closed
Tracked by #217
JWCook opened this issue Jun 2, 2021 · 15 comments · Fixed by #165
Closed
Tracked by #217

Add basic structure for API client class #163

JWCook opened this issue Jun 2, 2021 · 15 comments · Fixed by #165
Labels
enhancement New feature or request
Milestone

Comments

@JWCook
Copy link
Member

JWCook commented Jun 2, 2021

The data models (#145) are mostly complete. The next step is to add support for API requests that return model objects instead of JSON. For backwards-compatibility, this should be an optional feature, not a replacement for the existing API functions.

Update: Decided to go with a compromise between options 2 and 3, described in comments below. This will involve an API client class and one controller class per resource type.

Options

There are at least two ways to go about this:

Option 1: Add a keyword arg to existing functions to specify return format

Example:

observations = get_observations(user_id='my_username', format='model')

Pros:

  • This could work nicely with other possible return formats, if we add optional integration with pyinaturalist-convert later on. format could be used to specify any other supported format, for example format='csv'.
  • No new functions for users to learn

Cons:

  • Adding multiple return types for the same function generally isn't good practice
    • For other readers of client code that uses these functions, it can take a bit longer to understand what's going on
    • A long Union[x, y, z] isn't very useful as a return type annotation

Option 2: Add wrapper @classmethods to model objects

Example:

observations = Observation.search(user_id='my_username')

Pros:

  • Results in more readable client code
  • Makes a more obvious separation between the "low-level" API (raw JSON) and "high-level" API (model objects)

Cons:

  • Mixes data with logic
    • It can be nice to keep those separate (as in MVC patterns), but not totally necessary
  • Requires addling slightly more code to implement
    • This can be mitigated somewhat by using forge to reuse existing function signatures & docstrings

Examples in other API clients:

  • Service resources in boto3, and other service-specific interfaces like S3.Bucket
@JWCook JWCook added the enhancement New feature or request label Jun 2, 2021
@JWCook JWCook added this to the v0.14 milestone Jun 2, 2021
@JWCook
Copy link
Member Author

JWCook commented Jun 3, 2021

Okay, after writing that out, I may be leaning toward option 2.

Do you have any thoughts on this, @niconoe?

@JWCook
Copy link
Member Author

JWCook commented Jun 3, 2021

Also @synrg, since you expressed interest in this, do you have a preference between these two options?

@niconoe
Copy link
Collaborator

niconoe commented Jun 3, 2021

@JWCook: thanks!

While option 1 is also perfectIy fine, I really like option 2. It's like a brand new and exciting API (between client code and Pyinaturalist).

In that case, I'd also suggest communicate to users in terms that keep the separation very clear and also give guidance: "hey, we have a new major version of Pyinaturalist, you can continue using the old (mostly unchanged API) but there's a new one that's probably nicer and recommended for new projects, here is the tutorials/cheat sheet.

I'm not too concerned about the "mixes data with logic" part, in think it's inevitable to some extend, especially for a data-oriented project like this one. But maybe I'm not understanding the issue fully, do you have a specific example that bothers you and that you want to discuss?

Just my 2 cents, I think in the end you're in the best position to make the final call! Also curious to hear @synrg, it's amazing to have more people contributing (even if just suggestions, we're better together!)

@synrg
Copy link
Contributor

synrg commented Jun 3, 2021

I'm also in favour of option 2. Pragmatism wins out here for me over strict separation of code & logic. And if not 2, then something else other than 1, which is not very appealing for the reasons already stated.

Regarding the example given, why Observation.search() and not Observations.search() (as on a collection of Observation objects)?

@JWCook
Copy link
Member Author

JWCook commented Jun 3, 2021

Thanks for the input! My example was for adding that to the current Observation model, just because that's what I have so far.

Adding separate classes for model collections is a possibility. I've actually experimented with that a bit over here, but I wasn't quite happy with how that was turning out. It seemed a bit confusing to work with a List[Observation] in some places and an Observations object in others. I also couldn't find any good examples of something similar in other API clients out there. Not to say it's not doable, it would just take some more thought.

@synrg
Copy link
Contributor

synrg commented Jun 3, 2021

I'm all for keeping things simpler. It wasn't something so much that I thought was wrong as something I didn't have a clear idea yet of how it ought to work. Looking for examples from other API clients as a guide would be where I'd start, too.

@JWCook
Copy link
Member Author

JWCook commented Jun 3, 2021

Option 3: API client class

Another slightly different option to consider would be a single API client class containing wrappers for all API functions. There are some good examples of that out there, like boto3 again (with Sessions), the other AWS SDKs, and python-gitlab.

Usage could look something like:

client = InatClient()  # Probably needs a better name
observations = client.get_observations(user_id='my_username')

Pros:

  • A client object could store additional state, like requests.Session settings, rate-limit settings, cache settings (for Add integration with requests-cache #158), etc.
  • Requires fewer imports
  • Instance methods may be more intuitive than class methods for some users
  • ???

Cons:

  • At first glance, usage maybe doesn't look quite as nice as <ModelName>.search()
  • A lot of methods for one class, making docs a little harder to navigate compared to docs for a single resource type
  • ???

I need to think through that one a bit more. Any thoughts on that?

@synrg
Copy link
Contributor

synrg commented Jun 3, 2021

That's exactly the choice I made in Dronefly. I wouldn't hold up my newbie python code as a shining example to follow, though. :) And yeah, it's ... messy with all those methods and no grouping together per resource type, I agree.

@synrg
Copy link
Contributor

synrg commented Jun 3, 2021

A hybrid approach? Client object for all the settings and stuff, but with each of the models represented thusly (from Gitlab link you gave):

# list all the projects
projects = gl.projects.list()
for project in projects:
    print(project)

# get the group with id == 2
group = gl.groups.get(2)
for project in group.projects.list():
    print(project)

# create a new user
user_data = {'email': 'jen@foo.com', 'username': 'jen', 'name': 'Jen'}
user = gl.users.create(user_data)
print(user)

We could have inat.observations.search(), etc.

@JWCook
Copy link
Member Author

JWCook commented Jun 3, 2021

Yeah, I like the look of that!

@niconoe
Copy link
Collaborator

niconoe commented Jun 4, 2021

I'm not sure I fully understand the last example, but the settings discussion made me think of the requests approach which I quite like:

you can either call methods (such as get()) directly (nice for newcomers), or via a method of a session object (allows advanced settings, sharing those settings across subsequent calls, ...).

@synrg
Copy link
Contributor

synrg commented Jun 4, 2021

As for how I understood the last example, inat would be an instance of INatClient (or whatever) containing settings, session, and all the other goodies. inat.observations would be how you organize all the model methods off of the client, i.e. inat.observations.search() would do an Observation.search() within the context of the inat client (settings, session, etc.), I think.

This resolves my earlier discomfort with Observation.search(), i.e. I want to "search the observations" (plural), not "search the observation". inat.observations.search() is a very natural way to express that, and also resolves the objection Jordan raised that putting everything on the client object would add way too many methods to it and not organize them in any fashion.

I guess it's kind of like requests, only different, because we have all of these different kinds of resources to contend with, unlike requests, which is much simpler at the top level. So we share more in common with Gitlab. Unsurprising, since both are APIs for websites with numerous interrelated resources on them.

@JWCook
Copy link
Member Author

JWCook commented Jun 4, 2021

Exactly, I see that as a good way to get the benefits of both options 2 and 3 without most of the drawbacks we mentioned. So InatClient.observations would be a "controller" object that contains the wrapper methods for observation requests, and InatClient.taxa for taxon requests, etc., rather than putting all the wrapper methods in a single class.

The only downside is, well, it will be more work, but I'm up for it. I should have time within the next week or so to get a start on this.

@JWCook JWCook removed this from the v0.14 milestone Jun 18, 2021
@JWCook JWCook changed the title Integrate data models and API requests Add API client class to integrate data models and API requests Jun 18, 2021
@JWCook
Copy link
Member Author

JWCook commented Jun 28, 2021

Based on a suggestion from Ben, it would be useful to have some default request params that could be set on the client and then passed to any applicable API requests. Possible params include:

  • preferred_place_id
  • locale
  • per_page

@JWCook JWCook added this to the v0.15 milestone Jul 6, 2021
@JWCook JWCook changed the title Add API client class to integrate data models and API requests Add basic structure for API client class Jul 30, 2021
@JWCook JWCook mentioned this issue Jul 30, 2021
23 tasks
@JWCook
Copy link
Member Author

JWCook commented Jul 30, 2021

Continuing this in #217

@JWCook JWCook modified the milestones: v0.15, v1.0 Sep 2, 2021
@JWCook JWCook removed this from the v1.0 milestone Mar 8, 2023
@JWCook JWCook added this to the v1.1 milestone Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants