Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Design Principles and Goals and Other High Level Stuff (EXTREMELY TENTATIVE, PLEASE EDIT/DISCUSS)
Our goal in the long term is to provide the best possible experience for interacting with the ASpace API from Python; make reading stuff from ASpace easy, and writing stuff as easy as is feasible, and provide clean abstractions over the API without sacrificing capability to accomplish tasks.
Loose coupling to the API
The ASpace API is subject to change and is not versioned; there are essentially no hard guarantees that any aspect of the API will remain stable. Thus, it's desirable to, where possible, rely on parts of the API that are likely to be stable because they would be hard to change without breaking internal assumptions ASpace itself relies on. Code that relies on structural qualities of the API is thus encouraged, versus code that relies on specific endpoints and features of endpoints.
Code that can read atributes from ASpace or from ASpace code is a good second choice (e.g. processing JSONModel schema files or yml config files from ASpace, probing the API for route info if possible/plausible); custom code dependent directly to individual API methods and details should be considered carefully, but will certainly be necessary.
Lazy Evaluation Where Possible
ASpace installations often contain large numbers of objects; it's often valuable to defer evaluation to conserve memory. Additionally, fetching objects on demand can be required in some cases, because altering an object can in some cases affect other objects, and this can invalidate downloaded JSON.
Documentation and Testing
Have it and do it. Docs and tests are kindness done for both others and one's future self. (Note: EXPAND THIS)
Client API (implemented)
Basically shorthand for requests -
Plus an authorize method and a method for dealing with paged endpoints.
ORM-ish abstraction layer (proposed)
Currently ongoing work and discussion located here
A layer that abstracts fetching and interrogating API methods, handling three kinds of objects:
- Collections: generator or relation object wrapping/providing a generator over API index requests that returns the individual objects
- Individual objects: Python object representing a JSONModel object, where attribute access gets you either simple values (strings, numbers, etc), other individual objects, or relations
- Refs: Python object where on accessing it, it converts itself into the individual object by fetching it from the api
The goal is to have an api like this:
from asnake.aspace_orm import ASpace aspace = ASpace() for repo in aspace.repositories: print(repositories.name) for ao in repo.archival_objects: # do stuff with aos
For right now, I (@pobocks) think having this be a strict "you make either a direct or index call" and not worrying about search/selection makes sense, but definitely needs a lot of thought and probably discussion by people.
Proposed API For JSON Relation class.
The JSONModelRelation class should:
- allow iteration over the collection it represents
- allow fetching of single instances from that collection by id
I think this is best served by a relation class that stores the URL (and maybe param values?) for the collection in question, and has iter defined as a get_paged call to said URL with said values, and has a method (maybe shorthanded via call or getitem) to find by integer or string ID. getitem has the advantage of being able to accept slice notation, but the downside of being potentially confusing because it would mean getitem referring to a concept of index unrelated to what gets iterated when you do for x in thing.
Example of proposed API:
relation = aspace.repositories # __call__ version relation(3) # repository with id=3 relation("/repositories/3") # same # __getitem__ version relation # repository with id=3 relation("/repositories/3") # same relation[3:5] # repositories 3-5? Honestly this seems a bit dangerous and hard to impl # Or we can just have methods or such relation.find_by(id=3) relation.find_by(uri="/repositories/3")
Potential Misc Planned Enhancements
client arg to JSONModel classes optional
Both object and relation constructors should do something useful in the absence of a client being passed - I think expecting end users to always pass a client is unfriendly enough that it's to be avoided. I think, given that we're focusing on lazy eval, there are a couple of options. In ALL cases, I think client should be universally passed in ASnake code - this is a convenience for the user, not something we should rely on internally.
Have a "default client" resident on the class, configured with default values. Thus, in expected most common case, that is, in the case where people care about a single ASpace instance and configured ASnake with a yml file as suggested in docs, it'll "just work."
Option 1, but replace the "default client" any time a new asnake.aspace.ASpace instance is created, thus getting "last created" client.
- Advantages: workflows that talk to one ASpace, then another in sequence? In a REPL, it'll be the last one you set up?
- Disadvantages: less predictable, more effort
Some sort of "clientless mode," where if a client isn't passed, dereferencing or other operations that make client calls are omitted or fail.
Retry on failures that are clearly "haven't authed yet"
Possibly wrap get/post/delete/etc in wrappers that attempt to authorize and then retry (once) if the "not authorized" error happens? It would be very forgiving, and I can't think of MUCH reason someone would intentionally want to deliberately access authed things while unauthed. Definitely should have a config var if included.
Prolly ought to be a recursive version of reify
We're gonna want this, should be easy-ish to implement as long as nothing has 1000 levels of nesting?
Hmmm, circular refs might actually make this LESS GOOD AN IDEA, investigate.