-
Notifications
You must be signed in to change notification settings - Fork 56
Description
As mentioned in #277 (comment), we now have
ItemSearch.item_collection() and ItemSearch.item_collections(). I don't think it's immediately obvious what
the difference between the two is (spoiler: item_collections() returns an iterator of pystac.ItemCollection objects, one per page;
.item_collection() returns a single pystac.ItemCollection with all the results).
Let's take stock of the current methods for getting stuff out of an ItemSearch, ignoring some deprecated methods:
| method | return type | Notes |
|---|---|---|
items |
Iterator[Item] |
|
items_as_dicts |
Iterator[Dict[str, Any]] |
|
item_collections |
Iterator[ItemCollection] |
|
item_collection |
ItemCollection |
|
get_all_items_as_dicts |
Iterator[Dict[str, Any]] |
Deprecated in favor of get_items or get_item_collections, (?) |
We have a couple of dimensions expressed in these methods:
- The result set:
a. A single item (.items(),.items_as_dicts())
b. Pages at a time (.item_collections(),)
c. Everything (.item_collection()) - The result type:
a. Plain dict (items_as_dicts,get_all_items_as_dicts)
b. pystac object (.items(),.item_collection(),.item_collections())
I think each of these is useful.
I've personally used both single items and everything (1a and 1b). I'm not sure how commonly used getting items by page is, but I can imagine some usecases that want to carefully control when HTTP calls happen (and pystac-client uses it internally).
And getting pystac objects or plain dictionaries are both useful. pystac objects for convenience, and plain dictionaries for performance.
So the motivation for this issue is item_collections() vs. item_collection() but I think we can take this opportunity to change a bit more and get to a nice state.
My proposal: make the "by page" explicit by deprecating item_collections() in favor of .pages(). And we want full symmetry,
then we'd have a total of six methods representing the production of the result set & result type.
| method | return type |
|---|---|
items |
Iterator[pystac.Item] |
items_as_dicts |
Iterator[dict] |
pages |
Iterator[pystac.ItemCollection] |
pages_as_dicts |
Iterator[dict] |
item_collection |
pystac.ItemCollection |
item_collection_as_dict |
dict |
The result set is expressed in the method name (items, pages, item_collection). item_collection is a bit of an odd duck there, but I think that's OK (IMO it's better than .everything())
The pystac vs. plain is expressed by the presence or absence of the as_dict suffix.
The item_collection_as_dict and the values in pages_as_dicts would (I think) be the feature collection things:
{
"type": "FeatureCollection",
"features": [
...
]
}
With all that, I think we'd satisfy every use-case and have a coherent naming scheme.