-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for new version of lookup-by-base-path #812
Comments
GET does seem the semantic choice for this. How many of these do you look up with a single request? Do we know if we have a limit since it's just an internal request? Longest base_path we have is 799 characters percent encoded. Re: "draft returns details of the content item currently in the draft content store. There should always be a draft object." I like that this matches up with the "unpublishing" implementation on get content 👍 it would be closer to the semantics we use in other places if draft was nil when there isn't an active draft. Eg if you run Also.. would you want locale returned too? for content that is in multiple locales the content id is only half the picture. |
Sounds like an interesting URL we've got there 🤔 We're not normally going to be sending a massive amount of base paths, but I think we could still hit limits like this one. The multi-base-path form of the request is only used in a few places: The biggest usage seems to be for translating related item tags into content ids. I don't have good data on this but the largest amount of I was a bit unsure about always returning drafts (I assume you mean "if you requested something that is not in draft we'd return nothing rather than live one"). I don't have a good reason for differing from the standard semantics though, so I'll make it return nil unless there's an active draft. Good point about locale. I'm not sure we need it, since link-set-tagging affects all editions of a document regardless of locale. But I'm happy to add it if it's useful more generally. |
yes - it's loaded up with utf8 characters percent encoded (a whole other problem) hmm maybe we can make it respond to both GET and POST and make a decision on which one of those based off number of base_paths in API Adapters? yeah locale sounds like it'd be good for general purpose. Less of a risk than linkables but what do you think about adding a limit to the amount of base_paths we accept? Just something big like 1000? |
👍 to both of those suggestions. I'll start implementing something today. |
This allows draft and unpublished documents to be looked up in addition to live content. See #812
Have started the work in #822 |
This allows draft and unpublished documents to be looked up in addition to live content. See #812 Locale and content id are returned for each successful lookup, as well as unpublishing details if applicable. I've also included document type, because there are some document types that the application may not want. For example it makes no sense to tag something to something with document type "redirect".
This allows draft and unpublished documents to be looked up in addition to live content. See #812 Locale and content id are returned for each successful lookup, as well as unpublishing details if applicable. I've also included document type, because there are some document types that the application may not want. For example it makes no sense to tag something to something with document type "redirect".
Background
In regular publishing apps, we refer to documents by content ID, and using base paths as an identifier is discouraged.
GET /v2/content
can be used to search for editions.In content tagger, things work a bit differently. We work at the document level rather than with specific editions, and use
PATCH /v2/links
to add links via the LinkSet.For the basic tagging workflow, users enter a URL or base path of anything on GOV.UK, and we lookup the content id with
POST /lookup-by-base-path
.This request may take multiple content ids to lookup simultaneously, which we use for bulk tagging workflows.
Problem
Currently if a document has not been published yet, we can't look up the item in content tagger.
We want to allow content tagger to work with any draft content, while continuing to support live content (which may be unpublished and withdrawn). We should distinguish between the two so the user can either view their changes immediately (live) or preview them on the draft-origin (draft).
For error handling, it would also be useful to distinguish between content that doesn't exist, and content that used to exist, but has been redirected/removed.
This is something we noticed when bulk tagging to the education taxonomy, because while identifying content to tag to the taxonomy, education publishers also improved the content and consolidated a lot of pages. In this case the bulk imports triggered a lot of "Content not on GOV.UK" errors.
Proposal
I'd like to create a new version of
/lookup-by-base-path
, that is less restrictive and returns more context about what content store it's in and whether it's been unpublished.Request
Response
In the response there is now an object representing the current state of each content store.
draft
returns details of the content item currently in the draft content store. There should always be a draft object.live
returns details of the content item currently in the live content store. There will only be a live object for published and unpublished content.unpublishing
is returned for thelive
content item if the live content itemis an unpublished document.
Usage
We'd add a new gds-api-adapter to return the response.
In content tagger we'd use it like this:
Rejected Alternatives
We could combine the two objects into a single one, with the current state as a field.
I don't like this solution as much because a state field is easily confusable with the publishing state of an edition. I think for this request, we are thinking more in terms of content store content items than publishing api editions. I'm also not convinced this is flexible enough, as it assumes the content id will never change for a base path, and we'll never be interested in additional fields that differ between draft and live.
We could use edition states instead of content store states
This would make the responses more difficult to work with.
We could keep using v1 and use a query parameter to toggle the new response
I'd prefer to indicate the version more explicitly but not sure if there's other considerations here.
POST vs GET
In the proposal above I've left it as
POST
(matching the existing request), although I thinkGET
would convey the semantics better.IIRC, the request was
POST
rather thanGET
to avoid hitting problems with limits on the length of a URI due to a long query string.Alternatively we could use
GET
with a body. Happy to go with whichever approach is preferred.The text was updated successfully, but these errors were encountered: