-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Admin UI: Publisher Search #417
Comments
Thanks for raising this issue @robredpath. The main motivation for having this functionality is so that we can search across the whole set of publishers in one place, something that we don't have at the moment. aiui the use of checkboxes as suggested makes it possible for a user to check "has published" without checking "is approved". I don't think a UI should allow this and it suggests to me that checkboxes are unsuitable here. Radio buttons with "all", "registered (unapproved)", "approved (unpublished)" and "published" or similar might be better. In the publishers list,as it is currently implemented, the number of published datasets is shown. I think this is useful and would like to see it retained here, please. What is the purpose of "presence of errors in data"? Is this a Boolean, numerical or some other type of field? What is the logic for excluding unapproved publishers from non-Sysadmin users, please? At the moment sometimes people try to register the same organisation more than once. Not allowing people to check if their organisation is in the registration process is likely to increase the number of duplicate registrations that we see. |
I don't have strong feelings on this. I agree that the UI allowing the user to select (hopefully!) impossible combinations isn't ideal, but having a single control that selects based on a combination of fields doesn't feel great either. We could make some JavaScript to auto-select "is approved" if they select "is published"?
Agreed. Implicit - but should be explicit - is that there's no loss of display or other functionality as a result of this change.
As I understand it, this is to make it easier to locate the correct publisher in a list where other search terms might lead to lots of results, and it stops someone having to click through a long list of publishers one-by-one to see if they have errors in their data. I would expect the control to be a Boolean, and the results to either be Boolean or numeric, depending on implementation considerations
This is a security consideration: if someone creates a publisher for some non-IATI-related purpose (such as to advertise their gambling website, or some illegal pursuit) then we don't want any content that they create, even whatever they entered into the publisher name field, to be displayed on the website until it's been reviewed.
Perhaps a mitigation to this might be to indicate if the search term appears in unapproved publishers, without actually listing them? Some carefully-crafted help text would be required to explain what was going on, however. |
I think what I am suggesting is a single control that allows someone to either search across all categories of publisher, or just one category. Let's move on and see what @cormachallinanderilinx suggests when implementing this. snip
I'm still unclear about the use case for a member of IATI support to be using this. Should we be considering how this particular filter interacts, or doesn't, with http://dashboard.iatistandard.org/data_quality.html ?
Thanks. That makes sense.
It feels tricky and something that will be difficult to implement. Let's see what options @cormachallinanderilinx can offer us. |
#310 paraphrased here:
Can this be added to the acceptance criteria, please? |
Thanks @siwhitehouse. I've updated the initial comment in line with our conversation here. Two unresolved points, though:
I'm not sure where this requirement came from. Maybe @IsabelBirds might know? I have no attachment to it, it's just in the Miro board so it's made its way here!
To clarify: would this make things worse, or is this the current situation (and so this just doesn't make things better)? |
The error field was an idea to reduce the amount of digging we have to do to offer support. Then if I'm already engaged with an org and can easily notice that they have errors, I can bring this up and offer support. This is likely to increase uptake and changes to data quality compared to contacting orgs out of the blue. |
For the is approved and has published and has errors. This view is available for viewing pending publishers: https://www.iatiregistry.org/dashboard/mypublishers-pending Just use this this ticket to improve the searching (fuzzy logic) and fix the sorting/? |
I think we can return to the user story to help us here:
The end state that we're trying to get to here is a situation where, when someone contacts IATI Support, we can quickly understand which Registry publisher(s) correspond to the person and/or organisation who has contacted us, and what the current state of them is. Ideally, I think that would be part of the existing publisher search, because then there's just one place that you go to look for information about publishers. However, I think we're open to it being a separate admin tool if that's more straightforward in terms of implementation and security. If the information is split across multiple tabs or multiple searches it becomes harder to use: at best you need to do the search multiple times, and it becomes very easy for people to either not know about or forget to use the other tabs. Is that feasible: a tab in the dashboard which supports all the functionality that we've discussed here in a single view that's admin-only? |
yes, that sounds good to me |
Cool - I want to hear from @siwhitehouse before we proceed, though! |
Estimate 3 days |
I'm not clear still, my apologies. How could we show a per-activity error count when the search is at publisher level? |
@robredpath I don't think we should have a single control for "is approved", "has published", "has errors in data". I think we want to be able to filter by the three statuses that a publisher might be in: "registered (unapproved)", "approved (unpublished)" and "published". By default, a search should show all statuses. Either a single control, or a set of controls, should let us filter by status. Separately, we want to be able to filter by whether a publisher has errors in its set of published files. That should show the number of errors, which we should be able to order on. @IsabelBirds have I specified what you have in mind here? Thanks to @cormachallinanderilinx for the estimate. |
I had misinterpreted @IsabelBirds latest comment. What we would like is the mean average of errors per activity for the publisher as a column in the search results. That figure should also contain a link to the publisher's page on the IATI Validator, please. |
Is this the URL you would like included? https://validator.iatistandard.org/organisation/aiddata Do you know is there a validator API that can be used to access this which will allow us to get a count of errors as we dont store the count? To the best of my knowledge the validator only expose two APIs https://developer.iatistandard.org/api-details#api=iati-validator-v2&operation=get-pub-get-report |
The Validator API returns These are on a per-file basis; the way that we use CKAN in the Registry means that "file" and "dataset" are synonymous. The pipeline that feeds the Validator starts with the Registry, so any file that exists on the Registry should have an entry in the Validator. There will be a time lag, I'm not sure what it is precisely, but it won't be long! @simon-20 or @odscjames might be able to advise. Likewise, the Registry should know about updates to files first out of any of our systems. I'm not sure if there's an edge case where a file at an unchanged URL has been updated; again I hope that @simon-20 or @odscjames can advise on that. We discussed on the call that this could result in a lot of API calls if the results page has a lot of publishers on, each of whom have a lot of datasets. Given that the Registry knows about changes to files first, it should be fine to cache results and invalidate the cache based on Registry / archiver updates. The API isn't actually as fast as I thought (I'm seeing 300-400ms response times); we can look into improving that but caching will likely be important. The API should support a reasonable number of concurrent queries, which would hopefully speed up total time to compile the list. |
@cormachallinanderilinx do you already have an IATI API key? We can help you get signed up and increase your access level once you're up and running if not. |
This gist is an example response for a file with several ruleset errors, but that is valid IATI data. The summary elements are at the end of the response. |
I have opened several issues against the Validator API repos for us to investigate whether we can make the Validator API more suitable for this use. Depending on complexity and how well this sits alongside other work we're doing on the Validator API, we may be able to make these changes very quickly, or not for several months. The issues are: |
I did a quick check, and there is a fair bit of variation. Over half of the datasets currently known about by the Datastore were validated within 30 minutes; but there is a long tail on this one, some can take a few hours, and if there is a problem--a publisher is flagged, for instance, for too much invalid data too quickly--then full validation may take much longer. |
Hi @robredpath yes I have an API key set up for some work we were looking into previously |
@odscjames could you get in touch with @cormachallinanderilinx via email and make sure that we know which is Derilinx' API key and that it has appropriately high limits? I want to make sure we're ahead of any rate limiting complications. |
We discussed this and we prefer to have them both included, please. What about Warnings @robredpath ? Are they queryable through the API too?
So, is it possible to get a mean average of errors per activity then? |
Is it more useful for them to be provided separately, or added together into one aggregate figure? I'm conscious that fixing a structural issue might then allow validation to proceed to the point where many warnings are triggered, so this number might appear to get worse as the data is actually improving.
Yes,
@cormachallinanderilinx this one's for you! |
I've found the API key and it looks like it is already at high limits. |
I have had to do quite a bit of refactoring here, here is some examples of searching I added: SearchingSearch by name of title: Search by Country: https://staging.iatiregistry.org/publisher/?q=publisher_country%3DAS&sort=title+asc Search by publisher id: https://staging.iatiregistry.org/publisher/?q=publisher_iati_id%3Dtest_publisher_id_date&sort=title+asc Seems searching both country and publisher_id at the same time is breaking with my latest changes. SortingPaging is not working in UI so just update the url for now @siwhitehouse and @robredpath if ye would like to have a play around and give me any feedback on how you would like this better implemented in the UI please let me know. Im going to work on tests and fix the know issues mentioned above before working on the UI so ye can have a feel and provide any feedback. |
Searching |
@siwhitehouse Search If you want to search for an exact IATI publisher ID you can also add it here and it should match Country Search is a bit different as you cannot search on country name. Paging State
This is the url to get all that needs approval, think we should have a checkbox or button or what makes senses to add this query through the UI? |
Hello @cormachallinanderilinx - thank you for the update and my apologies for not responding sooner. Sorting, no paginationI searched for the word "foreign" and was returned nineteen organisations. Handy for testing the sorting without pagination. Created date - looks good Sorting, with paginationI searched for the word "the" and was returned eight pages of results. NameI sorted on "name ascending" and it looked fine until the last entry "National Association of Municipalities of Benin (ANCB), The" a bit of a jump from the previous result of "Doctors of the World / Medecins du Monde". Clicking on the second page led to a page sorted by "Created Descending". To see the second page of results sorted by name descending I typed https://staging.iatiregistry.org/publisher/?q=the&sort=name+asc&page=2 directly into the address bar. The first entry was Doctors of the World UK I think my description of how pagination currently behaves is different to yours, but this may be due to changes you've made since your update. Could you follow my steps, see if you can replicate and look at why "National Association of Municipalities of Benin (ANCB), The" appears out of order, please? IATI Organisation IdentifierUsing the dropdown menu all numerical codes (for e.g. '30001') start appearing after 'GB-CHC-1000566'. Organisation typeAscending starts from "government". Descending starts from "Academic, Training and Research". I suspect we are ordering by the code value rather than the name. See https://iatistandard.org/en/iati-standard/104/codelists/organisationtype/ Country/RegionSouth Africa appears between United States and Uganda. |
@siwhitehouse Name IATI Organisation Identifier Organisation type Country/Region |
@cormachallinanderilinx My apologies in turn for the delay in getting back to you. I assume the pagination still isn't ready for testing NameI don't understand your question. At the moment it looks to me that you no longer 'normalise' names by placing 'the's at the end of a name. I think that is good for the display, but I suspect we would still want to sort on the 'normalised' version. At the moment all organisations whose names begin with "The" are ordered using it, meaning they are all bunched together. @robredpath can you advise on best practice here, please? ###IATI Organisation Identifier Organisation Typeand Country/RegionThanks these both look good now. That leaves pagination to be fixed and Name ordering we should wait for Rob's opinion. |
I have a couple of comments about styling/layout. Table settingsThe table looks like it has fixed-width columns. Here is a screenshot of the top of the table when I search on 'development' Could we configure the table display so that it avoids such text wrapping? From a fixed-width perspective, I think we could add width in the left-hand side columns from those on the right hand side. Ideally, the table would adapt to the display settings of the person's browsew/display settings. I don't know the possibilities and limitations to an approach like this though. ###Ordering by table header I would like ordering by table header to be clear to the user and for it to perform the same sorting as the dropdown i.e. across all of the returned data. |
@siwhitehouse I have fixed the pagenation, still doing a bit of testing myself but looks good On the table header clicks I will look at this now. |
I think that being clear about the normalisation and having it be consistent across the site is more important than whichever approach we choose - so, whatever we do elsewhere is what we should do here. |
We discussed this on our call today. @cormachallinanderilinx will remove the sort from the column headers in the table, @siwhitehouse will check the pagination and then share this with the rest of IATI Support for feedback. |
Pagination looks good now, thank you @cormachallinanderilinx Is the API set up for the Staging instance? I'm asking because if I query
then I get a list of organisations, but if I query
I get a 401 response. We'd like to be able to check the organisations in the "approval needed" state through the API and the UI. |
@siwhitehouse python example: Postman: |
@cormachallinanderilinx Thank you. Unfortunately, I am still receiving a
error when I follow your instructions. I logged into my https://staging.iatiregistry.org/user/simonwhitehouse account and I created an API token. I then amended the code you posted above to include my username and API token. Running the code returns the 401. I have just shared the code with you (via Deepnote) for you to troubleshoot. I'd note that originally I was sending this as a get request without authentication, as per https://iatistandard.org/en/iati-tools-and-resources/iati-registry/iati-registry-api/publisher-endpoints/#ListPub |
Hi @siwhitehouse Thanks for sharing the Deepnote, I was able to fix it up there along with 1 or 2 small changes. FYI the API will have paging (offset and limit), by default the limit is 20 so the next page will be: You can also set a higher limit but response will be slower, example of 100 at a time: |
When logged in as https://staging.iatiregistry.org/user/simonwhitehouse I receive an internal server error when I click on the link to the last page (117). I also receive an internal server error when I select Order By "Created Ascending". @cormachallinanderilinx I don't know why I am seeing these now when I didn't before. Can you investigate this, please? Let me know if you need any information from me. |
When logged in as https://staging.iatiregistry.org/user/simonwhitehouse I see twenty publishers per page and (I assume) 116 pages return results. So, I would expect to see 2320-2340 publishers in the CSV download. I only see 1357. This is my alternative check on the number of organisations appearing in the UI matching those in the database, as I don't have the coding skills to page through the API. @cormachallinanderilinx I think the check here should be that the data in the CSV download matches that returned in the UI via the query in the URL. The API should also be consistent. This doesn't appear to be the case at the moment. Can you investigate this before we do any more testing, please? Happy to provide more information if you need it. |
Hi @siwhitehouse |
Hi @siwhitehouse |
What do you mean by the "actual publisher code" here, please Cormac? |
aiui we have two use cases for aligning the downloads:
I can't offer an opinion on the detail of how you propose to fix this, other than to say it looks like you are focusing on this end goal. It's fine to spend the time on this, so please do go ahead. I have a couple of other observations:
Finally, I'll leave it to you if you think it is better to set up a separate issue for aligning the Downloads. My preference would be for a new issue at this point. |
|
As a member of IATI Support,
I want to find publishers using the information I have available†
so that I can quickly discover the Registry situation for someone that I'm helping.
† Organisation name (e.g. "Open Data Services"), publishing status, presence of errors in data, org-id, country
In conversation with @cormachallinanderilinx we refined this to:
Checkboxes for "is approved", "has published", "has errors in data"per @siwhitehouse below this UI could be confusing and so we'd prefer to find a way of making a single control for this.Acceptance criteria
††† I don't think that we should have both the table headings and the Order By: dropdown for sorting results. We should choose one. My preference would be for the table headings to sort the whole list.
This search interface can be available to all users, apart from the ability to see unapproved publishers which should be restricted to logged-in Sysadmin users only.
EDIT: Update 2024-01-04 in line with discussions below
The text was updated successfully, but these errors were encountered: